Classics in the History of Psychology

An internet resource developed by
Christopher D. Green
York University, Toronto, Ontario
(Return to menu

PHYSICAL AND MENTAL TESTS[1]
Baldwin, J.M., Cattell, J.M., & Jastrow, J.(1898)

First published in Psychological Review, 5, 172-179.


Professor Jastrow, of the University of Wisconsin, opened the discussion with a paper entitled 'Popular Tests of Mental Capacity.' He said that the determination of the extent, accuracy, alertness, efficiency and other measurable qualities of mental processes offers many interesting and perplexing problems to the practical psychologist. The primary motive in such experimentation is quite different from that which guides analytical research in the hands of the psychological expert. A very considerable and valuable portion of modern contributions to psychological principles and doctrine is the result of careful and ingenious analysis on the part of well-trained and scientifically self-observant experimentalists; and a main part of the equipment of a psychological laboratory may be wisely devoted to the purposes of such research. Supplementary to this is a line of investigation which aims to establish the normal capacity of simple and typical sensory, motor and intellectual endowments, as they occur in the average individual or in specially selected groups; such study, moreover, naturally includes as well the problems relating to the distribution of such powers, their development in child-growth, their relation to practical and daily pursuits, and their correlation with one another. As such investigation involves in most cases what is best described as a test of some specified capacity, and as such tests must be suited to the mental experience and endowment of the untrained and non-expert, the problem thus outlined may be appropriately referred to as the determination of the most suitable popular tests of mental capacity.

The problem falls into two divisions: (1) the selection of the capacities to be tested, and (2) the practical methods of testing them. The principles guiding the selection of tests will naturally vary with the special purposes for which various groups of tests may be undertaken. A usual and important purpose is that of collecting material for the study of general typical, characteristic endowment, much as the student of anthropometry desires first to establish standards of the principal [p.173] dimensions and proportions of the human frame. The selection of typical measurements in physical anthropometry is a much simpler matter than in mental anthropometry. Weight, height, girth, muscular power and the like are obviously the chief respects in which our bodies differ. To a certain extent it may be possible to obtain a consensus of standard mental measurements; and this is one of the main purposes of the Committee appointed by this Association.

The tests, apart from a few personal and anthropometric data included to make possible a comparison between physical and mental endowment, fall naturally into (a) the senses, (b) the motor capacities, and (c) the more complex mental processes. Certain general desiderata may, perhaps, be suggested as applicable to each of these groups. It is well to have each test give information regarding a single or very limited group of powers; specific typical tests are better than general ones. It is better to select, even if in part arbitrarily, one form of a certain sense capacity and to test that sufficiently to yield a definite result, rather than to attempt to test superficially various forms of manifestation of the same function. It is important to arrange a test so that it is definitely clear just what the capacity tested is; and, if necessary, preliminary experiments of an analytical character should be performed to determine this point. It is desirable that the form of capacity chosen shall be related to the activities of daily life; but it is also desirable that a high degree of efficiency in any test shall not be dependent upon experience, which certain groups of persons have a decidedly greater opportunity of acquiring than others. Hence it is often best to choose, as the basis of tests, sense-impressions which are unfamiliar to all. The conditions of the tests should be simple, easily intelligible, and, if possible, interesting, so as to induce on the part of the subject a willing coöperation, a natural attitude and a desire to do his best. The tests must occupy as short a time as possible, the apparatus be not easily disarranged by unskilled handling, not too expensive, and, in brief, be practically efficient.

Passing to details the question of method is preëminent; this obviously differs essentially with the various types of tests. For the limits of sense capacity-sensitiveness-as determined by the minimum visible, the minimum audible, the minimum tangible, the question of method is substantially the question of experimental conditions, choice of apparatus and material. For the powers of discerning small differences between similar sense-impressions-sensibility-the methods are various and have been the subject of considerable controversy, some of which has become of historical interest. The question must be [p. 174] reconsidered in the light of popular, practical requirements, and, first of all, the necessity of securing a definite, even if not precise, result in a brief time. The method of reproducing a standard sense impression-involving in its calculation of results the determination of the average error-may be highly recommended as satisfying the requirement; but it is unfortunately not readily applicable to all the senses, being most naturally serviceable in the case of those senses which receive impressions quickly and pass readily from one sense impression to another. Ingenuity in arranging the apparatus may do much to minimize this disadvantage. The best substitute for the method of reproduction is the method of selection, the subject selecting one of a given series of sense-impressions arranged in orderly sequence as the equivalent of the standard impression. Of the two other most frequently employed methods, commonly known as the method of right and wrong cases, and of the just observable difference, the former occupies too much time and the latter is too vague in its results. These methods may be variously modified to make them more suitable, as, for instance, the method of arranging a group of sense-impressions in their true order of size or degree, but the difficulty of interpreting the degree of error of the results thus obtained seems to outweigh their other advantages. Sensory tests may involve distinctions of kind as well as of degree, and in these the element of time as well as correctness may be profitably utilized as a criterion of efficiency. This is particularly applicable to distinctions of form and color.

In regard to motor tests, method is again mainly a question of the choice of apparatus and of the groups of muscles to be tested. The qualities of motor response of greatest importance in this connection are strength, which may mean the maximum efficiency at a given moment, or may mean endurance, rapidity of muscular contraction, steadiness or precise voluntary control, accuracy of movement both in itself and in coordination with the eye. The muscle groups to be selected are those which are frequently and familiarly used and easily subject to voluntary control.

The more complex mental tests form a somewhat heterogeneous group. Many of those which are apt to be considered in a popular investigation will involve in various ways a series of more or less complicated distinctions, and of appropriate responses to or modes of indicating the appreciation of such distinctions; while another group arises to test certain phases of memory, or association, or attention or imagination. The former group would be naturally termed reactions, [p.175] the time element or alertness furnishing the main test. Typical simple reactions to show the quietness of appreciations of the presence of a stimulus, and a sufficient variety of adaptive reactions to indicate clearly the strength of the powers of distinction and of choice, should form a part of the test. Here, perhaps, more than elsewhere, the adoption of suitable standards is essentially dependent upon coöperative effort. Here, too, the question of apparatus and material is of unusual importance; first, because apparatus for timing is apt to become elaborate, and, second, because the naturalness of the mental processes involved in the test is a function of the details of arrangement. Little can be said as yet of the tests of memory, attention, association, imagination and the like. They are eminently desirable, but in part seem to involve more accurate conditions than it is usually practicable to secure. The most hopeful plan is to have different investigations take up extensive series of tests of special forms of the above powers and let actual experience decide as to their merits.

These remarks are offered by way of a summary discussion of principles underlying the selection of specific tests; their application to the problems of mental anthropometry, now at issue, is a task to the solution of which, it is hoped, that the members of the committee, as well as others, will contribute in the near future.

Professor Baldwin remarked upon the need of giving the tests as psychological a character as possible, thus justifying his preference for the exclusion of such physical tests as 'breathing capacity,' 'height sitting,'etc. He criticised Professor Jastrow's proposals as not giving a separate and prominent place to memory tests. He explained the three methods of testing memory which he developed in an earlier paper (reported to the Association at the New York meeting, December, 1893, by Warren, and published in full in the PSYCHOLOGICAL REVIEW for May, 1895).[2] He expressed a preference for the method of 'identification,' as opposed to those of 'selection' and 'reproduction,' on the grounds that reproduction measures or tests expression-which is very complicated-as well as memory-faithfulness, and that selection involved complications arising from contrast, suggestion, etc. The objection that testing by identification involves the use of the [p.176] method of right and wrong cases in computing the results is met when one takes a number of persons together (which need not occupy a longer time); while testing by reproduction, besides being liable to the objection stated above, is open to the difficulty of estimating the errors by the method of mean errors. For example, the calculation of errors in a reproduced series of numbers assumes a criterion as to what sort of a variation in the series shall be called 'one' error. The speaker reported further work in his laboratory on the memory methods. He hoped soon to test a new method suggested by him in 1892, which he called the 'dyamogenic method' ('Proc. London Congress,' p. 51; cf., 'Ment. Devel. in the Child and the Race,' 2d ed., p. 395).

Professor Cattell, chairman of the committee on physical and mental tests of the Association, said that the report of the committee presented at the Boston meeting, being placed at the end of a crowded program, could not be discussed or even read. The committee have consequently not undertaken to prepare a new report for the present meeting, regarding it as more profitable to present before the Association the individual opinions embodied in the report, with a view to securing a full discussion.

The committee agreed to take as the basis of their report a series of tests that could be made on one subject within one hour, and to select the tests and methods with special reference to college students tested in a psychological laboratory. We urged that such tests be made as far as possible in all psychological laboratories, and recommended that a variety of tests and methods be tried, and the results reported to the committee.

The report, already presented and published,[3] shows the individual experience and opinions of the five members of the committee, whence have come to light some agreement and some diversity. The following tests, with some variations in the methods of carrying them out, are recommended unanimously by the committee:-

Preliminary data :--Date of birth; birthplace; birthplace of father; birthplace of mother; occupation (including class in college, or if not a student, the last educational institution attended); occupation of father; any measurements previously made.
Physical measurements: height, weight and size of head; Keenness of vision; Color vision; Keeness of hearing; [p.177] Sensitivity to pain; Perception of weight or of force of movement; Dynamometer pressure of right and left hands; Reaction time with sound; Rate of discrimination and movement; Perception of time; Memory; Imagery.

All the members of the committee except Professor Baldwin recommend the following: Breathing capacity; Fineness of touch or sensation-areas; Rapidity of movement; Visual perception of size.
All except Professor Witmer recommend: Perception of pitch.

We have consequently, in addition to preliminary records, nineteen tests, all measurements except imagery, recommended with tolerable unanimity. It would require from 30 to 40 minutes to make these, and we have consequently only about 20 minutes of the hour left for additional tests, and I am of the opinion that it is desirable to have that much diversity of work in different laboratories. It seems profitable that at Princeton they should try Ebbinghaus' test of apperception; at Clark, Bergström's card-sorting with practice; at Wisconsin, throwing a marble at a target; at Pennsylvania, will-power and the knee-jerk, and that we at Columbia should measure after-images. I differ from my colleagues on the committee only in so far as I do not place after-images in the series recommended to all laboratories, but reserve it for our private exploitation at Columbia.

In addition to tests recommended by all or nearly all the committee, and those recommended by one member only, there are certain tests recommended by two or three of us, and these deserve special consideration. A test for muscular fatigue is recommended by Professors Baldwin and Witmer and by me. I regard this as a desirable test. Ten pressures on a dynamometer (I prefer the thumb and forefinger) can be registered on a kymograph as quickly as and more accurately than two or three readings can be taken with the hand dynamometer. We secure a strength record subject to mental conditions, and the fatigue curve is typical of the attitude and temperament of the observer. I believe that the only other test recommended by me, and not by at least four members of the committee, is the measurement of height sitting. I do not regard this as important, but it gives a typical individual [p.178] racial distinction, and it is desirable to have a case for the study of the correlation of measurements, where the error of measurement is small. I think it desirable, as recommended by Professors Baldwin and Sanford, to both read and show the numerals in the memory test. We are doing this at Columbia, and the test is consequently now recommended by a majority of the committee.

The tests recommended by two or three members of the committee, but not by me, are as follows: Accuracy of 'aim in touching a point with the hand is recommended by Professors Baldwin, Sanford and Witmer, and an analogous test, throwing a ball at a target, by Professor Jastrow. At Columbia we also use a form of this test, letting the observer join two points with a pencil, but we do not find it very satisfactory, and I do not include it in my series. It seems to be somewhat difficult to secure uniform method, but I shall be glad to see results when published.

Card-sorting as a measure of quickness of distinction and movement is recommended by Professors Baldwin, Jastrow and Sanford. The difficulty here seems to me to be that we do not know whether we are measuring the natural quickness of the student or how late he stays up at night playing whist or poker. I think that it might be better to use counters or marbles. Should the subject be required to sort one hundred colored balls, it might prove a good test for color discrimination, as well as for quickness. The analogous test of marking objects on a sheet of paper, recommended by all of us, seems to me, however, largely to cover the ground.

Professors Jastrow and Witmer recommend Jastrow's combined test of memory, association and finding time and my test of the accuracy of observations and recollection. The former of these seems to me a good test if used universally and with uniform method, but I think it best to await publication of the results obtained, before recommending it for a short series. My test on observation and recollection suffers also from the artificial character of the questions asked, and further from the difficulty of using the same questions with students who will discuss the matters together before they are all tested. Professors Jastrow and Witmer recommend the accuracy of movements to the right and to the left, but Professor Jastrow does not insist on this as a test of great importance. I am inclined to think, that series of movements made simultaneously by the two hands intended to be equal and registered in a kymograph would be a good test, but that it should be further studied before being recommended for this series.

Time will not permit me to discuss the tests recommended by one [p.179] member of the committee only, or the variations in method. I should suppose that all the tests recommended in the series proposed, respectively, by Professors Baldwin, Sanford and Witmer, could not be made in one hour, but do not know whether they have tried them or not. For example, Professor Sanford recommends testing perception of pitch to one-half vibrations with tuning-forks, whereas it seems to me that this test alone would require at least one hour. The tests of movement, fatigue and attention proposed by Professor Witmer would also, I should suppose, take an hour. The measurement of motor and sensory reactions recommended by Professor Baldwin seems to me rather a subject for research than for anthropometric tests. But I admit that exactly the tests most interesting to the psychologist are those most difficult to make in three minutes. We can measure the body and make certain tests upon the senses quickly and accurately, but others of our tests, owing to variations in the process and method, scarcely give the real individual aptitude or difference. However, if we test a hundred students, we secure at all events a satisfactory class measure and variation.
 

Footnotes
 

[1]Abstracts of the discussion presented by members of the committee of the American Psychological Association at the Ithaca meeting.

[2]Similar methods were independently expounded by Binet in 1894, in his Psychologie expérimentale. Comparing his methods with these, in the Année Biologique, I, p. 608, he says that they were "proposés par V. Henri et moi pour la première fois; Baldwin a indiqué des méthodes analogues, d'une manière tout a fait indépendante," evidently overlooking the report of them before this Association in New York.

[3]The Psychological Review, March, 1897; Science, February 5, 1897.