Psychological measurement plays an important role in modern society. Teachers have schoolchildren tested for dyslexia or hyperactivity, parents have their children's interests and capacities assessed by commercial research bureaus, countries test entire populations of pupils to decide who goes to which school or university, and corporate firms hire other corporate firms to test the right person for the job. The diversity of psychological characteristics measured in such situations is impressive. There exist tests for measuring an enormous range of capacities, abilities, attitudes, and personality factors; these tests are said to measure concepts as diverse as intelligence, extraversion, quality of life, client satisfaction, neuroticism, schizophrenia, and amnesia. The ever increasing popularity of books of the test-your-emotional-intelligence variety has added to the acceptance of psychological testing as an integral element of society.
When we shift our attention from the larger arena of society to the specialized disciplines within scientific psychology, the list of measurable psychological attributes does not become shorter but longer. Within the larger domain of intelligence measurement, we then encounter various subdomains of research where subjects are being probed for their levels of spatial, verbal, numerical, emotional, and perceptual intelligence; from the literature on personality research, we learn that personality is carved up into the five factors of extraversion, neuroticism, conscientiousness, openness to experience, and agreeableness, each of these factors themselves being made up of more specific subfactors; and in clinical psychology we discover various subtypes of schizophrenia, dyslexia, and depression, each of which can be assessed with a numerous variety of psychological tests. In short, scientific psychologists have conjured an overwhelming number of psychological characteristics, that can each be measured with an equally overwhelming number of testing procedures. How do these procedures work?
Consider, as a prototypical example, the measurement of intelligence. Intelligence tests consist of a set of problems that are verbal, numerical, or figural in character. As can be expected, some people solve more problems than other people. We can count the number of problems that people can solve and look at the individual differences in the computed scores. It so happens that the discovered individual differences are relatively stable across adulthood. Also, different tests for intelligence tend to be positively correlated, which means that people who solve more verbal problems, on average, also solve more numerical and figural problems. There is thus a certain amount of consistency of the observed differences between people, both across time periods and across testing procedures.
As soon as a way is found to establish individual differences between people, all sorts of correlations between the test scores and other variables can be computed. So, we can investigate whether people with higher intelligence test scores, when compared with people who obtain lower test scores, are more successful on a job; whether they make more money, vote differently, or have a higher life-expectancy. We can look into differences in intelligence test scores as a function of background variables like sex, race, or socio-economic status. We can do research into the association between intelligence test scores and neural speed, reaction time, or the amount of grey matter inside the skull. We find a diverse array of associations and mean differences. Some are large and stable, others small and difficult to replicate. And so the mechanism of science has been set in motion. Under which conditions, precisely, do certain effects occur? Which variables mediate or moderate relations between intelligence test scores and other variables? Are these relations the same in different groups of people? Once the scientific engine is running, more research is always needed.
However, a nagging question remains. Do such tests really measure something and, if so, what is it?
This book originates from my attempts to make sense of this question, which is encountered in virtually every field where psychological tests are used. In the past century, it has become known as the problem of test validity. Test validity has proven to be an elusive concept, which is illustrated by the fact that empirical validation research tends to be highly inconclusive. The issue remains problematic, in spite of the fact that psychological tests have come to belong to the standard equipment of social science research, and in spite of the enormous amounts of empirical data that have been gathered in psychological measurement. In fact, after a century of theory and research on psychological test scores, for most test scores we still have no idea whether they really measure something, or are no more than relatively arbitrary summations of item responses.
One of the ideas behind the present book is that part of the reason for this is that too little attention has been given to a conceptual question about measurement in psychology: what does it mean for a psychological test to measure a psychological attribute? The main goal of this book is to investigate the possible answers that can be given in response to this question, to analyse the consequences of the positions they entail, and to make an informed choice between them.
Was this article helpful?
For as much as we believe we train our brains and give them a good workout, we seldom actually do it on a regular basis. In most cases, our brains are not used in a balanced way. We're creatures of habit. We find a way to do things that we consider comfortable and we seldom change our ways.