09 Jun In Module 5, we considered the third in our three-part series on research design. Specifically, the focus was on the longitudinal studies, in which the resear
In Module 5, we considered the third in our three-part series on research design. Specifically, the focus was on the longitudinal studies, in which the researcher explicitly accounts for change over time in order to understand specific effects. Also, as with the prior week, this module's assigned readings provided a bit of historical context for the debate between scholars who argue that criminology/criminal justice research either should or should not incorporate longitudinal designs, for various reasons.
In this week's discussion, select any topic of your choice within criminology/criminal justice — it could be a specific theory, policy, justice system response, etc. — and explain to your classmates how that example either would or would not benefit from longidutinal study. But wait! Don't JUST tell us that example XYZ might change over time, and therefore it should be studied with longitudinal designs. Instead, explain WHY change over time might matter in your example. Is it because of generational effects between GenX and Millenials? Is it because of technology or some other external factor? Is it because society sees certain problems differently? Explain why and justify your points. You don't necessarily need to fetch journal articles to help, but if you wish to cite published evidence, that's okay too.
THE METHODOLOGICAL ADEQUACY OF LONGITUDINAL RESEARCH ON CRIME*
MICHAEL GOTTFREDSON TRAVIS HIRSCHI
University of Arizona
This paper argues that the increasing dominance in contemporary crim inology of the longitudinal or cohort study is not justified on methodologi cal grounds, that this research design has taken criminological theory in unproductive directions, has produced illusory substantive findings, and has promoted policy conclusions of doubtful utility. In addition, it is noted that longitudinal research is very expensive and therefore has high opportunity costs, costs that have not been properly evaluated. The positive thesis is that many of the apparent benefits of longitudinal research can be obtained by carefully designed and reasonably conceptualized cross-sec tional studies, at substantially reduced cost.
As a by-product of research on the relationship between age and crime (Hirschi and Gottfredson, 1983), we came to the conclusion that the concept of a "career criminal" was inconsistent with the evidence (Gottfredson and Hirschi, 1986) and that longitudinal research designs were not necessary for the study of crime and criminality. The conclusion of age invariance in crime proved to be controversial (Greenberg, 1985; Farrington, 1986a), as did the conclusions about the value of the criminal careers concept. If controversy implies disagreement, the least controversial conclusion of this research was that longitudinal research is not required for the study of crime. Almost no one agrees with this conclusion; on the contrary, nearly everyone agrees, it seems, that longitudinal research is essential for the proper study of crime causation.
Indeed, groups of the nation's leading scholars have recently called for even greater emphasis on longitudinal designs in criminology. (These groups include the National Academy of Sciences panels on deterrence and incapaci tation (Blumstein, Cohen, and Nagin, 1978) and on criminal careers (Blum stein, Cohen, Roth, and Visher, 1986), and a MacArthur Foundation panel on criminological knowledge (Farrington, Ohlin, and Wilson, 1986).) Obvi ously, if more longitudinal research is required for the proper study of crime causation, our conclusion about the value of this design is wrong. Since the
*We wish to thank David Riley, Doug Wholey, and David Farrington for comments on an early draft of this paper. Work on portions of this manuscript was supported by National Science Foundation grant SES8500244.
CRIMINOLOGY VOLUME 25 NUMBER 3 1987 581
582 GOTTFREDSON AND HIRSCHI
conclusion about longitudinal designs derived from substantive and theoreti cal considerations, the question of the methodological adequacy of longitudi nal designs may be seen as a question of theory and substance. But this position is not new. On the contrary, the adequacy of a research design is always judged by its relevance to questions of substance, theory, or policy (and not vice versa). Therefore, the preference for longitudinal designs must also stem from theoretical or policy preferences. Since methods cannot be divorced from substance and theory, they cannot be evaluated in isolation from issues of substance and theory. The purpose of this paper is to subject arguments favoring longitudinal designs in criminology to substantive, theo retical, and methodological analysis.
Early Longitudinal Research. Systematic empirical research on crime in this country is perhaps best traced to the pioneering work of Glueck and Glueck (1930, 1940). The Gluecks' early work (for example, 500 Criminal Careers, 1930) was primarily longitudinal, following large numbers of offend ers over long periods of time. A major problem with such research was loca tion of a reasonable comparison group. The Gluecks initially relied on statistics from the general population for this purpose, 1 but they eventually turned to comparisons of matched samples of offenders and nonoffenders (on such things as age, IQ, and neighborhood) in a standard cross-sectional design. Their major cross-sectional study was published in 1950 as Unrav eling Juvenile Delinquency. The Gluecks continued to follow over many years the 500 delinquents and 500 nondelinquents first identified for Unrav eling, and published the results in 1968 under the title Delinquents and Nondelinquents in Perspective.
The Gluecks' major work is Unraveling Juvenile Delinquency. Although citations are only one measure of influence of scholarly works, Wolfgang, Figlio, and Thornberry ( 1978) report that as of 1972 Unraveling was the most heavily cited book in criminology. Since publication of the longitudinal work (Delinquents and Nondelinquents in Perspective), the cross-sectional study has been cited 403 times (through 1984) and the longitudinal study 89 times, less
1. In comparison with the general population, the delinquents studied by the Gluecks in the 1920s were more likely to have dropped out of school, to have come from unhealthy home situations, to have bad companions, to come from densely populated areas, to be educationally retarded, to have started work at an early age, to have moved frequently from place to place at an early age, and to have engaged in sex, gambling, and drugs (Glueck and Glueck, 1930: 306-309). Despite the myriad criticisms of the Gluecks' work, their findings have proven to be amazingly consistent with subsequent research.
At the time the Gluecks were collecting data for Unraveling Juvenile Delinquency, the Cambridge-Somerville Youth Study got under way. This important study (Powers and Witmer, 1951; McCord and McCord, 1959) is essentially a long-term follow-up of the effects of a counseling program on potential delinquents. It is mentioned here as further evidence of the long history of longitudinal research in criminology, a history not always acknowledged by recent calls for increased emphasis on this design.
LONGITUDINAL RESEARCH 583
than one fourth as often (Social Science Citation Index, 1966-1984). Given the tremendous overlap in measurement, conceptualization, and analysis, the disproportionate influence of the cross-sectional study, Unraveling, over the longitudinal study, Delinquents and Nondelinquents, does not seem to favor the idea that, all else equal, longitudinal research is more important or valua ble for criminology. (After all, to use a criterion much favored by longitudi nal researchers, in this comparison the Gluecks serve as their own control.) Despite the claims of those favoring longitudinal designs (Farrington et al., 1986), other comparisons of longitudinal and cross-sectional research lead to the same conclusion: the assertion that longitudinal research, study for study, has had greater impact than cross-sectional research is inconsistent with the evidence.
What, then, accounts for the widespread view that longitudinal designs in criminology should be preferred to cross-sectional designs? We see two answers to this question. The first involves alleged methodological superior ity, according to which longitudinal research solves causal questions beyond the reach of cross-sectional designs. The second involves alleged substantive superiority, according to which the facts of crime and criminality require lon gitudinal designs for their explication. The adequacy of each answer will be considered in turn.
DESIGN FEATURES OF LONGITUDINAL RESEARCH
The ideal design in scientific research is the true experiment, where subjects are randomly assigned to treatment conditions and the effects of the various treatments are then compared (see Campbell and Stanley, 1963: 3, who refer to experimentation as "the basic language of proof, as the only decision court for disagreement between rival theories … "). This design, with sufficient replication, uniquely satisfies the three criteria of causation: (1) association between cause and effect; (2) temporal precedence of cause over effect; 3) non spuriousness (Hirschi and Selvin, 1967; Cook and Campbell, 1979). All other designs are inferior, but some are better than others. For example, Cook and Campbell (1979) persuasively argue that next to the true experiment in scien tific adequacy is the quasi-experiment, a design involving some of the active intervention of the true experiment without its control over extraneous condi tions. (This design satisfies the first two criteria of causality and sometimes, if one is lucky, the third as well.) Further down the list of scientific adequacy, one finds passive observational designs, where the investigator takes what nature gives and attempts to infer the elements of causation through correla tional or similar statistical methods. Passive designs are able to establish cor relations between variables, but they have difficulty distinguishing cause from effect, and they can only weakly approximate the experiment by statistically controlling for other variables that may be producing the correlations of
584 GOTTFREDSON AND HIRSCHI
interest. These difficulties cannot be overcome by advances in multivariate statistics, since such statistics remain ambiguous substitutes for manipulation and randomization (Cook and Campbell, 1979: 9).
Of course, experimentation, even quasi-experimentation, is not often possi ble with phenomena such as crime, and researchers are forced to do the best they can with the less than ideal designs available to them. (It should be noted that designs ideal in theory are themselves often less than ideal in prac tice. Thus, many true experiments are simply uninformative, and others are more trouble and expense than they are worth.) Nonexperimental designs also differ among themselves with respect to the extent to which they satisfy the criteria of causation, with respect to other valid scientific criteria such as external validity of their results and their compatibility with the phenomenon at issue, and with respect to such nonscientific but important criteria as cost in time and money. Obviously, selection of an appropriate research design involves consideration of a good many things, a simple fact that makes a priori design preference by funding agencies or panels of academicians (Mor ris, 1986; Farrington et al., 1986; Blumstein et al., 1986) hard to justify. Such considerations apply as well to critics of the longitudinal design. There can be no a priori objection to longitudinal research. The question is whether longitudinal research is appropriate to the study of crime and criminality, whether its methodological strengths compensate for its methodological defi ciencies, and whether its efficiencies outweigh its costs.
The longitudinal design involves repeated measures of the same subjects, where the frequency and duration of the "follow-up" is a function of the phe nomenon in question. In longitudinal studies of crime, the researcher some times collects data every year (for example, Elliott, Huizinga, and Ageton, 1985), sometimes every two years (West and Farrington, 1977), but usually at longer intervals (McCord and McCord, 1959; Glueck and Glueck, 1968; Wolfgang, Figlio, and Sellin, 1972). The longitudinal design does not entail any particular method or frequency of data collection, sampling strategy, method of analysis, or project duration. However, some argue that the tim ing of data collection is a crucial distinction among longitudinal studies, with the "prospective" study (where subjects are identified before the events of interest occur) usually seen as preferable to the "retrospective" study (where subjects are identified after the events of interest have taken place). (The pro spective study does have advantages. For example, the measurement of independent variables cannot be affected by knowledge of the values of the dependent variable. However, the prospective study has disadvantages as well: it costs more (in time, money, and sample size) and is considerably more risky (in terms of theory, data analysis, sample adequacy, and the like). Some (Farrington, 1979) regard the frequency of data collection as a crucial design issue, with frequent data collection being superior to infrequent collec tion for purposes of making causal inferences. Unfortunately, more data are
LONGITUDINAL RESEARCH 585
not necessarily better than less data, especially if they are essentially the same data. Practice or testing effects are often substantial in survey research, and the stability of characteristics is often insufficiently known to justify a priori claims to superiority of "more frequent" data collection. (After all, some stability in behavior is required to justify the "developmental" or longitudinal study in the first place.)
Some regard sampling distinctions to be critical in evaluating these designs. When all subjects share a common experience (for example, are born in a single hospital or in a single year), longitudinal studies are called "cohort" studies. When the sample includes people from more than one cohort (for example, people born in two different years), longitudinal studies are called multicohort studies. When such a study completes a second wave of data collection, it becomes a multiwave, multicohort study. The multiwave, mul ticohort study is designed to allow separation of the effects of age, period, and cohort. It is designed to determine whether it matters that the subjects were born in a particular year (or hospital), that they are a particular age, and that the study was conducted at a particular period of time. Obviously, the mul tiwave, multicohort study is typically thought to be better than the single wave, single-cohort study and, of course, better than the cross-sectional study, which, in terms thus far introduced, turns out to be a retrospective, single-wave, multicohort study with minimum frequency of data collection.
One of the alleged strengths of the prospective longitudinal study is that it entails lack of knowledge of "outcome" variables. In other words, the longi tudinal researcher does not know which subjects are going to be delinquents and which are going to be nondelinquents. When confronted with this prob lem, standard sampling strategy would be to stratify the population on vari ables known to be closely associated with the dependent variable and to oversample subjects likely to be delinquent (a strategy that assumes stability in the correlates of crime or in the tendency to commit criminal acts). Longi tudinal researchers in crime and delinquency tend not to adopt this common procedure (Elliott et al., 1985; Wolfgang et al., 1972; Tracy, Wolfgang, and Figlio, 1985)). Some see the frequency of data collection as critical, restricting the longitudinal study to those cases where there is more than one "follow up" or more than one contact involving information about the subjects. Others allow the term to be applied to one-time studies that gather informa tion about more than one point in time in the subject's life. All of these distinctions must be treated as measurement, sampling, or theoretical issues rather than design issues, since none of them bears on the adequacy of the longitudinal design as a basis for causal inference. Moreover, these research choices have direct counterparts in all other designs, whether experimental, quasiexperimental, or, as in the case of the longitudinal study, preexperi mental or passive observational.
586 GOTTFREDSON AND HIRSCHI
CAUSAL ORDER
In the criminological literature, it is frequently asserted that the longitudi nal design is superior to the cross-sectional design when one is interested in the problem of causal order. Elliott and Voss (1974: 7-8) assert that
Because of the difficulty involved in establishing the temporal order of variables, causal inferences are difficult to derive from cross-sectional data. Data gathered at one point in time generally preclude insight into developmental sequences or processes that lead to delinquent behavior or dropout. . . . [T]he availability of data gathered at different points in time permits assessment of the direction and amount of change in these scores during the course of the study and enables us to derive causal inferences.
Farrington (1986a: 212) asserts: "Another advantage of a longitudinal study is its superiority over cross-sectional research in establishing cause and effect, by showing that changes in one factor are followed by changes in another." Petersilia (1980: 337) claims that longitudinal research is "superior to cross sectional if one is primarily interested in drawing causal inferences." Blum stein et al. ( 1986: 199) provide a list of substantive areas in which the longitu dinal design is "required":
Many issues about criminal careers cannot be adequately addressed in cross-sectional research: the influence of various life events on an indi vidual's criminal career; the effects of interventions on career develop ment; and distinguishing between developmental sequences and heterogeneity across individuals in explaining apparent career evolution. Answering these and related questions requires a prospective longitudi nal study of individuals of different ages.2
Such statements illustrate the extent of the belief that longitudinal designs solve the problem of causal order (although they suggest that faith in the design extends beyond the causal order question). They do not, however, provide evidence or even illustration of the actual ability of the design to produce such solutions. Nor, for that matter, do these proponents of longitu dinal research show that causal order is an especially difficult problem for criminology. ls causal order especially problematic in crime and delinquency
2. Blumstein et al. (1986) urge the longitudinal design, apparently for the same rea sons it has been urged for about 100 years. According to Wolfgang et al. (1972: 5), Kohner asserted in 1893 that "correct statistics of offenders can be developed by a study of the total life history of individuals." Without discussing the merits of the theoretical issues raised by the Blumstein et al. characterization of the problem as being the "inherently longitudinal and dynamic characteristics of the criminal career," it is apparent that causal order is only one of the several claims for methodological superiority being attributed to the longitudinal design.
LONGITUDINAL RESEARCH 587
research? However defined, criminal acts and delinquencies are not tempo rally ambiguous. Typically, their occurrence can be pinpointed to the minute or hour. When was the liquor store robbed? When did the assault or bur glary or homicide or arson take place? These are not inherently difficult or ambiguous questions. Some crimes or delinquencies are of course more diffi cult to locate precisely in time. When did the child begin to use cigarettes or drugs? When did the child become incorrigible? But even these more ambig uous offenses can be located in time with sufficient precision to allow unam biguous conclusions about temporal sequence, at least with respect to most nontrivial causal variables.
So, even if the researcher's interest is in sequences of criminal events (in order to more accurately describe the life history of the career criminal), it should be easy to say where an event occurred in a sequence of events.
It would appear, then, that the causal order problem must be attributable to difficulties in establishing the order of crime and its potential causes. Since crime is, in these terms, relatively nonproblematic, one is led to infer that the potential causes of crime are especially problematic in terms of when they occur.
What causes of crime are seen by longitudinal researchers as problematic in these terms? One can infer from their discussions four categories of variables for which temporal order is problematic to longitudinal researchers: (1) age, period, and cohort; (2) standard causal variables thought to be implicated in crime and delinquency causation; (3) treatment and criminal justice interven tions; and (4) the effects of ordinary life events.
AGE, PERIOD, AND COHORT
In standard cross-sectional research, the observation that age is correlated with crime is subject to alternative interpretations. Differences apparently due to age (for example, higher rates among the young) may in fact be due to recent changes in economic or social factors important in crime causation. Suggestions that apparent age effects may be period or cohort effects are a major justification for longitudinal research for those urging a greater empha sis on this design (Farrington, 1986a; Blumstein and Cohen, 1979; Blumstein et al., 1986; Greenberg, 1985). If a single cohort (persons born within a lim ited period of time) is followed from birth to death, age differences cannot be due to between-cohort differences. Unfortunately, they may be due to period effects-that is, it may be that high-rate ages reflect nothing more than high rates of crime in the society when the cohort was at a particular age. Obvi ously, the cohort design must be complicated to allow resolution of this prob lem (or the researcher must look at data not collected as part of the cohort design), but the reader will note that none of this bears on the question of causal order, the question with which we are now dealing. However complex
588 GOTTFREDSON AND HIRSCHI
age, period, and cohort questions may become in terms of determining which of these three variables may be responsible for observed differences, there seems to be little controversy about whether they precede or follow crime. Since crime cannot cause age, period, or cohort, a longitudinal study is not required to answer the causal order question.
The much-touted ability of the complex longitudinal study to separate age, period, and cohort effects could not be of less theoretical or practical conse quence. In fact, a good case could be made for the view that concern for this distinction has distracted attention from more interesting crime data and has caused the field to misinterpret long available data on the issue. For example, longitudinal researchers frequently wonder whether the apparent age distri bution of crime could be a "cohort" or a "period" effect rather than an age effect (Cohen and Land, 1987; Farrington, l 986a; Blumstein and Cohen, 1979; Greenberg, 1985) when an empirical answer to this question may be had by examining the age distribution of crime for differing periods and cohorts (Hirschi and Gottfredson, 1983; Gottfredson and Hirschi, 1986). For that matter, the longitudinal study appears to be a grossly inefficient method of discovering period effects. The post-World War II crime wave was well documented by ordinary cross-sectional data long before it was reported by longitudinal researchers (Tracy et al., 1985). Concern for cohort effects is even more puzzling. Suppose it is found that a given cohort has a higher crime rate than an adjacent cohort, when age and period effects have been removed. What is to be made of this difference-that is, what life circum stances distinguish one cohort from the other? The answer, alas, is that the number of possible explanatory variables is for all practical purposes unlim ited. It could be the size or composition of the cohorts, it could also be that they were of different ages when one of many natural catastrophes occurred. (A further irony of interest in "cohort effects" is that they can be identified only long after their occurrence. They are therefore immune to manipulation and devoid of policy significance.)
STANDARD CAUSAL VARIABLES
The recent report of the National Academy of Sciences Panel on Research on Criminal Careers (Blumstein et al., 1986), summarizes the crime research literature using its own "criminal career paradigm." Since this report strongly supports longitudinal research, its list of causal variables should be representative of those considered important by longitudinal researchers and can help structure a consideration of the causal literature. What are these variables?
Sex. The Blumstein et al. panel reports that "the most consistent pattern with respect to gender is the extent to which male criminal participation in serious crimes at any age greatly exceeds that of females, regardless of source of data, crime type, level of involvement, or measure of participation" (1986:
LONGITUDINAL RESEARCH 589
40; see also Hirschi and Gottfredson, 1983; compare Farrington, 1986a). If sex differences are sufficiently robust that they survive all these conditions, including age, longitudinal research is not required to discover them. If the sex difference is the same at every age, as argued by Blumstein et al. (see also Hirschi and Gottfredson, 1983), examination of this difference at any age will be sufficient to determine its magnitude, and sex differences in crime cannot be used to justify longitudinal research, however crime might be measured or defined.
Race. Blumstein et al. of course do not suggest that the race-crime correla tion is problematic with respect to causal order. What, then, might be the value of a longitudinal design with respect to a race-crime connection? It is possible that race is more important at some ages than at others. In fact, the Wolfgang et al. cohort study (1972) suggested that blacks "start earlier" than whites. Reanalysis of the Wolfgang et al. data (Hirschi and Gottfredson, 1983) suggested that "age of onset" differences between blacks and whites were in fact merely rate differences in crime, differences that could be easily determined by cross-sectional research at any age. In other words, a cohort study was not required to discover them. This conceptual issue is not recog nized by Blumstein et al., who rely on other research to suggest that the race crime relation depends on age. This research is the Elliott et al. National Survey of Youth, a multiple cohort longitudinal study. According to Blum stein et al., the National Survey of Youth shows that race ratios differ by age. In the example cited, the black-to-white robbery ratio falls from 2.25: 1 when the cohort members were 11-17 years of age to 1.5: 1 when the cohort mem bers were 15-21 years of age, four years later.
In this instance, the National Youth Survey might be better used to illus trate the weakness rather than the strength of the longitudinal design. Put ting aside the disconcerting overlap in the age ranges compared, the cited differences do not appear to be statistically significant at conventionally accepted levels. The large standard errors of these age-race-crime specific estimates stem from the sample limitations imposed by the National Youth Survey's longitudinal design. In fact, this survey is unable simultaneously to disaggregate by sex in making race comparisons. Since race-sex interactions are routinely reported by cross-sectional research (Hindelang, Hirschi, and Weis, 1981; Hindelang, 1981 ), there would be reason to question the study's findings even were they to survive tests of statistical conclusion validity. (Incidentally, the Elliott et al. survey reports as much variability in crime specific sex ratios over age as it reports in crime-specific race ratios over age (Blumstein et al., 1986, Table A-4, p. 242), suggesting that the panel did not apply consistent standards in its review of the literature.) In any event, there is as of now little evidence to support the view that longitudinal research would shed light on the race-crime relation over and above that shed by less costly designs.
590 GOTTFREDSON AND HIRSCHI
Age. The Blumstein et al. panel argues that to resolve important age and crime issues "data are needed for a common sample on crime-specific age distributions of initiation and current participation according to both official records and self reports" (1986: 42). They then conclude, however, that dem ographic patterns "are of little use in explaining participation or in providing a basis of policy intended to prevent initiation of criminal careers" (1986: 42). Why the research community needs useless data is not clear. Certainly such need should not be used to justify longitudinal research. In any event, many believe that the age-crime relation is of crucial significance for criminological theory and crime control policy (Gottfredson and Hirschi, 1986; Hirschi and Gottfredson, 1983, 1986; Greenberg, 1977, 1985; Shavit and Rattner, 1986; Cohen and Land, 1987), and there can be no doubt that understanding of this relation is central to the issue of the alleged "need" for longitudinal research. As a result, the significance of the relation between age and crime goes far beyond the causal order question (which is again nonproblematic). In the authors' view, the evidence supports the notion that the age-crime relation is for all practical and theoretical purposes invariant. In the view of those pro moting longitudinal research, the evidence supports the notion that the age crime relation is sufficiently variable to justify expensive longitudinal research aimed at understanding how the causes of crime vary from one age to another. Making explicit what has been implicit in this controversy: propo nents of longitudinal designs argue that potential age-causal variable interac tions are of great theoretical and policy significance. We argue that such interactions are trivial compared to the theoretical and policy implications of a large and direct influence of age on crime. Current attempts to document age-causal variable interactions have done little more than shift attention from a major factor in crime causation to minor and inexplicable fluctuations in particular data sets.
Our original conclusion that the age-crime relation is invariant was based on inspection of available data, much of it of course cross-sectional. Conclu sions about age effects from cross-sectional data are traditionally