Econometrics, the statistical metric for economics, can be regarded as one of the main innovations which turned twentieth century economics into an engineering, or tool-based science, in which each application of economic theory requires special shaping to circumstances, whether for scientific purposes or in the policy domain (see Morgan 2001). The development and use of statistical tools for economics emerged in the early part of the twentieth century and, by mid-century, problems had been defined, solutions approached and usable concepts developed, so that one could legitimately refer to a distinct body of knowledge embracing both theory and practice. After 1950 econometrics became a mature field and the dominant method of applied economics.
The particularities of the history of econometrics have often been linked to the nature of its tasks and aims (see De Marchi and Gilbert 1989, Heckman 2000, Morgan 1990, Qin 1993). Economists have understood their fundamental methodological problem to be: how to establish knowledge about economic phenomena and economic laws given that most economic activity cannot be subjected to controlled experiments. Econometricians have assumed that this means that descriptions, regularities, and relations of the economy must be extracted from ‘passive’ observations: statistical data that result from the natural course of economic activities, with the use of statistical theories and tools. Econometrics therefore suffers from all the usual problems of an inductive science. When measurement or testing of theories is at stake, severe correspondence problems arise because of the huge gaps between the assumptions of those economic theories and the conditions of data collection.
The history of the field also has its external dynamics (Desrosieres 1998, Epstein 1987). To begin with, technological changes play an important role, evidenced in the development of the field of mathematical statistics and machines of computation. The provision of data, the availability of trained econometricians, and fashions in economic theory are all relevant to the narrative, as are the stimuli from twentieth-century economic events and associated demands for econometric expertise. Econometrics in the later twentieth century became a mainstay of economic policy analysis and tool for economic intervention, yet at the same time it grew into one of the most technically demanding and methodologically sophisticated areas of economics.
1. The Development Era
In the latter nineteenth century, a few economists began to use graphs and tables of statistical data to provide empirical support for their arguments, or to describe behavioral characteristics of economic phenomena (Klein 1997). They were quick to take up certain statistical techniques developed for biometrics, namely correlation and regression. But these ‘measures of association’ were not entirely adequate for economists’ questions (Aldrich 1995, Morgan 1997). Where biometricians had, for example, a sample of heights of fathers and their sons, whose relation could be assessed with correlation or regression coefficients under reasonable assumptions about the individuals in the sample, economists typically used time-series observations that provided only one datum for each time date. Not only were such ’samples’ considered peculiar from the viewpoint of the correlation theory of that time which assumed that the observations within the sample had been independently drawn but economists believed that such time-series observations reflected a number of concurrent behavioral characteristics or causal relationships varying with the time element and unit chosen.
This problem of extracting causal relationships between variables from data which contained different time patterns dominated early econometric work but the questions attacked were driven by economic and policy concerns. Measurements, for example, of market demand and supply relations for agricultural goods were undertaken at the American Bureau of Agricultural Economics during the 1920s when farmers faced falling prices in international markets and followed earlier European breakthroughs on these questions. Similarly, the problems posed by ‘business cycles’ were widely recognized by commerce, politicians, and economists, for the crash of 1920-1921 was only outdone by the even more spectacular and longer-lived depression of 1929-1933. Business cycle research institutes, which had begun to emerge even before the end of World War I, sprang up everywhere in the 1920s from Massachusetts to Moscow, but their truly international quest for description and explanation of the business cycle showed that the task of establishing measures of let alone explanations for that phenomenon, its characteristics, and its time patterns was incredibly difficult.
A concerted movement to create a more ’scientific’ economics, based on both statistical and mathematical tools in economics, resulted in the foundation of The Econometric Society (in 1932), and its house journal Econometrica (in 1933). The most critical innovation of that decade, often overlooked, was that of the ‘econometric model’ formulated as an intermediary device to bridge the gap between economic theory and economic data. Such models represented elements of the economic theory in mathematical form, but were also attuned to the peculiarities of the time patterns of data evidence. With the use of such models, various kinds of knotty correspondence problems at the intersection between economic theory and data began to be understood and solutions proposed. For example, the problem of unraveling several statistical relations holding within a variable set (the multi-collinearity of statistically correlated variables) was understood as different from the problem of separating out different economic relations which involved some of the same overlapping variable set (the identifiability of several behavioral economic relations). At the same time, problems of measurement errors (’errors in variables’), or of omitted variables from the model (’errors in equations’) were attacked with the development of statistical tools and methods. But these 1930s answers were proposed as one-off solutions to specific applied problems, not general solutions for generally recognized problems.
However, econometrics was not without its detractors and the move to codify and formalize these somewhat disparate concepts, problems, and emergent solutions seems to have been triggered by the Keynes-Tinbergen debate in the late 1930s. Jan Tinbergen, a young Dutch economist, was the most creative of econometricians of the 1930s in the development of modeling, in understanding conceptual problems, and in suggesting solutions. He was also responsible for building and using the first macroeconometric model to analyze policy options. In this work, he developed ideas by Ragnar Frisch, a Norwegian econometrician (which later made them joint winners of the first Nobel Prize in economics in 1969) who in turn had drawn on statistical work by English statistician Yule and Russian econometrician Slutsky.
Tinbergen’s macroeconometric modeling work was heavily attacked by Keynes and this attack prompted the development in the 1940s of rigorous foundations for econometrics. Despite their difference of opinion, both Tinbergen and Keynes can be seen, in retrospect, as reshaping economic theory and policy in the aftermath of the Great Depression so that, in the 1950s, Lawrence Klein (Nobel prize 1980), could develop the post-war generation of macroeconometric models on Keynesian theories with more secure statistical foundations.
The key development in this foundation work was the ‘probability approach’, fashioned in the early 1940s by Trygve Haavelmo (Nobel Laureate in 1989 for this work), a Norwegian econometrician, who proposed to use probability theory to establish a bridge between economic theory and passive economic observations. His bridge was designed to carry the burden in both directions: to provide guidance on how to specify econometric models as statistical hypotheses so that not only could economic theory be logically related to economic data but that there would also be a path for inference back from the data to the theory.
2. Establishing the Mature Discipline
Haavelmo’s probability blueprint for econometrics had indicated directions for both theoretical and applied work and has been interpreted as creating a ‘revolution’ in econometrics. Parts of Haavelmo’s program were immediately elaborated at the Cowles Commission (later Foundation), an American hothouse of econometric research in the mid 1940s (see Hildreth 1986). They assumed that the driving relations of the economy were best represented as a system of simultaneous equations that operated at a level hidden beneath the time-series observations. Their research was primarily concerned with developing appropriate identification conditions and estimation (measurement) techniques for such econometric models. In the years after 1950 an established orthodoxy developed based on Cowles’ work and enshrined in textbooks in the 1960s. But, despite an active program of research at the theory level, providing technical extensions to Cowles’ groundwork, cracks in that orthodoxy began to appear. There were alternatives to the Cowles program.
Applied econometric research revealed serious practical and methodological limitations in the Cowles approach, particularly in regard to model specification, model choice, and associated testing procedures. Their assumption that econometric work is anchored on a priori theories meant that it was rarely possible to specify, and identify uniquely, structural models for those theories. This made theories virtually unfalsi-fiable and the fitting of models to data an arbitrary affair, as Henri Theil and T. C. Liu pointed out in different contexts in the 1950s. This was made transparent in the specification and sensitivity analysis using Bayesian methods developed by Ed Learner in the 1970s (Qin 1996). The Cowles assumption of simultaneity and associated lack of attention to the characteristics of time-series data, were both open to challenges. The Swedish econometrician Herman Wold, who had trained in the Russian probability school and time-series analysis in the 1930s, attacked the assumption of simultaneity and developed alternative dynamic interdependent equation models in the 1950s (Morgan 1991). These developments found a generalized form in the 1970s in the American econometrician Christopher Sims’ time-series versions of Wold’s models called Vector Auto-Regressions (VARs) (Qin and Gilbert 2001). Whereas Wold had shown how the Cowles structural models were special cases of his causal chain models, VARs turn out to be the least arbitrary representation of the information in time-series data.
The time-series problems of economic data, and possible solutions for Cowles measurement techniques, were analyzed at the Department of Applied Economics at Cambridge in the 1940s under Richard Stone (Nobel Laureate in 1984). However, it was the 1970s experience of stagflation in many economies (high unemployment and inflation) that spurred a more radical refashioning of economic theory and econometric approaches. The dynamic elements of economic modeling, taken seriously in the 1930s, once again became the focal issue of econometric measurement. Descriptive statistical time-series models (i.e., models not relying on economic theory) underwent a certain renaissance in the wake of disappointing prediction records from many large-scale macro-econometric models during the 1970s upheavals. Meanwhile, economic theories were being re-formulated to provide a more elaborate account of uncertainty at the microeconomic level in the form of assumptions about ‘rational expectations’ and their relation to the macro-level. This formed an important stimulus for Sims’ work, but while VARs can be understood as the most general form of dynamic models, they were not necessarily the most interpret-able in terms of economic phenomena. An alternative dynamics, analogous in form to servo-mechanistic control engineering, and associated with David Hendry and the London School of Economics was the ‘error correction model’ which could be more readily understood in terms of individuals’ reactions to changing economic events. These developments effectively reversed the Cowles approach for they start from a summary of the data and move towards the simplest possible interpretable model that is coherent with the data.
The emergence of an especial microeconometrics occurred concurrently with these changes in macro-econometrics. In the early 1950s, Stone had struggled with the immense task of providing measurement structures appropriate for the separate components of demand in a total consumer expenditure system: a system simultaneous at the micro-level. By the 1970s, various statistical models appropriate for explaining behavior at the individual level had been adapted from 1930s biometrics into econometrics by Dan McFadden and James Heckman (joint Nobel Prize in 2000). Their treatment of survey data enabled measurement models to be developed for the discrete choices made by individuals. Such discrete choices dominate economic decision making in consumption decisions, work decisions, family formation decisions, etc. These measurement models were quickly taken up for they matched the individual utility-maximizing framework of neoclassical economic theorizing, the main orthodoxy of post 1950s western economics. The gradual appearance of large-scale ‘panel data sets’ (a set of individuals measured repeatedly over time) encouraged both theoretical and applied work at this microeconomics level and created a wider territory for the expansion of econometrics. Time-series work in micro-econometrics was not completely abandoned however, for in the 1990s a more data-instigated style of analysis, reminiscent of the 1930s business cycle work, was developed for the analysis of financial data.
Post 1950s data-based research has not been bound by any straight]acket methodology but spawned several hybrid approaches. Guy Orcutt, believing that macro-time series data were inadequate for inference about the economy, developed the method of micro-simulation, an intermediary style of econometric research involving modeling, analysis, and simulation. This approach, difficult to achieve when he began his work in the later 1950s, has become practicable with modern computing power, and is now almost a necessity in certain areas of policy analysis. In the 1980s, attempts to model business cycles combined simulation, computable general equilibrium analysis, and the VAR model another hybrid to produce replacing ‘estimation’ (using statistical data to measure relations) with ‘calibration’: the matching of certain statistical properties predicted by the models with those found in time-series data.
3. The Place of Econometrics in Economics
During the early years of The Econometric Society, ‘econometrics’ had been defined as the union of economics, mathematics and statistics. And, in developing model intermediaries based on mathematical economic theory in that period, econometricians had developed a distinctive path compared to those found in psychometrics and sociometrics in the same period. After 1950, as economic theorizing became generally expressed in mathematical terms, the term econometrics stabilized on its current meaning: the use of statistical reasoning and methods as means to establish data-based descriptions of economic phenomena and empirically based counterparts for, and tests of, economic theories.
The spread of econometrics as an established part of economics in the post 1950s period may be understood as due not only to the strengthening of its foundations in statistical theory but also to the continuing expansion of data (’official’ statistics), its establishment in the core undergraduate teaching program, and the development of cheap desk-top computing. Together these meant that econometric work became a standard tool of policy work in governments and international agencies as well as becoming endemic, in various different forms, in the sub-fields of scientific economic research. Econometric theory has developed into a formidable body of specialist statistical theory and the increasing gap between the difficult technical and theoretical questions and the apparent ease of applications might indicate a field where applications came loose from theoretical work. This has been militated by another tool, the development of specific software packages for econometrics, in which theoretical and technical developments can be quickly translated into modeling, measurement, and testing regimes at the level of applications.
This widening use of econometrics did not denote a settled field. The community in the latter part of the twentieth century found itself immersed in a number of vibrant and important arguments about how to do econometrics, based upon applied econometric experience as much as on considerations of statistical theory and philosophy of science. Should models be theoretically based or descriptively accurate? What are the relevant criteria, economic and statistical, which can be used to test the adequacy of these different models? Should the modeling process start with simple models that grow more complex, or start general and aim to become simple? Should the probability framework be classical or Bayesian? What implications does this difference have for the types of measurements made and the possibilities of inference? How far can applied econometrics be formalized and codified apart from tacit knowledge? The terms of the continued debates around such themes point to a mature field, one intimately connected to both statistical and economic theory, but in which the foundational arguments raised in early work continue to reverberate to the most recent time.