Hostname: page-component-848d4c4894-ttngx Total loading time: 0 Render date: 2024-05-22T00:42:03.232Z Has data issue: false hasContentIssue false

Political Statistics for the United States: Observations on Some Major Data Sources*

Published online by Cambridge University Press:  01 August 2014

Edward R. Tufte*
Affiliation:
Princeton University

Abstract

Thirteen major data sources for the study of American politics are examined with regard to their conceptual orientation, error structure, and inferential utility. A great deal of ephemera and measurement without theory is discovered. Few of the documents contain any serious discussion of error structure, although some do report “standard errors” based on naive sampling models. In addition to suggestions for improving the compilation of political statistics, recommendations for a basic minimum library of data sources for American politics are made: The Almanac of American Politics and the Statistical Abstract of the United States, followed by the Guide to U.S. Elections.

Type
Review Essay
Copyright
Copyright © American Political Science Association 1977

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

*

The Almanac of American Politics, 1976. By Michael Barone, Grant Ujifusa, and Douglas Matthews. (New York: E. P. Dutton, 1975. Pp. xviii, 1054. $7.95.)

I am indebted to Vincent Barabba, former Director of the Bureau of the Census, for his very detailed comments on an early draft of the manuscript. We do, however, remain in disagreement on some issues. Criticism was also provided by John McCarthy (University of California, Berkeley), Margaret E. Martin (NAS-NRC Committee on National Statistics), and Donald Stokes and Dennis Thompson at Princeton. William Kruskal (University of Chicago) and Robert Russell, O.S.A. (St. Thomas Monastery) made valuable epigraphical contributions. Financial support came from the Woodrow Wilson School of Public Affairs at Princeton University and from a fellowship at the Center for Advanced Study in the Behavioral Sciences. I wish also to thank several anonymous reviewers and Dr. Ellen Y. Siegelman of the Review for their helpful comments. These individuals and institutions bear no responsibility for the faults or the views of this report.

References

1 Good discussions of these perspectives are found in Koopmans, Tjalling C., “Measurement without Theory,” reprinted with replies and counter-replies in The Scientific Papers of Tjalling C. Koopmans (New York: Springer-Verlag, 1970), pp. 112163 Google Scholar; Feinberg, Stephen E. and Goodman, Leo, “Social Indicators, 1973: Statistical Considerations,” in Social Indicators, 1973: A Review Symposium, ed. Dusen, Roxann A. Van (Washington, D.C: Social Science Research Council, 1974), pp. 6382 Google Scholar; and Morgenstern, Oskar, On the Accuracy of Economic Observations, 2nd edition (Princeton: Princeton University Press, 1963)Google Scholar.

2 This question was newly added in 1972, apparently as a contribution to the dialogue between the networks and the administration with respect to the “question whether voting turnout in the three West Coast States (and even more so in Alaska and Hawaii) might be affected by the fact that the polls were closed in the populous Northeast and election returns and media projections were being disseminated throughout the Nation, while the polling places in these Western States were still open” (p. 7). No substantive findings on this issue are reported although there is a cryptic bar chart showing the turnout at 6 p.m. (local time) for different regions of the country. The previous literature on the effects of election forecasts and on the time of day people vote is not cited; for example, Mendelsohn, Harold and Crespi, Irving, Polls, Television, and the New Politics (Scranton, Pennsylvania: Chandler, 1970)Google Scholar and the many items cited there; also Fuchs, Douglas A. and Becker, Jules, “A Brief Report on the Time of Day When People Vote,” Public Opinion Quarterly, 22 (Fall, 1968), 437440 CrossRefGoogle Scholar.

3 What continuity there has been in the Gallup surveys has yielded many useful inquiries, including the rapidly multiplying studies of whatever it is that determines presidential approval ratings over the years. A recent example is Stimson, James A., “Public Support for American Presidents: A Cyclical Model,” Public Opinion Quarterly, 40 (Spring, 1976), 121 CrossRefGoogle Scholar. Although all of the current discussions of approval ratings begin with the Truman years, the Gallup data actually go back to 1935 and are available in Clark, Wesley C., Economic Aspects of a President's Popularity (Ph.D. dissertation, University of Pennsylvania, 1943)Google Scholar. The Gallup Poll also reports some tabulations similar to Clark's as does the valuable compilation of Gallup electoral material in “Campaign '76,” The Gallup Opinion Index, 125 (November-December, 1975). Beniger, James R. also makes good use of the continuity in the Gallup polls in his “Winning the Presidential Nomination: National Polls and State Primary Elections, 1936–1972,Public Opinion Quarterly, 40 (Spring, 1976), 2238 CrossRefGoogle Scholar.

4 The Congressional District Data Book is supplemented by the Bureau of the Census Congressional District Data Profile (available in the form of a computer printout) for each district. The Profile provides many additional district variables.

5 The County and City Data Book yields 800 figures per penny; the worst buy, numerically speaking, is The Gallup Poll, which, at $95 for the set, yields 8.5 figures per penny.

6 A helpful bibliography is Runyon, John H., Verdini, Jennefer, and Runyon, Sally S., Source Book of American Presidential Campaign and Election Statistics, 1948–1968 (New York: Frederick Ungar, 1971)Google Scholar. Also Tufte, Edward R., Data Analysis for Politics and Policy (Englewood Cliffs, New Jersey: Prentice-Hall, 1974), pp. 164170 Google Scholar.

7 The book is now supplemented by David, Paul T., “Party Strength in the United States: Changes in 1972,” Journal of Politics, 36 (August, 1974), 785796 CrossRefGoogle Scholar; and David, Paul T., “Party Strength in the United States: Some Corrections,” Journal of Politics, 37 (May, 1975), 641642 CrossRefGoogle Scholar. David's volume has been more widely reviewed than most electoral compilations; see the lengthy discussions by Burnham, Walter Dean, American Political Science Review, 67 (March, 1973), 218220 CrossRefGoogle Scholar; Cummings, Milton G. Jr., “Elections in the United States: Some Interesting Findings,” Virginia Quarterly Review, 48 (Autumn, 1972), 590594 Google Scholar; and McCarthy, John L., American Historical Review, 78 (June, 1973), 736737 CrossRefGoogle Scholar. A fine guide to state electoral data is Burnham's manuscript, “Sources of American Election Data.”

8 These data can be pieced together from The Gallup Poll, and University of Michigan elections studies, and from a remarkable 858-page data dump published by the Subcommittee on Intergovernmental Relations, Senate Committee on Government Operations, Confidence and Concern: Citizens View American Government– A Survey of Public Attitudes (Washington, D.C: U.S. Government Printing Office, 1973)Google Scholar.

9 One additional item of political interest in the Statistical Abstract must be mentioned. Page 835 of the 1973 Abstract recorded a testament to the statistical wisdom of Richard Nixon for his creation of ten standard federal administrative regions for which, among other things, statistics can be compiled. He was the first president ever to be eulogized in the long history of the Abstract, published in 1973 by the Bureau of the Census, in the Social and Economic Statistics Administration, of the Department of Commerce. (Since the new regions were states, the new statistical compilations consisted entirely of adding up the numbers for the four or five states contained in each region.) The eulogy, reprinted in the 1974 Abstract (p. 847), was altered in the 1975 Abstract (p. 867) by purging Mr. Nixon's name entirely and putting the passage in the passive voice.

10 Mosteller, Frederick and Bush, Robert R., “Selected Quantitative Techniques,” in Handbook of Social Psychology: Volume I, Theory and Method, ed. Lindzey, Gardner (Cambridge, Massachusetts: Addison-Wesley, 1954), p. 331 Google Scholar. The internal quotation is from Wallis, W. Allen, “Statistics of the Kinsey Report,” Journal of the American Statistical Association, 44 (December, 1949), 471 CrossRefGoogle ScholarPubMed.

11 Clausen, Aage R., “Response Validity: Vote Report,” Public Opinion Quarterly, 32 (Winter, 19681969), 588606 CrossRefGoogle Scholar. An interesting variation on Clausen's theme is Adamany, David and DuBois, Philip, “The ‘Forgetful’ Voter and an Underreported Vote,” Public Opinion Quarterly, 39 (Summer, 1975), 227231 CrossRefGoogle Scholar. Additional research and citations are in Schreiber, E. M., “Dirty Data in Britain and the USA: The Reliability of ‘Invariant’ Characteristics Reported in Surveys,” Public Opinion Quarterly, 39 (Winter, 19751976), 493506 CrossRefGoogle Scholar.

12 The paragraph is found both in U.S. Bureau of the Census, Current Population Reports: Voting and Registration in the Election of November 1970, Series P-20, No. 228 (Washington, D.C: U.S. Government Printing Office, 1971), p. 6 Google Scholar; and Voting and Registration in the Election of November 1972, p. 8.

13 Glenn, Norval D., “Trend Studies with Survey Data: Opportunities and Pitfalls,” Social Indicators Newsletter, 5 (October, 1974), 45 Google Scholar. A useful statement concerning the details of error structure in large-scale surveys is found in Converse, Philip E., “The Availability and Quality of Sample Survey Data in Archives within the United States,” in Merritt, Richard L. and Rokkan, Stein, eds., Comparing Nations: The Use of Quantitative Data in Cross-National Research (New Haven: Yale University Press, 1966), pp. 419440 Google Scholar. See also Kish, Leslie, Survey Sampling (New York: Wiley, 1965)Google Scholar, chap. 13; Glenn, Norval D., “Trend Studies with Available Survey Data: Opportunities and Pitfalls,” in Social Science Research Council, Data for Trend Analysis (Williamstown, Massachusetts: Roper Public Opinion Research Center, 1975), pp. 635 Google Scholar; and Asher, Herbert B., “Some Consequences of Measurement Error in Survey Data,” American Journal of Political Science, 18 (May, 1974), 469485 CrossRefGoogle Scholar.

14 Tarrance, V. Lance Jr., Texas Precinct Votes '68 (Dallas: Southern Methodist University Press, 1970), p. 181 Google Scholar. See also his Texas Precinct Votes '70 (Austin: Foundation for the Study of Democratic Processes, 1972), pp.xi–xii, 185188 Google Scholar.

15 Useful considerations are found in Hoaglin, David C. and Andrews, David F., “The Reporting of Computation-Based Results in Statistics,” The American Statistician, 29 (August, 1975), 122126 Google Scholar. A subtle computational problem has risen with respect to random number generators; see Cold-well, Robert Lynn, “Correlational Defects in the Standard IBM 360 Random Number Generator and the Classical Ideal Gas Correlation Function,” Journal of Computational Physics, 14 (February, 1974), 223226 CrossRefGoogle Scholar.

16 Would political research fare better in replication than the studies described in this report published in 1962? “Last spring a graduate student at Iowa State University required data of a particular kind in order to carry out a study for his master's thesis. In order to obtain these data he wrote to 37 authors whose journal articles appeared … between 1959 and 1961. Of these authors, 32 replied. Twenty-one of these reported the data misplaced, lost, or inadvertently destroyed. Two of the remaining 11 offered their data on the conditions that they be notified of our intended use of their data, and stated that they have control of anything that we would publish involving these data. We met the former condition but refused the latter for those two authors since we felt the raw data from published research should be made public upon request when possible and economically feasible. Thus raw data from 9 authors were obtained. From these 9 authors, 11 analyses were obtained, four of these were not analyzed by us since they were made available several months after our request. Of the remaining 7 studies, 3 involved gross errors. One involved an analysis of variance on transformed data where the transformation was clearly inappropriate. Another analysis contained a gross computational error so that several F ratios near one were reported to be highly significant. The third analysis incorrectly reported insignificant results due to the use of an inappropriate error term.…” Wolins, Leroy, “Responsibility for Raw Data,” American Psychologist, 17 (1962), p. 657 CrossRefGoogle Scholar.

17 See Farmer, James, Springer, Colby, and Strumwasser, Michael J., “Cheating the Vote-Count Systems,” Datamation, May, 1970, pp. 7680 Google Scholar; and Patrick, Robert L. and Dahl, Aubrey, “Voting Systems,” Datamation, May 1970, pp. 8182 Google Scholar. The currently definitive document is Saltman, Roy G., Effective Use of Computing Technology in Vote-Tallying (Washington, D.C: National Bureau of Standards, March, 1975)CrossRefGoogle Scholar, prepared for the Clearinghouse on Election Administration, Office of Federal Elections.

18 Persistent problems with data errors and failures to replicate have cropped up even in the technically sophisticated examinations of the correlations between electoral and economic time series; see the papers by Goodman and Kramer and by Arcelus and Meltzer in the December, 1975 American Political Science Review. For a study of the correlation between time series chosen at random from Historical Statistics of the United States, see Ames, Edward and Reiter, Stanley, “Distributions of Correlation Coefficients in Economic Time Series,” Journal of the American Statistical Association, 56 (September, 1961), 637656 CrossRefGoogle Scholar.

19 Cochran, W. G., “Errors of Measurement in Statistics,” Technometrics, 10 (November 1968), 665 CrossRefGoogle Scholar.

20 See Clausen, “Response Validity: Vote Report,” and Mosteller, Frederick et al., The Pre-Election Polls of 1948 (New York: Social Science Research Council, 1949)Google Scholar, chap. 5. Other worthwhile reports on error are Morgenstern, On the Accuracy of Economic Observations; Kruskal, William H. and Telser, Lester G., “Food Prices and the Bureau of Labor Statistics,” The Journal of Business, 33 (July, 1960), 258279 CrossRefGoogle Scholar; a thorough study of the population undercount in the 1970 census and its consequences is Siegel, Jacob S., “Coverage of the Population in the 1970 Census and Come Implications for Public Programs,” U.S. Bureau of the Census, Current Population Reports, Series P-23 (Washington, D.C: U.S. Government Printing Office, 1975)Google Scholar; for data problems in economic policy making, Denison, Edward F., Accounting for United States Economic Growth 1929–1969 (Washington, D.C: The Brookings Institution, 1974), pp. 84101 Google Scholar. Also of interest is U.S. Bureau of the Census, “Standards for Discussion and Presentation of Errors in Data,” Technical Paper No. 32 (Washington, D.C: U.S. Government Printing Office, 1974)Google Scholar. Good general discussions of error are Eisenhart, Churchill, “Expression of the Uncertainties of Final Results,” Science, 160 (June 14, 1968), 12011204 CrossRefGoogle ScholarPubMed; and Youden, W. J., “Enduring Values,” Technometrics, 14 (February, 1972), 111 CrossRefGoogle Scholar.

21 Such a project is now moving forward: Warren E. Miller and Arthur H. Miller are preparing American Political Trends: A Sourcebook of Political Indicators, 1952–1974.

22 Flanigan, William H. and Zingale, Nancy H., Political Behavior of the American Electorate, 3rd edition (Boston: Allyn and Bacon, 1975)Google Scholar.

23 Kruskal, William, “Science Indicators: Themes and Variations,” in Toward a Metric of Science: Reflections on Science Indicators, ed. Elkana, Yehuda, Lederberg, Joshua, Merton, Robert K., Thackray, Arnold, and Zuckerman, Harriet (New York: Wiley, 1977)Google Scholar.