Political Statistics for the United States: Observations on Some Major Data Sources*†

Edward R. Tufte

doi:10.2307/1956972

Political Statistics for the United States: Observations on Some Major Data Sources*†

Published online by Cambridge University Press: 01 August 2014

Edward R. Tufte

Show author details

Edward R. Tufte*: Affiliation:
Princeton University

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Thirteen major data sources for the study of American politics are examined with regard to their conceptual orientation, error structure, and inferential utility. A great deal of ephemera and measurement without theory is discovered. Few of the documents contain any serious discussion of error structure, although some do report “standard errors” based on naive sampling models. In addition to suggestions for improving the compilation of political statistics, recommendations for a basic minimum library of data sources for American politics are made: The Almanac of American Politics and the Statistical Abstract of the United States, followed by the Guide to U.S. Elections.

Information

Type: Review Essay
Information: American Political Science Review , Volume 71 , Issue 1 , March 1977 , pp. 305 - 314

DOI: https://doi.org/10.2307/1956972 [Opens in a new window]
Copyright: Copyright © American Political Science Association 1977

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

The Almanac of American Politics, 1976. By Michael Barone, Grant Ujifusa, and Douglas Matthews. (New York: E. P. Dutton, 1975. Pp. xviii, 1054. $7.95.)

†

I am indebted to Vincent Barabba, former Director of the Bureau of the Census, for his very detailed comments on an early draft of the manuscript. We do, however, remain in disagreement on some issues. Criticism was also provided by John McCarthy (University of California, Berkeley), Margaret E. Martin (NAS-NRC Committee on National Statistics), and Donald Stokes and Dennis Thompson at Princeton. William Kruskal (University of Chicago) and Robert Russell, O.S.A. (St. Thomas Monastery) made valuable epigraphical contributions. Financial support came from the Woodrow Wilson School of Public Affairs at Princeton University and from a fellowship at the Center for Advanced Study in the Behavioral Sciences. I wish also to thank several anonymous reviewers and Dr. Ellen Y. Siegelman of the Review for their helpful comments. These individuals and institutions bear no responsibility for the faults or the views of this report.

References

¹ Good discussions of these perspectives are found in Koopmans, Tjalling C., “Measurement without Theory,” reprinted with replies and counter-replies in The Scientific Papers of Tjalling C. Koopmans (New York: Springer-Verlag, 1970), pp. 112–163 Google Scholar; Feinberg, Stephen E. and Goodman, Leo, “Social Indicators, 1973: Statistical Considerations,” in Social Indicators, 1973: A Review Symposium, ed. Dusen, Roxann A. Van (Washington, D.C: Social Science Research Council, 1974), pp. 63–82 Google Scholar; and Morgenstern, Oskar, On the Accuracy of Economic Observations, 2nd edition (Princeton: Princeton University Press, 1963)Google Scholar.

² This question was newly added in 1972, apparently as a contribution to the dialogue between the networks and the administration with respect to the “question whether voting turnout in the three West Coast States (and even more so in Alaska and Hawaii) might be affected by the fact that the polls were closed in the populous Northeast and election returns and media projections were being disseminated throughout the Nation, while the polling places in these Western States were still open” (p. 7). No substantive findings on this issue are reported although there is a cryptic bar chart showing the turnout at 6 p.m. (local time) for different regions of the country. The previous literature on the effects of election forecasts and on the time of day people vote is not cited; for example, Mendelsohn, Harold and Crespi, Irving, Polls, Television, and the New Politics (Scranton, Pennsylvania: Chandler, 1970)Google Scholar and the many items cited there; also Fuchs, Douglas A. and Becker, Jules, “A Brief Report on the Time of Day When People Vote,” Public Opinion Quarterly, 22 (Fall, 1968), 437–440 CrossRef Google Scholar.

³ What continuity there has been in the Gallup surveys has yielded many useful inquiries, including the rapidly multiplying studies of whatever it is that determines presidential approval ratings over the years. A recent example is Stimson, James A., “Public Support for American Presidents: A Cyclical Model,” Public Opinion Quarterly, 40 (Spring, 1976), 1–21 CrossRef Google Scholar. Although all of the current discussions of approval ratings begin with the Truman years, the Gallup data actually go back to 1935 and are available in Clark, Wesley C., Economic Aspects of a President's Popularity (Ph.D. dissertation, University of Pennsylvania, 1943)Google Scholar. The Gallup Poll also reports some tabulations similar to Clark's as does the valuable compilation of Gallup electoral material in “Campaign '76,” The Gallup Opinion Index, 125 (November-December, 1975). Beniger, James R. also makes good use of the continuity in the Gallup polls in his “Winning the Presidential Nomination: National Polls and State Primary Elections, 1936–1972,” Public Opinion Quarterly, 40 (Spring, 1976), 22–38 CrossRef Google Scholar.

⁴ The Congressional District Data Book is supplemented by the Bureau of the Census Congressional District Data Profile (available in the form of a computer printout) for each district. The Profile provides many additional district variables.

⁵ The County and City Data Book yields 800 figures per penny; the worst buy, numerically speaking, is The Gallup Poll, which, at $95 for the set, yields 8.5 figures per penny.

⁶ A helpful bibliography is Runyon, John H., Verdini, Jennefer, and Runyon, Sally S., Source Book of American Presidential Campaign and Election Statistics, 1948–1968 (New York: Frederick Ungar, 1971)Google Scholar. Also Tufte, Edward R., Data Analysis for Politics and Policy (Englewood Cliffs, New Jersey: Prentice-Hall, 1974), pp. 164–170 Google Scholar.

⁷ The book is now supplemented by David, Paul T., “Party Strength in the United States: Changes in 1972,” Journal of Politics, 36 (August, 1974), 785–796 CrossRef Google Scholar; and David, Paul T., “Party Strength in the United States: Some Corrections,” Journal of Politics, 37 (May, 1975), 641–642 CrossRef Google Scholar. David's volume has been more widely reviewed than most electoral compilations; see the lengthy discussions by Burnham, Walter Dean, American Political Science Review, 67 (March, 1973), 218–220 CrossRef Google Scholar; Cummings, Milton G. Jr., “Elections in the United States: Some Interesting Findings,” Virginia Quarterly Review, 48 (Autumn, 1972), 590–594 Google Scholar; and McCarthy, John L., American Historical Review, 78 (June, 1973), 736–737 CrossRef Google Scholar. A fine guide to state electoral data is Burnham's manuscript, “Sources of American Election Data.”

⁸ These data can be pieced together from The Gallup Poll, and University of Michigan elections studies, and from a remarkable 858-page data dump published by the Subcommittee on Intergovernmental Relations, Senate Committee on Government Operations, Confidence and Concern: Citizens View American Government– A Survey of Public Attitudes (Washington, D.C: U.S. Government Printing Office, 1973)Google Scholar.

⁹ One additional item of political interest in the Statistical Abstract must be mentioned. Page 835 of the 1973 Abstract recorded a testament to the statistical wisdom of Richard Nixon for his creation of ten standard federal administrative regions for which, among other things, statistics can be compiled. He was the first president ever to be eulogized in the long history of the Abstract, published in 1973 by the Bureau of the Census, in the Social and Economic Statistics Administration, of the Department of Commerce. (Since the new regions were states, the new statistical compilations consisted entirely of adding up the numbers for the four or five states contained in each region.) The eulogy, reprinted in the 1974 Abstract (p. 847), was altered in the 1975 Abstract (p. 867) by purging Mr. Nixon's name entirely and putting the passage in the passive voice.

¹⁰ Mosteller, Frederick and Bush, Robert R., “Selected Quantitative Techniques,” in Handbook of Social Psychology: Volume I, Theory and Method, ed. Lindzey, Gardner (Cambridge, Massachusetts: Addison-Wesley, 1954), p. 331 Google Scholar. The internal quotation is from Wallis, W. Allen, “Statistics of the Kinsey Report,” Journal of the American Statistical Association, 44 (December, 1949), 471 CrossRef Google Scholar PubMed.

¹¹ Clausen, Aage R., “Response Validity: Vote Report,” Public Opinion Quarterly, 32 (Winter, 1968–1969), 588–606 CrossRef Google Scholar. An interesting variation on Clausen's theme is Adamany, David and DuBois, Philip, “The ‘Forgetful’ Voter and an Underreported Vote,” Public Opinion Quarterly, 39 (Summer, 1975), 227–231 CrossRef Google Scholar. Additional research and citations are in Schreiber, E. M., “Dirty Data in Britain and the USA: The Reliability of ‘Invariant’ Characteristics Reported in Surveys,” Public Opinion Quarterly, 39 (Winter, 1975–1976), 493–506 CrossRef Google Scholar.

¹² The paragraph is found both in U.S. Bureau of the Census, Current Population Reports: Voting and Registration in the Election of November 1970, Series P-20, No. 228 (Washington, D.C: U.S. Government Printing Office, 1971), p. 6 Google Scholar; and Voting and Registration in the Election of November 1972, p. 8.

¹³ Glenn, Norval D., “Trend Studies with Survey Data: Opportunities and Pitfalls,” Social Indicators Newsletter, 5 (October, 1974), 4–5 Google Scholar. A useful statement concerning the details of error structure in large-scale surveys is found in Converse, Philip E., “The Availability and Quality of Sample Survey Data in Archives within the United States,” in Merritt, Richard L. and Rokkan, Stein, eds., Comparing Nations: The Use of Quantitative Data in Cross-National Research (New Haven: Yale University Press, 1966), pp. 419–440 Google Scholar. See also Kish, Leslie, Survey Sampling (New York: Wiley, 1965)Google Scholar, chap. 13; Glenn, Norval D., “Trend Studies with Available Survey Data: Opportunities and Pitfalls,” in Social Science Research Council, Data for Trend Analysis (Williamstown, Massachusetts: Roper Public Opinion Research Center, 1975), pp. 6–35 Google Scholar; and Asher, Herbert B., “Some Consequences of Measurement Error in Survey Data,” American Journal of Political Science, 18 (May, 1974), 469–485 CrossRef Google Scholar.

¹⁴ Tarrance, V. Lance Jr., Texas Precinct Votes '68 (Dallas: Southern Methodist University Press, 1970), p. 181 Google Scholar. See also his Texas Precinct Votes '70 (Austin: Foundation for the Study of Democratic Processes, 1972), pp.xi–xii, 185–188 Google Scholar.

¹⁵ Useful considerations are found in Hoaglin, David C. and Andrews, David F., “The Reporting of Computation-Based Results in Statistics,” The American Statistician, 29 (August, 1975), 122–126 Google Scholar. A subtle computational problem has risen with respect to random number generators; see Cold-well, Robert Lynn, “Correlational Defects in the Standard IBM 360 Random Number Generator and the Classical Ideal Gas Correlation Function,” Journal of Computational Physics, 14 (February, 1974), 223–226 CrossRef Google Scholar.

¹⁶ Would political research fare better in replication than the studies described in this report published in 1962? “Last spring a graduate student at Iowa State University required data of a particular kind in order to carry out a study for his master's thesis. In order to obtain these data he wrote to 37 authors whose journal articles appeared … between 1959 and 1961. Of these authors, 32 replied. Twenty-one of these reported the data misplaced, lost, or inadvertently destroyed. Two of the remaining 11 offered their data on the conditions that they be notified of our intended use of their data, and stated that they have control of anything that we would publish involving these data. We met the former condition but refused the latter for those two authors since we felt the raw data from published research should be made public upon request when possible and economically feasible. Thus raw data from 9 authors were obtained. From these 9 authors, 11 analyses were obtained, four of these were not analyzed by us since they were made available several months after our request. Of the remaining 7 studies, 3 involved gross errors. One involved an analysis of variance on transformed data where the transformation was clearly inappropriate. Another analysis contained a gross computational error so that several F ratios near one were reported to be highly significant. The third analysis incorrectly reported insignificant results due to the use of an inappropriate error term.…” Wolins, Leroy, “Responsibility for Raw Data,” American Psychologist, 17 (1962), p. 657 CrossRef Google Scholar.

¹⁷ See Farmer, James, Springer, Colby, and Strumwasser, Michael J., “Cheating the Vote-Count Systems,” Datamation, May, 1970, pp. 76–80 Google Scholar; and Patrick, Robert L. and Dahl, Aubrey, “Voting Systems,” Datamation, May 1970, pp. 81–82 Google Scholar. The currently definitive document is Saltman, Roy G., Effective Use of Computing Technology in Vote-Tallying (Washington, D.C: National Bureau of Standards, March, 1975)CrossRef Google Scholar, prepared for the Clearinghouse on Election Administration, Office of Federal Elections.

¹⁸ Persistent problems with data errors and failures to replicate have cropped up even in the technically sophisticated examinations of the correlations between electoral and economic time series; see the papers by Goodman and Kramer and by Arcelus and Meltzer in the December, 1975 American Political Science Review. For a study of the correlation between time series chosen at random from Historical Statistics of the United States, see Ames, Edward and Reiter, Stanley, “Distributions of Correlation Coefficients in Economic Time Series,” Journal of the American Statistical Association, 56 (September, 1961), 637–656 CrossRef Google Scholar.

¹⁹ Cochran, W. G., “Errors of Measurement in Statistics,” Technometrics, 10 (November 1968), 665 CrossRef Google Scholar.

²⁰ See Clausen, “Response Validity: Vote Report,” and Mosteller, Frederick et al., The Pre-Election Polls of 1948 (New York: Social Science Research Council, 1949)Google Scholar, chap. 5. Other worthwhile reports on error are Morgenstern, On the Accuracy of Economic Observations; Kruskal, William H. and Telser, Lester G., “Food Prices and the Bureau of Labor Statistics,” The Journal of Business, 33 (July, 1960), 258–279 CrossRef Google Scholar; a thorough study of the population undercount in the 1970 census and its consequences is Siegel, Jacob S., “Coverage of the Population in the 1970 Census and Come Implications for Public Programs,” U.S. Bureau of the Census, Current Population Reports, Series P-23 (Washington, D.C: U.S. Government Printing Office, 1975)Google Scholar; for data problems in economic policy making, Denison, Edward F., Accounting for United States Economic Growth 1929–1969 (Washington, D.C: The Brookings Institution, 1974), pp. 84–101 Google Scholar. Also of interest is U.S. Bureau of the Census, “Standards for Discussion and Presentation of Errors in Data,” Technical Paper No. 32 (Washington, D.C: U.S. Government Printing Office, 1974)Google Scholar. Good general discussions of error are Eisenhart, Churchill, “Expression of the Uncertainties of Final Results,” Science, 160 (June 14, 1968), 1201–1204 CrossRef Google Scholar PubMed; and Youden, W. J., “Enduring Values,” Technometrics, 14 (February, 1972), 1–11 CrossRef Google Scholar.

²¹ Such a project is now moving forward: Warren E. Miller and Arthur H. Miller are preparing American Political Trends: A Sourcebook of Political Indicators, 1952–1974.

²² Flanigan, William H. and Zingale, Nancy H., Political Behavior of the American Electorate, 3rd edition (Boston: Allyn and Bacon, 1975)Google Scholar.

²³ Kruskal, William, “Science Indicators: Themes and Variations,” in Toward a Metric of Science: Reflections on Science Indicators, ed. Elkana, Yehuda, Lederberg, Joshua, Merton, Robert K., Thackray, Arnold, and Zuckerman, Harriet (New York: Wiley, 1977)Google Scholar.

Submit a response

Comments

No Comments have been published for this article.

Article contents

Political Statistics for the United States: Observations on Some Major Data Sources*†

Abstract

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Comments

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests