Skip to main content

A Re-analysis of the Reliability of Psychiatric Diagnosis

  • Robert L. Spitzer (a1) and Joseph L. Fleiss (a2)

Classification systems such as diagnosis have two primary properties, reliability and validity. Reliability refers to the consistency with which subjects are classified; validity, to the utility of the system for its various purposes. In the case of psychiatric diagnosis, the purposes of the classification system are communication about clinical features, aetiology, course of illness and treatment. A necessary constraint on the validity of a system is its reliability. There is no guarantee that a reliable system is valid, but assuredly an unreliable system must be invalid.

Hide All
American Psychiatric Association (1952) Diagnostic and Statistical Manual of the Mental Disorders.
American Psychiatric Association (1968) Diagnostic and Statistical Manual of the Mental Disorders. 2nd Edition.
Beck, A. T., Ward, C. H., Mendelson, M., Mock, J. E., & Erbaugh, J. K. (1962). Reliability of psychiatric diagnoses: 2. A study of consistency of clinical judgments and ratings. Amer. J. Psychiat., 119, 351–7.
Cohen, J. (1960) A coefficient of agreement for nominal scales. Educ. psychol. Measmt., 20, 3746.
Cohen, J. (1968) Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychol. Bull., 70, 213–20.
Cooper, J. E., Kendell, R. E., Gurland, B. J., Sharpe, L., Copeland, J. R. M. & Simon, R. (1972) Psychiatric Diagnosis in New York and London. (U.S.–U.K. Diagnostic Project.) London: Oxford University Press.
Copeland, J. R. M., Cooper, J. E., Kendell, R. E., & Gourlay, A. J. (1971) Differences in usage of diagnostic labels amongst psychiatrists in the British Isles. Brit. J. Psychiat., 118, 629–40.
Feighner, J. P., Robins, E., Guze, S. B., Woodruff, R. A., Winokur, G. & Munoz, R. (1972) Diagnostic criteria for use in psychiatric research. Arch. gen. Psychiat., 26, 5763.
Fleiss, J. L. (1971) Measuring nominal scale agreement among many raters. Psychol. Bull., 76, 378–82.
Fleiss, J. L. & Cohen, J. (1973) The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ. psychol. Measmt., 33, 613–19.
Fleiss, J. L. Spitzer, R. L., Endicott, J. & Cohen, J. (1972) Quantification of agreement in multiple psychiatric diagnosis. Arch. gen. Psychiat., 26, 168–71.
Gurland, B. J., Fleiss, J. L., Sharpe, L., Simon, R. & Barrett, J. E. (1972) The mislabeling of depressed patients in New York State hospitals. Disorders of Mood (eds. Zubin, J. & Freyhan, F. A.), pp. 1728. Baltimore: Johns Hopkins Press.
Her Majesty's Stationery Office (1968) A Glossary of Mental Disorders. General Register Office Studies on Medical and Population Subjects, no. 22.
Katz, M. M., Cole, J. O. & Lowry, H. A. (1969) Studies of the diagnostic process: The influence of symptom perception, past experience, and ethnic background on diagnostic decisions. Amer. J. Psychiat., 125, 937–47.
Kendell, R. E., Cooper, J. E., Gourlay, A. J., Copeland, J. R. M., Sharpe, L. & Gurland, B.J. (1971) The diagnostic criteria of American and British psychiatrists. Arch. gen. Psychiat., 25, 123–30.
Kreitman, N. (1961) The reliability of psychiatric diagnosis. J. ment. Sci., 107, 876–86.
Light, R.J. (1971) Measures of agreement for qualitative data: Some generalizations and alternatives. Psychol. Bull., 76, 365–77.
Lorr, M., McNair, D. M., Klett, C. J. & Lasky, J. J. (1962) Evidence of ten psychiatric syndromes. J. consult. Psychol, 26, 185–9.
Sandifer, M. G., Hordern, A., Timbury, G. C. & Green, L. M. (1968) Psychiatric diagnosis: A comparative study in North Carolina, London and Glasgow. Brit. J. Psychiat., 114, 19.
Sandifer, M. G., Pettus, C. & Quade, D. (1964) A study of psychiatric diagnosis. J. nerv. ment. Dis., 139, 350–6.
Schmidt, H. O. & Fonda, C. P. (1956) The reliability of psychiatrie diagnosis: A new look. J. abnor. soc., Psychol., 52, 262–7.
Sharpe, L., Gurland, B. J., Fleiss, J. L., Kendell, R. E., Cooper, J. E. & Copeland, J. R. M. Some comparisons of American, Canadian and British psychiatrists in their diagnostic concepts. Canad. J. Psychiat. In press.
Spitzer, R. L., Cohen, J., Fleiss, J. L. & Endicott, J. (1967a) Quantification of agreement in psychiatric diagnosis: A new approach. Arch. gen. Psychiat., 17, 83–7.
Spitzer, R. L., Endicott, J., Cohen, J. & Fleiss, J. L. Constraints on the validity of computer diagnosis. (In preparation).
Spitzer, R. L., Endicott, J., Fleiss, J. L. & Cohen, J. (1970) Psychiatric Status Schedule: A technique for evaluating psychopathology and impairment in role functioning. Arch. gen. Psychiat., 23, 4155.
Spitzer, R. L., Fleiss, J. L., Endicott, J. & Cohen, J. (1967b) Mental Status Schedule: Properties of factor analytically derived scales. Arch. gen. Psychiat., 16, 479–93.
Wing, J. K., Birley, J. L. T., Cooper, J. E., Graham, P. & Isaacs, A. D. (1967) Reliability of a procedure for measuring and classifying ‘present psychiatric state’. Brit. J. Psychiat, 113, 499515.
Zubin, J. (1967) Classification of the behavior disorders. In Annual Review of Psychology (eds. Farnsworth, P. R. & McNemar, O.). Palo Alto, California, Annual Reviews , pp. 373406.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

The British Journal of Psychiatry
  • ISSN: 0007-1250
  • EISSN: 1472-1465
  • URL: /core/journals/the-british-journal-of-psychiatry
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 13 *
Loading metrics...

Abstract views

Total abstract views: 86 *
Loading metrics...

* Views captured on Cambridge Core between 29th January 2018 - 27th May 2018. This data will be updated every 24 hours.

A Re-analysis of the Reliability of Psychiatric Diagnosis

  • Robert L. Spitzer (a1) and Joseph L. Fleiss (a2)
Submit a response


No eLetters have been published for this article.


Reply to: Submit a response

Your details

Conflicting interests

Do you have any conflicting interests? *