Hostname: page-component-5d59c44645-kw98b Total loading time: 0 Render date: 2024-02-21T02:24:18.779Z Has data issue: false hasContentIssue false

Case finding and screening clinical utility of the Patient Health Questionnaire (PHQ-9 and PHQ-2) for depression in primary care: a diagnostic meta-analysis of 40 studies

Published online by Cambridge University Press:  02 January 2018

Alex J. Mitchell*
Department of Cancer Studies, University of Leicester, and Department of Psycho-Oncology, Leicestershire Partnership NHS Trust, Leicester, UK
Motahare Yadegarfar
Medical School, University of Leicester, Leicester, UK
John Gill
Medical School, University of Leicester, Leicester, UK
Brendon Stubbs
Institute of Psychiatry, Psychology and Neuroscience, King's College London, and Physiotherapy Department, South London and Maudsley NHS Foundation Trust, UK
Alex J. Mitchell, Psycho-Oncology, Department of Cancer Studies, University of Leicester, Leicester LE1 5WW, UK. Email:
Rights & Permissions [Opens in a new window]


Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

The Patient Health Questionnaire (PHQ) is the most commonly used measure to screen for depression in primary care but there is still lack of clarity about its accuracy and optimal scoring method.


To determine via meta-analysis the diagnostic accuracy of the PHQ-9-linear, PHQ-9-algorithm and PHQ-2 questions to detect major depressive disorder (MDD) among adults.


We systematically searched major electronic databases from inception until June 2015. Articles were included that reported the accuracy of PHQ-9 or PHQ-2 questions for diagnosing MDD in primary care defined according to standard classification systems. We carried out a meta-analysis, meta-regression, moderator and sensitivity analysis.


Overall, 26 publications reporting on 40 individual studies were included representing 26 902 people (median 502, s.d.=693.7) including 14 760 unique adults of whom 14.3% had MDD. The methodological quality of the included articles was acceptable. The meta-analytic area under the receiver operating characteristic curve of the PHQ-9-linear and the PHQ-2 was significantly higher than the PHQ-9-algorithm, a difference that was maintained in head-to-head meta-analysis of studies. Our best estimates of sensitivity and specificity were 81.3% (95% CI 71.6–89.3) and 85.3% (95% CI 81.0–89.1), 56.8% (95% CI 41.2–71.8) and 93.3% (95% CI 87.5–97.3) and 89.3% (95% CI 81.5–95.1) and 75.9% (95% CI 70.1–81.3) for the PHQ-9-linear, PHQ-9-algorithm and PHQ-2 respectively. For case finding (ruling in a diagnosis), none of the methods were suitable but for screening (ruling out non-cases), all methods were encouraging with good clinical utility, although the cut-off threshold must be carefully chosen.


The PHQ can be used as an initial first step assessment in primary care and the PHQ-2 is adequate for this purpose with good acceptability. However, neither the PHQ-2 nor the PHQ-9 can be used to confirm a clinical diagnosis (case finding).

Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an open access article distributed under the terms of the Creative Commons Non-Commercial, No Derivatives (CC BY-NC-ND) licence (
Copyright © The Royal College of Psychiatrists 2016


Declaration of interest



1 Mitchell, AJ, Vaze, A, Rao, S. Clinical diagnosis of depression in primary care: a meta-analysis. Lancet 2009; 374: 609–19.Google Scholar
2 National Collaborating Centre for Mental Health. Depression in Adults with a Chronic Physical Health Problem: The NICE Guideline on Treatment and Management 2010. British Psychological Society & Royal College of Psychiatrists.Google Scholar
3 Moussavi, S, Chatterji, S, Verdes, E, Tandon, A, Patel, V, Ustun, B. Depression, chronic diseases, and decrements in health: results from the World Health Surveys. Lancet 2007; 370: 851–8.Google Scholar
4 Whiteford, HA, Degenhardt, L, Rehm, J, Baxter, AJ, Ferrari, AJ, Erskine, HE, et al. Global burden of disease attributable to mental and substance use disorders: findings from the global burden of disease study 2010. Lancet 2013; 382: 1575–86.Google Scholar
5 Harman, JS, Veazie, PJ, Lyness, JM. Primary care physician office visits for depression by older Americans. J Gen Intern Med 2006; 21: 926–30.Google Scholar
6 Shah, A. The burden of psychiatric disorder in primary care. Int Rev Psychiatry 1992; 4: 243–50.Google Scholar
7 Zastrow, A, Faude, V, Seyboth, F, Niehoff, D, Herzog, W, Löwe, B. Risk factors of symptom underestimation by physicians. J Psychosom Res 2008; 64: 543–51.Google Scholar
8 Duhoux, A, Fournier, L, Gauvin, L, Roberge, P. Quality of care for major depression and its determinants: a multilevel analysis. BMC Psychiatry 2012; 12: 142.Google Scholar
9 Druss, BG, Wang, PS, Sampson, NA, Olfson, M, Pincus, HA, Wells, KB, et al. Understanding mental health treatment in persons without mental diagnoses: results from the National Comorbidity Survey replication. Arch Gen Psychiatry 2007; 64: 1196–203.Google Scholar
10 Takayanagi, Y, Spira, A, Bienvenu, O, Hock, RS, Carras, MC, Eaton, WW, et al. Antidepressant use and lifetime history of mental disorders in a community sample: results from the Baltimore Epidemiologic Catchment Area Study. J Clin Psychiatry 2015; 76: 40–4.Google Scholar
11 Duhoux, A, Fournier, L, Menear, M. Quality indicators for depression treatment in primary care: a systematic literature review. Curr Psychiatry Rev 2011; 7: 104–37.Google Scholar
12 Mojtabai, R. Clinician-identified depression in community settings: concordance with structured-interview diagnoses. Psychother Psychosom 2013; 82: 161–9.Google Scholar
13 Dowrick, C, Frances, A. Medicalising unhappiness: new classification of depression risks more patients being put on drug treatment from which they will not benefit. BMJ 2013; 347: 7140.Google Scholar
14 Jerant, A, Kravitz, RL, Fernandez, Y, Garcia, E, Feldman, MD, Cipri, C, et al. Potential antidepressant overtreatment associated with office use of brief depression symptom measures. J Am Board Fam Med 2014; 27: 611–20.Google Scholar
15 Mitchell, AJ, Meader, N, Davies, E, Clover, K, Carter, GL, Loscalzo, MJ, et al. Meta-analysis of screening and case finding tools for depression in cancer: evidence based recommendations for clinical practice on behalf of the Depression in Cancer Care consensus group. J Affect Disord 2012; 140: 149–60.Google Scholar
17 Mitchell, AJ, Coyne, JC. Do ultra-short screening instruments accurately detect depression in primary care? A pooled analysis and meta-analysis of 22 studies. Br J Gen Pract 2007; 57: 144–51.Google Scholar
18 Mitchell, AJ, Vahabzadeh, A, Magruder, K. Screening for distress and depression in cancer settings: 10 lessons from 40 years of primary-care research. Psychooncology 2011; 20: 572–84.Google Scholar
19 Kroenke, K, Spitzer, RL, Williams, JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001; 16: 606–13.Google Scholar
20 Kroenke, K, Spitzer, RL, Williams, JB. The patient health questionnaire-2: validity of a two-item depression screener. Med Care 2003; 41: 1284–92.Google Scholar
21 MacMillan, HL, Patterson, CJ, Wathen, CN, Feightner, JW, Bessette, P, Elford, RW, et al. Canadian Task Force on Preventive Health Care: screening for depression in primary care: recommendation statement from the Canadian Task Force on Preventive Health Care. CMAJ 2005; 172: 33–5.Google Scholar
22 U.S. Preventive Services Task Force. Screening for depression: recommendations and rationale. Ann Intern Med 2002; 136: 760–4.Google Scholar
23 Siu, AL; US Preventive Services Task Force. Screening for depression in adults: US Preventive Services Task Force Recommendation Statement. JAMA 2016; 315: 380–7.Google Scholar
24 National Collaborating Center for Mental Health. The NICE Guideline on The Management and Treatment of Depression in Adults (Updated Edition). National Institute for Health and Clinical Excellence, 2010.Google Scholar
25 Allaby, M. Screening for Depression: A Report for the UK National Screening Committee (Revised Report). UK National Screening Committee, 2010.Google Scholar
26 Shaw, EJ, Sutcliffe, D, Lacey, T, Stokes, T. Assessing depression severity using the UK Quality and Outcomes Framework depression indicators: a systematic review. Br J Gen Pract 2013; 63: e30917.Google Scholar
27 Joffres, M, Jaramillo, A, Dickinson, J, Lewin, G, Pottie, K, Shaw, E, et al. Canadian Task Force on Preventive Health Care: recommendations on screening for depression in adults. CMAJ 2013; 185: 775–82.Google Scholar
28 Goodyear-Smith, FA, van Driel, ML, Arroll, B, Del Mar, C. Analysis of decisions made in meta-analyses of depression screening and the risk of confirmation bias: acase study. BMC Med Res Methodol 2012; 12: 76.Google Scholar
29 Gilbody, S, Richards, D, Brealey, S, Hewitt, C. Screening for depression in medical settings with the patient health questionnaire (PHQ): a diagnostic meta-analysis. J Gen Intern Med 2007; 22: 1596–602.Google Scholar
30 Wittkampf, KA, Naeije, L, Schene, A, Huyser, J, van Weert, HC. Diagnostic accuracy of the mood module of the Patient Health Questionnaire: a systematic review. Gen Hosp Psychiatry 2007; 29: 388–95.Google Scholar
31 Manea, L, Gilbody, S, McMillan, D. Optimal cut-off score for diagnosing depression with the Patient Health Questionnaire (PHQ-9): a meta-analysis. CMAJ 2012; 184: 191–6.Google Scholar
32 Manea, L, Gilbody, S, McMillan, D. A diagnostic meta-analysis of the Patient Health Questionnaire-9 (PHQ-9) algorithm scoring method as a screen for depression. Gen Hosp Psychiatry 2015; 37: 6775.Google Scholar
33 Löwe, B, Spitzer, RL, Gräfe, K, Kroenke, K, Quenter, A, Zipfel, S, et al. Comparative validity of three screening questionnaires for DSM-IV depressive disorders and physicians' diagnoses. J Affect Disord 2004; 78: 131–40.Google Scholar
34 Lowe, B, Kroenke, K, Kerstin, G. Detecting and monitoring depression with a two-item questionnaire (PHQ-2). J Psychosom Res 2005; 58: 163–71.Google Scholar
35 Cholera, R, Gaynes, BN, Pence, BW, Bassett, J, Qangule, N, Macphail, C, et al. Validity of the Patient Health Questionnaire-9 to screen for depression in a high-HIV burden primary healthcare clinic in Johannesburg, South JAfrica. J Affect Disord 2014; 167: 160–6.Google Scholar
36 Whiting, PF, Rutjes, AW, Westwood, ME, Mallett, S, Deeks, JJ, Reitsma, JB, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011; 155: 529–36.Google Scholar
37 Mitchell, AJ. Sensitivity x PPV is a recognized test called the clinical utility index (CUI +). Eur J Epidemiol 2011; 26: 251–2.Google Scholar
38 Mitchell, AJ. The clinical significance of subjective memory complaints in the diagnosis of mild cognitive impairment and dementia: a meta-analysis. Int J Geriatr Psychiatry 2008; 23: 1191–202.Google Scholar
39 Mitchell, AJ. A meta-analysis of the accuracy of the mini-mental state examination in the detection of dementia and mild cognitive impairment. J Psychiatr Res 2009; 43: 411–31.Google Scholar
40 Reeve, JL, Lloyd-Williams, M, Dowrick, C. Revisiting depression in palliative care settings: the need to focus on clinical utility over validity. Palliat Med 2008; 22: 383–91.Google Scholar
41 Li, J, Fine, JP. Assessing the dependence of sensitivity and specificity on prevalence in meta-analysis. Biostatistics 2011; 12: 710–22.Google Scholar
42 Swets, JA. Measuring the accuracy of diagnostic systems. Science 1988; 240: 1285–93.Google Scholar
43 Higgins, JPT, Thompson, SG, Deeks, JJ, Altman, DG. Measuring inconsistency in meta-analyses. BMJ 2003; 327: 557–60.Google Scholar
44 Harbord, RM, Egger, M, Sterne, JA. A modified test for small-study effects in meta-analyses of controlled trials with binary endpoints. Stat Med 2006; 25: 3443–57.Google Scholar
45 Thompson, SG, Higgins, JPT. How should meta-regression analyses be undertaken and interpreted? Stat Med 2002; 21: 1559–73.Google Scholar
46 Arroll, B, Goodyear-Smith, F, Crengle, S, Gunn, J, Kerse, N, Fishman, T, et al. Validation of PHQ-2 and PHQ-9 to screen for major depression in the primary care population. Ann Fam Med 2010; 8: 348–53.Google Scholar
47 Ayalon, L, Goldfracht, M, Bech, P. ‘Do you think you suffer from depression?’ Reevaluating the use of a single item question for the screening of depression in older primary care patients. Int J Geriatr Psychiatry 2010; 25: 497502.Google Scholar
48 Azah, N., Shah, M, Juwita, S, et al. Validation of the Malay version brief Patient Health Questionnaire (PHQ-9) among adult attending family medicine clinics. Int Med J 2005; 12: 259–64.Google Scholar
49 Cannon, DS, Tiffany, ST, Coon, H, Scholand, MB, McMahon, WM, Leppert, MF. The PHQ-9 as a brief assessment of lifetime major depression. Psychol Assess 2007; 19: 247–51.Google Scholar
50 Chen, S, Fang, Y, Chiu, H, Fan, H, Jin, T, Conwell, Y. Validation of the nine-item Patient Health Questionnaire to screen for major depression in a Chinese primary care population. Asia Pac Psychiatry 2013; 5: 61–8.Google Scholar
51 Chen, S, Chiu, H, Xu, B, Ma, Y, Jin, T, Wu, M, et al. Reliability and validity of the PHQ-9 for screening late-life depression in Chinese primary care. Int J Geriatr Psychiatry 2010; 25: 1127–33.Google Scholar
52 de Lima Osório, F, Vilela Mendes, A, Crippa, JA, Loureiro, SR. Study of the discriminative validity of the PHQ-9 and PHQ-2 in a sample of Brazilian women in the context of primary health care. Perspect Psychiatr Care 2009; 45: 216–27.Google Scholar
53 Gelaye, B, Williams, MA, Lemma, S, Deyessa, N, Bahretibeb, Y, Shibre, T, et al. Validity of the Patient Health Questionnaire-9 for depression screening and diagnosis in East Africa. Psychiatry Res 2013; 210: 653–61.Google Scholar
54 Gilbody, S, Richards, D, Barkham, M. Diagnosing depression in primary care using self-completed instruments: UK validation of PHQ-9 and CORE-OM. Br J Gen Pract 2007; 57: 835–6.Google Scholar
55 Henkel, V, Mergl, R, Kohnen, R, Allgaier, AK, Moller, HJ, Hegerl, U. Use of brief depression screening tools in primary care: consideration of heterogeneity in performance in different patient groups. Gen Hosp Psychiatry 2004; 26: 190–8.Google Scholar
56 Lamers, F, Jonkers, CC, Bosma, H, Penninx, BW, Knottnerus, JA, van Eijk, JT. Summed score of the Patient Health Questionnaire-9 was a reliable and valid method for depression screening in chronically ill elderly patients. J Clin Epidemiol 2008; 61: 679–87.Google Scholar
57 Liu, SI, Yeh, ZT, Huang, HC, Sun, FJ, Tjung, JJ, Hwang, LC, et al. Validation of Patient Health Questionnaire for depression screening among primary care patients in Taiwan. Compr Psychiatry 2011; 52: 96101.Google Scholar
58 Lotrakul, M, Sumrithe, S, Saipanish, R. Reliability and validity of the Thai version of the PHQ-9. BMC Psychiatry 2008; 8: 46.Google Scholar
59 Patel, V, Araya, R, Chowdhary, N, King, M, Kirkwood, B, Nayak, S, et al. Detecting common mental disorders in primary care in India: a comparison of five screening questionnaires. Psychol Med 2008; 38: 221–8.Google Scholar
60 Phelan, E, Williams, B, Meeker, K, Bonn, K, Frederick, J, Logerfo, J, et al. A study of the diagnostic accuracy of the PHQ-9 in primary care elderly. BMC Fam Pract 2010; 11: 63.Google Scholar
61 Richardson, LP, Rockhill, C, Russo, JE, Grossman, DC, Richards, J, McCarty, C, et al. Evaluation of the PHQ-2 as a brief screen for detecting major depression among adolescents. Pediatrics 2010; 125: e1097103.Google Scholar
62 Sherina, MS, Arroll, B, Goodyear-Smith, F. Criterion validity of the PHQ-9 (Malay version) in a primary care clinic in Malaysia. Med J Malaysia 2012; 67: 309–15.Google Scholar
63 Spitzer, RL, Kroenke, K, Williams, JBW. Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study. JAMA 1999; 282: 1737–44.Google Scholar
64 Sung, SC, Low, CC, Fung, DS, Chan, YH. Screening for major and minor depression in a multiethnic sample of Asian primary care patients: a comparison of the nine-item Patient Health Questionnaire (PHQ-9) and the 16-item Quick Inventory of Depressive Symptomatology – Self-Report (QIDS-SR16). Asia Pac Psychiatry 2013; 5: 249–58.Google Scholar
65 Whooley, MA, Avins, AL, Miranda, J, Browner, WS. Case-finding instruments for depression: two questions are as good as many. J Gen Intern Med 1997; 12: 439–45.Google Scholar
66 Wittkampf, K, van Ravesteijn, H, Baas, K, van de Hoogen, H, Schene, A, Bindels, P, et al. The accuracy of Patient Health Questionnaire-9 in detecting depression and measuring depression severity in high-risk groups in primary care. Gen Hosp Psychiatry 2009; 31: 451–9.Google Scholar
67 Yeung, A, Fung, F, Yu, SC, Vorono, S, Ly, M, Wu, S, et al. Validation of the Patient Health Questionnaire-9 for depression screening among Chinese Americans. Compr Psychiatry 2008; 49: 211–7.Google Scholar
68 Zuithoff, NP, Vergouwe, Y, King, M, Nazareth, I, van Wezep, MJ, Moons, KG, et al. The Patient Health Questionnaire-9 for detection of major depressive disorder in primary care: consequences of current thresholds in a crosssectional study. BMC Fam Pract 2010; 11: 98.Google Scholar
69 Maxwell, M, Harris, F, Hibberd, C, Donaghy, E, Pratt, R, Williams, C, et al. A qualitative study of primary care professionals' views of case finding for depression in patients with diabetes or coronary heart disease in the UK. BMC Fam Pract 2013; 14: 46.Google Scholar
Supplementary material: PDF

Mitchell et al. supplementary material

Supplementary Material

Download Mitchell et al. supplementary material(PDF)
PDF 163 KB
Submit a response


No eLetters have been published for this article.