Development of the RMT20, a composite screener to identify common mental disorders

Background There are few very brief measures that accurately identify multiple common mental disorders. Aims The aim of this study was to develop and assess the psychometric properties of a new composite measure to screen for five common mental disorders. Method Two cross-sectional psychometric surveys were used to develop (n = 3175) and validate (n = 3620) the new measure, the Rapid Measurement Toolkit-20 (RMT20) against diagnostic criteria. The RMT20 was tested against a DSM-5 clinical checklist for major depression, generalised anxiety disorder, panic disorder, social anxiety disorder and post-traumatic stress disorder, with comparison with two measures of general psychological distress: the Kessler-10 and Distress Questionnaire-5. Results The area under the curve for the RMT20 was significantly greater than for the distress measures, ranging from 0.86 to 0.92 across the five disorders. Sensitivity and specificity at prescribed cut-points were excellent, with sensitivity ranging from 0.85 to 0.93 and specificity ranging from 0.73 to 0.83 across the five disorders. Conclusions The RMT20 outperformed two established scales assessing general psychological distress, is free to use and has low respondent burden. The measure is well-suited to clinical screening, internet-based screening and large-scale epidemiological surveys.


Existing screening tools for common mental disorders
Mental disorders are common in the community but often go unrecognised, contributing to low rates of treatment. 1,2 Using screening measures to identify individuals who might be experiencing a mental health problem may lead to better uptake of evidencebased treatments. 3 Accurate screening measures may also be used to identify individuals who require more comprehensive assessment, 4,5 used to guide the tailoring of treatment in the context of internet-based and face-to-face interventions (for example Batterham et al 6 and Kiropoulos et al 7 ), or used to assist individuals in the community to recognise whether they are likely to be experiencing a specific mental disorder. 8 However, there are few brief measures that can be used to accurately screen for a range of common mental health problems. 8,9 There are two broad approaches to screening for mental health problems: assessing general psychological distress and assessing the presence of specific common mental disorders. Screeners that assess general psychological distress capture internalising symptoms (i.e. symptoms of mood or anxiety disorders) at a transdiagnostic level. 10,11 Psychological distress measures, such as the Distress Questionnaire-5 (DQ5 12 ) and Kessler-6/Kessler-10 (K6/K10 13 ) are accurate for identifying a range of mental disorders. However, this approach to screening does not provide direction as to which disorder is most likely to be present, may miss specific manifestations of distress, and may not cover symptoms of disorders other than major depressive disorder (MDD) or generalised anxiety disorder (GAD). 10,12 There have been attempts to generate brief composite screeners to identify a broader spectrum of mental disorders in online, community or clinical settings. 8,9 However, finding an appropriate balance between the brevity of the screener and its accuracy has proven challenging. Screening measures typically aim to maximise sensitivity (low rates of false negatives), so as not to miss any individuals who may meet clinical criteria for a disorder, along with adequate levels of specificity. 14 Multidisorder composite screeners that typically consist of between two and five items per disorder have been found to perform with varying degrees of success in identifying common mental disorders in the community. 8

Use of item banks to assess mental disorders
Our team has recently developed item banks to assess a range of specific mental disorders, labelled as Rapid Measurement Toolkit (RMT) item banks. 15 The RMT item banks complement the existing Patient Reported Outcomes Measurement Information System (PROMIS) emotional health measures that assess depression and anxiety. 16 RMT and PROMIS item banks were developed through extensive multistage processes that included evaluation of items with experts and people with lived experience of mental illness, and validation in population-based samples using item-response theory methods. [15][16][17] Items within these banks have been demonstrated to be free from invariance on the basis of age, gender and education. Items are also free of local dependence, that is, items are not correlated after accounting for variance in the latent construct, which makes them appropriate candidates for use in unidimensional measurement tools. 15 By combining items from the RMT and PROMIS item banks, it may be possible to identify a parsimonious set of items that enable accurate and efficient screening for common mental disorders. However, no existing studies have assessed whether items from these dimensional measures also provide accurate screening to identify individuals who meet clinical criteria for a disorder.

Aims
The aim of the present study was to develop a composite screener for common mental disorders and, in two population-based samples, assess its psychometric properties in identifying clinical 'caseness' for five common mental disorders: MDD, GAD, social anxiety disorder (SAD), panic disorder and post-traumatic stress disorder (PTSD). Two measures of general psychological distress, the DQ5 and K10, were used as comparators for identifying individuals who met criteria for each disorder, on the basis of area under the receiver operating characteristic curve (AUC). The new multidisorder screener, called the Rapid Measurement Toolkit-20 (RMT20), was designed to have high sensitivity with acceptable specificity in assessing the presence of five mental disorders. The screener was also designed using items that have been shown to be precise in assessing severity of symptoms for each of the five included disorders. 15,16 Screeners that can also provide an indication of severity to the clinician may have higher utility than those that only assess the presence or absence of a disorder. 18,19 Method

Participants and procedures
Two independent samples were recruited using virtually identical methods, separated by 13 months. These samples are referred to as the 'development' sample (n = 3175) and the 'validation' sample (n = 3620)the screener was developed using data from the development sample 15 then validated using data from the validation sample. 20 All participants were recruited using advertisements on the online social media platform Facebook, with the development sample recruited between August and December 2014 and the validation sample recruited between January and February 2016. Advertisements linked directly to the survey and targeted Australian adults aged 18 years or older.
The content of the advertisement was designed to attract oversampling of people with symptoms of a mental disorder, emphasising that the research was on the topic of mental health. The survey was implemented online using Qualtrics survey software. From 39 945 individuals who clicked on the advertisement for the initial (development) survey, 10 082 (25%) consented and commenced the survey and 5011 (50%) completed the full survey, with 3175 allocated to complete all of the scales included in the present study (a short form of the survey was administered to the remaining 1836 participants). For the second (validation) survey, 5379 individuals clicked on the advertisement, 5220 (97%) consented and commenced the survey and 3620 (69%) of these completed the full survey. There were no missing data as responses were required for all questions except age and gender, with participants encouraged to discontinue if they were uncomfortable with the survey.
Written informed consent was obtained from all participants. Participants were given details of local and national mental health resources, along with the contact details of the research team, to facilitate access to mental health support if required.
The development survey included pools of items to assess SAD, panic disorder, PTSD, obsessive-compulsive disorder (OCD), adult attention-deficit hyperactivity disorder (ADHD) and substance use disorder, described previously. 15,17 In addition, the PROMISdepression, PROMIS-anxiety and PROMIS-alcohol use item banks 16,21 were administered, but only in the development sample as these measures are established. A number of other existing scales related to mental health and suicide prevention were also included, but are not the focus of the present study.
Each survey took approximately 40-60 min to complete. Participants in the development survey were offered the opportunity to enter into a draw for one of four iPad Minis; no incentive was provided to participants in the validation survey. The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008. All procedures involving human patients were approved by the ANU Human Research Ethics Committee (protocols: #2013/509 and #2015/717).

Item banks
The item banks used to assess MDD and GAD were the PROMISdepression and PROMIS-anxiety item banks, whereas item banks for SAD, panic disorder and PTSD were the respective RMT item banks. All items used a first-person perspective. PROMIS item banks are rated based on the past 7 days, whereas RMT item banks are based on the past 30 days. Response to all items are on a 5-point frequency Likert scale: never (1), rarely (2), sometimes (3), often (4), always (5). The complete item banks have previously been published. 16,22 The item banks were designed through systematic item selection and refinement processes, resulting in unidimensional, accurate measures to assess specific mental disorders. 15,16,23 The authors of the current study developed the RMT item banks but were not involved in the development of the PROMIS item banks.

Clinical diagnoses
Clinical diagnoses were made using the DSM-5 symptom checklist, developed by the authors as a self-report assessment for clinical diagnosis based on DSM-5 criteria. 22,23 The checklist queried respondents about the presence or absence of symptoms based on DSM-5 definitions for each disorder of interest. Eight items assessed SAD; 21 for panic disorder; 14 for GAD; 15 for MDD (including items to exclude hypomania); 22 for PTSD; 14 for OCD; and 21 for ADHD. Each item reflected a single DSM-5 criterion for the disorder of interest, although some criteria were probed across multiple questions and additional items were used to exclude alternative diagnoses.
Example items included: 'In the past six months, did social situations nearly always make you feel frightened or anxious?', 'During the past six months, has your behaviour or difficulty in paying attention caused problems at home, work, school, or socially?' and 'In the past month, has there been a time when you unexpectedly felt intense fear or discomfort?' The checklist was designed along similar principles to the electronic version of the Mini International Neuropsychiatric Interview (MINI 24 ) in terms of structure (binary and categorical self-report items with conditional skip logic) and response burden. However, the checklist used in the current study was developed independently from the MINI, non-proprietary and based on DSM-5 rather than DSM-IV criteria. The full checklist has been published previously 22 and is available from the authors.

Comparator scales
Comparator scales to test the relative precision of the new screener were the DQ5 12 and K10. 13 Both scales are accurate unidimensional measures of general psychological distress and are accurate in identifying individuals who are likely to meet clinical criteria for a range of common internalising disorders. 10,13 The DQ5 (α = 0.91) and K10 (α = 0.94) both had excellent internal consistency in the validation sample.

Demographic factors
Demographic factors were collected to describe the participants and were based on self-reported measures of age group, gender (male, female, other), educational attainment, employment status, location (metropolitan, regional, rural) and language spoken at home.

Analysis
From the five item banks (PROMIS-depression, PROMIS-anxiety, RMT social anxiety, RMT panic, RMT PTSD), four items for each disorder were chosen on the basis of their accuracy in assessing clinical criteria for the disorder of interest. These 20 items formed the RMT20 screener. We initially tested screeners with 3-6 items for each disorder but found that adding items beyond four typically did not substantially improve the sensitivity and specificity of the screener within the development sample. The items were chosen to provide coverage across the spectrum of severity that is measured by the full item banks. 25 Specifically, items were selected based on item response theory (IRT) discrimination and difficulty parameters as well as item information curves when measuring a single unidimensional construct representing either panic disorder, SAD or PTSD. 25 This approach ensured that the screeners were accurate across the continuum of the latent construct.
The PROMIS screeners were only administered within the development sample, as they are established item banks, whereas screeners for the RMT measures were then tested in the validation sample. The AUC was the indicator of the precision of each subscreener. AUC for the new subscreeners were compared to AUC for the DQ5 and K10 for each of the five disorders.
Cut-points were defined based on Youden indices, although with a view to maximising sensitivity when there were comparable choices. Sensitivity and specificity (with 95% CI) were estimated at each cut-point and compared with the sensitivity and specificity of the DQ5 and K10 based on prescribed cut-points. 12

Results
The characteristics of the two samples are provided in Table 1. There were significant differences in all variables except for language, GAD caseness, panic disorder caseness, PTSD caseness and K10 score. Participants in the validation sample appeared to be slightly younger, better educated, had higher employment rates, and a greater proportion resided in urban areas. In addition, the validation sample had a lower prevalence of depression and SAD and less severe distress than the development sample. Nevertheless, differences were typically small (for example Cohen's d = 0.12 for DQ5), which suggests that the statistical significance of these comparisons may be more related to sample size than clinically meaningful differences. The eight PROMIS items and 12 RMT items selected for the final composite screener (RMT20) are provided in Table 2, including mean (s.d.) for individuals with and without the specific disorder of interest. It should be noted that these means are likely higher than would be seen in the general population, so should not be considered normative data. All items significantly differentiated those with and without the disorder of interest. Table 3 details the precision of the subscreeners in identifying clinical caseness for the five mental disorders based on AUC, with comparison with the two measures of psychological distress: DQ5 and K10. The table indicates that the disorder-specific subscreeners from the RMT20 were significantly more precise in screening for all disorders of interest, except in the case of GAD where the difference between the RMT20 and DQ5 was not significant. For SAD and panic disorder in particular, the RMT20 had considerably greater precision than both the DQ5 and K10, with an increase of up to 9% in AUC. Table 4 shows the performance of the subscreeners at the identified cut-points. All screeners had high sensitivity, approximately 85% or greater, and specificity above 70%, indicating their accuracy across independent samples. The subscreeners also had high internal consistency, at or above 0.9. Performance of the RMT20 was stronger overall than the distress screeners, with similar or high sensitivity and specificity at prescribed cut-points.

Main findings
The psychometric properties of the RMT20 composite screening measure suggest it provides an accurate and rapid method to identify individuals who meet clinical criteria for specific common   mental disorders within the general population. The RMT20 screener outperformed two established measures of general psychological distress across all disorders, suggesting the composite screener approach may be more effective and efficient when there is a need to identify the presence of specific mental disorders. The gains in precision were most evident for social anxiety, panic disorder and PTSD, which are not typically captured as well as depression and generalised anxiety by measures of general psychological distress. 10 The RMT20 was designed for online use but may also have relevance in a range of clinical settings where identifying specific forms of psychopathology would be beneficial.
Measures of general psychological distress remain useful for identifying individuals who are likely to meet clinical criteria for one or more mental disorders. Indeed, measures such as the DQ5 have been shown to be highly accurate and efficient in screening for a range of mental disorders, using only five items. However, distress measures are unable to differentiate the specific disorder that an individual is most likely to be experiencing. The 20-item composite screener presented here provides a compromise between distress measures and lengthy batteries of mental health measures. For example, using common existing measures assessing five disorders included in the composite screener would require presentation of at least 34 items from five scales, each with different stems and response frames. The RMT20 also allows flexibility in the scope of screening, enabling the subscreeners to be administered as needed.

Administration of the RMT20
The RMT20 performed well in relation to other brief composite screeners tested previously, 8,9 which typically perform better for some disorders (for example MDD) than others (for example GAD, PTSD). The RMT20 also performed similarly to computer adaptive tests, which typically require 4-15 items per domain to deliver a similar level of accuracy. [26][27][28] Computer adaptive tests, although similarly efficient to brief composite screeners, require considerable infrastructure to administer and do not appear to provide a considerable benefit in terms of markedly greater precision for identifying specific mental disorders in the community. 25 In contrast, the RMT20 is easy to administer in an online or paper-and-pencil settings, with minimal loss of diagnostic accuracy.

Potential applications
While our main focus in the development of this composite screener was on tailoring internet interventions to prevent and treat mental disorders in the community, 6 further potential applications are extensive. Poor recognition of mental disorders in the community 29 could be improved by provision of screening tools to the public, in conjunction with feedback to support appropriate help-seeking. Screening programmes within general practice, hospital settings and community-based mental health services often require brief and accurate indicators for a range of specific mental health problems. As a result of time constraints and consideration of patient burden, healthcare settings typically screen using psychological distress measures that may be suboptimal, or focus only on depression and/or generalised anxiety. The current findings suggest that distress screeners may not be as accurate as a composite screener in identifying particular common mental disorders such as panic disorder and PTSD. The present composite screener may provide an alternative approach to screening that is more accurate and can be administered in 1-2 min. Although the current population sample reported high levels of psychopathology, further investigation of the performance of the RMT20 in clinical settings would be warranted.  Composite mental health screener for common mental disorders

Strengths and limitations
This study used separate development and validation data-sets to establish the psychometric properties of the new composite screener. Both data-sets consisted of large samples of adults recruited from the community, with oversampling of people with mental health problems. However, there are some limitations of the present research. First, the RMT20 did not include externalising disorders or less common internalising disorders, although there is scope to add modules for these domains in future. The sample was not representative of the population nor of a treatment-seeking clinical sample, so further data may be needed to provide population norms. The clinical outcome measure was a self-report checklist, which is similar to the approach used in other population-based studies. Such checklists provide an indication for probable disorder only, so further evaluation of the RMT20 against a clinician diagnosis would be valuable. Furthermore, the development of the RMT20 was on the basis of accuracy in assessing DSM-5 disorders, so some degree of circularity in the definitions used in RMT20 may exist, excluding broader manifestations of these disorders. Although psychological distress scales are widely used for screening, a stronger future comparison for the RMT20 may be against a battery of more traditional disorder-specific screening measures (see for example Zuromski et al, 30 Kroenke et al 31 and Batterham et al 32,33 ). The PROMIS measures were only included in the development sample, as they are established measures, precluding us from measuring their psychometric properties across independent data-sets.
Finally, the method for selecting items was designed to maximise the accuracy of the screener across the dimensions of each disorder, rather than using methods to maximise classification, such as decision trees. The dual-function screeners may therefore be used in assessing both severity and presence of disorder. However, alternative methods for identifying subsets of items may have provided greater accuracy in capturing diagnostic criteria.

Implications
The RMT20 is a composite measure that is accurate in screening for five mental disorders in the community. The measure outperforms two established scales assessing general psychological distress, is free to use and is associated with low respondent burden, which makes it well-suited to busy clinical settings, internet-based screening and large-scale epidemiological surveys.