Validation of the Cohen-Mansfield Agitation Inventory Observational (CMAI-O) tool

ABSTRACT Objectives: Behaviours associated with agitation are common in people living with dementia. The Cohen-Mansfield Agitation Inventory (CMAI) is a 29-item scale widely used to assess agitation completed by a proxy (family carer or staff member). However, proxy informants introduce possible reporting bias when blinding to the treatment arm is not possible, and potential accuracy issues due to irregular contact between the proxy and the person with dementia over the reporting period. An observational measure completed by a blinded researcher may address these issues, but no agitation measures with comparable items exist. Design: Development and validation of an observational version of the CMAI (CMAI-O), to assess its validity as an alternative or complementary measure of agitation. Setting: Fifty care homes in England. Participants: Residents (N = 726) with dementia. Measurements: Two observational measures (CMAI-O and PAS) were completed by an independent researcher. Measures of agitation, functional status, and neuropsychiatric symptoms were completed with staff proxies. Results: The CMAI-O showed adequate internal consistency (α = .61), criterion validity with the PAS (r = .79, p = < .001), incremental validity in predicting quality of life beyond the Functional Assessment Staging of Alzheimer's disease (β = 1.83, p < .001 at baseline) and discriminant validity from the Neuropsychiatric Inventory Apathy subscale (r = .004, p = .902). Conclusions: The CMAI-O is a promising research tool for independently measuring agitation in people with dementia in care homes. Its use alongside the CMAI could provide a more robust understanding of agitation amongst residents with dementia.

Behaviours associated with agitation are common amongst people living with dementia, particularly in care home settings with prevalence estimates of clinically significant symptoms ranging from 40 to 85% in various countries including the UK, Norway, and Holland (Livingston et al., 2017;Testad et al., 2007;Zuidema et al., 2007). The aetiology of agitation is not clear but is likely to be multifactorial with care approaches, the physical and social environment, medical comorbidities such as pain, genetics and the progression of dementia pathology all being contributing factors (Lanctôt et al., 2017). Consistent with its multifactorial aetiology, there are many behaviours typically associated with agitation including repetitive mannerisms, hoarding, screaming, hitting, wandering, verbal aggression, and general restlessness, which can be extremely distressing to the person with dementia, their carers and others around them. Recent consensus criteria for agitation state that symptoms of agitation should: (1) occur in the context of cognitive impairment of dementia; (2) be consistent with emotional distress; (3) manifest as excessive motor activity, verbal or physical aggression; and (4) not be solely attributable to another disorder (Cummings et al., 2015). While these criteria provide a useful and much needed framework for the assessment of agitation, particularly with respect to inclusion in clinical trials, there is still a need to refine and validate assessment tools to accurately evaluate agitation as a clinical outcome.
The Neuropsychiatric Inventory (NPI; Cummings et al., 1994) and the Cohen-Mansfield Agitation Inventory (CMAI; Cohen-Mansfield and Billig, 1986) are both commonly used to evaluate agitation. The CMAI consists of 29 items forming four subscales; physically aggressive behaviour (e.g. hitting others), physically non-aggressive behaviour (e.g. pacing), verbally aggressive (e.g. swearing) and verbally non-aggressive behaviours (e.g. repetitive sentences). The CMAI incorporates both the frequency and severity of behaviours associated with agitation and allows the quantification of agitated behaviours into a continuous measure, which is sensitive to change. The combination of these factors has led to its widespread use in clinical trials of pharmacological (e.g. Porsteinsson et al., 2014) and psychosocial interventions (e.g. Ballard et al., 2018).
Proxy informant interviews to complete outcome measures have clear benefits for populations of people with dementia, who are likely to have significant cognitive impairment or communication difficulties that often make direct interviews difficult (Moyle et al., 2007). Notable drawbacks include informant recall/ knowledge (due to factors such as the amount of contact between the proxy and research participant during the reporting period) andin the case of psychosocial interventionslack of informant blinding to treatment, both of which can affect measurement accuracy and increase the chance of unwanted reporting error. In clinical trial settings, where there is an increasing number of clinical trials relating to agitation (e.g. Creese et al., 2018), and where close attention is being paid to the accuracy of outcome measurements and minimising placebo response rates, these drawbacks illustrate the need to explore alternative and complementary methods for outcome measurements.
Observational tools are one avenue for addressing the issues outlined above, as independent observers (i.e. members of the research team who are not part of the intervention delivery) are able to remain blinded to treatment and are not subject to recall issues. There are a number of observational measures of agitation in people with dementia (Curyto et al., 2008, Zeller andRhoades, 2010). However, some measure a broader range of behaviours that include but are not restricted to agitation (Auer et al., 1996;Beck et al., 1997;McCann et al., 1997;Morgan and Stewart, 1998;Van Haitsma et al., 1997), some are not appropriately validated (Camberg et al., 1999;Cohen-Mansfield et al., 1989;Yudofsky et al., 1997) or are not validated specifically for use with people with dementia. Others include only a narrow range of agitated behaviours, such as aggression (e.g. Almvik et al., 2000;Perlman and Hirdes, 2008), and others are not available (Whall, 1999).
There are then few observational agitation measures appropriately validated for use with people with dementia, which include a range of potential agitated behaviours. These include the Agitated Behaviour Scale (ABS; Bogner et al., 1999;Corrigan, 1989) and the Pittsburgh Agitation Scale (PAS; Rosen et al. 1994). However, these too have limitations. The ABS, for example, was originally developed for use in people with traumatic brain injury, and some items are relevant only to a hospital rather than care home environment (e.g. pulling at tubes, restraints etc). The scale also does not capture the frequency of occurrence of a particularly behaviour over an extended period of time due to the 20-minute observation periods used. The PAS measures the severity of four groups of agitated behaviours and is completed by trained researchers or clinical staff based on observations of participant behaviour over a shift or similar extended period of time. The PAS lacks detail in key areas including the breadth of symptoms assessed and the frequency with which symptoms occur. These additional pieces of information are vital for any assessment scale to capture heterogeneous range of symptoms that can constitute an agitated clinical syndrome. Likewise, due to the different approaches to measurement of agitation, none of the measures are appropriate for use as a direct comparator measure to the CMAI to detect potential reporting error by proxy reporters.
We sought to address these issues by developing and testing an observational version of the CMAI and including it in a randomized controlled trial where the primary outcome measure was resident agitation measured by the CMAI. This paper presents data that evaluates the psychometric properties and reliability of the CMAI-O and its criterion, incremental, and discriminant validity in comparison to existing measures. It also discusses its potential as an alternative or complementary measure of agitation in people with dementia. This was achieved through secondary analysis of the [name redacted] DCM EPIC trial dataset (see Surr et al., 2016 for trial protocol).

Method
Participants Participants [number (N) = 726] were recruited from 50 randomly selected care homes (mean = 15 residents per care home) in three regions across England (Yorkshire, Oxfordshire and London) for a randomized controlled trial. Residents were eligible to participate if they lived in the care home permanently and had a formal diagnosis of dementia or scored ≥ 4 on the Functional Assessment Staging of Alzheimer's disease (FAST; Reisberg, 1988). Residents were ineligible to participate if they had been formally admitted to an end of life care pathway or were cared for in bed. For inclusion in the present work, participants were required to have the CMAI-O (completed by an independent observer) completed for at least 2 hours, and at least one measure completed by a staff proxy; therefore, the sample differed at each time point due to loss to follow-up and missing data.as assessed by their key worker status and/or the judgement of the home manager) for at least several months. Additionally, this person must have been in regular contact with the resident during the previous 2 weeks. Researchers provided information on the completion of measures, including the period that the staff proxy should consider, and the use of rating scales.
Where possible, residents provided consent to participate. Where potential participants were deemed not to have capacity, a personal consultee (relative or friend) was approached to provide advice on the person's wishes. For potential participants whose personal consultee did not respond to contact or who did not have anyone to act as a personal consultee, a nominated consultee (care staff member) was approached to provide this advice.

Ethical considerations
Ethical approval for the clinical trial was obtained from the Bradford Leeds NHS REC committee and subsequent ethical approval for this sub-study was obtained from the Leeds Beckett University research ethics committee. Where possible, residents provided informed written consent to participate. Where potential participants were deemed not to have capacity, in line with the Mental Capacity Act (2005) and following standard guidance (Medical Research Council, 2007), a personal consultee (relative or friend) was approached to provide advice on the person's wishes. For potential participants whose personal consultee did not respond to contact or who did not have anyone to act as a personal consultee, a nominated consultee (care staff member) was approached to provide this advice. For reasons of privacy and dignity, CMAI-O and PAS observations took place only in the public areas of the care home (e.g. lounge, dining room, corridors), and no private or personal care, or observations in bedrooms or bathrooms were conducted.

Measures
Six measures were completed at three time points; baseline, 6 months post care home randomisation (first follow-up) and 16 months post randomisation (second follow-up; see Table 1 for an overview). The independent observer, who was blinded to intervention allocation, conducted observations (of both the CMAI-O and PAS) between approximately 10am and 12pm, and 2pm and 5pm, on a single day, to reach a total of 5 hours observation. All other measures, completed by a researcher with the staff proxy, were completed within 4 weeks of the observational measures.
Researchers were able to act as the independent observer if they had not previously visited the care home to consent participants or complete any data collection. Prior to this, inter-rater reliability with another researcher of at least 80% agreement on the CMAI-O over a 1-hour period was required.

C O H E N -MANSFIELD A G I T A T I O N I N V E N T O R Y
The CMAI (Cohen-Mansfield and Billig, 1986) measures 29 behaviours typically associated with agitation or aggression. Proxy reporters identify the frequency of 29 behaviours during the past 2 weeks, on a seven-point Likert scale ranging from "never" to "several times an hour", with higher scores indicating more agitation. The measure has moderate Researcher with family/staff proxy Previous 7 days Criterion FAST 7 levels (sub-levels for levels 6 and 7) Researcher with staff proxy Not specified Incremental Validation of the CMAI-O 77 concurrent validity with NPI-NH Agitation sub-scale (r = .52; Wood et al., 2000) and converge with an observational scale of agitation, the Agitated Behaviours Mapping Instrument (r = .32-.39; Cohen- Mansfield and Libin, 2004). The CMAI showed internal consistency of α = .85 at baseline.
The CMAI-O includes the same 29 items as the CMAI, but completed through observations of the participant by a trained researcher. The CMAI-O is scored on a four-point Likert scale of 1 = "never", 2 = "less than once per hour", 3 = "once per hour" and 4 = "several times an hour", with higher scores indicating more agitation. In the present study, data were collected across 5 hours split into two time points during a single day. Up to 15 observations (individuals) were conducted simultaneously. The CMAI-O showed internal consistency of α = .61 at baseline.

P I T T S B U R G H A G I T A T I O N S C A L E (PAS)
The PAS (Rosen et al., 1994) is an observational rating of agitation and was completed by the independent observer concurrently with the CMAI-O. This measures intensity of agitation on four domains; aberrant vocalization (e.g. screaming), motor agitation (e.g. wandering), aggressiveness, and resisting care, each scored on a five-point Likert scale from 0 (not present) to 4 (high intensity of agitated behaviour). The measures also asks whether any interventions were used by care staff to support or manage agitated behaviours e.g. seclusion or restraint. PAS observations should be conducted for between 1 and 8 hours, and in the present study, data were collected across 5 hours in a single day. The measure has internal consistency of α = .80 (Rosen et al., 1994) and in the current study, had internal consistency of α = .63 at baseline.
The NPI-NH (Cummings, 1997;Cummings et al., 1994) identified psychopathology for people with dementia in care homes. It assesses the presence and occupational disruptiveness of 12 behaviours, such as agitation/aggression, anxiety, and disinhibition. The frequency (within the past 2 weeks) and severity of each behaviour are rated on a four-point (1-4) and three-point (1-3) Likert scale, respectively. A score is then calculated for each behaviour by multiplying the severity and frequency score (1 -12). A total score is then calculated by adding the scores of all 12 subscales. A composite score comprising agitation/aggression, irritability/lability, disinhibition and aberrant motor behaviour accounts for 60% of the variance of a total CMAI score (Wood et al., 2000). The measure showed internal consistency of α = .63 at baseline.  Benhabib et al., 2013). The NPI-NH showed internal consistency of α = .74 at baseline.
The Functional Assessment Staging of Alzheimer's Disease (FAST; Reisberg, 1988) records how severely dementia affects individuals' daily functioning. Scores range from 1 (no dementia) to 7 (severe dementia), with levels 6 and 7 having sub-levels, focusing on deficits associated with personal care and communication. It was completed by a researcher through interview with a staff proxy.

MISSING D A T A
The levels of missing data were relatively high by the second follow up (43.8%), due to loss to follow up across 16-months of the trial (see Table 2). During data analysis, missing data were identified and recoded before being incorporated or excluded from all analyses. For the incremental validity and discriminant validity analyses all missing data recorded from each scale were excluded from the analysis. When performing correlation analyses between the PAS and CMAI-O, multiple imputation was used (Rubin, 1987). This technique creates complete data sets by generating several possible values for any missing values. Analyses are conducted across all of these data sets and outputs provide estimates for each data set about the results that would have been expected if there had been no missing values in the original data set. In the present study, five data imputations were created. Furthermore, the CMAI-O was only completed for participants who were in a communal area during data collection (either AM, PM, or both). The number of participants with completed CMAI-O at each time point and reasons why people were not in a communal area of the care home were recorded (see Table 3).

Data Analysis
Data manipulation, analyses and graphical visualisations were performed in IBM SPSS 24. Prior to performing inferential statistical analyses, each dataset was assessed for parametric assumptions. A skewed distribution was consistently observed across all data, suggesting that scores were predominantly weighted at the lower end of each scale. This was expected as the majority of residents within care homes score lowly on neuropsychiatric outcome measures; whilst only a minority of residents typically score highly. In order to account for this non-normal data, parametric bootstrapping was performed on all analyses (see Davison and Hinkley, 1997). Data were deemed to meet all other parametric assumptions.
Several forms of validity were tested; criterion validity, incremental validity and discriminant validity. Criterion validity refers to the degree of correlation between a proposed novel measure and a pre-existing, validated assessment that targets the same phenomenon. It is important that a new measure demonstrates criterion validity as it assures researchers and practitioners that any results derived can be assumed to be consistent with alternative methods of assessment. Incremental validity is used to determine whether a new measure increases the predictive ability beyond that provided by an existing tool. It is important to demonstrate this form of validity as it demonstrates that a measure elicits additional information relative to other measures. Discriminant validity illustrates whether scores on a scale are truly independent of scores on another scale that measures distinctly different phenomena. Previous evidence has found no correlation between agitation and apathy in a sample of nursing home residents living with dementia (Mouriz-Corbelle et al., 2017). Therefore, we hypothesised that the CMAI-O would be unrelated to the NPI Apathy subscale if discriminant validity is high.

Preliminary analyses
The prevalence of each symptom at baseline, measured using the CMAI and CMAI-O can be seen in Table 4. Within this table, for the CMAI, only items that were present 'several times a week' or more were included, in order to be comparable with the CMAI-O. This is due to the CMAI-O being completed during one single day; therefore, items that are displayed once per week or less were unlikely to be displayed during a single day observation. Generally, levels of agitation were lower for the CMAI-O, with the exception of 'performing repetitious mannerisms', which was seen more commonly.
Correlations between the CMAI-O and the CMAI, PAS and NPI, at each time point were also conducted (see Table 5). Pearson correlations for the CMAI-O and the CMAI as scored by the staff proxy were significant: baseline r = .44, 6 months r = .23 and 16 months r = .28 (all p < .001). The NPI Agitation subscale had a weak but statistically significant correlation with the CMAI-O at baseline and 6 months (r = .38, p = < .001 and r = .12, p = .021), however no significant correlation was found between the two measures at 16 months (r = .09, p = .199). Finally, both the AM and the PM scores for the PAS showed a significant correlation with total CMAI-O scores across all time points, with a significance of p = < .001.
To understand whether scores on the CMAI-O were related to scores on another scale measuring agitation, correlations were calculated between this and the PAS. There is no existing evidence of validation of the CMAI and the PAS. Given the PAS is the only existing observational measure of agitation it was used to compare to the CMAI-O,  to establish criterion validity. A partial correlation was completed with bootstrapping to account for non-normally distributed data, controlling for the data collection time point (baseline/6-month follow up/16-month follow up) and the time of day (AM or PM) when both were completed. When controlling for time point and the time of day completed, a significant partial correlation was found (r = .80, p = <.001). Additionally, we compared CMAI-O scores to the NPI Agitation subscale.
Whilst not an observational measure, it represents a psychometric tool validated to measure agitation in nursing home residents, thus a significant correlation between this and the CMAI-O would support criterion validity. A partial correlation analysis of the CMAI-O and the NPI Agitation subscale, controlling for data collection time point, found a small but significant relationship between the measures (r = .24, p = <.001).
To assess whether the CMAI-O increased the predictive ability beyond that provided by an existing method of assessment, we hypothesised that agitation as measured by the CMAI-O would predict quality of life as measured by the QUALID, above the prediction from scores on the FAST. It has been suggested in previous studies that functional status significantly impacts the quality of life of a person living with dementia (e.g. Andersen et al., 2004) with those with more severe dementia experiencing poorer quality of life.
In step one of a hierarchal multiple regression, scores on the FAST were included as a predictor of quality of life. The FAST score significantly predicted the quality of life score across all time points, baseline (β = .21, p < .001), 6 months (β = .202, p = .001) and 16 months (β = .198, p = .006). In step two of the analysis, when the CMAI-O score was also included as a predictor, an R 2 change of .095 (R 2 = .139, F(2, 390) = 66.87, p <.001) was observed at baseline, an R 2 change of .027 (R 2 = .068, F(2, 286) = 33.7, p<.001) at 6-months and an R 2 change of .042 (R 2 = .082, F(2, 191) = 32.12, p<.001) at 16 months. The FAST remained significant at each of these time points also [6 months (β = 3.45, p = .001), 16 months (β = 3.09, p = .002)]. These analyses demonstrated that the observational CMAI had incremental predictive value in the measurement of quality of life beyond levels that could be predicted by participants' stage of dementia, as determined by functional status (FAST).
To establish whether the CMAI-O failed to correlate with a measure that should be conceptually unrelated, bootstrapped correlation analyses between the CMAI-O and all NPI subscales were conducted, controlling for the data collection time point. We hypothesized that the CMAI-O would show no relationship with the NPI Apathy subscale. No relationship was found between the CMAI-O and NPI Apathy subscales, when controlling for time point (r = .004, p = .902). This suggests that agitation as measured by the CMAI-O is completely unrelated to apathy, in line with previous research.

Discussion
The present study describes the development and validation of the CMAI-O. This tool assesses agitation over a shorter period than the original CMAI, can be completed by an independent observer rather than relying on proxy reports and therefore could form part of an agitation assessment battery for research, including clinical trials.The psychometric properties of the CMAI-O were examined. Scores on the CMAI-O correlated with scores on the CMAI, PAS and NPI-NH, and the tool had incremental predictive value when measuring quality of life beyond levels predicted by functional status. No relationship was found between the CMAI-O and NPI Apathy subscale, suggesting that agitation as measured by the CMAI-O is completely unrelated to apathy. The internal consistency of the CMAI-O, whilst being adequate, suggested that there may be issues with the items within the scale. Further examination of this is required, to establish whether this was caused by general low levels of agitation within our sample, or issues within the measure. Within the current study, we sought to assess the psychometric properties of an alternative method of administration, therefore we have not suggested refinement of the CMAI-O items.
To identify whether the CMAI-O identified behaviours typically associated with agitation comparisons between the CMAI-O and CMAI were also conducted. Scores on the CMAI-O were consistently slightly lower than those on the CMAI. This may be due to the restricted locations and time periods of observation. For example, observations took place in the communal areas therefore agitated behaviours in other areas were not recorded, in particular during personal care. Previously, personal care activities including bathing and toileting, particularly when initiated by a caregiver rather than the resident themselves, have been found to significantly increase agitation behaviours (Cohen-Mansfield et al., 1992).
There are several limitations with the present research. One difficulty when using observational methods for assessing agitation is that many behaviours do not occur often, therefore long observations are sometimes required to detect behaviours. This is especially true for the less common behaviours like aggression, which are some of the most important in terms of impact on others (Cohen-Mansfield, 1996). A standardised period of 5 hours Validation of the CMAI-O 81 within a single day in this study provided data that was adequately correlated with alternative proxy measures and, therefore, is the recommended minimum observation period for use of the CMAI-O. Secondly, we recruited participants from 50 care homes in three areas of England, which may not represent average care homes across the UK. For example, those under admissions bans due to breaches of statutory regulations were not eligible to participate and those approached but not interested in taking part in research did not consent to take part. The levels of agitation seen in the present sample were lower than previous samples (e.g. Zuidema et al., 2007). This may reflect improved understanding of how to effectively manage behaviours associated with agitation by care home staff over recent years. Understanding these as a reflection of unmet needs rather than symptoms of dementia is becoming a more common approach and recent research has found lower levels of agitation amongst people living with dementia in care homes (Livingston et al., 2017). Furthermore, the average CMAI score in the present study was similar to that of participants in another recent psychosocial clinical trial (Ballard et al., 2016). However, comparing levels of agitation in this study to those reported in earlier care home studies identified lower levels in this sample generally. For example, 16% of individuals experienced general restlessness, in comparison to 44% of individuals in the Netherlands (Zuidema et al., 2007), although our sample was comparable to the 22% of individuals who experienced general restlessness within a Norwegian sample (Testad et al., 2007). However, it may be that given the relationship between agitation and poor quality care, individuals living in care homes where there are breaches of statutory regulations, such as unsafe staffing levels, might experience greater levels of agitation, and such individuals were not recruited in the present study as their care homes were ineligible to participate. Additionally, we only observed participants in communal areas, which may have impacted on levels of agitation. This may be particularly relevant as some of the measures completed by staff proxy informants included specific questions about personal care, which are not observed due to issues around privacy and dignity.
By developing the CMAI-O we have addressed the disadvantage of the CMAI relating to potential reporting biases from proxy reporters. However, observational measures of agitation also have limitations. Levels of agitation observed by the CMAI-O were generally lower than those reported by care staff members in the CMAI, although scores on the two were significantly correlated. It may be that staff proxies over-report the prevalence of these behaviours on the CMAI, particularly if they find the behaviours difficult to support. Alternatively the CMAI-O may miss some agitation due to its use in this study only in communal areas during the daytime. Research indicates higher levels of agitation may be seen in the evenings and during personal care (e.g. Sloane et al., 2004). However, the CMAI-O does have the benefit of using continual observation over two sustained periods, which may offer a more accurate picture of agitation than measures that use short time-sampling observations. Additionally, observer fatigue is a concern with longer observation periods and one which should be considered. Therefore, further research on optimal observation length and time(s) of day when using the CMAI-O is required, as well as exploration of ethical and practical issues around the potential for observations to be conducted in other areas where care may be delivered. Researchers have rightly questioned the accuracy of recall of agitated behaviours by staff members working in care homes, due to the impact of caregiver burden. Care staff reporting a lower burden have been found to provide higher quality of life ratings than those experiencing more caregiver burden (Graske et al., 2014). One possible explanation for this is that care staff experiencing more burden also expect residents to be experiencing this burden (Graske et al., 2014). Alternatively, it may be that those experiencing more burden are less able to see positive aspects of residents' lives. However, there is currently very little research examining this amongst formal care staff. A systematic review of the relationship between family caregiver distress and quality of life ratings concluded that family caregivers who reported experiencing more stress associated with caregiving provided more negative reports of their relative's quality of life (Neumann et al., 2000). However, more recently no relationship has been found between family caregiver distress and their proxy quality of life ratings (Sheehan et al., 2012). Furthermore, there is scope for influence by interpretation or perceptual biases of individual proxies. For example, a resident whose physical mobility has improved following an exercise intervention and who is spending more time walking may be noted as 'wandering' (Brett et al., 2017) by care home staff. This may be due to a perception that residents who are not sitting are more problematic to care for, or that 'wandering' is a common problem in dementia and therefore all walking behaviours must be wandering. Perceiving or interpreting this behaviour as wandering would indicate an increase in agitation when completing the CMAI, when the behaviour is instead walking that reflects an increase in physical ability and activity. Use of the CMAI-O, by trained researchers who receive instruction on how to observe and interpret resident behaviours may allow a more objective understanding of the behaviour of individuals, through concentrated observation of them over a substantial period of time. This may provide a more accurate measure of agitation than untrained staff recall of behaviours over the previous 2 weeks. Future research should consider the influence of proxy reporters and their subjective experiences on research outcomes.
Furthermore, a single researcher can complete the CMAI-O for multiple participants over several time points, whereas it is particularly difficult to recruit staff proxies for multiple time points, based on turnover and shift patterns. In addition the use of staff proxies to report on agitation of a 2-week period raises concerns about accuracy reporting, since they work shifts and even when on duty do not always spend long periods of time with individual residents. Therefore, assumptions about frequency of agitated behaviours may be made on based on intensity over a short period or reports of disruptiveness from colleagues. In one recent study, to overcome this, staff members completed a tally score of the CMAI at the end of each shift for a 2-week period (Brett et al., 2017). The CMAI-O addresses this potential reporting bias, as a researcher captures all behaviours across a single day, leading to decreased likelihood of over or under-reporting of agitated behaviours. However, observation on a single day does limit ability to observe less common but potentially impactful behaviours and therefore, further research is needed to examine the test-retest reliability of the CMAI-O and establish the optimal periods over which to use the measure to most accurately capture the full extent and range of agitated behaviours. Future research should also extend the validation of the CMAI-O to confirm the factor structure of the measure amongst individuals experiencing higher levels of agitation and establish whether it is suitable for use in other settings, such as hospital wards, or with other populations such as individuals living with severe and enduring mental health problems, where agitation is a concern.
The CMAI-O offers a novel approach for evaluating the presence of behaviours associated with agitation amongst people with dementia in care homes (both residential and nursing) within research and practice. The CMAI-O may be particularly useful to include in research where the CMAI is the primary outcome, since it may provide a measure of potential performance or reporting bias from staff members who cannot be blinded to treatment arm. Moreover, as an observational measure of agitation, the CMAI-O offers a more suitable observational measure than the PAS or ABS because its items are directly aligned to the CMAI and thus the results are complementary to those obtained on the CMAI. As such, future researchers will be able to confidently provide composite CMAI scores using information from both proxy-informant and researcher observations.

Conclusion
In conclusion, this initial validation of the CMAI-O has demonstrated that the tool is appropriate for use with people living with dementia in care homes. The tool provides a complementary or alternative to measure to staff proxy completed measures of agitation. This may be particularly valuable when evaluating care home interventions. It is hoped that the development of the CMAI-O leads to increased use of independent assessment to understand behaviours associated with agitation amongst people living with dementia in care homes and its use as an outcome measure to evaluate change in agitation.

Conflicts of interest
The authors declare no conflicts of interest.

Description of authors' roles
AWG: design of the study, data acquisition, data analysis, interpretation of data and drafting of this paper.
CPA: data acquisition, data analysis, interpretation of data and drafting of this paper.
NLB: data acquisition, data analysis, interpretation of data and drafting of this paper.
BC: data acquisition, data analysis, interpretation of data and drafting of this paper.
RW: data analysis, interpretation of data and drafting of this paper.
IH: data analysis, interpretation of data and drafting of this paper.
JS: data acquisition and drafting of this paper. CAS: conception and design of the study, data acquisition, interpretation of data and drafting of this paper.

Funding
This project was funded by the National Institute for Health Research Health Technology Assessment programme (15/11/13). The views and opinions expressed therein are those of the authors and do not necessarily reflect those of the HTA, NIHR, NHS or the Department of Health and Social Care.
Validation of the CMAI-O 83