Multimorbidity Treatment Burden Questionnaire (MTBQ): Translation, Cultural Adaptation, and Validation in French-Canadian

Abstract Reliable treatment burden measures are needed given the aging population and the associated increase in multimorbidity and polypharmacy. Treatment burden is defined as the effort to care for one’s health and the resulting impact on one’s daily life. This study aimed to translate the Multimorbidity Treatment Burden Questionnaire (MTBQ) for French-Canadians and assess its reliability and validity. The MTBQ was translated and tested with cognitive debriefing interviews, and the French version (MTBQ-F) was then administered 2 times among 105 participants. Reliability and validity were examined using the intra-class correlation coefficient (ICC), Cronbach’s alpha, and Spearman’s correlations. The median global MTBQ-F scores were 32.69 (interquartile range [IQR]: 21.15-48.08) and 30.77 (IQR: 21.15-46.15) for the first and second administrations, respectively. Test-retest (ICC: 0.73; 95% CI: 0.63-0.81) and internal consistency reliability (Cronbach’s alpha: 0.80) were good. There was a moderate positive correlation between the MTBQ-F score and the number of self-reported conditions (rho: 0.28). This valid instrument could identify patients experiencing a high treatment burden and assess the impact of interventions among them.


Introduction
Valid treatment burden measures for patients with multimorbidity are needed given the aging population and an associated increase in polypharmacy (Guthrie, Makubate, Hernandez-Santiago, & Dreischulte, 2015) and treatment burden (Sheehan et al., 2019). Treatment burden is defined as the "effort of looking after ones' health and the impact that this has on everyday life" . It includes taking complex medication regimens, coordinating health care appointments, self-monitoring, and making lifestyle changes. High treatment burden is associated with poor medication adherence, low quality of life, dementia, and depression (Duncan et al., 2018a;Tran et al., 2014). Multimorbidity is defined as the co-occurrence of two or more chronic diseases (Pefoyo et al., 2015) and is associated with high health care utilization and costs (Marengoni et al., 2011), nonadherence to treatment, adverse drug events (Guthrie et al., 2015), and poor quality of life (Fortin et al., 2004;Marengoni et al., 2011). In 2011-2012 in Canada, around seven per cent of people ages 35 to 49 reported having at least two chronic conditions. This proportion rises to 16 per cent and 31 per cent for people ages 55 to 64 and 65 and over, respectively (Roberts, Rao, Bennett, Loukine, & Jayaraman, 2015).
Being able to measure treatment burden adequately is crucial to identify patients experiencing a high treatment burden and to assess the impact of interventions designed to optimize treatment or reduce treatment burden. Treatment burden is often unknown and undetected by health care professionals who do not have valid tools to assess treatment burden (May, Montori, & Mair, 2009). From a clinical perspective, using such tools could also lead to a better understanding of areas of burden for patients. Given the significant consequences of a high treatment burden, interventions that could lessen the most common modifiable burdens could then be developed and implemented, at the individual or populational level. For instance, pharmacists could work with physicians and patients to optimize pharmacotherapies and, consequently, reduce the burden associated with a high number of different medications or daily doses (Samir Abdin, Grenier-Gosselin, & Guénette, 2020). Physicians could also lessen the burden associated with appointments, for example, by offering various time slots in the evenings and weekends or new ways to meet with the patients.
There are existing questionnaires (Boyd et al., 2014;Eton et al., 2017;Tran et al., 2012Tran et al., , 2014) that measure treatment burden, but all have significant limitations. The 15-item Treatment Burden Questionnaire (Tran et al., 2014) was developed and validated in France. Although test-retest (ICC: 0.77; 95% CI: 0.70-0.82) and internal consistency reliability (Cronbach's alpha: 0.90) were good, this questionnaire is not specific to patients with multimorbidity, has complicated wording, and is not culturally adapted to French-Canadians. Other measures of treatment burden have not been validated in French. The Patient Experience with Treatment and Self-management (PETS) is a comprehensive, 48-item questionnaire developed in the United States (Eton et al., 2017) and limited by its length. An abbreviated version (36 items) of the PETS has recently been created (Eton et al., 2020). Among multimorbid adults, this version demonstrated good internal consistency (Cronbach's alphas ≥ 0.80), evidence of known-groups validity and responsiveness to change, but is still relatively long. The Multimorbidity Illness Perceptions Scale (MULTIPleS) (Gibbons et al., 2013) and health care task difficulty (HCTD) (Boyd et al., 2014) questionnaires do not include all aspects of treatment burden.
The Multimorbidity Treatment Burden Questionnaire (MTBQ) is a brief, self-administered 10-item questionnaire (with three additional optional questions), validated in adults with multimorbidity in the United Kingdom (Duncan et al., 2018a). The MTBQ has demonstrated good content validity, construct validity, internal consistency reliability and responsiveness (Duncan et al., 2018a), and has been used in other studies (Friis, Lasgaard, Pedersen, Duncan, & Maindal, 2019;Mangin et al., 2020;McCarthy et al., 2020). This study aimed to translate and adapt the MTBQ to French-Canadians and to assess its validity.

Methods
The study comprised three steps conducted from December 2019 to September 2020: (a) translation of the MTBQ into French, (b) French-Canadian cultural adaptation and content validity assessment, and (c) reliability and construct validity assessment. Ethical approval was obtained from the local research ethics board (no. 2020-4959).
Step 1: French Translation A back-and-forth translation (Eremenco, Cella, & Arnold, 2005;Reeve et al., 2013;Wild et al., 2005) was conducted. Two professional bilingual translators translated the instrument into French, and two others, blinded to the original English version of the MTBQ, back-translated the questionnaire into English. Six bilingual research team members independently read the translations, commented on them, and selected their preferred French version for each item. The developer of the original MTBQ was consulted. Four members of the research team then discussed all the comments and agreed on the version to retain for each item to create the French-Canadian version of the MTBQ (MTBQ-F).
Step 2: Cultural Adaptation and Content Validity Assessment To assess item clarity, wording, and relevance for individuals with multimorbidity, the MTBQ-F was tested with a sample of Canadians who met the following eligibility criteria: (a) ≥ age 18 years, (b) prescribed ≥ three long-term medications, and (c) able to read and speak French. To capture a range of perspectives, we recruited individuals with various sociodemographic characteristics from community pharmacies in the Quebec City area (n = 6 individuals recruited through this mean) and the research team's networks (n = 5). Two trained research assistants conducted in-depth cognitive debriefing interviews with participants, in person or remotely (using the Zoom platform because of coronavirus disease  pandemic restrictions). During these interviews, participants were invited to read the tool and freely give their comments aloud or by writing them on the tool. Interviewers were provided with key general questions (probes) that they could ask people to clarify their thoughts (e.g., What does this mean to you? What do you understand? Can you give me examples?). They could also present alternative terms that were discussed among the research team in step 1. Interviewers reviewed all instructions, questions, and answers with the participants. Such qualitative testing is recommended for evaluating the translation of a questionnaire . All participants received a CAD $50 compensation.
Interviews were audio-recorded, and a summary of the participants' comments was compiled. After an initial wave of six participants, the research team met to discuss the data and consulted the developer. The wording of some items that seemed less clear for participants was refined, and the resulting version of the MTBQ-F was further tested on a second wave of five participants. In wave 2, additional alternative terms that had been suggested by participants in the previous wave were presented during the interviews. A research assistant completed a synthesis grid following the interviews. These summaries were validated by a second research assistant who could listen to interview excerpts as needed. The research team met again and agreed on each item's wording based on the participants' comments compiled in the synthesis. The process ended after interpreting the second wave's data, as data became redundant .

Step 3: Reliability and Validity Assessment
We recruited participants across the province of Quebec. Inclusion criteria were the same as in step 2, but participants who had taken part in this step were not eligible to participate in step 3. Mixed methods of recruitment were used among different organizations with the intent to vary the characteristics of potential participants. Forty-one community organizations offering services to patients with various chronic conditions shared invitation posters to their members using Facebook groups or their website (n = 78 individuals recruited through this mean). Invitation e-mails were also sent to the list of employees of a hospital centre (n = 10) and the list of members of a university (n = 12). Five persons were recruited because they heard about the study from someone close to them.
The MTBQ-F questions were administered to participants through the telephone by the same trained research assistant at two-time points, 1 month apart. This method was selected as we intend to use the instrument in a subsequent study performed among older people with cognitive impairment, and self-administration would have been challenging in this population. We pursued participants' recruitment until 101 test-retests were completed. This number provides sufficient power when calculated using a 0.7 intra-class correlation coefficient (ICC) with a 95 per cent confidence interval of 0.6 to 0.8. Participants received CAD$10 after the second completion.

Analysis
We used descriptive statistics to describe the participants and the distribution of responses for each questionnaire item. Each item was scored as follows: 0 ("not difficult"/ "does not apply"), 1 ("a little difficult"), 2 ("quite difficult"), 3 ("very difficult"), and 4 ("extremely difficult") (Duncan et al., 2018b). Floor and ceiling effects were deemed present if more than 15% of respondents answered "0" or "4" to an item, respectively (Dou, Huang, Duncan, & Guo, 2020). To calculate a global score ranging from 0 to 100, we computed each participant's average score and multiplied it by 25. Scores were interpreted as suggested by the authors of the original MTBQ instrument: "no burden" (score 0), "low burden" (< 10), "medium burden" (10-22), and "high burden" (> 22) (Duncan et al., 2018b). We examined test-retest reliability using the intra-class correlation coefficient, and it was deemed good (> 0.9) or acceptable (0.70-0.90), as suggested by Fayers and Machin (2000). Internal consistency was measured with Cronbach's alpha and deemed acceptable if > 0.70 (Nunnally & Bernstein, 1994). We explored the dimensional structure with factor analysis, more precisely using a principal component analysis. This statistical technique can be useful to identify unnecessary items and possible subscales. The original MTBQ has one dimension (treatment burden) but includes questions related to (a) medication taking, (b) pharmacy and physician visits, (c) health problem management and support, and (d) lifestyle. We tested the number of dimensions of the MTBQ-F with a scree plot test. Moreover, we looked at the number of factors to obtain a cumulative proportion of variance around 99 per cent and did a χ 2 test on the number of factors that should be retained. Finally, we measured the correlation between the MTBQ-F score and the number of self-reported long-term conditions, using Spearman's rank correlations, to assess construct validity. Table 1 shows the characteristics of the 11 cognitive debriefing interview and 105 test-retest participants. They were mainly women (72.7% and 82.9%, respectively). The mean age was 71.2 AE 8.2 years for the cognitive interview and 54.3 AE 17.1 years for test-retest participants. Participants of the cognitive interviews reported having 2.7 AE 1.1 chronic conditions and using 6.4 AE 2.5 prescribed medications on average, while these figures were 3.8 AE 1.6 and 8.9 AE 5.0, respectively, for test-retest participants. The selection process for test-retest participants is presented in Figure 1. In total, 110 individuals were eligible to participate, but 4 were no longer interested after having received details on the study. Among the remaining 106 participants, 1 withdrew after completing the baseline questionnaire, and 105 completed both the baseline and follow-up questionnaires. The participant who withdrew from the study was not included in the analysis.

Cultural Adaptation and Content Validity Assessment
Participants well understood the translation, and most items applied to the French-Canadian culture and context. Some items were culturally adapted. Moreover, participants thought that the instrument was comprehensive and covered all aspects of the treatment burden. Supplemental Table 1 presents the original English MTBQ version and the different versions of the MTBQ-F tested during the cognitive interviews (waves 1 and 2). A summary of comments from participants, the team decisions and rationale, and the final version are also presented. For instance, the term impacts in the instructions was replaced by consequences between waves 1 and 2 as one participant of the first wave found it unclear. This modification was kept in the final version as all participants of the second wave found the instructions clear. Table 2 presents the distribution of responses for each item of the baseline MTBQ-F (13 items). Except for items 9 (31.4%) and 10 (36.2%), very few respondents said the questions did not apply to them, and there were no missing responses. There was a floor effect on all items except item 12 as the proportion of participants with a score of 0 (i.e., answering "not difficult" or "does not apply") was high. The floor effect is explained by a higher proportion of "not difficult" responses rather than "does not apply". Items 7, 9, and 12 also had a ceiling effect when applying the 15 per cent limit.

Internal Consistency and Dimensions of MTBQ-F
Internal consistency was good with a Cronbach's alpha of 0.80 for the global score, including all 13 items. It was slightly improved when removing item 3 or 4 ( Table 3). The exploratory factor analysis on all 13 items suggests that the MTBQ-F has three dimensions (see Table 3). The first dimension relates to managing medication, care, and services from different professionals. The second is about self-management and social support, and the third refers to the burden of visiting health care professionals. Factor loading was significant (> 0.04) for all items but items 3 and 12.
Finally, there was a moderate positive correlation between the MTBQ-F global score and the number of conditions (rho: 0.28; p < 0.01).

Discussion
The 13-item MTBQ-F is an easy to understand and valid instrument to measure treatment burden among French-Canadians. It demonstrated good internal consistency (Cronbach's alpha = 0.80) and construct validity. Test-retest reliability was also good with an intra-class coefficient correlation of 0.73. Results suggest that this instrument can effectively assess treatment burden, identify patients with a high treatment burden, and guide health care providers in their interventions to reduce this burden. Although test-retest reliability was good, it was lower than what was observed for the Chinese version of the MTBQ (C-MTBQ) (Dou et al., 2020) (ICC: 0.94). There are two explanations for this difference. First, we readministered the questionnaire after 1 month, while in the Chinese study, the delay between the two tests was 2 weeks. Some participants may have experienced a real change in their treatment burden between the two questionnaires. Second, we performed this study during the first wave of the COVID-19 pandemic, and several restrictions were implemented and changed during this time. This context might have influenced the perceived treatment burden for our participants, and this perception might have fluctuated, depending on the restrictions in place at the time that the participants responded.
Internal consistency was good (Cronbach's alpha: 0.80) and similar to the original English MTBQ (0.83) and the C-MTBQ (0.76). As for the C-MTBQ (Dou et al., 2020), factor analyses suggested more than one dimension in the MTBQ-F. Analyses also suggest that some items might not be relevant to measure treatment burden in our population, either because factor loading was not significant (i.e., 3 and 12) or because a large proportion of respondents answered that it did not apply to them (i.e., 9 and 10). Items 3 (paying for medication and equipment), 9 (getting health care in the evening and at weekends), and 10 (getting help from community services) were also found not relevant for the U.K. population with 76 per cent, 41 per cent, and 50 per cent, respectively, of "does not apply" answers (Duncan et al., 2018a). Thus, one could choose to include or not include these items, depending on their population and on health care systems, drug plans, and organization of care in place. For instance, the burden of paying for medication and equipment might be particularly important in populations not covered by health insurance. If in doubt or if time is not limited, all items should be included.
We found a moderate and positive correlation between the MTBQ-F global score and the number of chronic conditions (rho: 0.28), which is similar to 0.32 reported by Duncan et al. (2018a). This finding suggests a good construct validity of the MTBQ-F, as treatment burden is likely to increase with the number of chronic conditions. The fact that the correlation was moderate also confirms that it measures more than the number of chronic conditions alone.
The elevated median score (32.69) and a large proportion of participants found to have a high treatment burden (73.3%) indicate that treatment burden was substantial in our study compared to others (Dou et al., 2020;Duncan et al., 2018a;Salisbury et al., 2018). By comparison, 26.6 per cent and 45.5 per cent of participants from the U.K. and China MTBQ validation studies, respectively, were found to have a high treatment burden (Dou et al., 2020;Duncan et al., 2018a). High perceived treatment burden is associated with being young (Duncan et al., 2018a;Tran et al., 2012) and female (Duncan et al., 2018a), two prevalent characteristics in our study.
This study has several strengths. First, the translation process was done according to the required standards, and several cognitive interviews and discussions between co-authors enabled the wording to be adapted to suit the French-Canadian culture and context. Second, cognitive interviews were performed in a diverse sample of older adults ensuring the terms used were simple and understood by older and less educated people. Third, we validated the MTBQ-F with a large sample at two-time points. Participants' sociodemographic characteristics such as chronic disease, income, education, age, and region of residency were highly diversified, so our results may be generalizable to a diverse French-Canadian population with multimorbidity.
However, this study also has some limitations. The original MTBQ is a self-administered questionnaire, but, in the current study, a research assistant administered it through the telephone, which may have influenced the participants' results (Bowling, 2005). However, a friendly interviewer can help the participants with their responses and increase item response rates (Bowling, 2005), explaining why no data were missing in the present study. Although the absence of missing data is positive, our results might not apply to a self-administration of the MTBQ-F. Also, we used convenience sampling, targeting patient websites and Facebook groups, and sending e-mail invitations. These methods were selected in accordance with COVID-19 pandemic restrictions. Doing so, we may have introduced a selection bias since only people who had access to the Internet could participate (Frippiat & Marquis, 2010) and Internet users are usually younger (Bernier, 2017). The employees of the hospital centre and the members of the university were probably more educated than the general population. People involved with community organizations offering services related to various chronic conditions may have distinct   characteristics associated with treatment burden. Moreover, people experiencing a high treatment burden might have been more interested in participating, explaining the high treatment burden rates observed in our study. In addition, although we recruited participants for whom at least three medications have been prescribed as a proxy for multimorbidity, six participants reported having only one chronic condition. A sensitivity analysis excluding these participants did not change study results (not shown). Finally, as briefly discussed, the COVID-19 pandemic is also part of the limitations of this study. Indeed, at the time of data collection, many participants had not seen their doctor recently because of the pandemic and their outside activities and contacts were very limited. The disruption to the health care system and feelings of loneliness and isolation may have exacerbated the stress of their chronic condition and increased the burden of treatment (Comité en Prévention et Promotion de l'Institut National de Santé Publique du Québec, 2020). In times of a pandemic, it is more difficult to see your doctor and access "non-emergency" health care services, so these events may have influenced participants' responses.
With the gradual recovery of health services, it would be interesting to reassess the treatment burden of participants to see whether it has decreased as the pandemic subsides.

Conclusion
The French-Canadian MTBQ (MTBQ-F) demonstrated good internal consistency, test-retest reliability, and construct validity. This brief questionnaire could be a valuable tool to assess the impact of interventions designed to optimize treatment and reduce treatment burden. Clinicians could also use the MTBQ-F to identify patients experiencing a high treatment burden. Factor loading values of 0.40 or greater are in bold and regarded as significant.