Relative validity of a short screener to assess diet quality in patients with severe obesity before and after bariatric surgery

Objective: To determine the relative validity and reproducibility of the Eetscore FFQ, a short screener for assessing diet quality, in patients with (severe) obesity before and after bariatric surgery (BS). Design: The Eetscore FFQ was evaluated against 3-d food records (3d-FR) before (T0) and 6 months after BS (T6) by comparing index scores of the Dutch Healthy Diet index 2015 (DHD2015-index). Relative validity was assessed using paired t tests, Kendall’s tau-b correlation coefficients (τb), cross-classification by tertiles, weighted kappa values (k w ) and Bland–Altman plots. Reproducibility of the Eetscore FFQ was assessed using intraclass correlation coefficients (ICC). Setting: Regional hospital, the Netherlands. Participants: Hundred and forty participants with obesity who were scheduled for BS. Results: At T0, mean total DHD2015-index score derived from the Eetscore FFQ was 10·2 points higher than the food record-derived score (P < 0·001) and showed an acceptable correlation (τb = 0·42, 95 % CI: 0·27, 0·55). There was a fair agreement with a correct classification of 50 % (k w = 0·37, 95 % CI: 0·25, 0·49). Correlation coefficients of the individual DHD components varied from 0·01–0·54. Similar results were observed at T6 (τb = 0·31, 95 % CI: 0·12, 0·48, correct classification of 43·7 %; k w = 0·25, 95 % CI: 0·11, 0·40). Reproducibility of the Eetscore FFQ was good (ICC = 0·78, 95 % CI: 0·69, 0·84). Conclusion: The Eetscore FFQ showed to be acceptably correlated with the DHD2015-index derived from 3d-FR, but absolute agreement was poor. Considering the need for dietary assessment methods that reduce the burden for patients, practitioners and researchers, the Eetscore FFQ can be used for ranking according to diet quality and for monitoring changes over time.

using data from multiple food records, 24-h dietary recalls or a single FFQ. Unfortunately, these methods are timeconsuming and burdensome and therefore less likely to be used in everyday clinical practice. For this reason, a short screener, the Eetscore FFQ, was developed to estimate the DHD2015-index in time-limited situations. The Eetscore FFQ showed to be acceptably correlated with the DHD2015-index derived from a full-length FFQ in a normal-weight adult population (15) . However, the Eetscore FFQ has not been evaluated in patients with (severe) obesity before or after undergoing BS.
Accurate measures of diet quality are needed to optimise nutritional care provided to these patients during the BS programme, but validated dietary assessment tools in this specific population are lacking (16) . Therefore, this study aimed to evaluate the relative validity and reproducibility of the Eetscore FFQ as a screener for diet quality in patients with (severe) obesity before and 6 months after BS.

Study design and participants
Between October 2018 and September 2019, patients with obesity who were eligible and scheduled for BS at Vitalys Obesity Clinic, part of Rijnstate hospital (Arnhem, the Netherlands), were asked to participate in this prospective cohort study. Participants were included approximately 6 weeks pre-surgery (T0) and followed up until 6 months post-surgery (T6). Exclusion criteria were a non-Dutch eating pattern, suffering from an eating disorder, inability to fill in questionnaires or food records and a previous bariatric procedure other than an adjustable gastric band.
In total, 200 participants signed the informed consent and were included in the study. Both before and after BS, we evaluated the Eetscore FFQ against 3-d food records (3d-FR) as reference method by comparing index scores of the DHD2015-index derived from both methods. At both time points, demographic information was collected and participants were asked to complete the Eetscore FFQ, followed by a 3d-FR as reference method. At T0, the Eetscore FFQ was completed twice (Eetscore FFQ1, Eetscore FFQ2) with an interval of approximately 5 weeks in order to analyse reproducibility.
From the total study sample of 200 participants, we excluded 60 participants with no Eetscore FFQ and 3d-FR (n 18), a missing Eetscore FFQ (n 5) or a missing/incomplete 3d-FR (n 37) at T0. The final study sample for data analysis at T0 consisted of 140 participants, of whom 116 completed both Eetscore FFQ1 and Eetscore FFQ2 (Fig. 1). For the study sample at T6, we additionally excluded 37 participants with no Eetscore FFQ and 3d-FR (n 22), a missing Eetscore FFQ (n 4) or a missing 3d-FR (n 11) at T6, resulting in a final study sample of 103 participants for data analysis at T6 (Fig. 1).

Data collection
Demographic information Socio-demographic (age, sex and educational level) and health-related information (anthropometrics, type of surgery, comorbidities and smoking status) were obtained from electronic patient records. Educational level was defined as low (primary education and prevocational secondary education), medium (senior general secondary education, pre-university education and secondary vocational education) or high (higher vocational education and university). Anthropometric measurements were performed during standard visits at the hospital. Body weight was measured to the nearest 0·1 kg with a digital weighing scale (Tanita BC-420MA), after removal of heavy clothing and shoes. Height was measured in standing position with a wall-mounted stadiometer (Seca 206). BMI was calculated as weight (kg) divided by squared height (m 2 ). TBWL at 6 months was calculated as weight loss divided by body weight before surgery, multiplied by 100 %.
Physical activity at T0 was assessed with the validated Baecke Questionnaire (17) that evaluates a person's habitual physical activity and separates it into three domains: work index, sports index and leisure index. Each domain could receive a score from 1 to 5 points, resulting in a total score ranging from 3 to 15. A score of 15 indicates being physically active at a high intensity.

DHD2015-index
The development of the DHD2015-index has been previously described (13) . The DHD2015-index consists of fifteen components representing the Dutch food-based dietary guidelines of 2015 (14) : vegetables, fruit, wholegrain products, legumes, nuts, dairy, fish, tea, fats and oils, coffee, red meat, processed meat, sweetened beverages, alcohol and Na. Additionally, the component 'unhealthy food choices' was added based on the guideline of the Netherlands Nutrition Centre (18) . Food items that contributed most to total energy, saturated fat, and mono-and disaccharide intake according to the Dutch National Food Consumption Survey (DNFCS) 2007-2010 were included in this component, such as sweet spreads, pastries, chocolate, savoury snacks, sauces and use of sugar in coffee or tea.
A complete overview of the sixteen components and their cut-off and threshold values is presented in Table 1. For every component, the score ranges from 0 (no adherence) to 10 points (complete adherence), resulting in a total score between 0 and 160 points. A graphic presentation of the scoring of the different types of components can be seen in Supplemental Fig. 1. For adequacy components (vegetables, fruit, legumes, nuts, fish and tea), no intake is awarded with 0 points and intakes between the cut-off and threshold value are scored proportionally. For moderation components (red meat, processed meat, sweetened beverages, Na, alcohol and unhealthy food choices), intakes between the cut-off and threshold value are also scored proportionally, but no intake is awarded with 10 points. Optimum components (dairy) have an optimal range of intake, and ratio components (fat and oils) reflect replacement of less preferred foods (e.g. solid fats) by more preferred foods (e.g. liquid fats and oils). The wholegrain product component is scored based on two sub-components: an adequacy component for wholegrain consumption and a ratio component to reflect replacement of refined grain products by wholegrain products. The coffee component is a qualitative component, based on the type of coffee (filtered v. unfiltered). As information on the type of coffee used was not available from the food records, this component could not be included in the validity analyses. For this reason, total score ranged between 0 and 150 for this part of the study.
The Eetscore FFQ The development of the Eetscore FFQ has been described in detail elsewhere (15) . Briefly, the Eetscore FFQ was developed to assess the DHD2015-index as a measure of adherence to the Dutch food-based dietary guidelines. The Eetscore FFQ assesses dietary intake over the previous month, based on fifty-five food items that account for 85 % of energy intake from the adult population of the DNFCS 2007-2010 (19) . The six answer categories for questions on frequency of consumption range from 'never' to 'every day' for regularly consumed foods and from 'not this month' to '4 times a month' for episodically consumed foods. Portion sizes are assessed in standard portions and commonly used household measures. Average daily intakes of food items are calculated by multiplying frequency of consumption by portion size in grams. The Eetscore FFQ directly reports index scores of the sixteen components of the DHD2015-index.
Three-day food records A 3-d estimated food record was used as the reference method. This method is considered acceptable for the assessment of usual dietary intake and is commonly used in dietary validation studies (20) . We used structured openended food records containing predefined food groups (including the option 'others') at six food occasions (breakfast, lunch, dinner and three eating occasions between the main meals). All participants received verbal instructions and were provided with a written example. They were asked to record all foods and beverages consumed over the 3 d in as much detail as possible, to describe the amounts consumed in units, household measures or provide weights when known, to report cooking methods and to include the recipes for any mixed dishes. At both time points, recorded days were randomly selected and consisted of 2 weekdays (Monday-Thursday) and 1 weekend day (Friday-Sunday) within a 1-week period. Completed food records were reviewed by the researcher for completeness with regard to portion sizes, cooking methods and description of foods. Telephone interviews with the participants were conducted in case of any uncertainties. Dietary intake data were entered in Compl-eat™, a computer-based nutrition calculation programme that is linked to the Dutch Food Composition Database (NEVO-online, version 2016) (21) . All foods and beverages from the food records were categorised into one of the fifteen DHD components (excluding coffee) to calculate the scores of the DHD2015-index. In case of missing recipes for mixed meals such as pasta or rice dishes, standard recipes of the Dutch Food Composition Database (NEVO-online, version 2016) were used (21) . Food items that did not fall into one of the DHD components (e.g. potatoes and soups) were not included. Total dietary intake of the fifteen DHD components in grams was averaged over the number of completed days before calculating corresponding index scores.

Statistical analysis
General characteristics of the study population are reported as medians and interquartile ranges (Q1-Q3) for continuous data and as frequencies and percentages for categorical data. Total DHD2015-index score and individual component scores calculated from the Eetscore FFQ and the 3d-FR are presented as means and standard deviations.
Relative validity of the Eetscore FFQ compared to the 3d-FR was assessed by calculating Kendall's tau-b (τb) as well as Spearman's rho (ρ) correlation coefficients between the DHD index scores derived from both methods. At T0, we used data of the Eetscore FFQ that was completed in the same month as the 3d-FR. CI for the correlations were obtained using Fisher's z-transformation. Correlation coefficients less than 0·20 were classified as poor, 0·20-0·49 as acceptable and ≥ 0·50 as good (22) . Additionally, total DHD2015-index scores derived from the Eetscore FFQ and the 3d-FR were categorised into tertiles. If ≥ 50 % of the participants were classified into the same tertile and/or ≤ 10 % into the opposite tertile, this was considered a good outcome (22) . Weighted kappa coefficients (k w ) were calculated to further evaluate the relative level of agreement. k w coefficients less than 0·20 indicated a poor level of agreement, 0·21-0·40 fair agreement, 0·41-0·60 moderate agreement, 0·61-0·80 good agreement and greater than 0·80 a very good level of agreement (23) . Paired t tests were used to test the mean differences in the DHD index scores between the two methods. Bland-Altman plots with 95 % limits of agreement were used to visualise the differences in the total DHD2015-index score. We additionally explored the degree of potential misreporting of dietary intake by comparing reported energy intake calculated from the food records at T0 with energy requirements as identified by the revised Goldberg cut-off method (24) . BMR was estimated using the Mifflin-St Jeor Equation (25) as this method provides the best estimation in individuals with (severe) obesity (26)(27)(28) . We used a physical activity level of 1·55, reflecting a moderate active lifestyle that was in line with the median physical activity score resulting from the Baecke Questionnaire.
All analyses were conducted using SPSS statistics 25.0 (IBM).

Misreporting
According to the revised Goldberg cut-off method, 57·1 % of the participants was classified as potential under-reporters of energy intake at T0 and 58·3 % of the participants at T6. We did not identify potential over-reporters of energy intake. Excluding potential misreporters did not markedly affect our results regarding the relative validity of the Eetscore FFQ at both time points (see online supplementary material, Supplemental Table 2a and b).

Discussion
In this study, we determined the relative validity and reproducibility of the Eetscore FFQ as a screener for diet quality in patients with (severe) obesity before and after BS by comparing index scores of the DHD2015-index derived from the Eetscore FFQ to the scores derived from 3d-FR (reference method). We demonstrated an overall reasonable relative agreement between the two methods, although the Eetscore FFQ showed higher index scores in comparison with the 3-FR and absolute agreement between the two methods was poor. Correlation coefficients for the DHD component scores varied widely with best coefficients observed for fruit and tea, and worst for legumes and red meat. Reproducibility of the Eetscore FFQ was considered good. We observed lower correlations for the total DHD2015index score based on fifteen components (excluding coffee) between the Eetscore FFQ and 3d-FR than reported in the study of de Rijk et al., who compared the Eetscore FFQ to a full-length FFQ (15) . They reported a Kendall's tau-b coefficient of 0·51 (95 % CI: 0·47, 0·55) for the total DHD2015-index score based on thirteen DHD components (excluding fish, fats and oils, and coffee). This could be explained by a difference in the number of DHD components included in the total score as well as a difference in reference method. The Eetscore FFQ is also an FFQ; therefore, more correlated errors might be expected with a full-length FFQ, resulting in higher correlations. Yet, a full-length FFQ might capture habitual dietary intake more accurately than three food records. Although all days of the week were equally represented across all records, foods that are not consumed on a daily basis, for example fish or legumes, could have been underestimated when recording only 3 d. This is also reflected in relative large absolute differences for these components. It has been suggested that when dietary methods assessing habitual dietary intake, such as the Eetscore FFQ, are validated against food records, a certain degree of disagreement can be expected due to the greater within-subject variations that occur over the shorter reference period of a food record (20) . In a study of Papadaki et al., Pearson's correlation coefficient of 0·52 was observed comparing the English version of the 'Mediterranean Diet Adherence Screener' to 3d-FR in patients with high cardiovascular risk in the UK (30) . Schröder et al. found Pearson's correlation coefficient of 0·61 when they compared the 'Diet Quality Index' derived from the 'Short Diet Quality Screener' to ten 24-h dietary recalls in a Spanish population (31) . In the same study, they also observed a correlation of 0·40 for the 'Modified Mediterranean Diet Score' derived from the 'Brief Mediterranean Diet Screener' compared with the score derived from ten 24-h dietary recalls (31) . These values are comparable to Spearman's Rho correlations observed in the current study (ρ = 0·60, 95 % CI: 0·47, 0·70 at T0 and ρ = 0·44, 95 % CI: 0·26, 0·59 at T6).
In contrast to the findings on relative agreement, absolute agreement between the Eetscore FFQ and the 3d-FR was poor. According to the Bland-Altman plots, the Eetscore FFQ systematically overestimated the total DHD2015-index score compared to the 3d-FR at both time points with relatively wide limits of agreement. However, no significant proportional bias was observed. This is in line 0·36, 0·66 0·65 0·51, 0·76 9.
Fat and oils  with other studies that also found higher mean index scores derived from a diet screener in comparison with food records (15,(30)(31)(32) . As most FFQ, the Eetscore FFQ can be considered more appropriate for ranking patients according to their diet quality or monitoring relative differences over time, rather than assessing absolute individual scores. It is however important to note that a food record is also no golden reference method and has its own limitations with regard to assessing dietary intake. Furthermore, we evaluated the intake of food groups instead of nutrients which is more difficult because of the high day-to-day variation. This may have impacted our findings with respect to the poor absolute agreement between the two methods.
With regard to the individual DHD components, correlations varied widely with highest values found for fruit and tea, and lowest values for legumes and red meat. For legumes, we observed many participants with an extreme difference of 10 points between the index score derived from the Eetscore FFQ compared to the food record-derived score, meaning that these participants had a score of 10 for legumes according to the Eetscore FFQ, whereas their score was 0 based on the food records. This resulted in large mean differences for this component (5·7 v. 0·8 points at T0 and 5·5 v. 1·4 points at T6, P < 0·001). This could be due to the fact that food records might not accurately capture habitual dietary intake, especially for foods that are not consumed on a daily basis such as legumes, as mentioned earlier. This is in concordance with an Australian study (age ≥ 70) validating a six-item dietary screener against three 24-h dietary recalls that also observed a poor agreement for legume intake (k w = 0·12) (33) .
For red meat, we observed poor correlations of < 0·20 at both time points, whereas mean index scores for this component were fairly similar between the two methods (8·7 v. 8·9 points at T0 and 9·5 v. 9·2 points at T6, P > 0·05). This might be explained by a low variation in the index scores for red meat. Over half of the participants scored 10 points based on the Eetscore FFQ as well as the 3d-FR. As a result, the few observations with (relatively) large differences in index score could have biased the correlation towards zero.
We also aimed to define participants who substantially under-or overreported their dietary intake by using the revised Goldberg cut-off method in which energy intake is compared with (estimated) energy expenditure. However, adequately estimating energy expenditure in subjects with (severe) obesity is challenging. In a study of Cancello et al. (26) , predictive equations for resting energy expenditure were compared to indirect calorimetry in 4247 subjects with obesity (69 % women, mean age 48 ± 19 years, mean BMI 44 ± 7 kg/m 2 ). The authors found that the Mifflin-St Jeor equation had the highest performance for both accuracy and bias but emphasise that the accuracy is still far from ideal (26) . Furthermore, the revised Goldberg cut-off method cannot be applied after BS as the condition of weight stability is violated, resulting in an invalid ratio between reported energy intake and energy requirement. We therefore assumed that participants who were identified as potential misreporters of dietary intake at T0 also misreported their intake at T6.
At both time points, the rate of potential misreporters was relatively high with 57·1 % of the study population potentially underreporting their dietary intake at T0 and 58·3 % at T6. According to a review of Poslusna et al., the percentage of under-reporters in studies using estimated food records ranged from 12 to 44 % (34) , which is lower than the observed percentages in the present study. This is in line with previous research showing that a higher BMI is associated with underreporting of dietary intake (35) .
Overall, excluding potential misreporters did not markedly affect our results, although caution is needed in the interpretation because of the aforementioned limitations in the use of the Goldberg cut-off method within this population.
Reproducibility of the Eetscore FFQ before surgery was considered good. The observed ICC of 0·78 was slightly lower than reported in previous research by de Rijk et al., who found an ICC of 0·91 for the total DHD2015-index score (15) . This could be due to a difference in study population as well as the multidisciplinary lifestyle programme that all participants started before undergoing BS. During this programme, patients received general information on healthy eating behaviour and dietary counselling. For most participants, the first Eetscore FFQ was administered before entering the multidisciplinary programme while they completed the second Eetscore FFQ during the programme. It is therefore plausible that participants already implemented beneficial changes with respect to their diet. This might explain the slightly higher DHD2015-index score resulting from the second Eetscore FFQ. Future studies are needed to confirm our findings while limiting the influence of such external factors. For the individual DHD components, most correlation coefficients ranged between 0·5 and 0·7 which are common in reproducibility studies of FFQ (20) .
Dietary assessment is an important component in the BS programme. Currently, dietary intake of patients undergoing BS is often assessed by a dietitian with the use of food records. This assessment method is very time-consuming, might be prone to reactivity and recall bias and only reflects the intake of the past days. The Eetscore FFQ is a short, webbased tool that can be used to assess general aspects of a healthy nutrient-dense diet such as the consumption of fruits and vegetables, wholegrains and dairy. However, the Eetscore FFQ does not include additional information about patients' eating behaviour including the distribution of food intake (e.g. few large meals or frequent smaller feedings) and the separation of food and beverages. Also, other factors affecting dietary intake may be missed by the Eetscore FFQ, such as food preparation methods and non-included food items (e.g. plant-based dairy, meat substitutes and fast food).
The Eetscore FFQ can therefore be used as an additional dietary assessment tool in the BS programme rather than as a replacement for the current methodology.
Considering the need for dietary assessment methods that reduce the burden for patients, practitioners and researchers, the Eetscore FFQ can be used for ranking patients according to diet quality and for monitoring relative changes in intake over time in order to indicate an improvement or a deterioration in diet quality. This can be relevant before undergoing surgery, during annual follow-up in the late post-operative phase or in case of weight regain. Dietary assessment methods assessing actual intake may be preferred in the early postoperative phase when patients are still adapting to the new eating habits and in case of food-related complaints such as dumping syndrome or hypoglycaemia.
The main strength of this study is the validation of an existing dietary assessment tool in patients with (severe) obesity before and after BS as there is a clear lack of validated, easy-to-use tools within this patient population. Another strength is the use of multiple statistical tests to provide a comprehensive insight into various facets of validity. As Kendall's tau-b correlation coefficients tend to be smaller, we also reported Spearman's Rho correlations to allow for comparison with other research. Furthermore, by choosing 3d-FR as reference method, we minimised the risk of correlated measurement errors between the two methods (20) .
We aimed to determine relative validity of the Eetscore FFQ both before and after BS, but thirty-seven participants dropped out between T0 and T6, resulting in two different study populations. We are aware that the study population at T0 and T6 is therefore not mutually exclusive and direct comparisons between the populations cannot be made. Nonetheless, both populations and the dropouts were similar with respect to sex, age, BMI, smoking status, education, physical activity, prevalence of comorbidities and type of surgery. Moreover, both the study population at T0 and T6 were found representative of the general Dutch bariatric patient population (36) , indicating a minor risk of selection bias.
Another limitation is the lack of a golden standard reference method for dietary intake. To reduce participant burden, we chose for 3d-FR using household measures, which are prone to report bias and are not ideal for foods that are not consumed daily. For future research, we suggest to evaluate the Eetscore FFQ against dietary biomarkers that are suitable for patients after BS to provide an objective measure of dietary intake.

Conclusion
The Eetscore FFQ is a short screener of diet quality that assesses adherence to the Dutch dietary guidelines. Based on our findings, the Eetscore FFQ was considered an acceptable screener for ranking individuals according to their diet quality and showed good reproducibility to monitor relative changes in diet quality over time. However, the tool showed poor absolute agreement and is not suitable for assessing diet quality on the individual level. Future research is needed to improve the use of the Eetscore FFQ for this purpose.