Penn Healthy Diet survey: pilot validation and scoring

Abstract Though diet quality is widely recognised as linked to risk of chronic disease, health systems have been challenged to find a user-friendly, efficient way to obtain information about diet. The Penn Healthy Diet (PHD) survey was designed to fill this void. The purposes of this pilot project were to assess the patient experience with the PHD, to validate the accuracy of the PHD against related items in a diet recall and to explore scoring algorithms with relationship to the Healthy Eating Index (HEI)-2015 computed from the recall data. A convenience sample of participants in the Penn Health BioBank was surveyed with the PHD, the Automated Self-Administered 24-hour recall (ASA24) and experience questions. Kappa scores and Spearman correlations were used to compare related questions in the PHD to the ASA24. Numerical scoring, regression tree and weighted regressions were computed for scoring. Participants assessed the PHD as easy to use and were willing to repeat the survey at least annually. The three scoring algorithms were strongly associated with HEI-2015 scores using National Health and Nutrition Examination Survey 2017–2018 data from which the PHD was developed and moderately associated with the pilot replication data. The PHD is acceptable to participants and at least moderately correlated with the HEI-2015. Further validation in a larger sample will enable the selection of the strongest scoring approach.

Dietary patterns of food intake have been linked to the most common noncommunicable diseases in developed countries (1)(2)(3)(4)(5) .Most clinical care utilises electronic medical record systems that represent a trove of information that can be leveraged for quality improvement and research.Unfortunately, many electronic medical records contain very limited nutrition intake information (6) , data that would be particularly helpful not only for nutrition counseling but also for nutrition epidemiology research in clinical contexts.
Self-reported recalls of the previous day's diet are commonly used to assess food intake and diet quality.The validated Automated Self-Administered 24-hour (ASA24®) is an online tool which allows individuals to self-report their dietary intake using either a computer or a smartphone (7) .The automated multi-pass method used requires multiple queries about the list of the previous day's foods, amounts eaten, food preparation methods, condiments or seasoning added and details forgotten, a cognitively demanding process.It is estimated that most individuals are able to complete the ASA24 in less than 30 min (8) .While self-administered 24-h dietary recalls are an advance for research (7,9) , challenges remain.Individuals who never learned to type, who do not work with computers routinely or who have limited internet access may need to complete the ASA24 using their smartphone.If they are visually challenged, hurried or have limited cell phone minutes or web access, the quality of their dietary recall information may be compromised or incomplete.Moreover, the amount of time required for completion and data analysis may lead to selection bias in research and generally make use of self-administered 24h dietary recalls impractical for clinical care.To assess food patterns on a larger scale with lower participant burden, a simpler solution is needed.
Dietary screening tools hold promise for filling this gap in nutrition information when care is provided in time-limited settings (10)(11)(12)(13)(14)(15)(16) .Screeners can obtain information about key aspects of the diet quickly, typically in less than 5 min (17,18) .The Penn Healthy Diet (PHD) screening tool (19) was developed using dietary intake data in the 2017-2018 National Health and Nutrition Examination Survey (NHANES) (20) with a focus on food groups that comprise the Healthy Eating Index (HEI)-2015 (21,22) .To better reflect the diversifying USA population, since 2011 NHANES has oversampled several groups including Hispanics, Non-Hispanic Blacks, Non-Hispanic Asians, Non-Hispanic Whites, persons over age 80 years and those living below 185 % of the federal poverty line (20) .The PHD queries the number of times (0 to 5 or more) items from twenty-nine food groups were consumed during the previous day.The request to report number of exposures was selected in lieu of portion sizes that are both difficult to assess and cognitively challenging to estimate.Commonly reported foods in the NHANES 2017-2018 diet recall data were added to each food group to provide currently available examples in the USA food supply.Items delineating types of protein foods and behavioural items were added to aid in facilitation of nutrition counseling goals.
A simple scoring algorithm identified twelve food groups (fruit juice, fruit, green/leafy vegetables, red/orange vegetables, whole grains, milk, seafood, plant proteins, nuts/seeds, sugary beverages, refined grains and cheese) and three behavioural items (full fat dairy, added butter/gravy and added oil) that were moderately or strongly associated with HEI-2015 variables.The simple scoring approach with PHD data correlated strongly (Spearman's rho = 0•75) with the HEI-2015 score from NHANES diet recall information (19) .However, in the initial study, participant responses to the PHD screener questions were simulated based on their reported food intake in the NHANES 24-h dietary recalls.Further comparison of the PHD against related items in a concurrently collected 24-h recall is needed to confirm these initial findings and to refine the scoring approach.
The purposes of this research were to assess participant experience with the PHD screener in a clinical sample, to validate the accuracy of the PHD screening tool against related items in the ASA24 and to explore new scoring algorithms with relationship to the Healthy Eating Index-2015 computed from the ASA24 data.We hypothesised that the PHD screener would be acceptable to participants, that the screener would identify the consumption of similar foods to those identified with the ASA24 and that a scoring algorithm based on data collection via the PHD would correlate with parameters in the HEI generated from ASA24 data.

Participants
The Penn Medicine BioBank (PMBB) has enrolled 174 712 racially diverse participants (56 % female; 17 % Black, 71 % White, 4 % Asian, 3 % Other and 6 % Unknown) with a median of seven years of prospective health and disease data mapped to ICD diagnostic codes as well as the complete electronic medical record data (imaging, clinical laboratory measures, procedures, clinical notes, etc.) for each participant (23) .The PMBB also includes a biorepository of blood and tissue samples for genetic and other omics assays; to date, whole exome sequence and genome-wide genotype data are available for approximately 45 000 participants.However, dietary intake data are not collected in any systematic way.The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board (or Ethics Committee) of University of Pennsylvania (protocol codes: 808346 approved 07/01/2008, 813913 approved 4/3/2013 and 817977 approved 6/ 6/2013) for studies involving humans.Potential respondents provided informed consent by their electronic response to the approved email invitation.

Design
A convenience sample of 385 PMBB participants who were recently treated by the gastroenterology service was invited to participate by email or telephone call.Email invitations were sent via the secure Research Electronic Data Capture (REDCap) application (24) , with a follow-up reminder email one week later, if a patient did not respond to the initial invitation.The email invitation led to the PHD screener survey and a unique login credential and password for the ASA24 website to complete the dietary recall.Participants who did not have an email on record with the PMBB were contacted by telephone to offer the opportunity to participate by email invitation, mail to their home address or verbally by telephone.If a patient did not respond to the telephone call, a voicemail message was left which detailed the participation and remuneration process.Patient responses were recorded in the REDCap survey system.If a patient requested telephone assistance to complete the ASA24, questions regarding the patient's previous day's dietary intake were asked in the order and format as they appear in the ASA24 system.The patient's answers were entered directly into the ASA24 site as they responded to the questions.Participants were incentivised $5 for completing the PHD and $20 for completing the ASA24, using the Greenphire® ClinCard program.The incentive amounts were designed to recognise the likely time commitment of the subject to complete the surveys.
Participant experience questions were included at the end of the screener to gauge the ease of participation, and the device that was used to complete the survey and willingness to complete the PHD again.Time to complete the screener was computed by subtracting start time from completion time in REDCap.The ASA24 site reports time to complete the ASA24.
Participant demographics were abstracted from the electronic medical record.Diagnosis codes (ICD-10) were collected and used to calculate Charlson Comorbidity Index (25,26) .

Statistical analyses
Summary statistics were computed for participants' clinical and demographic characteristics using means and standard deviations and frequencies and percentages for continuous and Penn Healthy Diet survey validation and scoring categorical measures respectively.Likert-scale responses to the patient experience questions to the PHD were described as frequency and percentage.
The PHD screener queries the number of times (0 to 5þ) on the previous day that foods from twenty-nine food groups were eaten (water, coffee/tea, fruit juice, whole fruit, whole grains, refined grains, alcohol, sugary beverages, diet soda, milk, yogurt, cheese, eggs, poultry, seafood, plant proteins, red meat/pork, cured meat, fried foods, fast food meals, nuts, desserts, sweet snacks and salty snacks) (19) .Similarly to the HEI-2015, the PHD asks for the number of servings of unsweetened fruit juice.However, fruit drink and fruit punch are included in the sugary beverages question to capture this significant source of added sugar and processed foods.The PHD also includes seven behavioural questions with significant information for diet counseling (adding energy content to coffee/tea such as sweeteners or cream; adding artificial sweeteners; using full fat dairy products; adding salt at the table; adding butter/gravy and using oils in cooking).While the PHD was designed based on the HEI-2015, the HEI-2015 does not address intake of alcohol, water or coffee/tea, or the behavioural variables in the PHD in a discrete way.These items were included in the PHD to aid in diet counseling for adequate fluid intake or potential sources of energy content or salt.
The simple scoring approach originally developed for the PHD was based on those food groups that had at least a moderate association with HEI-2015 variables (19) .The score added one point for each time the previous day healthy foods were reported (fruit juice; fruit; green/leafy vegetables; red/ orange vegetables; whole grains; milk; fish/seafood; plant proteins and nuts/seeds) and reverse scored those with negative correlations (sugary beverages; refined grains and cheese).Two behavioural questions added one point for a negative response (added butter/gravy; full fat dairy) and one for a positive response (added oil).The score range was 0-63 with higher score indicating a healthier diet.
The food group file provided by the ASA24 comprises selfreported serving size of foods ingested by the respondent.By contrast, the PHD asks only how many times a food was eaten the previous day, without consideration of serving size.Each food item reported by participants in the ASA-24 diet recall was coded by a nutrition science expert into categories which correspond to each of the twenty-nine screener food group items.The three questions about eating behaviours in the simple scoring approach were not included because no equivalent item was found in the ASA24 food list.While these behaviours have recognised impact on health and intake of energy and saturated fat, they are not recognisable as discrete variables in the HEI-2015 score.
To evaluate the similarity of related responses to the PHD relative to the ASA24, two approaches were used based on the presence or absence of a food in the ASA24 report.For each food group, observations were coded as insertions if the food was reported in the screener but not in the diet recall.Observations were coded as deletions if the food was reported in the diet recall but not in the screener.Lastly, observations were coded as congruent when a food group was reported consistently in both the diet recall and the screener.Frequency tables were produced to summarise insertions, deletions and congruent reporting across each food group.Cohen's Kappa statistic was computed to assess the extent of agreement in the presence and absence of each food group's reporting across the screener and diet recall responses.Cohen's Kappa assesses the extent to which two categorical ratings agree beyond chance, where negative scores indicate less than chance agreement and scores 0-1 indicate greater than chance agreement (27) .Data published as kappa results were interpreted as follows: Values ≤ 0 as no agreement, 0-0•20 as no to slight agreement, 0•21-0•40 as fair, 0•41-0•60 as moderate, 0•61-0•80 as substantial and 0•81-1•00 as almost perfect agreement (28,29) .
Two additional novel data-driven approaches to development of a screener scoring algorithm were taken utilising simulated screener responses derived from NHANES 2017-2018 data as the training set but then testing the strength of the models in the 3P study data.Using the NHANES data as a training dataset, simulated screener responses from the NHANES participants were operationalised as predictors of the HEI-2015 score in a regression tree model.The regression tree model was selected based on Spearman correlations between simulated screener values and HEI-2015 scores.The model utilised reduced error pruning and employed a minimum leaf size of ten and maximum tree depth of 9 to avoid potential overfitting to the training dataset.The selected regression tree model was then operationalised as a scoring algorithm to compute predicted HEI-2015 scores for the 3P study data.Utilising the 3P study data, the Spearman correlation between the PHD screener score and the ASA-24-derived HEI-2015 score was then computed to determine concurrent validation of the scoring algorithm.Regression tree modelling was conducted using Proc HPSPLIT in SAS 9•4.
In this model, each simulated screener response (excluding dichotomous behavioural questions) was included as a predictor of the HEI-2015 scores.Regression coefficients derived from the multivariable model were then applied as item weights to compute a screener score for both the NHANES and 3P study data.Spearman correlations were computed to assess scoring algorithm performance on the NHANES data, as well as to determine concurrent validation using the 3P study data.Statistical analyses were conducted using SAS 9•4.

Participants
A total of 385 individuals were invited to participate in the pilot survey.Participants were contacted by email (n 355) if an email address was available or by telephone call (n 33).For two participants contacted by telephone who were interested in completing the PHD screener by telephone, each question and possible response was read out in entirety.One patient requested to complete the ASA24 recall with assistance over the telephone.The overall response rate was 60 (15•6 %) and 23 (5•9 %) of the total invited continued to complete the ASA24.However, another 27 (45 %) never logged in, and 5 (8 %) logged in but quit before completing the dietary recall.
The demographic characteristics of respondents to the surveys are shown in Table 1.The mean age was 59 years, 75 % were female, predominantly non-Hispanic White and obese.The Charlson Comorbidity Index score was 1•32.

Patient experience with the Penn Healthy Diet
Responses to the patient experience questions and response time are shown in Table 2.All respondents reported that the PHD was easy to complete, and 99 % would be willing to repeat the survey.The most common frequency of willingness to complete was monthly, but all would complete it at least annually.The most common device used to complete the PHD was the cell phone (52 %) followed by the computer (39 %) and tablet (5 %).The median times to completion of the screener were 5 min for cell and tablet and 7•5 min for the computer.While the ASA24 site does not identify which device was used to complete the dietary recall, the minutes required for completion was median (IQR) 21 (13-39) min.

Accuracy of Penn healthy diet screener item responses relative to dietary recall
Comparison of the accuracy of the PHD relative to responses to the same patient's response to the ASA24 is shown in Table 3. Congruent responses occurred far more commonly than insertions or deletions.Only three items had < 70 % congruence.The whole grain bread item appeared on the PHD before white bread, and 52 % of responses reported whole grain bread on the PHD but not the ASA24.By contrast, 44 % of respondents reported white bread on the ASA24 but not on the PHD.Nuts were reported congruently by 65 %, as insertions on the PHD by 30 % and as deletions by 4 %.In assessing statistical agreement of food categories in the screener and diet recall, Kappa statistics ranged from −0•12 to 0•88 (mean = 0•44) with 15/29 screener items exhibiting moderate to strong agreement with the ASA24.

Screener scoring approaches
Using the simple scoring algorithm recently published (19) , in respondents to both the PHD and the ASA24, the PHD score was 14•5±4•68 and the HEI-2015 score based on ASA24 data was 52•7 ±16•5.The screener score exhibited a moderate correlation with the HEI-2015 total score (r = 0•59, P = 0•0034).Online Supplementary Fig. 1 presents a heatmap which visualises associations between screener items and total score v. HEI-2015 subcomponents and total score.As with the HEI-2015, a higher screener score implies a diet with greater nutrient than energy density.
The regression tree model selected the same twelve variables as the simple scoring algorithm, but added nine additional variables (alcohol, desserts, cured meat, red meat/pork, fast food meals, coffee/tea, poultry, savory snacks and eggs).
The multivariable regression model constructed using NHANES 2017-2018 data with simulated screener responses also selected the same twelve variables as the simple approach but added the same nine additional variables as the regression tree but also diet soda and yogurt (online Supplementary Table 4).The five most important food groups with negative betas were refined grains Table 4 contains the correlation coefficients comparing responses to the PHD score to the HEI-2015 score from dietary Because the screener scores are of different magnitudes, a comparison of the three approaches statistically is not possible.

Discussion
There is substantial need for more efficient and user-friendly methods to gauge diet quality.We developed the PHD screener to help meet this need (19) .In this study, we demonstrated that the patient experience of the PHD screener was generally positive, and the time to completion was in the range expected for a dietary screener.Responses of times foods were eaten were predominantly congruent with the ASA24 data.All three PHD scoring approaches identified similar food groups, and the correlations based on 3P replication data were only moderate while those based on the NHANES derivation set were strong.This PHD screener bears some similarities to other tools developed for use in European clinical settings.The Mediterranean Diet Adherence screener was developed for use in a large Mediterranean diet intervention in Spain (15) , recently validated for use in the UK (14) and was recently updated with an energy-restricted version (16) .While the PHD screening tool targets similar foods that are more commonly available in the USA diet, neither the PHD nor Mediterranean Diet Adherance Survey permits assessment of energy intake.However, the close alignment of the PHD with the HEI-2015, a tool that is energy-adjusted, suggests that higher scores with the PHD would indicate a diet that is more nutrient-than energy dense.A brief diet quality assessment tool for use in French- speaking Quebec was developed relative to the Alternative Healthy Eating Index score (12) , but it has not yet been piloted in healthcare settings.Two tools have been developed in Finland, one for use in adult primary care (11) and the other for dietary counseling by professionals without specific training in nutrition (13) .The PHD survey was designed to support dietary counseling and to give a sense of overall diet quality.
As with the other European and Canadian recent screener evaluations (10)(11)(12)(13)(14)(15)(16) , a strength of the PHD is that it was tested here in a sample from the setting in which it would be useful.The favourable subject experience data are also encouraging about the feasibility of employing the tool more broadly.However, the relatively low response rate to our unsolicited email request to participate suggests that a different approach to obtaining data from biobank participants is indicated.The low response rate also carries the risk that responders were not fully representative of other invitees in terms of interest in diet or willingness to report detailed dietary information.We note that the participants who participated had similar self-reported race, but a larger proportion of females when compared with published data from the PMBB.Women have been described by others as more interested nutrition and healthy diet than men (31,32) .While these similarities are encouraging, we cannot evaluate how representative this sample is of general hospital samples in the USA.A strength of the 3P pilot project is that the same subjects responded to both the PHD and the ASA24 surveys.While we cannot determine whether the same device was used for the ASA24 as for the PHD, the completion time was considerably shorter with the PHD, as expected.The most common device used for the PHD and probably the ASA24 was the cellphone.If the requirement of multiple screen views for the ASA24 is tedious by cellphone, this may be why so many started the diet recall but stopped before completion.It is encouraging that the same food groups were identified with both novel approaches to scoring, those with positive beta were foods scored positively (adequacy items) and those with negative beta were foods scored negatively (to take in moderation) by the Healthy Eating Index-2015.A weakness in the original version of the PHD tested here was that respondents appear to have entered white bread as whole grain bread, possibly because the whole grain option appeared earlier in the survey.Since white grains are eaten more commonly than whole grains in the USA, we will list these options prior to the whole grain options in future surveys to avoid confusion in responses for subjects who are not clear about the meaning of whole grain bread.A limitation to this pilot project and to the strength of our conclusions about a scoring algorithm is the small sample size.A larger validation study is underway using the reordered white v. whole grain options that will improve our ability to choose the best-performing scoring approach while also assessing the value of the screener to predict metabolomic signatures.We will continue to use the original simple scoring algorithm until a stronger option is clear.
In conclusion, the PHD screener has demonstrated high levels of acceptance by participants.The PHD screener is strongly correlated with the HEI-2015 derived from 24-h dietary recalls.However, further research is needed to identify the optimal scoring algorithm for clinical and research purposes.Nonetheless, the routine use of electronic medical records creates the potential to automate delivery and scoring of the PHD within the context of routine care.As such, the PHD holds enormous promise for nutrition-focused clinical care and research.

Table 1 .
Demographic characteristics of survey respondents

Table 2 .
Patient experience with Penn healthy diet screener

Table 3 .
Comparison of Penn healthy diet screener responses to ASA-24 responses