Harmonising dietary datasets for global surveillance: methods and findings from the Global Dietary Database

Objective: The Global Dietary Database (GDD) expanded its previous methods to harmonise and publicly disseminate individual-level dietary data from nutrition surveys worldwide. Design: Analysis of cross-sectional data. Setting: Global. Participants: General population. Methods: Comprehensive methods to streamline the harmonisation of primary, individual-level 24-h recall and food record data worldwide were developed. To standardise the varying food descriptions, FoodEx2 was used, a highly detailed food classification and description system developed and adapted for international use by European Food Safety Authority (EFSA). Standardised processes were developed to: identify eligible surveys; contact data owners; screen surveys for inclusion; harmonise data structure, variable definition and unit and food characterisation; perform data checks and publicly disseminate the harmonised datasets. The GDD joined forces with FAO and EFSA, given the shared goal of harmonising individual-level dietary data worldwide. Results: Of 1500 dietary surveys identified, 600 met the eligibility criteria, and 156 were prioritised and contacted; fifty-five surveys were included for harmonisation and, ultimately, fifty two were harmonised. The included surveys were primarily nationally representative (59 %); included high- (39 %), upper-middle (21 %), lower-middle (27 %) and low- (13 %) income countries; usually collected multiple recalls/ records (64 %) and largely captured both sexes, all ages and both rural and urban areas. Surveys from low- and lower-middle v. high- and upper-middle income countries reported fewer nutrients (median 17 v. 30) and rarely included nutrients relevant to diet-related chronic diseases, such as n-3 fatty acids and Na. Conclusions: Diverse 24-h recalls/records can be harmonised to provide highly granular, standardised data, supporting nutrition programming, research and capacity development worldwide.

community-level nutrition surveys at the individual level have been conducted worldwide (15) .However, these are rarely harmonised, hence not comparable between and within countries, populations and over time, due to differences mainly in representativeness, diet assessment tools used and data analysis and reporting.
Over the past 10 years, dietary surveys from around the world have been collated and harmonised at the food group level, e.g. total fruit intake, as part of the Global Dietary Database (GDD).However, such efforts have not previously accounted for the often large variation in the classification and description of individual food items, for example related to differing food definitions, preparation methods, local food names and more.For example, a food may be described by a simple food name (e.g.chicken), but not capture multiple additional levels of detail (e.g.baked or fried; light or dark meat; with or without skin and with or without added oil or salt).Local food names can also denote different foods in varying countries or even within the same country (e.g.biscuit in the USA v. the UK).Such heterogeneity can lead to inconsistent food names, definitions, groupings, food matching (referring to the process by which a food is assigned to its corresponding nutrient content using a food composition table or database) and ultimately discrepancies in analysis and reporting.Furthermore, inaccurate or inconsistent food aggregation (or classification) of individual food items with similar characteristics into larger food groups/ categories in a hierarchical manner can compound the errors.This could include, for example, heterogeneity in classifying a food (e.g.avocado and tomato) as a fruit v. vegetable or in different levels of food groups (e.g. total vegetables and starchy vegetables).Accurate food classification is especially crucial for mixed dishes and packaged foods, which are an increasing part of the global food supply and must be disaggregated into their ingredients in a standardised fashion to capture intake from all sources.Moreover, even if standardised, such dietary data are rarely publicly available, limiting use by and impact for national and global nutrition communities (16) .
To address these critical gaps and advance the quality and quantification of dietary intakes worldwide, the GDD, in collaboration with the European Food Safety Authority (EFSA) and the Food and Agriculture Organization of the United Nations (FAO), developed and implemented comprehensive and standardised methods to streamline the collation, harmonisation and public dissemination of primary individual-level dietary datasets worldwide.

Study overview
The methods for this work build and expand upon the approach of the GDD 2018 to identify, collate and standardise existing dietary data and create a comprehensive and comparable database of diet intakes globally (15) .In brief, GDD systematically searched for and identified surveys with quantitative intake data for any of fifty-four foods, beverages and nutrients of interest; extracted intakes using standardised definitions and units per dietary factor, jointly disaggregated by age (23 age groups from 0-6 months to 99þ years), sex, education (low, middle and high), urban/ rural residence and pregnancy/lactation status, as available and incorporated these inputs along with relevant covariates into a Bayesian hierarchical prediction model to estimate stratum-specific mean intakes worldwide.
This current project expanded these methods and protocols (17)(18)(19) to identify, retrieve, harmonise and disseminate individual-level 24-h dietary recall or record data at their finest level (hereinafter referred to as microdata).Standardised criteria, processes and materialsas described in detail belowwere developed (Fig. 1, see online supplementary material, Supplemental Table 1).

Collaborations
Given the shared goal of harmonising individual-level dietary data worldwide, the GDD joined forces with FAO and EFSA (20) .In 2017, FAO and the WHO developed the FAO/WHO Global Individual Food consumption data Tool (FAO/WHO GIFT) (16,21) , aimed at harmonising and disseminating individual-level dietary microdata, as well as presenting summary statistics in the areas of food consumption, nutrition and food safety (16) .EFSA, a European agency providing independent scientific advice on food-related risks, developed and maintains the FoodEx2 food classification and description system (FCDS) and has been working, for over 10 years, on the harmonisation of dietary microdata.Each initiative developed and used their own processes and materials with regards to data harmonisation and dissemination.Yet, through our collaboration, we aimed to maximise efficiency by exchanging resources, methodological approaches and complementary experience and expertise.

Eligibility and prioritisation of dietary surveys
Surveys were eligible if they had valid quantitative 24-h recall or food record data and aiming to assess the general population of any country (see online supplementary material, Supplemental Table 2).Surveys with less than 100 participants, evidence of strong selection bias (e.g.special population subsets such as with medical conditions), published before 1980, or not agreeing to make the harmonised microdata publicly available through the GDD platform were excluded.Surveys needed to include a minimum level of detail required for FoodEx2 mapping (i.e.use of the FoodEx2 FCDS to code all reported foods): specifically, foods should not be reported as 'generic' terms (e.g.fruit instead of apple, orange, etc.; legumes instead of beans, lentils, etc.).
Due to the time-and resource-intensive nature of FoodEx2 mapping (see below), the GDD aimed to ultimately include and harmonise 40-50 dietary datasets.Surveys were prioritised based on key characteristics, such as being nationally representative and from populous countries (see online supplementary material, Supplemental Table 3) and aiming to include low (LIC), lower-middle (LMIC), uppermiddle (UMIC) and high (HIC) income countries (22) .Subnational and community-level surveys were considered if national data were unavailable for a priority country or if the FoodEx2 mapping had been already performed.Surveys that were more recent, included both men and women and diverse ages, and captured both rural and urban areas were further prioritised.For publicly available surveys, those with datasets in English or Spanish that could be readily managed by the GDD investigator team were prioritised.

Survey identification
The GDD searched for public and non-public dietary surveys in multiple electronic databases and through extensive direct communications with data owners, nutrition experts and collaborators worldwide.The survey lists of the GDD 2018 (15,23) , EFSA (20,24) and FAO/WHO GIFT database were screened (21) .The GDD 2018 incorporated 1220 public and non-public surveys, including 286 with 24-h recall data covering all world regions.The EFSA database comprised sixty-nine surveys with 24-h recall/ food record data from European countries (fourteen in common with GDD 2017) for which harmonisation and FoodEx2 mapping had been completed, but the datasets were not publicly available (20) .The FAO had identified 336 surveys with 24-h recall data (110 in common with GDD 2018) for harmonisation (21) .Complementary online Fig. 1 The GDD harmonisation process: steps, actions, and developed material.The Global Dietary Database aimed to harmonise dietary surveys with quantitative 24-h recall or food record data from around the world.For details on material description and use, refer to Table S3 searches were performed in PubMed and Google to identify more recent surveys through December 2019.  1, Supplemental Figure 1) were developed for contacting the data owners of eligible surveys.To augment internal and external consistency and data validity, the GDD, in collaboration with FAO, aimed to establish contact with all prioritised surveys, including publicly available ones.Data owners were invited to participate, harmonise and share their data under one of three options: (1) structure their data per the GDD requirements (see next section for details) and perform the FoodEx2 mapping; (2) structure their data for the FoodEx2 mapping to be performed by the GDD investigators or (3) share their original raw microdata to be structured and mapped to FoodEx2 by the GDD investigators.
Interested data owners completed a nine-question Short Survey Questionnaire (see online supplementary material, Supplemental File 1) to confirm 24-h recall or food record data availability and structure.Questions assessed mixed dish disaggregation, English translation, food matching, FoodEx2 mapping, number of unique food items and mixed dishes and data sharing option (of the three described above).One investigator screened responses, with questions resolved by consensus.The Short Survey Questionnaire did not apply to already harmonised EFSA and FAO/WHO GIFT surveys.For eligible non-public datasets surveys, data owning institutions signed a data sharing agreement; for public datasets, eligibility for reposting was determined by their terms of use.
Data owners were offered remuneration for harmonisation efforts of up to $7700 per harmonised survey; 57 % of data owners accepted remuneration (mean $2584 per survey).The survey-specific amount was determined based on an algorithm, which was informed by the Short Survey Questionnaire and accounted for the complexity of the required efforts.

Data retrieval and harmonisation
Standardised protocols and forms were developed to retrieve survey information and data in a systematic manner and achieve high-quality harmonisation.Survey information was retrieved using a metadata formadapted from the FAO relevant oneand it captured, among other characteristics, survey name, country, year(s), sampling methods, representativeness, sample (size, age and sex), diet assessment method (seasons and days covered, single v. multiple, portion size estimation aids) and food composition table/database (FCT/FCDB) (see online supplementary material, Supplemental File 2).Individuallevel participant and dietary data (termed microdata) were retrieved using a standardised codebook and extraction template, building on those developed by EFSA.The codebook included instructions on how data owners should structure their data; the variables (name, definition and values) to use for reporting participant data, such as socio-demographics (age, sex and education), anthropometry (weight, height and BMI), pregnancy/lactation status, breast-feeding status and physical activity (see online supplementary material, Supplemental Table 4) and the variables (name, definition, unit of measurement and values) to use for the dietary data, which included all food items consumed per participant, meal type (e.g.breakfast, lunch), time and place of consumption, amount consumed and contents of energy and thirty-nine nutrients (see online supplementary material, Supplemental Table 5).

Microdata harmonisation
Microdata harmonisation included standardising variable definitions and coding, data structure and food description.For each step, a standardised process (see online supplementary material, Supplemental Figure 2) and instructions (see online supplementary material, Supplemental Figure 3) were developed to maximise validity and consistency.Each variable in the participant and dietary datasets was characterised according to the standardised codebook previously mentioned (see online supplementary material, Supplemental Tables 4 and 5).The dietary data structure included individual food items consumed per participant, each presented in a separate row along with the corresponding amount and nutrient content.Mixed dishes were disaggregated, as possible, into all ingredients and corresponding amounts (e.g.lentils (200 g) = lentils (130 g), tomatoes (50g), onion (10g), olive oil (10g)); if disaggregation was not part of the data collection, data owners were asked to perform it post hoc, based on commonly used recipes among the reference population.Nutrients were assigned to each food item and disaggregated ingredients; food matching was performed by the data owners using the FCT/FCDB they considered as more appropriate for their needs.All reported data were retrieved in English.For datasets harmonised by EFSA and FAO, variables were renamed, recoded and retrieved (e.g.nutrient intakes for EFSA datasets and additional socio-demographics, such as education, for all datasets), as appropriate, so that all harmonised datasets included the GDD standardised set of variables.

FoodEx2 mapping
Each food item and disaggregated ingredient was coded according to FoodEx2 (25) , an FCDS including >4540-terms organised across twenty-one top-level food groups, each further split into up to seven levels of food groups or foods (see online supplementary material, Supplemental Table 6) and a twenty-eight-facet description system (see online supplementary material, Supplemental Table 7).FoodEx2 facets include, for example, cooking and other processes (frying, mincing, sugar-coating, etc.), sweetening agent (including low-or non-energetic sweeteners), production method (organic farming, aquaculture, etc.); yet, not all facets are relevant to food consumption (e.g.packaging material is relevant to food safety).All FoodEx2 foods and facet descriptors have a unique five-character alphanumeric code.For all datasets, each individual food item, after mixed dish disaggregation, was mapped to the appropriate FoodEx2 food term.For food items that were further described, as many facets possible were used to describe all additional characteristics reported (see example below).Mapping was performed manually for each food item by a trained coder using the FoodEx2 Catalogue browser (26,27) , a desktop application which includes all FoodEx2 foods and facets, allows iterative building of coding and accounts for multiple mapping rules.The GDD investigators received a rigorous 3-day in-person formal training at EFSA's headquarters (28) and observed a data owner training session delivered by FAO.To enable data owners to also perform high-quality FoodEx2 mapping themselves, a structured training process and relevant materials (see online supplementary material, Supplemental Table 1) were developed; all were reviewed and approved by EFSA.The training materials included a presentation on FoodEx2 theory, concepts, rules and mapping examples (see online supplementary material, Supplemental Figure 3), and a thirty-two-food item exercise to practice the theory learned; EFSA's FoodEx2 technical report (25) and webinars (29) were further shared as reference materials.Moreover, a 2-hour web call was held to provide clarifications on FoodEx2 mapping and map survey-specific foods.Success of training was validated by asking data owners to map 100 food items from their dataset and handchecking the results (see online supplementary material, Supplemental Figure 2), with follow-up personalised training as needed.

Quality control
Data integrity and quality were assessed throughout the retrieval and harmonisation process.All metadata forms were reviewed by one investigator to confirm all fields were completed correctly; with random double checks by a second investigator.Microdata retrieval and harmonisation included specific actions and milestones (see online supplementary material, Supplemental Figure 2), which served as intermediate check points.This included whether datasets had all variables as dictated by the codebook, whether mixed dishes were fully disaggregated, assessment for outliers and errors in amounts consumed based on FoodEx2-code specific EFSA plausibility thresholds and visual inspection of nutrient intakes at the food item level.In addition, all FoodEx2 mapping performed by data owners was reviewed by one investigator for errors with random second reviews of 20-25 % of the foods by a second investigator.For FoodEx2 mapping performed by GDD, up to 50 % of the foods were reviewed randomly by a second investigator.Surveys already harmonised by EFSA and FAO were checked using each initiative's internal protocols.Data checks were performed using Stata v14•0 (StataCorp LLC).Identified potential errors, inconsistencies or implausible values were discussed within the GDD team, followed by direct communication with data owners to obtain clarifications and additional consultation with EFSA for remaining uncertainties.

Dissemination
For each survey, the metadata, microdata and corresponding standardised codebook were prepared.All harmonised surveys are publicly available and free to download through the GDD website, (23) with parallel availability through the FAO/WHO GIFT website for consenting surveys; any future harmonised surveys will be added on a rolling basis.Three files are available for each harmonised survey: the participant dataset, the diet dataset and the codebook, which includes the citation of the harmonised survey, survey metadata, all data variables and instructions on how to use FoodEx2 to classify foods into food groups.

Survey identification, screening and inclusion
Of 1500 surveys identified, 600 were eligible for inclusion (Fig. 2).Of those, GDD prioritised and contacted 156 surveys.Of the 156 surveys, twelve (8 %) were excluded based on the Short Survey Questionnaire, e.g.related to qualitative data collection or microdata no longer available; twenty three (15 %) refused to participate; thirty eight (25 %) never responded and twenty-eight surveys (18 %) responded but never made a final decision about participating or not.Overall, fifty-five surveys (35 %) were ultimately included for harmonisation, of which forty two (76 %) were not publicly available (Table 1).Of the fiftyfive surveys, twenty four were retrieved directly from data owners, fifteen from EFSA and sixteen from FAO/ WHO GIFT.

Data harmonisation
For the twenty-four surveys retrieved from data owners, eleven agreed to provide both data structure (including mixed dish disaggregation and food matching) and FoodEx2 mapping; five agreed to data structure only and the GDD performed the FoodEx2 mapping and eight provided their data as is, including seven publicly available ones, which the GDD fully harmonised (Table 2).For fifteen surveys retrieved through EFSA that were lacking the food matching, thirteen agreed to perform this following contact by the GDD.All sixteen surveys retrieved through the FAO/WHO GIFT platform were fully harmonised.Among data owners performing the harmonisation, the most resource-intensive action was mixed dish Fig. 2 Screening and selection process of dietary surveys with valid 24-h recall or food record data for GDD harmonisation.All public and non-public priority surveys were contacted, with the exception of the three NHANES surveys, for which the harmonisation could be performed independently.Of the 55 surveys included for harmonisation, the harmonisation has not been completed for three of these due to time and resource restrictions *As sub-nationals were considered the surveys which were representative of a major region of the country and as community-level the surveys which were representative of a single city, town or village.†Rolling programs refer to surveys whose administration is repeated every year or every few years; these surveys may or may not follow the same population.‡The reported sample size, age range and sex refer to the participants with available dietary data and not to the overall survey sample.§Country-income level was based on the World Bank classification of countries into low-income (LIC), lower-middle income (LMIC), upper-middle income (UMIC) and high-income (HIC) countries. (22).||The number in the parenthesis is the number of recalls or records collected per participant.When multiple recall/ record is reported, it applies to either part of the sample or all participants.
¶Financial support was offered to the data owners of all surveys, with the exception of publicly available datasets and surveys whose harmonisation had been already completed by FAO.**The leading initiative was responsible for harmonising the dietary data or for guiding the data owners to perform the harmonisation themselves.* †Refers to the data sharing option selected by data owners, and it is only applicable to surveys whose harmonisation was led by GDD.Option A refers to data owners agreeing to structure their data per the GDD requirements and perform the FoodEx2 mapping; option B includes agreement to data structure only and GDD to perform the FoodEx2 mapping and option C refers to sharing data as is, without structure or FoodEx2 mapping and GDD performs the overall harmonisation.NA refers to surveys that have been harmonised by FAO or EFSA.
disaggregation (generally 2-6 months to complete), followed by FoodEx2 mapping (1-4 months) and food matching (1-3 months).With experience, such as in the GDD investigator team working with surveys over time, the harmonisation of each public dataset took from 1 to 3 months.Common challenges included understanding the reported information and data, especially related to variable dataset documentation, and FoodEx2 mapping for highly localised food items.

Survey characteristics
The fifty-five surveys included for harmonisation were from thirty-five countries (Fig. 3) and included twenty-four surveys from HIC, ten from UMIC, fifteen from LMIC and six from LIC (Table 1).Most surveys (n 32, 58 %) were nationally representative, fifteen (27 %) were sub-national, and eight (15 %) were at the community level, especially in LIC and LMIC where nationally representative 24-h recall data were limited (Table 2).Half of the surveys (n 27) were rolling programs.About two-thirds (n 34, 62 %) were performed after 2010 and few (n 2, 4 %) before 2000.The predominant diet assessment method was 24-h recall (n 43, 78 %), followed by food records (n 9, 16 %), whereas three (5 %) surveys used both methods.Most surveys (n 38, 69 %) collected multiple recall/ record data for a subset or all participants.Generally, LIC and LMIC had smaller-scale surveys (median 826 participants; IQR 434-1665), compared with HIC and UMIC (1832; 893-6860).Most surveys (n 44, 80 %) collected data for both sexes; most (n 49, 89 %), on children and adolescents (ages 0-19 years), including twenty-seven surveys with data on children under 5 years of age; and nearly two-thirds (n 32, 58 %), from both rural and urban areas.Twelve of the twenty-one surveys from LIC and LMIC were administered in children under 5 years of age and/or women of reproductive age.Data reporting language varied across and within datasets (twenty-eight languages and multiple dialects).Surveys HIC, high-income countries; LMIC, lower-middle income countries; LIC, low-income countries; UMIC, upper-middle income countries; FCDS, food classification and description system; IQR, interquartile range.*The sum of surveys across the different assessment methods is more than the overall sum because three surveys used both 24-h recall and food record.†The institutional home refers to the data owning institution/organisation of the survey data.The data owning institution could be either governmental (e.g.ministry, statistical agency), an academic institution (e.g.university), other institution (e.g.non-governmental organisations, independent research institutes), or individuals who self-funded the survey data collection.‡This applies only to the twenty-four GDD surveys.The surveys the GDD received from FAO and EFSA were already harmonised.§Values refer to the harmonisednot to the originaldataset.Only surveys whose harmonisation was complete (n 52 of fifty-five available) were used.
were originally collected using pen and paper in forty-four surveys (80 %), with computer-assisted interviews only in HIC and UMIC.Survey data owners varied between governmental institutions (n 22), academic institutions (n 22), independent research or non-governmental organisations (n 9) and individual scientists (n 2).

Survey data
Surveys originally had either no (n 39) or a national or survey-specific (n 16) food classification system in place.
The number of reported unique food items varied across datasets (median 719, IQR 232-1804, max 6063) (Fig. 4).All LIC surveys had less than 250 unique food items (149, 92-184); LMIC (269, 87-684) and UMIC (717, 458-1096) surveys included, in general, less than 1000 food items; and HIC surveys generally included over 1000 unique food items (1602, 1068-2784).Mixed dish disaggregation was originally available for twelve of the twenty-four surveys retrieved from data owners; for ten, disaggregation was performed as part of the harmonisation; for two publicly available surveys, this information was not available.For eight datasets, the food description remained in the original language, because FoodEx2 mapping, which enables translation into English, was performed by data owners.All datasets captured the whole diet; more than half (n 31) further captured water consumption.Food matching in HIC and UMIC relied mainly on national FCT/FCDB (twenty-five of twenty-nine surveys with nutrient intakes reported) that contained a large number of available foods (1626, 1040-4300) and nutrients.LIC and LMIC relied also primarily on local national FCT (fourteen of eighteen surveys with nutrient intakes reported), but with a lower number of available food items (343, 152-696) and nutrients; to incorporate additional nutrients into their dataset, data owners generally attempted to use large FCDB of HIC, such as the USDA Food Database (the metadata information that is published as part of the harmonised data includes the specific FCT/FCDB used for each survey).
Four surveys only reported on foods, of which three publicly available ones did not permit the nutrient composition to be made public and 1 did not manage to add the food matching due to constraints related to the COVID-19 pandemic.In the remaining forty-eight surveys, twenty-four (Sd: ±8) nutrients were available on average per dataset (Table 2), with generally fewer available in LIC and LMIC (19, ±5) v. in HIC and UMIC (28, ±7).Energy was available in all forty-eight datasets reporting nutrient intakes, followed by total protein, carbohydrates and total fat (each N 47; Fig. 5).The most frequently reported vitamins and minerals were Ca (n 47), Fe (n 46), vitamin B 1 (n 46), vitamin B 2 (n 46) and vitamin C (n 46) and the least, vitamin K (n 9) and iodine (n 9).Other less frequently reported dietary factors included protein subtypes (n 1-6 surveys) and plant n-3 fatty acids (n 6).

Public dissemination
As of October 2023, the final output of the harmonisation process are fifty-two surveys, which have been harmonised and made publicly available and free to download through the GDD website (www.globaldietarydatabase.org) (23) ; twenty seven of these are also available through the FAO/WHO GIFT website (www.fao.org/gift-individualfood-consumption) (21).

Discussion
This investigation reports on the process and results for identification, retrieval, harmonisation and public dissemination of individual-level dietary data at their finest level from dietary surveys around the world.The GDD identified and included for harmonisation fifty-five surveysfifty two were ultimately harmonisedusing 24-h recalls (78 %), food records (16 %) or both (5 %) from thirty-five countries.Most surveys (58 %) were nationally representative, and countries of all income levels were represented.The surveys largely captured both sexes and diverse ages from birth to late in life and generally collected data for both rural and urban areas.Notably, the majority of these surveys (76 %) were not previously publicly available, requiring extensive work to contact data owning institutions, access data and complete data sharing agreements for public dissemination.Thus far, these harmonised surveys represent the most comprehensive collection of comparable, standardised dietary data at their most granular level globally.
Survey methods for collecting and reporting data were variable, including dataset structure, reporting language, available level of detail in recorded foods and use of an FCDS.This meant that data harmonisation and analysis from diverse countries were particularly complex undertakings.For example, reporting language included over twenty-eight languages and dialects across the fifty-five datasets, stressing the importance of harmonisation to a common language and terminology across datasets.In addition, the combination of open-ended questions in a 24-h recall or record plus local food names resulted in similar items being reported differently across datasets and with varying detail (e.g.'grilled beef' v. 'steak, cow, barbequed').Extensive efforts and clarifications by data owners were often required to fully understand the foods reported.Many surveys had not previously disaggregated their mixed dishes, preventing valid estimates of intakes of different food groups (e.g.fruits, vegetables and meats).Finally, most surveys had not used an FCDS prior to this effort, and those that did used a local or survey-specific one, not comparable across datasets.This effort demonstrated that such foundational challenges can be addressed with a coordinated effort to harmonise and share the original survey information and it greatly advanced data completeness, comparability and availability.The availability of nutrient estimates following harmonisation and comprehensiveness of the FCT used also varied substantially.In general, HIC and UMIC reported more nutrients than LIC and LMIC (median 30 v. 17  carbohydrates, total fat and key vitamins and minerals relevant to specific micronutrient deficiencies in maternal and child health, such as Ca, Fe, Zn and vitamins A, B1, B2 and C.These surveys rarely reported nutrients relevant to diet-related chronic diseases, such as total polyunsaturated fat, plant or seafood n-3 fat, trans fat, Na or potassium.FCT were also typically more comprehensive in HIC and UMIC than in LIC and LMIC (median ∼1626 v. ∼343 food items).This partly reflects the greater diversity of available food items commonly consumed in high-income nations, such as specific packaged and processed foods, compared with low-income nations where staples and variety are much smaller.However, this also reflects less available local food composition data in middle-and low-income nations.
Given the societal burdens of malnutrition in all its forms in every country worldwide, including diet-related chronic diseases (13) , our findings highlight the need for more comprehensive nutrient assessments in LIC and LMIC worldwide, coupled with investment in expanded local FCT.
Our extensive searches only identified twenty-one 24-h recall or record surveys from LIC and LMIC available for inclusion in this effort.These surveys were usually administered within a specific region or community; were small-scale and aimed to evaluate specific nutrients and, as such, nutrient deficiencies, especially focused on maternal and young child nutrition.Such surveys also most often collected data via pen and paper, raising challenges for standardised data collection and future data manipulation (30) .Half of the included surveys (49 %) were rolling programs, i.e. with planned additional cycles of dietary assessment.This raises the possibility of far more efficient, or even prospective, data harmonisation and FoodEx2 mapping in future survey cycles.However, most rolling programs were in HIC or UMIC (n 22), fewer were in LMIC (n 5), and none were identified in LIC.These findings stress such disparities in dietary data availability and quality across countries of different income levels, emphasizing the need for substantial new investments in dietary surveillance and harmonisation efforts in LIC.For example, ongoing major investments in supplementation programs, crop biofortification and food fortification will be more inefficient and less effective without reliable, valid surveillance data on the specific nutrient gaps in specific population subgroups and the specific food intakes that should be targets of fortification delivery intakes.
The current investigation has several strengths.Systematic searches were performed across multiple online databases, complemented by extensive contacts with data owners and nutrition experts worldwide to identify surveys with valid 24-h recall or food record data.The collaboration with FAO and EFSA advanced global coordination including sharing of expertise, experiences, methodological approaches and networks.Standardised methods, protocols and materials were developed for all steps of this effort in order to reduce the likelihood of error during data acquisition and achieve high-quality harmonisation.Individual-level dietary data at their finest level, as well as detailed socio-demographic information per participant, were collected and harmonised, which altogether capture the overall diet of the population, minimise errors in reporting and analysis and provide critical information on dietary heterogeneity within countries and populations.A common FCDS, FoodEx2, was used to address the challenge of food description and classification variations, improve data interpretability, allow for detailed analysis and enable comparability of estimated diet intakes within and between nations, further overcoming any language barriers.To ensure data integrity and maximise internal and external consistency, data owners were extensively trained on restructuring and harmonizing their datasets using FoodEx2 and rigorous data checks throughout the harmonisation process were performed.The outcome of this process, the harmonisation methods and final datasets were made publicly available and freely accessible.These can serve as a critical public resource for researchers, health agencies and governments to inform future dietary data collection efforts and promote global collaboration and capacity development.Such timely, reliable and comparable nutrition methods and data can be leveraged to understand, quantify and address the corresponding nutrition burden locally, nationally and across the world.
Potential limitations should be considered.Nationally representative 24-h recall/record data were less common in LIC and LMIC, requiring greater reliance on subnational surveys.Variability in primary data collection methods and details, as expected, precluded collecting all variables of interest, such as socio-demographics, dietary information and nutrients and leveraging all FoodEx2 facets for every food item.However, by using the rigorous protocols and materials developed, all available information in the primary datasets were converted or derived and standardised.While ultimately fifty-five surveys were included by the GDD for harmonisation, not all potentially eligible data owning institutions were able to participate.Interestingly, the majority of exclusions were due to nonresponse, no final decision, unwillingness to share their data, with only a few noting workload or resource challenges.This raises the question of whether strengthened global coordination, including prioritisation from national governments, could significantly increase such dietary harmonisation efforts with additional resource investment.
In summary, this effort demonstrates that new investments in institutional support and training can result in diverse global 24-h dietary recalls/records being harmonised to provide highly granular, standardised and publicly available dietary data from around the world.Such highquality comparable data can promote and support nutrition programming, research and capacity development worldwide.This work also identifies key gaps in 24-h recall data collection, facilitates and promotes country capacity development that can alleviate many of the barriers that prevent routine harmonised collection of 24-h recall data and highlights the importance of collecting comprehensive harmonised dietary data systematically in all nations.

Fig. 3
Fig. 3 Availability of harmonised dietary surveys per country.The map presents the 55 surveys, from 35 countries, that have agreed to the GDD harmonisation.Of these, the harmonisation has not been completed for three surveys (one in Brazil, one in Taiwan and one in the USA) due to time and resource restrictions

Fig. 4 .
Fig. 4. Number of unique food items per dietary survey by country income level.The top panel shows the distribution of surveys from low-income (LIC), lower-middle income (LMIC), upper-middle income (UMIC), and high-income countries (HIC) across 4 range categories of unique food items per survey.All 55 surveys included for harmonisation were used.The bottom panel shows the absolute number of unique food items per survey grouped by country income level.Boxplots are shown, with horizontal lines representing the median value; shaded bars representing the interquartile range (IQR); whiskers representing maximum (top) and minimum (bottom) value excluding outliers, x's representing the mean value and circles representing outliers.Only surveys whose harmonisation was complete were included (n 52 of 55) ). LIC and LMIC typically focused on energy, total protein,

Fig. 5
Fig. 5 Availability of energy and nutrients across harmonised dietary datasets, by survey representativeness.Only surveys whose harmonisation has been completed and they report nutrient intakes are presented (n 48 of 55).The figure presents the 40 nutrients requested; datasets may contain additional nutrients (e.g. total sucrose, retinol) that are not listed here The main search terms included: 'nutrition survey' OR 'diet survey' OR 'dietary survey' OR 'nutrient survey' AND '24-h recall' OR 'diet recall' OR 'food record' OR 'diet record' OR 'food diary' AND 'national' OR 'nationally representative' AND ['country of interest'].

Table 1
Characteristics and progress status of the surveys included for harmonisation

Table 2
Survey and dietary data characteristics of the included 24-h recall/record dietary surveys by country income category