Valid estimation of habitual food and nutrient intakes is fundamental to the study of relationships between diet and disease. Food-frequency questionnaires (FFQs) are commonly used to estimate dietary intakes in epidemiological studies because they are relatively inexpensive and easy to administer and analyse for large numbers of peopleReference Willett1. Epidemiological studies most commonly measure associations between dietary intake and disease risk via odds ratios or relative risks. Therefore, FFQs must be able to rank individuals reliably according to their nutrient intakes, so that those with low intakes can be differentiated from those with high intakes for any particular nutrient.
Because there is currently no ‘gold standard’ for measuring food and nutrient intakes, the relative validity of an FFQ is determined through comparison with other methods such as weighed food records (WFRs) or a series of 24-hour recalls, which have their own, but largely unrelated, errors of measurementReference Willett1. The most common means of assessing agreement between different methods has been to calculate correlation coefficientsReference Burley, Cade, Margetts, Thompson and Warm2. However, the calculation of weighted kappa values assessing agreement between tertiles, quartiles or quintiles of nutrient intakes has also been frequently usedReference Burley, Cade, Margetts, Thompson and Warm2. More recently, the Bland–Altman methodReference Bland and Altman3 has been suggested as a more appropriate way of assessing agreement between different dietary assessment tools, although this assertion has been challengedReference Masson, McNeill, Tomany, Simpson, Peace and Wei4. It has recently been suggested that in order to thoroughly validate an FFQ, a variety of statistical methods should be used, including correlations, weighted kappa values and the Bland–Altman methodReference Burley, Cade, Margetts, Thompson and Warm2.
Accurate assessment of habitual dietary intake is of importance because Australia and many other nations are experiencing an epidemic of diet-related conditions such as overweight and obesity, type 2 diabetes, cardiovascular disease and several major cancersReference O'Brien5–7. Research to date has identified relationships between excessive energy, saturated fat, alcohol and sodium intakes, and insufficient dietary fibre and antioxidant intakes, and many of these lifestyle-related diseases6. There is currently no evidence that high dietary carbohydrate intakes increase the risk of developing these diseases, independent of total energy intake6. There is growing evidence, however, that diets with either a high glycaemic index (GI) or a high glycaemic load (GL) are linked to the development of obesityReference Ma, Olendzki, Chiriboga, Hebert, Li and Li8–Reference Spieth, Harnish, Lenders, Raezer, Pereira and Hangen11, type 2 diabetesReference Salmeron, Ascherio, Rimm, Colditz, Spiegelman and Jenkins12–Reference Schulze, Liu, Rimm, Manson, Willett and Hu14, cardiovascular diseaseReference Oh, Hu, Cho, Rexrode, Stampfer and Manson15–Reference Amano, Kawakubo, Lee, Tang, Sugiyama and Mori17, and cancer of the colonReference Michaud, Fuchs, Liu, Willett, Colditz and Giovannucci18–Reference Franceschi, Dal Maso, Augustin, Negri, Parpinel and Boyle20 and breastReference Silvera, Jain, Howe, Miller and Rohan21–Reference Augustin, Dal Maso, La Vecchia, Parpinel, Negri and Vaccarella24.
The underlying mechanism linking high-GI/GL diets to the risk of developing these lifestyle-related diseases is thought to be the postprandial glycaemic response. Both the quality and the quantity of carbohydrate determine the postprandial glycaemic response to a food or mealReference Sheard, Clark, Brand-Miller, Franz, Pi-Sunyer and Mayer-Davis25. By definition, the GI compares equal quantities of available carbohydrate in foods and provides a measure of carbohydrate quality or glycaemic potential. The GL, on the other hand, is a function of a food’s GI and its total available carbohydrate content and is defined as: GL = GI (%) × carbohydrate(g). The higher the GL, the greater the expected elevation in postprandial blood glucose levelsReference Foster-Powell, Holt and Brand-Miller26.
To date, over 30 prospective cohort studies have investigated the link between GI, GL and the risk of developing chronic lifestyle-related disease using FFQs. Surprisingly, to our knowledge, none has directly assessed the ability of an FFQ to rank individuals accurately according to their GI and GL values. Instead, most have used total carbohydrate, dietary fibre and other carbohydrate fractions as surrogate measures. The lack of direct validation of these measures could be a significant cause of the widespread variation in associations found between GI, GL and disease risk, and deserves further investigation.
We aimed in the present study to assess the relative validity of the Blue Mountains Eye Study (BMES) FFQ by comparing it with the BMES WFR, to determine how well it ranked older Australians according to their mean daily GI, GL and carbohydrate intakes.
The methods used in the BMES have been described previouslyReference Attebo, Mitchell and Smith27, Reference Smith, Mitchell, Reay, Webb and Harvey28. Briefly, the present study concerns the baseline study (BMES I), which identified 4433 eligible non-institutionalised permanent residents, aged 49 years or more, in a door-to-door census conducted during 1991, of whom 3654 (82%) participated in detailed examinations during the period 1992–1994. The study was approved by both the Western Sydney Area Health Service and the University of Sydney Human Ethics Committees, and written informed consent was obtained from all participants. Of 779 (18%) persons identified in the census who did not participate, 353 (8%) permitted only a brief interview, 148 (3%) refused, 210 (5%) had moved out of the area, and 68 (1.5%) had died before the examinations were conducted. The overall response was therefore 82%. Baseline differences between participants and non-participants were minimalReference Mitchell, Smith, Wang, Cumming, Leeder and Burnett29.
All study participants were invited to attend a local clinic for a medical history and examination, which included anthropometry, history of chronic lifestyle-related diseases and associated risk factors. Fasting pathology tests, including blood lipids and plasma glucose, were obtained for 88% of the 3654 residents at a second visitReference Mitchell, Smith, Wang, Cumming, Leeder and Burnett29.
The 145-item semi-quantitative FFQ was modified for the Australian diet and vernacular from an earlier FFQ of Willett et al.Reference Willett, Sampson, Browne, Stampfer, Rosner and Hennekens30, and included portion size estimates and additional questions about the type of breakfast cereals, to increase the accuracy of the GI estimates. The FFQ has previously been validated against a series of WFRs for most macro- and micronutrients, but not for GI or GLReference Mitchell, Smith, Wang, Cumming, Leeder and Burnett29, Reference Flood, Smith, Webb and Mitchell31. Participants attempting the FFQ numbered 3267 (89%) and, of these, 2868 were usable (79% of those examined, 88% of those who attempted the FFQ). FFQs with more than 10 items missingReference Smith, Mitchell, Reay, Webb and Harvey28 or with implausible, extreme values (<2090 kJ/day or >16 720 kJ/day) were excluded. Respondents of the FFQ were asked about the foods eaten in the previous 12 months, and an allowance for seasonal variation of fruit and vegetables was made during analyses by weighting seasonal fruits and vegetables.
Subjects in the validation study
In 1994, a random selection of 186 BMES subjects, weighted to include more older people (65–85 years), were invited to take part in the validation study. Each subject was required to complete three 4-day WFRs approximately four months apart. Average time from completion of the first FFQ was 6 months. Of the 150 people who agreed to participate, 139 began recording food intake and 121 provided completed, usable first WFRs. Of these, 100 completed a second usable WFR, 4 months later, and 90 completed a third WFR, 8 months after the first. Of these, 12 were excluded because they either did not complete a second FFQ or the WFRs were incompleteReference Mitchell, Smith, Wang, Cumming, Leeder and Burnett29. A final total of 78 subjects (52% of those who agreed to participate) completed all three 4-day WFRs, which were included in the secondary analysis of GI, GL and carbohydrates.
A dietitian coded data from the semi-quantitative FFQ into a custom-built database (DBASE IV; Borland International Inc., 1991), which incorporated the Australian food composition tables (NUTTAB 90)32 plus published GI values using the glucose = 100 scaleReference Foster-Powell, Holt and Brand-Miller26. Additional GI data were obtained from the Sydney University Glycemic Index Research Service (SUGiRS) online database (www.glycemicindex.com). The same dietitian coded the WFR data into the FoodWorks (Xyris Software, 2006) dietary analysis software, using the same food composition tables and GI values. In total, 88.9% of the GI values were obtained direct from published values, while the remaining 11.1% were interpolated from similar food items.
The overall GI of each participant’s diet was calculated by summing the weighted GI of individual foods in the diet, with the weighting proportional to the contribution of individual foods to total carbohydrate intake. The GL of each food item was calculated by multiplying each food’s GI by the amount of available carbohydrate (g) per serving: GL = GI (%) × carbohydrate(g). Overall dietary GL was then determined as the product of the food’s GL and the participant’s frequency of consumption, summed over all foods.
Concurrent validity of the FFQ compared with the WFRs was assessed using four primary methods:
1. Comparison of group means.
2. Pearson product–moment and Spearman ranked correlations.
3. Bland–Altman limits of agreement (LOA)Reference Bland and Altman3, in which the mean agreement between the two methods, i.e. FFQ and WFR, was calculated. The LOA define the limits within which 95% of these differences are expected to fall (mean±two standard deviations of the differences). The differences between the two methods were plotted against the average of the two methods. Any dependency between the two methods was tested by fitting the regression line of differences (Ho: β = 0, α = 0.05), i.e. ideally if the two methods are equally variable, the correlation between the differences would equal zero. Natural log (ln) transformation was performed since the dietary data were skewed, as recommended by Bland and AltmanReference Bland and Altman3.
4. Joint classification of nutrient intake assessed by the FFQ and the average of the three WFRs was assessed using quintiles of intake for each nutrient from the FFQ and WFR, respectively. The proportion grossly misclassified applied when one dietary assessment method classified the individual’s intake into the lowest quintile and the other method classified it into the highest quintile. Cohen’s weighted kappa values were calculated comparing quintiles of intake for each nutrient from the FFQ and WFRReference Fleiss33.
For all methods, all nutrients were adjusted for total energy using the residual methodReference Willett and Stampfer34. Energy adjustment was done separately for each FFQ and the average of the three WFRs.
All statistical analyses were performed using SPSS version 14.0.0 (SPSS Inc., 2002).
Individuals who completed a usable FFQ were, on average, 1 year younger than the study population as a whole, but they were no more likely to have serious eye disease (data not shown). The proportion of men (45%) and women (55%) participating in the validation and BMES studies was similar. By design, those participating in the validation study were older than those participating in the BMES study, by a mean difference of approximately 5 years (age 65 years in BMES and 70 years in the validation study; P < 0.0001).
Comparison of group means
The mean daily intakes of carbohydrate fractions, GI and GL for the average of the three 4-day WFRs and the FFQ are shown in Table 1. With the exception of starch and GI, the FFQ provided higher mean estimates of all of the nutrients compared with the WFR. With the exception of GL, all differences were statistically significant (Wilcoxon signed-rank sum test: carbohydrate, P = 0.012; sugar, P < 0.0001; starch, P < 0.0001; fibre, P < 0.0001; GI, P < 0.0001; GL, P = 0.3327).
LOA – limits of agreement.
* Energy-adjusted using the residual method34 and ln-transformed.
† One dietary method classifies intake into the bottom quintile; the other method classifies intake into the top quintile.
Pearson product–moment correlation and Spearman rank correlation (crude, adjusted for energy, ln-transformed and deattenuated) for comparisons of nutrient intakes from the FFQ and the three 4-day WFRs are shown in Table 1. All of the nutrients had deattenuated Pearson or Spearman correlations greater than 0.5, with the exception of GL. The correlation between total carbohydrate and GL was 0.92 for the FFQ and 0.89 for the WFR.
Bland–Altman analyses were performed for GI, GL and all carbohydrate fractions. Figures 1 and 2 illustrate the findings for GI and GL, respectively (other carbohydrate fractions not illustrated). For GI, sugar, starch and fibre, the regression lines indicated a non-significant linear trend (P = 0.07, P = 0.360, P = 0.277 and P = 0.097, respectively). For GL and total carbohydrate, however, the regression lines indicated a significant linear trend (P = 0.006 and P < 0.0001, respectively), suggesting dependency existed between the difference of the two methods and the average of the two methods; as individuals’ GL and total carbohydrate increased, the magnitude of the error between the FFQ and WFR increased.
Classification into categories of consumption
The proportion of subjects correctly classified within one quintile category for all of the nutrients was over 50% (Table 1). Gross misclassification was relatively low, with all nutrients less than 10%. All weighted kappa values indicated moderate to good agreement, with the exception of GL which was only fair.
This study assessed, using different statistical methods, the validity of a 145-item semi-quantitative FFQ in a representative sample of older Australians, by comparing carbohydrate intakes, GI and GL values obtained from the FFQ with those derived from three, 4-day WFRs. Comparison of group means, correlation coefficients, Bland–Altman analyses and weighted kappa values showed that this FFQ is capable of ranking individuals according to their total carbohydrate, sugar, starch, dietary fibre and GI intakes, but not as well for their GL. The FFQ had a tendency to overestimate total carbohydrate, sugar, dietary fibre and GL, but not starch and GI. For GL and total carbohydrates, the overestimation was greatest in those with higher intakes.
The BMES FFQ is subject to errors common to these kinds of tools, namely the reliance on long-term memory, a relatively restricted list of foods, interpretation of frequencies and average serving sizes, and a poor ability of some individuals to estimate and describe their usual food intake. While WFRs do not rely on memory, limited food lists or average serving sizes, they do place a substantial burden on individuals and their families, which may inadvertently affect usual food intake. In addition to this, food intake may vary considerably over the course of a week, month and even more so over a yearReference Willett1. However, our analyses demonstrate that for total carbohydrate, sugar, starch, dietary fibre and GI, the estimates of both methods were sufficiently similar to indicate an ability to rank individuals reliably according to their habitual intakes of these essential nutrients.
The BMES FFQ was not originally designed to measure differences in the GI of foods, and the GI values of certain foods like breads and breakfast cereals are very brand-specific. However, Australia has a more extensive GI database than most other countriesReference Foster-Powell, Holt and Brand-Miller26, minimising this potential source of error. It is likely that the errors associated with estimating the GI, combined with those associated with estimating total carbohydrate, were compounded, resulting in the relatively poor ability of the FFQ to estimate GL.
The generalisability of the results of this validation study to the whole BMES population is considered reasonable. A slightly older age range was deliberately selected for the validation study because we were mainly interested in age-related diseases less common in younger people and we wanted to be sure of a valid instrument for those individuals who were likely to become cases in our future prospective analyses. The participation rate in the validation study was not high, but acceptable, and the loss to follow-up was not unduly large for such a long-term study with high subject burdenReference Mitchell, Smith, Wang, Cumming, Leeder and Burnett29.
The results of this validation study have important implications for studies that have used FFQs to investigate associations between dietary carbohydrates, GI, GL and risk of chronic lifestyle-related disease. To our knowledge, none published to date has directly assessed the ability of its FFQ to rank individuals accurately according to their GI and GL values. Instead, the vast majority of studies have used the correlation coefficient for total carbohydrate, and occasionally other carbohydrate fractions, as surrogates.
Brunner et al.Reference Brunner, Stallone, Juneja, Bingham and Marmot35 have suggested that a value ‘of about 0.5 for most nutrients and 0.8 for alcohol between methods is good evidence that the FFQ has the ability to rank individuals … according to nutrient intake’. Indeed, of the 34 prospective cohort studies investigating the link between dietary carbohydrates, GI, GL and chronic-disease risk published by mid-2006Reference Salmeron, Ascherio, Rimm, Colditz, Spiegelman and Jenkins12, Reference Salmeron, Manson, Stampfer, Colditz, Wing and Willett13, Reference Oh, Hu, Cho, Rexrode, Stampfer and Manson15, Reference Michaud, Fuchs, Liu, Willett, Colditz and Giovannucci18, Reference Higginbotham, Zhang, Lee, Cook, Giovannucci and Buring19, Reference Silvera, Jain, Howe, Miller and Rohan21, Reference Hodge, English, O'Dea and Giles36–Reference Mayer-Davis, Dhawan, Liese, Teff and Schulz63, 24 (70%) have correlations for total carbohydrate that meet this target; as did our own FFQ. While the correlation between our FFQ and WFR was similarly acceptable for GI, it was not for GL, and this was confirmed by further statistical analysis. Therefore, our data suggest that just because an FFQ has a correlation for total carbohydrate with a WFR of ≥0.5, we cannot automatically assume that it is able to rank individuals according to their GI and GL. Its ability to do so needs to be assessed independently, using a variety of different statistical methodsReference Burley, Cade, Margetts, Thompson and Warm2.
Of concern, fourReference Silvera, Jain, Howe, Miller and Rohan21, Reference Terry, Jain, Miller, Howe and Rohan48, Reference Silvera, Rohan, Jain, Terry, Howe and Miller51, Reference Silvera, Rohan, Jain, Terry, Howe and Miller53 of the 34 studies (12%) investigating the relationship between dietary carbohydrates, GI, GL and chronic disease risk were not validated for any available carbohydrate fraction, let alone GI or GL. All four of these studies were based on the Canadian National Breast Screening Study. The self-administered dietary questionnaire used in this study was validated for a variety of nutrients that at the time were thought to be related to cancer aetiology including fats, protein and a number of vitamins, with generally acceptable results. However, available carbohydrate was not considered to be a risk factor, and as a consequence was not included in the validation processReference Jain, Harrison, Howe and Miller64. A further six (18%) studiesReference Meyer, Kushi, Jacobs, Slavin, Sellers and Folsom39, Reference Stevens, Ahn, Juhaeri, Houston, Steffan and Couper40, Reference Nielsen, Olsen, Christensen, Overvad and Tjonneland46, Reference Silvera, Rohan, Jain, Terry, Howe and Miller53, Reference Michaud, Liu, Giovannucci, Willett, Colditz and Fuchs56, Reference Mayer-Davis, Dhawan, Liese, Teff and Schulz63 had correlation coefficients for total carbohydrates less than 0.5. The ARIC (Atherosclerosis Risk in Communities) studyReference Stevens, Ahn, Juhaeri, Houston, Steffan and Couper40 and the pancreatic cancer study of Michaud et al.Reference Michaud, Liu, Giovannucci, Willett, Colditz and Fuchs56 used the original 61-item FFQ of Willett et al., which had an energy-adjusted correlation coefficient for total carbohydrate of 0.45 with a WFRReference Willett, Sampson, Stampfer, Rosner, Bain and Witschi65. The breast cancer study of Nielsen et al.Reference Nielsen, Olsen, Christensen, Overvad and Tjonneland46 used an FFQ with an energy-adjusted correlation coefficient for total carbohydrate of 0.47Reference Tjonneland, Overvad, Haraldsdottir, Bang, Ewertz and Jensen66. The Insulin Resistance Atherosclerosis Study, reported by Liese et al.Reference Liese, Schulz, Fang, Wolever, D'Agostino and Sparks62 and Mayer-Davis et al.Reference Mayer-Davis, Dhawan, Liese, Teff and Schulz63, used an FFQ with an energy-adjusted correlation of only 0.37Reference Mayer-Davis, Vitolins, Carmichael, Hemphill, Tsaroucha and Rushing67. Finally, the FFQ used in the type 2 diabetes study of Meyer et al.Reference Meyer, Kushi, Jacobs, Slavin, Sellers and Folsom39 had an energy-adjusted correlation of 0.45Reference Munger, Folsom, Kushi, Kaye and Sellers68. Despite this, all these authors suggested that their FFQs were able to rank individuals according to their GI and GL, and drew conclusions about the effect of GI and/or GL on disease risk on the basis of this assumption which, according to our own analysis, may be unjustifiedReference Barclay and Brand-Miller69.
Our validation study of dietary carbohydrates, GI and GL has found that a Willett-derived FFQ used in the BMES can rank individuals according to their intakes of total carbohydrate, sugar, starch, fibre and GI, but less well for GL. To date, studies investigating the link between habitual intake of glycaemic carbohydrate and chronic disease risk have not directly assessed whether their FFQ is able to adequately rank individuals according to their GI and GL, with the majority relying on surrogate measures like correlations with total carbohydrate. Our study has demonstrated that this assumption is not entirely supported, and may have led to some erroneous conclusions. More comprehensive validation techniques that include direct assessment of an FFQ’s ability to rank individuals according to their carbohydrate fractions, GI and GL will improve the quality of evidence investigating the effect of these factors on chronic disease risk.
Competing interests: Nil identified.