Validity of a web-based dietary questionnaire designed especially to measure the intake of phyto-oestrogens

A diet questionnaire (DQ) designed to assess habitual diet and phyto-oestrogen intake was developed. This study aimed to examine the validity of the DQ in men, with and without having prostate cancer. The DQ was validated against alkylresorcinol metabolites measured in urine as objective biomarkers of whole grain wheat and rye (WG) intake, and a 4-d estimated food record (FR) was used for relative comparison. Participants (n 61) completed both methods and provided spot urine samples. We found a statistically significant correlation between the DQ and FR for reported whole grain intake and isoflavonoids, as well as for intake of macronutrients, except protein. The correlation coefficient between the two methods was on average r 0·30, lowest for lignans (r −0·11) and highest for alcohol (r 0·65). Reported energy intake was lower in the DQ compared with FR (8523 v. 9249 kJ (2037 v. 2211 kcal), respectively; P = 0·014). Bland–Altman plots showed an acceptable agreement; most cases were within the limits (95 % CI) of agreement on reported energy intake, as well as intake of macronutrients, except protein (which was underestimated in the DQ compared with the FR). The correlation of alkylresorcinol with WG intake was statistically significant in the DQ (r 0·31, P = 0·015), but not in the FR (r 0·18, P = 0·12) and the weighted κ was 0·29 and 0·11, respectively. In conclusion, the results showed that the DQ have a reasonable validity for measuring WG intake and most nutrients, and, after some adjustments regarding protein intake assessment have been made, the DQ will be a promising tool.

In nutrition research, various dietary components are explored in order to study relationships between dietary exposures and disease outcome. However, it is well known that the various subjective dietary assessment methodologies used in nutrition research have limitations, which we need to cater for. To be able to investigate the diet-disease association, the dietary assessment tool of choice needs to capture the dietary components of interest in a satisfactory manner.
In Europe, 417 000 new cases of prostate cancer were identified in 2012, and it is the most common cancer among men in Westernised countries (1) . Many appear to be willing to alter their diet in hope of decreasing the tumour growth rate, and we have previously shown that 50 % of all men with prostate cancer choose to self-medicate with a variety of dietary supplements upon no scientific basis (2) . In the absence of data with high-grade evidence on how the diet is affecting prostate tumour growth, no adequate dietary advice can be given to these patients.
A diet rich in phyto-oestrogens (e.g. soya beans, rye bran and flaxseeds) has been shown to reduce the risk of developing prostate cancer (3)(4)(5)(6)(7)(8) , especially among individuals with a particular genetic type of the oestrogen receptor-β gene (ER-β) (9) . ER-β has demonstrated a cancer-inhibiting effect and is expressed in normal prostate epithelium, but the level of expression decreases gradually during the development of prostate cancer. Since phyto-oestrogens are structurally similar to female sex hormones and bind to ER-β with high affinity, phyto-oestrogens and ER-β should be able to interact during the development of cancer (10,11) . This association needs to be further investigated in order to draw any conclusions on the effect of phyto-oestrogen intake and the progress of prostate cancer.
To meet the demand of convenient and updated tools for assessing dietary intake in large-scale studies, we developed a new web-based diet questionnaire (DQ) based on two previously validated paper-based questionnaires (12,13) . The questionnaire was designed to measure the habitual diet, and specifically the intake of phyto-oestrogen-rich foods, while at the same time covering the food items on the market today. The questionnaire is planned to be used in an intervention study to measure dietary intake in a group of men with prostate cancer, but also in the general population. Therefore, this study aimed to examine the validity of the questionnaire against a reference method (a 4-d diet record), in elderly men with and without prostate cancer. Also, alkylresorcinol metabolites measured in urine were used as objective markers of whole grain wheat and rye intake. To ensure that the questions in the DQ were correctly understood, a 'face-to-face' validation was conducted with the first ten study participants.

Study population
Two groups of men were recruited during the period of March 2011 to June 2011; one including men with prostate cancer from the Department of Urology, Sahlgrenska University Hospital, Gothenburg, Sweden and one including randomly selected men from the Swedish population register. The inclusion criteria were men aged 57-71 years from the healthcare region of Västra Götaland, Sweden. The exclusion criteria were: treatment with antibiotics during the last 3 months (due to the metabolism of phyto-oestrogens), other severe mental or physical illness, food allergy or inability to understand written Swedish. Men from the population register were excluded if they had ever been diagnosed with cancer.
All men with prostate cancer who met the inclusion criteria were informed and received written information about the study by the treating physician; fifty-one persons in total. In total, thirty-eight men agreed to participate.
A total of 125 randomly selected men from the population register were informed about the study by letter and later by telephone; forty-five men agreed to participate in the study. Of the initial eighty-three participants (both groups), sixty-one participants were eligible in the final analyses (Fig. 1).

Study protocol
All participants enrolled were provided with a urine sampling kit, written information about the study and a booklet to keep a food record (FR) in. Participants were scheduled for a visit at the study centre at the Department of Internal Medicine and Clinical Nutrition, Sahlgrenska Academy, Sweden and instructed on how to keep a FR and to collect a spot urine sample the morning after completing the FR. At the study centre all participants signed a written consent form and filled in the DQ. All FR were reviewed by the study dietitian during the visit to ensure that they were properly filled in, as was the DQ. The first ten study participants underwent a face-to-face interview to ensure that the DQ was perceived correctly. This study was conducted according to the guidelines laid down in the Declaration of Helsinki and all procedures involving human subjects were approved by the Gothenburg Regional Ethics Committee (no. 765-10).
Dietary assessments Diet questionnaire. Based on a previously developed and validated paper-based DQ that has been used in several studies, for example, the Swedish Obese Study and the Cancer of the Prostate in Sweden (CAPS) study (12,13) , we further developed the questions regarding food intake. The food items and dishes that are now included are more in line with current food trends and have a better coverage of phyto-oestrogen-rich foods.
The new web-based DQ was designed to capture eating habits during the past 3 months, as well as questions regarding lifestyle, such as physical activity, intake of food supplements, alcohol consumption, tobacco use and use of medication (i.e. antibiotics and diabetes medications). The questionnaire also included questions on socio-economic parameters and anthropometric measurements. An extended version of the questionnaire was developed for men with prostate cancer with questions regarding family history of prostate cancer, and if their lifestyle habits or intake of dietary supplements had changed after being diagnosed with cancer.
In total, 184 food items and complex dishes were included. The questions regarding dietary intake were divided into twelve food group categories as follows: sandwiches; porridge and muesli; nuts and seeds; drinks; vegetables and fruit; meat and meat products; fish and seafood; vegetarian dishes and eggs; potatoes, pasta and grains; and candy, cakes and snacks. Within each food group, the frequency of intake was listed first with optional answering frequencies divided into: daily; weekly; monthly or never. As a follow-up question, food items were specified within each food group category and ranked according to intake during the last ten eating occasions. For an example, see the nuts and seeds section of the questionnaire (Fig. 2). Meal sizes were calculated using standard portion sizes, based on information from the Swedish National Food Agency.

4-d Food record.
An estimated 4-d FR was completed as a reference method. Participants were asked to keep record of all foods and drinks consumed during four consecutive days, including a weekend day. Meal size was estimated by describing the amounts in grams or household measurements,  3 journals.cambridge.org/jns such as tablespoons, decilitres, glasses, bowls or pieces. Participants were also asked to fill in as detailed information as possible regarding the manufacturers' name (on pre-cooked dishes), cooking methods and fats used for cooking.
Energy and nutrient calculation. Intake of energy and specific nutrients, as well as phyto-oestrogens, was estimated by linking information from the DQ and 4-d FR to the Swedish National Food Agency's nutritional database (version 2010-08) (13) and to a previously developed phytooestrogen database (14) . Phyto-oestrogen exposure was categorised into: total lignan intake, which included intake (μg/d) of secoisolariciresinol, matairesinol, lariciresinol, isolariciresinol, pinoresinol, syringaresinol and medioresinol; total dietary isoflavonoid intake (μg/d), which included genistein, daidzein, formononetin, biochanin A and equol. For the DQ, the calculations were carried out with a specially developed calculation software. For the 4-d FR, the software Dietist XP version 3.1 was used (15) . The reported intakes of whole grain wheat and rye were calculated using a table presenting the whole grain content of various foods, provided by the Swedish National Food Agency. All individual nutrient values were calculated as average intake per d.

Biomarker
Urinary alkylresorcinol metabolites. All men collected spot urine samples (3 × 10 ml) after an overnight fast (22.00 hours) the morning after finishing the 4-d FR. Urine samples were stored in a household freezer until the visit at the study centre, and there it was stored at −80°C until analysis. The amounts of alkylresorcinol metabolites 3,5-dihydroxybenzoic acid (DHBA) and 3-(3,5-dihydroxyphenyl)-1-propanoic acid (DHPPA) in urine were analysed by a GC-MS method according to Marklund et al. (16) at the Swedish University of Agricultural Sciences (SLU), Department of Food Science, Uppsala. Creatinine was determined by a photometric method at the Sahlgrenska University Hospital, Laboratory Medicine, Clinical Chemistry.

Statistical analysis
Baseline characteristics for men with and without prostate cancer were compared using two-sided t tests for equal means for continuous, normally distributed variables, independentsamples Mann-Whitney U tests for non-normally distributed variables and χ 2 tests for categorical variables. The mean values and standard deviations, and median (25 and 75 percentile) of total energy and nutrient intake are presented. To compare unadjusted energy and nutrient intake estimated by the two methods, the 4-d FR and the DQ, the Wilcoxon signed-rank test was performed and Spearman correlation coefficients were calculated. Nutrient density was obtained by dividing the estimated nutrient intake (μg/d, mg/d, g/d) by total energy intake (MJ/d) (17) . Correlation analyses on urinary alkylresorcinol metabolites were adjusted for creatinine (alkylresorcinols μmol/l)/(creatinine mmol/l). Because most nutrients were not normally distributed, all variables were log transformed prior to further analyses. Pearson correlation coefficients were then used on energy-adjusted variables. Subgroup analyses were also conducted to evaluate the influence of other covariates, such as BMI, smoking status, intake of saturated fat, alcohol intake and age (cut-off set by median). The ability to rank individuals by reported intake of whole grain and phyto-oestrogens (using energy-adjusted values) was examined by dividing the study population into tertiles of dietary intakes, as for levels of alkylresorcinol metabolites. Through a cross-tabulation the Cohen's weighted κ was obtained. The absolute agreement between the two methods was evaluated with Bland-Altman plots (18) . The plot obtained illustrates the differences between the two nutrient intake measurements against the mean of both methods. A 95 % CI calculated as the mean difference ±1·96 SD enables the evaluation between the methods within the limits of agreement. All statistical analyses were two-sided with a significance level at α < 0·05. To achieve 80 % power to reject the null hypothesis if the true correlation coefficient between alkylresorcinol and whole grain intake is 0·5, the number of participants was calculated to be at least twenty-nine persons in each group (Pearson correlation, two-sided test, α = 5 %, z-approximation). Statistical analyses were performed using SPSS Statistics for Windows, version 20.0 (released 2011; IBM Corp.) and SAS 9.2 for Windows (SAS Institute, Inc.)

Results
Characteristics of the study participants are presented in Table 1. Mean age was 65·7 years and the proportion of obese individuals (BMI >30 kg/m 2 ) was 11·5 %. The proportion of current smokers was 10 % and the proportion of men with a higher degree of education corresponded to the general population in Sweden (19) , which was around 33 %. There were no statistically significant differences between the two groups of men in any of the background characteristic variables; hence all further analyses were performed with both groups combined.
Mean average daily intakes of dietary variables estimated by the two different methods (DQ and FR) are shown in Table 2. Reported energy intake was lower measured by the DQ compared with FR (8523 v. 9249 kJ/d (2037 v. 2211 kcal/d); P = 0·014, respectively). Of the macronutrients, the largest discrepancy was shown for protein where the reported mean daily intake was on average 20 g lower in the DQ compared with the FR. Reported intake of whole grains (wheat and rye) was higher in the DQ compared with the FR (48 v. 34 g; P < 0·001). Reported daily intake of total phyto-oestrogens was higher in the FR compared with the DQ (P < 0·001). Rye bread contributed most to the phyto-oestrogen intake for both methods (91 % of the intake in FR and 85 % in DQ) and the estimated median intake of rye bread was 70 g/d in FR and 41 g/d in DQ. Reported intake of phyto-oestrogens did not differ between men with or without prostate cancer estimated by either of the methods (data not shown) and nearly all phyto-oestrogens (99 %) were derived from lignans.
We found a statistically significant correlation between the two methods for reported whole grain wheat and rye intake, as well as for intake of macronutrients, except for protein ( Table 2). The lowest correlation coefficient was seen for lignans and the highest for alcohol intake. The correlation on isoflavonoid intake between the two methods was statistically significant (r 0·42; P < 0·001). The unadjusted nutrient intake correlation between the methods was on average r 0·29 (range −0·03 to 0·66) and the correlation coefficient was improved to r 0·30 (range −0·11 to 0·65) when energy-adjusted variables were used.
Bland-Altman plots (Fig. 3) shows an acceptable agreement: most cases were within the 95 % limits of agreement on reported energy intake ( Fig. 3(a)), as well as intake of carbohydrate ( Fig. 3(d)), fat (Fig. 3(c)) and whole grains (Fig. 3(e)). Reported protein intake was underestimated in the DQ compared with the FR (Fig. 3(b)). Reported intake of phytooestrogens showed larger discrepancy at higher intakes, where intakes were predominantly higher and the intake distribution was skewed in the FR (Fig. 3(f)).
We found a statistically significant correlation between reported intake of whole grains (wheat and rye) measured with the DQ and levels of alkylresorcinol metabolites in urine (r 0·31; P = 0·015), but not with the FR (r 0·18; P = 0·12) ( Table 3). Reported intake of phyto-oestrogens did not correlate with levels of alkylresorcinols. All analyses were also performed with Pearson partial correlations and adjusted for BMI, age, smoking status, fat and alcohol intake, but none of these variables affected the correlation coefficients significantly (data not shown).
Results of cross-tabulation between the methods shows that the proportion of individuals categorised in the same tertile ranged from 29·4 % for lignans, to 55·7 % for whole grains, and the weighted κ values were −0·04 and 0·37, respectively (Table 3). Agreement on isoflavonoid intakes placed 39·4 % in the same tertile, with a weighted κ value of 0·14. The ranking of whole grains (wheat and rye) against alkylresorcinols performed better in the DQ than FR, with a weighted κ of 0·29 and 0·11, respectively.

Discussion
In this study, the validity of a new web-based DQ was examined. Alkylresorcinol metabolites measured in urine were used as an objective biomarker of whole grain wheat and rye intake, and the results showed that the DQ had a satisfactory validity on whole grain wheat and rye assessment. Also, the DQ was in concordance with most nutrients compared with the FR. The questionnaire was perceived as easy to use and to understand according to the face-to-face validation.

Strengths and limitations
A limitation of the study is that no objective biomarkers of phyto-oestrogen intake were used. However, to our knowledge, there are no adequate reference methods or biomarkers for measuring intake of phyto-oestrogens. We have previously investigated the correlation between lignan intake assessed with an FFQ and serum enterolactone levels, and found no correlation (13) . Low correlations between lignan intake and serum enterolactone levels have also been shown elsewhere (20)(21)(22)(23) . These poor correlations are often attributed to the large individual variations in absorption, metabolism and   differences in the gut microflora (24)(25)(26) . We therefore chose to validate the whole grain (wheat and rye) component in the questionnaire, of which there is a valid biomarker (27,28) . Although the sample size was relatively small (sixty-one individuals), a sample size of at least fifty individuals would be considered as sufficient when using biomarkers in dietary validation studies (29) . In our study, the reference method of choice was a paperbased 4-d FR, because FFQ and FR have traditionally been considered to have unrelated measurement biases (30) .
Compared with a paper-based assessment method, the administration of a web-based questionnaire is easier and reduces costs in terms of data collection and management (31) (although errors in self-reported dietary intake measurement will occur in both instruments).
The cohort comprised men from the Swedish population register, as well as men having been diagnosed with prostate cancer, which enhances generalisation to the study population where the questionnaire is aimed to be used later. After conducting the face-to-face validation on the first ten study subjects, some small adjustments were made in the construction of the DQ to ensure that all questions were properly understood.

Energy and nutrient assessment
Although the mean energy intake in the DQ was lower than in the FR, it is still in line with what other dietary surveys have reported in similar populations using paper-based FR (32,33) and slightly higher than a study using an FFQ (34) . Reported energy intake was also consistent with the Swedish National Food Agency's nutrition survey 'Riksmaten' in 2010 for the same age group (35) . The lower energy intake was most likely to be derived from the low protein intake assessed in the DQ, and was demonstrated both by the low correlation coefficients against the reference method and by the Bland-Altman plots. It could be questioned if the portion sizes in the DQ on protein sources were too small, as standard portions would also be adapted for women. There might also be a lack of crucial protein sources in the food list. Our intention is to use the DQ in future studies and therefore we will identify and add important sources of protein to the food items list that we expect to be consumed in this population. Carbohydrate and fat intakes had statistically significant correlations between the methods and improved when energy-adjusted variables were used.
The largest discrepancy in reported intake was seen for lignans, which consequently also had the lowest correlation coefficient. By comparing intake of lignan-rich foods, which in both methods mainly consisted of rye bread and flaxseeds in this cohort, it is clear that the distribution for intake of these foods was highly skewed in the FR. This was demonstrated both by the large standard deviation in mean intake but also with the Bland-Altman plots with larger discrepancy at higher intake values, where intake was predominantly higher in the FR. The median intake of rye breads reported in the FR was also very high, 70 g/d in comparison with 43 g/d as displayed in the Riksmaten survey for men (35) . It could be that 4 d of recording is not long enough to be able to capture the habitual intake of rye bran and flaxseeds, and therefore a FR might not be a relevant measure for comparison of lignan intake. In a multicentre study, the European Prospective Investigation into Cancer and Nutrition (EPIC), phyto-oestrogen intake was estimated by a 24-h recall method. Among all study centres, the average intake was estimated to 2664 µg/d, and in the Swedish cohort to 1737-2089 µg/d (36) . These values are well in line with the phyto-oestrogen intake assessed with the DQ. Another FFQ, aiming to measure lignan intake, estimated the average intake to 1616 µg/d among Swedish women (23) . This is also well in line with the results from our DQ.
For isoflavonoids, the concordance between the DQ and the FR was better than for lignans. In the typical Swedish diet, major sources of lignan precursors are spread over more food items than sources of isoflavonoids, and are therefore more difficult to capture. For some food items, e.g. flaxseed and rye bran, not everyone may be aware of eating them because a wide variety of breads on the Swedish market contains some amount of rye brans and/or flaxseeds even though it might not be apparent. In contrast, it is more likely that a person is aware of having consumed a large amount of beans or soya products, which are rich sources of isoflavonoids, because they are not commonly used in Swedish products. Because of the large day-to-day variations in intake of foods rich in phyto-oestrogens, we believe that our DQ is more reliable for measuring phyto-oestrogen intake than a 4-d FR is. However, the DQ appears to be a better tool to measure intake of isoflavonoids than intake of lignans.
Bland-Altman plots on reported energy intake, as well as intake of carbohydrate, fat and whole grains, showed a good agreement between the methods, where most cases were within the limits of agreement.
Assessment of whole grain wheat and rye intakeobjective validation Alkylresorcinol metabolites 3,5-dihydroxybenzoic acid (DHBA) and 3-(3,5-dihydroxyphenyl)-1-propanoic acid (DHPPA) were measured in urine and analysed according to standard methods (16) . To minimise the burden on study participants only spot urine was collected, that is, one urine sample collected in the morning rather than a 24-h urine collection. Urinary creatinine was therefore used as an adjustment for the concentration of urine in the biomarker analysis. This, however, introduces a potential source of error since creatinine levels can vary among individuals on a day-to-day basis (37) . Both the individual variation of creatinine secretion but also of alkylresorcinol 8 journals.cambridge.org/jns metabolism can affect the outcome of the correlations. Alkylresorcinol metabolites measured in urine has the ability to reflect a short-term intake of whole grains (38) , and for that reason we expected urinary metabolites to correlate better with the FR because the urine sampling was conducted immediately after finishing the diet record. Our results showed the opposite, with a moderate yet statistical significant correlation between the reported intake of whole grains in the DQ against alkylresorcinols and a non-significant correlation against FR. When using concentration biomarkers as objective comparisons, a fully linear relationship is not to be expected (39) (as in the case of recovery biomarkers, e.g. the doubly labelled water method for energy intake). No previous validation study has used urinary alkylresorcinol metabolites before, but they have shown to be as valid as plasma alkylresorcinol homologues with the advantage that collecting fasting urine samples are less invasive than taking blood samples (40,41) . A previous, although small, study validating an FFQ specifically designed to measure whole grain cereal food intake in the diet, using plasma alkylresorcinols as the biomarker, displayed a correlation of 0·54 on total whole grain intake and plasma alkylresorcinols (42) . Former studies have found correlations between 0·25 to 0·34 on rye bread and plasma alkylresorcinols (43,44) , and other intervention studies between 0·35 and 0·58 for whole grain intake and plasma alkylresorcinols (45,46) . In nutrition epidemiology, dietary exposures are rarely assessed as absolute amounts but rather as the ranking of exposure, often in quartiles. Therefore, the ability to rank individuals on low, medium and high intake of whole grains was examined. According to the agreement with alkylresorcinols, a weighted κ coefficient of 0·29 was obtained which concludes that the ranking capacity can be considered as moderately good with the DQ (47) .
In conclusion, the results showed that the new DQ has a satisfactory validity for measuring intake of whole grains (wheat and rye) and most nutrients. Sources and portion sizes regarding protein intake need to be revised, but after these adjustments have been made, the DQ will be a promising tool.