Research investigating links between intake of specific foods and health requires accurate assessment of dietary exposure(Reference Jenab, Slimani and Bictash1). Conventional methods of measuring dietary exposure such as FFQ(Reference Kristal, Peters and Potter2, Reference Willett and Hu3) depend upon estimates of food intake and are subject to well-recognised errors, derived largely from participant bias, which can confound interpretation of subsequent data(Reference Bingham4, Reference Bingham, Gill and Welch5). To address this problem, recent studies have described the targeted analysis, in blood and urine samples, of specific nutrients and metabolites derived from key foods that may have value as direct biomarkers of dietary exposure. In addition, quantification of biomarker concentrations in accessible biofluids can be used to help validate intake data obtained from FFQ and other conventional assessment methods(Reference Holmes, Powell and Campos6–Reference Talegawkar, Johnson and Caritbers10), which is an important aspect of study design(Reference Bingham, Gill and Welch5, Reference Cade, Thompson and Burley11, Reference Jia, Craig and Aucott12). To date, putative biochemical markers are available for only a relatively small number of specific foods and food components, and validation of food intake using conventional dietary assessment instruments in large cohorts of free-living participants remains a significant challenge(Reference Spencer, Mohsen and Minihane13). For example, a large number of studies have proposed that the antioxidant properties of dietary polyphenols from fruits and vegetables may protect consumers against several diseases(Reference Mennen, Sapinho and Ito7, Reference Spencer, Mohsen and Minihane13). However, as a result of substantial metabolism after ingestion(Reference Bredsdorff, Nielsen and Rasmussen14, Reference Del Rio, Costa and Lean15), it can be technically challenging to use the levels of specific secondary metabolites as an accurate estimate of dietary intake of purportedly beneficial foods. To address this issue, it has recently been proposed that the comprehensive chemical analysis of accessible human biofluids, using metabolomics methodology, may provide more suitable dietary intake biomarker leads(Reference Favé, Beckmann and Draper16–Reference Walsh, Brennan and Malthouse20). Methods utilising NMR(Reference Holmes, Loo and Stamler17, Reference Walsh, Brennan and Malthouse20) and particularly MS(Reference Favé, Beckmann and Draper16, Reference Llorach, Urpi-Sarda and Jauregui18, Reference Scalbert, Brennan and Fiehn19, Reference Beckmann, Parker and Enot21) are now implemented relatively routinely, and certain metabolite fingerprinting techniques are becoming high throughput, with the potential for automation(Reference Beckmann, Parker and Enot21). Recently, we have developed volunteer handling, data normalisation and non-targeted metabolite fingerprinting methodology to aid measurement of metabolome changes in urine and other biofluids in response to acute dietary interventions(Reference Favé, Beckmann and Draper16, Reference Favé, Beckmann and Lloyd22). Based on these studies that used flow injection electrospray-ionisation MS (FIE-MS)(Reference Beckmann, Parker and Enot21) in conjunction with supervised multivariate data classification analysis(Reference Enot, Lin and Beckmann23) and electrospray ionization-MS signal annotation tools(Reference Draper, Enot and Parker24), we describe an approach to validate the use of FFQ dietary component descriptors without the need for prior knowledge of putative biochemical markers indicative of exposure to specific dietary components. As an example, we demonstrate that FFQ estimates of citrus exposure in small groups of free-living human subjects were correlated with distinct quantitative differences in the urine metabolome that are related to metabolites found in citrus fruits(Reference Mennen, Sapinho and Ito7, Reference Bredsdorff, Nielsen and Rasmussen14, Reference Brett, Hollands and Needs25, Reference Kawaii, Tomono and Katase26). The utility of these potential biomarkers, discovered by non-targeted fingerprinting, is compared with dietary exposure estimates made by the targeted analysis of urine samples for metabolites derived from the abundant polyphenols found in oranges(Reference Major, Williams and Wilson27) and proposed previously as putative biomarkers for citrus exposure(Reference Mennen, Sapinho and Ito7, Reference Spencer, Mohsen and Minihane13, Reference Bredsdorff, Nielsen and Rasmussen14, Reference Brett, Hollands and Needs25, Reference Kawaii, Tomono and Katase26). Additionally, to develop metabolomic procedures suitable for future epidemiological studies, we demonstrate the value of overnight and, particularly, fasting urine samples for the discovery of potential chemical biomarkers indicative of habitual dietary exposure.
Ethical approval and subject recruitment
The present study was approved by the Newcastle and North Tyneside 2 Research Ethics Committee (reference no. 07/H0907/136) and registered with the Newcastle upon Tyne Hospitals NHS Foundation Trust (registration no. 4392). The present study was conducted according to the guidelines laid down in the Declaration of Helsinki, and all procedures involving human subjects were approved by the Newcastle and North Tyneside 2 Research Ethics Committee. Written informed consent was obtained from all subjects after a detailed explanation of the study protocol at an induction visit to the Clinical Research Facility (CRF) (Royal Victoria Infirmary, Newcastle upon Tyne, UK). The project constitutes part of the Metabolomics to characterise Dietary Exposure (MEDE) research programme(Reference Favé, Beckmann and Draper16), which aimed to develop a standardised protocol for nutritional metabolomics investigations(Reference Favé, Beckmann and Lloyd22). In the present study, study 1 participants were sampled during phase 2 of the MEDE project and study 2 participants were sampled during phase 3 of the MEDE project(Reference Favé, Beckmann and Draper16). The volunteers were recruited through word of mouth and by advertisement in Newcastle University, UK. They were assessed for suitability via a screening questionnaire, which included the following exclusion criteria: aged under 18 years; for women being premenopausal; having a BMI < 18·5 kg/m2 or >30 kg/m2; being a smoker, non-milk drinker and/or non-fish eater; having a history of substance abuse or alcoholism (alcohol consumption higher than 30 units/week); being allergic to any test food; suffering from any significant health problem and/or planning to change dietary or physical activity habits. Demographic data for each study participant are presented in Table S1 of the supplementary material (available online at http://www.journals.cambridge.org/bjn).
Levels of habitual exposure to dietary citrus foods based on FFQ information
Habitual diet was characterised using the validated FFQ employed by the European Prospective Investigation into Cancer and Nutrition study(Reference Bingham, Gill and Welch28), which was modified slightly to include foods consumed frequently in the North East of England. The detailed study design and protocols will be published elsewhere(Reference Favé, Beckmann and Lloyd22), and a detailed standard operating procedure is available on the NuGO website (http://www.nugo.org/sops/40 878/41 026). Volunteers were classified into three levels (low, medium and high) of habitual exposure to dietary citrus foods based on the analysis of FFQ information (Table 1) by combining exposure ratings for three specific food groups(Reference Bingham, Gill and Welch28) (see Table S2 of the supplementary material, available online at http://www.journals.cambridge.org/bjn) to provide estimates of total ‘citrus’ intake (Table 1). Individuals in the ‘low citrus’ exposure category consumed citrus food products < 2 week, those with ‘medium’ exposure levels consumed citrus foods almost every day, and two to three citrus portions/d were consumed by those with ‘high’ intakes.
* The scoring system is described in Table S1 of the supplementary material (available online at http://www.journals.cambridge.org/bjn).
† Sum of columns 2, 3 and 4 per volunteer.
Sample collection and acute exposure study design
Study 1 volunteers (twelve individuals) attended two identical test days in the CRF, which were held several months apart. Volunteers were asked to collect all urine samples produced after consumption of a standardised evening meal, up to and including the morning void before attending the CRF, identified as the ‘PRE’ sample. On each study day, volunteers came to the CRF after a 12 h minimum fast, and ‘fasting’ urine samples were collected. Volunteers received a standardised test breakfast, and further urine samples were collected after 2, 4, 6 and 8 h. The test breakfast consisted of 200 ml orange juice, 190 ml tea with 14 ml skimmed milk and 12 g sugar, a 35 g butter croissant and 25 g cornflakes with 125 g semi-skimmed milk (1·7 % fat) (see Favé et al. (Reference Favé, Beckmann and Lloyd22) for full details of all food items). A standardised light lunch, provided 4 h after the breakfast, consisted of two poached free-range eggs (approximately 2 × 50 g), two slices of sliced white bread (2 × 36 g) and 500 ml still mineral water (see Favé et al. (Reference Favé, Beckmann and Lloyd22) for full details of all food items). Study 2 volunteers (eleven individuals) attended six identical test days in the CRF, which were held at least 1 week apart, over the duration of a year. Only ‘PRE’ and ‘fasting’ samples were collected. Urine samples were frozen immediately at − 20°C and moved to − 80°C within 24 h(Reference Favé, Beckmann and Lloyd22).
Metabolite fingerprinting and data analysis – feature selection
FIE-MS was carried out as described previously(Reference Beckmann, Parker and Enot21, Reference Favé, Beckmann and Lloyd22). Aliquots of thawed urine samples (50 μl) were diluted in 450 μl of pre-chilled methanol–water (3·5:1), vortexed, shaken for 15 min at 4°C and then centrifuged for 5 min at 14 000 g. For each urine sample, data were acquired in alternating positive and negative ionisation modes over four scan ranges (mass:charge ratio (m/z) 15–110, 100–220, 210–510 and 500–1200), with an acquisition time of 5 min, on a linear trap quadrupole linear ion trap (Thermo Electron Corporation, San Jose, CA, USA). The resulting mass spectrum was the mean of twenty scans about the apex of the infusion profile. Raw data dimensionality was reduced by electronically extracting signals with ± 0·1 Da mass accuracy, which resulted on average in one signal per mass bin (in the following referred to as a ‘nominal mass’). Data were log transformed and normalised to total ion current before data analysis(Reference Beckmann, Parker and Enot21).
Comprehensive data mining was carried out following the FIEmspro workflow validated previously in Aberystwyth(Reference Enot, Lin and Beckmann23) and accessible at the uniform resource locator http://users.aber.ac.uk/jhd/. Principal component (PC) analysis transformed the sample data matrix into a coordinate system, where each new projection (PC) is a linear combination of the original variables, thus reducing data dimensionality. This was followed by supervised PC-linear discriminant analysis in which plots of the first two discriminant functions allowed visualisation of the goodness of class separation. Eigenvalues (T w), defined as the ratio of the between- and within-group standard deviations of the discriminant variables, were used to evaluate the performance of PC-linear discriminant analysis; discrimination was considered good for eigenvalues of >2·0 and poor for eigenvalues < 1·0(Reference Enot, Lin and Beckmann23). Random forest (RF), which is an ensemble classifier that consists of many decision trees and outputs the class that is the mode of the classes output by individual trees, was also employed in the analysis of the multivariate data. In RF, each tree is trained on a bootstrap sample of the training data, and predictions are made by a majority vote of the trees. The best split is based on the variables in the training set instead of all variables. To assess the classification performance, the RF classification margin, defined as the proportion of votes for the correct class minus the maximum proportion of votes for the other classes, was used. Average margin values ≥ 0·3 indicate adequate classification in metabolomic experiments(Reference Enot, Lin and Beckmann23).
Feature selection techniques were used to select the nominal mass signals (m/z), which were responsible for discriminating between different sample classes. A combination of three methods, RF, area under the receiver-operating characteristic curve (AUC) and Welch's t test, were used in feature selection, to produce a full feature rank list based on their statistical score values(Reference Enot, Lin and Beckmann23). RF feature selection was obtained by calculating importance scores, being the mean decrease in accuracy over all classes when a feature is omitted from data. The AUC used the area under curve of the sensitivity (true-positive rate) against the specificity (false-positive rate), and Welch's t test ranked the features by their absolute values of the false discovery rate-corrected P values. Randomised re-sampling strategies using bootstrapping were applied in the process of classification and feature selection to counteract the effect of any unknown structured variance in the data. In the present data analysis, 100 bootstraps were used for classification and feature selection with RF using 1000 trees.
Pearson's correlation coefficients between selected variables were calculated using the R-function cor (correlation function). Variables with correlation coefficients >0·7 were considered to belong to a cluster indicative of different ionisation or potential biotransformation/breakdown products of a metabolite.
Targeted metabolite analysis – validation of features
Selected variables revealed by FIE-MS data mining were investigated further using targeted Nano-Flow (TriVersa NanoMate; Advion BioSciences Limited, Norfolk, UK) linear trap quadrupole-Fourier-transform ion cyclotron resonance mass spectroscopy ultra (FT-ICR-MS; where ultra referred to the high-sensitivity ICR-cell). For each biological class, pools were constructed using urine samples from four volunteers chosen at random and reconstituted in methanol–water (80:20, v/v). For each spray on the TriVersa Nanomate, a sample volume of 13·0 μl was used, and 2 μl of air were aspirated after the sample. Gas pressure was maintained between 0·2 and 0·6 psi, with the voltage at 1·4–1·7 kV (generally higher for negative ionisation mode) to achieve currents at 80–120 and − 100 to − 60 nA in positive and negative ionisation mode, respectively. Operating the FT-ICR-MS in narrow mode, a resolution of 100 000 was chosen, and mass range was scanned for 1 min. A minimum of three samples per class or treatment containing the specific selected mass were required for successful accurate mass verification. The system was calibrated with the linear trap quadrupole Fourier-transform calibration solution prepared according to the instrument manufacturer's instructions.
For metabolite signal identification, the accurate mass values were then queried against MZedDB, an interactive accurate mass annotation tool that can be used to provide tentative annotation of signals by means of neutral loss and/or adduct formation rules(Reference Draper, Enot and Parker24). Further metabolite signal identification was obtained using FIE-MS/MS in which the scan window was set for twenty scans, with an isolation width of m/z 1 and normalised collision energy of 40 V. The activation coefficient Q of 0·250 was chosen and an activation time of 30 ms, with wideband activation turned on and source fragmentation of 20 V. Mass range settings were dependent upon the molecular weight of the target ion. Chemical standards investigated with FT-ICR-MS and FIE-MS/MS were obtained commercially at highest purity, and solvents were of HPLC-MS grade. Standards were prepared by dissolving 1 mg of the metabolite standard in 1 ml extraction solvent and reconstituted in methanol–water (80:20, v/v). To confirm the identity of salt adducts, additional aqueous Na+ or K+ ion-containing solutions (50 mm as bicarbonate in the case of Na+ and chloride in the case of K+) were added to aid adduct formation.
Classification of habitual consumption of citrus foods using metabolite fingerprint analysis of urine
PC-linear discriminant analysis was used to determine how well ‘PRE’ and ‘fasting’ urine samples from study 1 volunteers in each of the habitual citrus exposure levels (high, medium and low) were discriminated using both positive and negative ionisation modes and four overlapping mass ranges (m/z 15–110, 100–220, 210–510 and 500–1200). In positive ionisation mode (Table 2), the mass range m/z 100–220 had the most classification power using both ‘PRE’ and ‘fasting’ urine samples. The same mass range was the most informative using negative ionisation mode data (see Table S3 of the supplementary material, available online at http://www.journals.cambridge.org/bjn). Good discrimination of each citrus exposure class is evident in PC-linear discriminant analysis score plots comparing FIE-MS fingerprints of ‘PRE’ (Fig. 1(a)) and ‘fasting’ urine samples (Fig. 1(b)). The eigenvalue (T w) for separation between high and low citrus consumers in the first discriminant function dimension (DF1) is >2 for fasting urine FIE-MS fingerprint models (m/z 100–220), which indicates robust classification of the habitual consumption of citrus foods (Fig. 1(b)) in study 1 volunteers.
m/z, Mass:charge ratio; T w, discriminant function 1 eigenvalue; AUC, area under the receiver-operating characteristic curve; Margin, random forest (RF) classification margin.
* Principal component-linear discriminant analysis (PC-LDA) and RF classification of data acquired by FIE-MS (positive ion mode) analysis of pre-test day overnight urine samples ‘PRE’ and ‘fasting’ urine samples, after a 12 h (minimum) fast, from twelve individuals. For PC-LDA, ‘dietary citrus exposure’ from Table 1 was the class structure applied consisting of ‘high’, ‘medium’ and ‘low’ citrus consumers (twelve volunteers, two repeat samples provided). Pairwise RF comparisons were made between five ‘high’ and four ‘low’ citrus consumers.
Identification of urine metabolite fingerprint signals potentially explanatory of habitual citrus food consumption level
A total of five volunteers in study 1 were considered to be high-level habitual consumers of citrus foods, while four individuals were classed as reporting low-level exposure to citrus foods (Table 1). Analysis of FFQ data from study 2 volunteers (see Table S1 of the supplementary material, available online at http://www.journals.cambridge.org/bjn) revealed that a similar number of volunteers could be considered either high or low habitual consumers of citrus foods (six individuals were categorised as high consumers and five were low consumers). For each volunteer in study 1, two independent fasting and two independent overnight void (PRE) urine samples were available. Fasting and overnight void urine samples were collected from study 2 volunteers on six independent occasions spread throughout a 14-month period. In both studies, volunteers had consumed a freely chosen diet for several weeks before collecting urine samples, and thus for the purpose of the present analyses, each was considered an independent class (i.e. high or low citrus consumer) replicate. Metabolite fingerprints from ‘PRE’ and ‘fasting’ urine samples representative of high and low habitual citrus exposure classes (study 1 samples, eighteen: ten high citrus and eight low citrus; study 2 samples, sixty-six: thirty-six high citrus and thirty low citrus) were subjected to pairwise comparison, using machine-learning techniques in which a combination of three methods (RF, AUC and Welch's t test) were employed to rank features for discrimination power. To maximise predictability, re-sampling using the bootstrap method was applied. As a ‘rule of thumb’ we have shown, using a range of other FIE-MS datasets, that the threshold for significance in a pairwise analysis lies within an importance score range of 0·0015–0·003(Reference Enot, Lin and Beckmann23). The curve inflection occurring at approximately 0·002 shows that the top fifteen to twenty of the m/z signals conferred the majority of discriminatory power in both PRE and fasting urine samples (Fig. 2(a) and (b)) in both studies. Fig. 2(c) shows that seven common signals (m/z 104, 166, 167, 169, 182, 183 and 201; shaded in black) were explanatory of high v. low citrus exposure levels in both PRE and fasting urine samples and for both studies (highly ranked in three or all four of the datasets) (expanded lists are shown in Table S4 of the supplementary material, available online at http://www.journals.cambridge.org/bjn). Of these signals, m/z 104, 169 and 201 did not correlate with other highly ranked signals (data not shown). However, m/z 166, 167, 182 and 183 were strongly correlated with each other (Fig. 2(d)), together with m/z 144 and 145 (ranked in the top fifteen of the study 1 data and slightly lower ranked in the study 2 data), suggesting that these signals may be isotopes and/or salt adducts of ionised metabolites (Fig. 2(d)). In PRE urine sample, two further signals (m/z 160 and 198) formed part of the same correlation grouping.
These high-ranked nominal mass bins within this clade (Fig. 2(d)) were investigated in detail by ultra-high mass resolution FT-ICR-MS. Table 3 summarises the accurate mass FT-ICR-MS analysis of the correlated explanatory mass bins in both PRE and fasting urine samples. Querying the identity of the accurate mass signals in MZedDB(Reference Draper, Enot and Parker24) suggested that these correlated explanatory signals were ionisation adducts and isotopes of proline betaine (stachydrine) and of hydroxyproline betaine (Table 3). A comparison of spectra derived from FIE-MS/MS fragmentation of m/z 144 (Fig. 3(a)), with an authentic sample of synthetic proline betaine [M+H]1+ (Fig. 3(b)) confirmed this annotation. In addition, the correlated explanatory signals, proposed to be salt adduct and isotopes of proline betaine (Table 3), were also confirmed by FIE-MS/MS fragmentation with standards (data not shown). FIE-MS/MS spectra of the nominal mass bin containing predicted hydroxyproline betaine [M+H]1+ (m/z 160) from a single fasting individual (Fig. 3(c)) substantially matched that of the FIE-MS/MS spectra of an authentic sample of 4-hydroxyproline betaine [M+H]1+ (Fig. 3(d)). However, the presence of fragment ions at m/z 60, 102 and 116 in the spectra derived from FIE-MS/MS analysis of this particular individual's urine suggested the presence of more than one chemical in this nominal mass bin. The FIE-MS/MS spectra of m/z 160 from a ‘pool’ derived from urine collected from four random fasting volunteers (Fig. 3(e)) showed an enhancement of fragment ions m/z 60 and 116. In addition, the correlated signal proposed to be a K+ adduct of hydroxyproline betaine (Table 3) was also confirmed by FIE-MS/MS fragmentation (data not shown). The structures of proline betaine and 4- and 3-hydroxyproline betaine are shown in Fig. 4(a).
Acute exposure to a test breakfast containing orange juice demonstrates proline betaine, hesperidin and narirutin biotransformation and excretion
The possibility of biotransformation and excretion of proline betaine was examined in spot urine samples collected 2 and 8 h after consumption of a standard breakfast including 200 ml orange juice in study 1 volunteers. Signals discriminating fasting urine samples from either a 2 or 8 h postprandial urine sample in both positive- and negative-ion mode FIE-MS data are shown in Table 4. Comparison of signals at both 2 and 8 h after consumption of orange juice allowed an assessment of the potential contribution of metabolite signals derived from the colonic fermentation of ingested food residues (present in 8 h but not in 2 h samples). Explanatory signals common to both urine sampling times in positive-ion data (italicised in Table 4) corresponded with those derived from proline betaine and hydroxyproline betaine. The two explanatory mass bins, m/z 223 and m/z 319, in negative-ion data (italicised in Table 4) discriminated strongly between fasting and postprandial urine samples at both 2 and 8 h. These two signals were highly correlated both with each other (Fig. 5(a)) and with the positive-ion signals representative of proline betaine (Fig. 5(b)). Detailed FIE-MS/MS analysis of m/z 223 showed a loss of m/z 80, suggesting that it was a sulphonated derivative of proline betaine (Table 5). Analysis of m/z 319 showed fragment ions at m/z 175 and 113 corresponding to glucuronate ‘fingerprint’ ions(Reference Gu, Zhong and Chen29). Both m/z 223 and m/z 319 produced a fragment ion at m/z 143, of which the second-generation fragment ions (m/z 125, 115, 113 and 99) matched the fragmentation spectra of an authentic sample of synthetic proline betaine m/z 143 (Table 5). In addition, FIE-MS/MS analysis of m/z 161, which also correlated with m/z 223 and m/z 319 (Fig. 5(a) and (b)), yielded a fragment ion at m/z 143, with identical second-generation fragment ions (data not shown). In addition, two other correlated negative-ion features, m/z 345 and 343, are most probably glucuronides of a further, as yet unidentified, compound (data not shown).
* Rank in a random forest classification of fasting v. postprandial urine samples. Fasting, spot urine samples collected after a 12 h (minimum) fast. Signals associated with proline betaine are italicised; flavonone conjugate signals are bold, m/z 381, hesperetin sulphate [M − H]1 − ; m/z 447, naringenin glucuronide [M − H]1 − ; m/z 477, hesperetin glucuronide [M − H]1; twelve individuals, twenty-four samples per time point.
* Proline betaine standard with abundant signals at m/z 125, 115, 113 and 99.
† MS3 analysis of the MS2 product ion m/z 143 produced the same fragmentation as the proline betaine standard.
The flavonoid glycosides hesperidin (hesperetin-7-rutinoside) and narirutin (naringenin-7-rutinoside) are abundant in oranges and have previously been suggested as possible biomarkers for citrus exposure(Reference Mennen, Sapinho and Ito7, Reference Spencer, Mohsen and Minihane13–Reference Del Rio, Costa and Lean15, Reference Brett, Hollands and Needs25, Reference Kawaii, Tomono and Katase26). As expected(Reference Bredsdorff, Nielsen and Rasmussen14), conjugates of the flavonoid aglycones (hesperetin sulphate, hesperetin monoglucuronide and naringenin monoglucuronide; m/z 381, 477 and 447, respectively) only appeared in urine 6–8 h after consumption of the standard breakfast (Fig. 6) when monitored by targeted analysis of specific m/z in negative-ion mode data (confirmed by FIE-MS/MS, data not shown). Although these ions appeared in urine 8 h after consumption of orange juice in the breakfast (signals bold in negative-ion data; Table 4), these signals were not explanatory of habitual citrus exposure levels in either ‘PRE’ or ‘fasting’ urine samples (see Table S5 of the supplementary material, available online at http://www.journals.cambridge.org/bjn).
Proline betaine and its biotransformation products as potential biomarkers of habitual, in addition to acute, citrus exposure
Within 2 h of acute exposure to orange juice (from the standard breakfast), proline betaine and hydroxylated derivatives were present in urine and persisted in detectable concentrations for at least 8 h. In addition, in both ‘fasting’ and ‘PRE’ urine samples, positive-ion signals derived from both of these chemicals were strongly explanatory of citrus intakes estimated by FFQ (Table 3). The targeted analysis of proline betaine m/z signals in urine samples from individual volunteers representing low, medium and high habitual citrus exposure classes demonstrated a potential quantitative relationship between exposure level and signal intensity (Fig. 7). The false discovery rate-adjusted P values indicate a significant difference between the high and low consumers for all three proline betaine m/z signals (m/z 144, 166 and 182) in both ‘PRE’ and ‘fasting’ urine samples (P < 0·05; see Table S6 of the supplementary material, available online at http://www.journals.cambridge.org/bjn). However, it was not possible to distinguish between medium and either high- or low-level consumers (false discovery rate-adjusted P values >0·05; see Table S6 of the supplementary material, available online at http://www.journals.cambridge.org/bjn). The elevation of excretion of these three proline betaine m/z signals in the high citrus consumers compared with low citrus consumers in PRE urine samples showed a sensitivity of 84·7–92·2 % and a specificity of 74·2–94·1 %, depending on the adduct (see Table S7 of the supplementary material, available online at http://www.journals.cambridge.org/bjn). In the fasting urine samples, the elevation of excretion of these three proline betaine m/z signals showed a sensitivity of 80·8–89·2 % and a specificity of 79·6–89·0 % (see Table S7 of the supplementary material, available online at http://www.journals.cambridge.org/bjn). In addition, negative-ion signals associated with sulphonated or glucuronidated proline betaine biotransformation products were also present at low levels in fasting and PRE urine samples (data not shown).
In the present study, we used a non-targeted metabolomics approach to discover and structurally identify urinary biochemical markers of citrus exposure, in a small group of volunteers. Subsequently, we confirmed this observation in a second group of volunteers of similar size, but who provided a larger number(Reference Holmes, Powell and Campos6) of replicate urine samples collected at intervals over a 14-month period. We observed that proline betaine, an abundant component of citrus fruits(Reference de Zwart, Slow and Payne30–Reference Slow, Donaggio and Cressey32), was strongly explanatory of both acute and habitual exposure to citrus-containing foods. Previous reports have described the rapid excretion of proline betaine in urine following acute exposure to either the pure chemical(Reference Atkinson, Downer and Lever33) or orange juice(Reference Heinzmann, Brown and Chan31, Reference Atkinson, Downer and Lever33). A recent investigation of proline betaine excretion using NMR analysis of postprandial urine samples suggested that it was cleared from the body rapidly and could not be detected easily 14 h after consuming orange juice(Reference Heinzmann, Brown and Chan31). Subsequent validation of urinary proline betaine as a potential biomarker of citrus consumption was undertaken using samples and data from the International Study of Macro- and Micro-Nutrients and Blood Pressure study, in which participants were dichotomised into citrus consumers and citrus non-consumers on the basis of two consecutive multipass 24 h dietary recalls repeated after 3 weeks and analysis of two 24 h urine sample collections made concurrently(Reference Heinzmann, Brown and Chan31). The previous report concluded that proline betaine was an effective biomarker of citrus exposure, where 24 h dietary recall data indicated that citrus products had been consumed within the previous 24 h. The rapid clearance kinetics of proline betaine reported by Heinzmann et al. (Reference Heinzmann, Brown and Chan31) might seem to limit the utility of this metabolite as a biomarker of citrus consumption. However, as well as confirming that proline betaine can be detected by MS in 2 and 8 h postprandial urine samples after acute exposure to orange juice, we demonstrated that this metabolite is present at elevated levels in overnight void (‘PRE’) and ‘fasting’ urine samples in individuals reporting habitually high intake of citrus foods. Furthermore, the present study shows that the quantitative relationships between habitual citrus intake, estimated by FFQ and the levels of proline betaine in fasting urine sample, are not dependent on the knowledge of citrus fruit consumption on the day of urine collection nor is it compromised by unreported factors associated with the timing of citrus intake before urine sampling.
Unlike previous reports that used only positive ionisation mode liquid chromatography–MS procedures only(Reference Atkinson, Downer and Lever33) or NMR fingerprinting(Reference Heinzmann, Brown and Chan31) to detect proline betaine, we have demonstrated that biotransformed proline betaine derivatives are detectable in urine samples and are explanatory of habitual citrus exposure levels. Betonicine (4-hydroxyproline betaine) is a component of citrus fruits, present at a lower concentration than proline betaine(Reference Atkinson, Downer and Lever33), and thus its appearance in postprandial urine samples following exposure to orange juice (m/z signals 160 and 198 in positive-ion mode data; Table 4) is unsurprising. Analysis of biotransformation products in rat urine samples has suggested hydroxylation of proline betaine at carbon 3(Reference Chen, Shen and Han34). The present FIE-MS/MS analysis of m/z 160 revealed three additional fragment ions (m/z 60, 116 and 102). Of the three fragment ions, two (m/z 60 and 102) matched abundant signals in the previously reported fragmentation spectra of an authentic sample of synthetic 3-hydroxyproline betaine [M+H]1+, but the origin of the fragment ion at m/z 116 is currently unknown(Reference Chen, Shen and Han34). Therefore, our data (Fig. 3(e)) suggest that 3-hydroxyproline betaine is also present in human urine, which is probably derived by biotransformation of proline betaine. Additionally, we demonstrate, for the first time (using negative ionisation mode FIE-MS fingerprinting), that proline betaine is also conjugated in human subjects to form sulphate and monoglucuronide derivatives. This relatively complex phase II metabolism was not identified in a previous study of citrus fruit metabolism using NMR(Reference Heinzmann, Brown and Chan31) probably because of the relative insensitivity of NMR-based methodology. The present study shows that proline betaine conjugates are present in urine within 2–8 h after consuming orange juice, and that signal intensities of these derivatives are substantially lower than those reflecting the presence of non-modified proline betaine. Further work is required to describe quantitatively the kinetics of appearance and excretion of sulphate and glucuronide derivatives of proline betaine to help evaluate the possible utility of these biotransformation products as putative biomarkers of citrus exposure. In contrast to proline betaine, citrus food flavonone biomarkers would be subject to considerable diurnal variation dependent on the timing of major phases of colonic fermentation activity required for their absorption(Reference Bredsdorff, Nielsen and Rasmussen14, Reference Del Rio, Costa and Lean15). This is in agreement with a recent review(Reference Pérez-Jiménez, Hubert and Hooper35), which also concludes that the biotransformation products of the flavonone glycosides hesperidin and narirutin are unlikely to be suitable biomarkers of habitual exposure to citrus.
In the present study, we provided a fruit-free standardised meal(Reference Favé, Beckmann and Draper16, Reference Favé, Beckmann and Lloyd22) on the evening before collection of overnight (PRE) and fasting urine samples and were able to distinguish between low, medium and high habitual intakes of citrus foods (estimated by FFQ) based on urinary proline betaine measurements, despite studying samples from only a relatively small number of volunteers. It is observed that even though six replicate samples were available for each volunteer in study 2, the inclusion of extra replicates did not improve significantly the classification robustness over that achieved in study 1 volunteers. In support of the potential of this metabolite as a biomarker of citrus intake, we have observed strong links between orange juice consumption in a standard breakfast and urinary proline betaine excretion 2–8 h later. In addition, we have provided preliminary evidence that proline betaine may be metabolised in human subjects to a number of derivatives including sulphates and glucuronides – an observation contrary to the assumption that proline betaine is metabolically inert(Reference Heinzmann, Brown and Chan31). A potential quantitative relationship between high and low dietary citrus consumption and urinary excretion of proline betaine signals (m/z 144, 166 and 182) in positive-ion data was demonstrated (adjusted P values < 0·05; see Table S6 of the supplementary material, available online at http://www.journals.cambridge.org/bjn). However, differences in intensity levels of these adducts between medium citrus consumers and either low or high consumers were not statistically significant (P>0·05). This may be because individuals who consume either a high amount (at least once a day) or very low amounts of a particular food generally are able more accurately estimate their consumption (using a FFQ reporting system) than individuals who consume these foods at ‘medium’ levels. In addition, of course, it is technically easier to detect the larger differences between ‘high’ and ‘low’ intakes. Relatively high sensitivities and specificities (80·8–92·2 and 74·2–94·1 %, respectively, see Table S7 of the supplementary material, available online at http://www.journals.cambridge.org/bjn) for the three elevated proline betaine adducts (H+, Na+ and K+) were demonstrated in participants who reported high citrus consumption (in both pooled overnight urine and spot fasting urine samples), thus further validating the potential biomarker status of this metabolite. This potential quantitative relationship between high and low dietary citrus consumption and urinary excretion of proline betaine could be further explored using standard Triple Quad technology(Reference Urpi-Sarda, Garrido and Monagas36, Reference Urpi-Sarda, Monagas and Khan37).
In conclusion, the present metabolomics-based study provides prima facie evidence that urinary excretion of proline betaine (and possibly some of its metabolites) is a potentially useful biomarker of habitual citrus consumption. However, we have not attempted to test the utility of this metabolite as a biomarker for different types of citrus fruit nor have we examined, extensively, dose– and time–response relationships between citrus food consumption and patterns of urinary excretion, which would be necessary to establish the sensitivity and specificity of our proposed biomarker. These areas require further investigation.
The present study was supported by the UK Food Standards Agency (project N05073). The authors thank the volunteers for their commitment, the CRF, Newcastle, for nursing support; Claire Kent, Heather E. Gifford, Julie Coaker and Linda Penn for their practical support; and Marks & Spencer for donating the chocolate éclairs (standardised evening meal). We thank Kathleen Tailliart for FIE-MS analysis and for technical support, and Dr Wanchang Lin (Manchester School of Biomedicine) for support with the statistical analysis of data. The authors' contributions to the study were as follows: A. J. L. conducted the data analysis, produced figures, researched the literature and wrote the manuscript; M. B. developed the urine extraction procedures, designed the metabolite fingerprinting experiments, supervised MS support staff, pre-processed data for analysis and edited the manuscript; G. F. undertook the volunteer recruitment, coordinated the volunteer CRF visits and supervised CRF support staff, refined sampling methodology and edited the manuscript; J. C. M. coordinated the study, supervised the study in Newcastle University, designed the volunteer handling protocols and edited the manuscript; J. D. coordinated the study, supervised the study in Aberystwyth, designed the figures and wrote the manuscript. None of the authors has a conflict of interest with respect to the study.