It is widely accepted that current eating habits in general tend to be less than healthyReference Salt1. For example, the financial burden on the National Health Service of food-related ill health is more than double that related to smokingReference Rayner and Scarborough2. One reason for this is the increase in consumption of pre-packaged foods and a subsequent reduction in consumers who cook food every dayReference Caraher, Dixon, Lang and Carr-Hill3. A change in people's dietary approach is therefore crucial in order to provide a healthier dietReference Holdsworth and Spalding4. One area where consumers can be informed of dietary information is in the labelling of nutrition on foods, so that consumers can make an informed choice regarding what they eat.
However, Black and Rayner'sReference Black and Rayner5 influential study of nutrition labels found that people often do not use nutrition information, or just use one item on the nutrition label (usually fat) to guide judgements on the healthiness of a foodstuff. Furthermore, people found nutrition information difficult to comprehend and use. For example, consumers found it difficult to determine whether a specified amount of a nutrient was a low, medium or high amount.
More recently, Higginson et al. Reference Higginson, Rayner, Draper and Kirk6 examined verbal protocols when consumers made their normal weekly shop and when purchasing the healthiest version of nine items listed by the researchers. Nutrition labels were only examined on 4.2 and 33.0% of occasions for the weekly and healthiest version shops, respectively. For both types of shopping, the two main nutrients examined were fat and energy.
It is clear that there are problems with the current nutrition label. Firstly, consumers find it difficult to understand the information presented. Secondly, there is a lack of understanding as to what nutrients are important to examine, with consumers mainly attending to only fat and energy. In fact, it has been known for some time that the information given regarding nutrition may not quite match what the consumer wants or needsReference Hurt7.
Recently, the Food Standards Agency (FSA) tested several methods of banding nutrition information based on signpost labelling concepts8, assessing the extent to which consumers could quickly assess the nutritional content and healthiness of a product. The preferred method of consumers and the FSA was the ‘key nutrients’ concept (option D) which rated each of four nutrients as high (red), medium (amber) or low (green) in what we will refer to as a ‘traffic light’ system.
The study presented will examine the effect of the traffic light label on consumer's perception of the health rating of foodstuffs. The study will add to existing research on nutrition labelling by systematically varying the levels of each nutrient in a methodical way in order to examine precisely which nutrients people take account of. For example, if people mainly examine the energy and fat content on nutrition labels, does this mean they only use these two items to make health judgements? Do they use only one (or none) of the two? Or do they use other nutrients even though they rarely examine these nutrients?
In conjunction with the systematic variation of the levels of nutrients, eye tracking equipment is used in order to obtain precise values regarding the amount of time spent examining each area of the nutrition label. Eye tracking equipment provides an objective measure rather than a subjective measure of which nutrients consumers examine most often, providing a more sensitive measure of the importance of particular nutrients.
The remainder of this paper is as follows. Firstly, we cover how the levels of each nutrient were systematically varied. Secondly, we give details of the actual methodology of the study. Thirdly, we cover the results obtained. Fourthly, a discussion of the results will be given.
Definition of nutrient variables
The eight standard nutrients found on most nutrition labels (energy, protein, fat, saturates, carbohydrates, sugars, fibre and sodium) were systematically varied across 18 nutrition labels using a balanced fractional factorial and orthogonal design. The orthogonal design controls collinearity in the regression analysis, and the balanced fractional factorial provides sufficient independent variability to disambiguate the separate effects of each nutrient.
Each of the eight nutrients were assigned values of either high, medium or low across 18 orthogonal combinations. High, medium and low levels map on to the traffic light system proposed by the FSA8. The 18 combinations were based upon a random sample of possible combinations, and orthogonality was checked so that correlations between nutrients were < 0.40. Furthermore, the design was balanced so that across the 18 labels there were six instances of high, medium and low for each nutrient. Table 1 shows the high, medium and low levels of each nutrient for each of the 18 devised labels.
Table 2 shows the actual quantities that constitute high, medium and low levels of each nutrient based upon guideline daily amounts (GDA) derived from Rayner et al. Reference Rayner, Scarborough and Williams13 and FSA definitions of ‘a little’ (3.3% or less of GDA) and ‘a lot’ (20% or more of GDA)Reference Rayner, Scarborough and Stockley11.
Energy levels (kcal) were converted to joules (kJ) by multiplying by 4.184.
In order to ensure that the actual quantity of a nutrient varied across labels, the nutrient quantities were randomly assigned within a 12.5% band at each of the low/medium/high levels. This ensured that, for example, two labels with ‘high’ levels of fat did not have exactly the same gram quantities of fat. Table 3 shows the low/medium/high quantity ranges for each nutrient. Note that the actual level used for ‘medium’ had to be lowered in order to prevent unrealistic labels, such as medium levels of saturated fat exceeding low levels of fat. An example of the actual numerical quantities for one of the labels (label 1) can be seen in Table 4.
Method
Participants
Ninety-two participants (25 male, 67 female; mean age 31.5 years) were paid for their participation in the study. All participants had normal or corrected-to-normal vision and were either staff or students at the University of Derby. None of the participants worked in food or nutrition areas.
Design
A 2 (label type: label A – per 100 g and per serving information; label B – per 100 g, per serving and traffic light information) × 9 (nutrition type: amounts of energy/kcal, energy/kJ, protein, fat, saturates, carbohydrates, sugars, fibre and sodium) repeated measures design was employed. The dependent variables were the perceived healthiness rating for a nutrition label (on a scale of 1–10 with 1 being less healthy and 10 being more healthy; healthiness ratings such as this have been used effectively in the pastReference Byrd-Bredbenner, Alfieri and Kiefer9) and the areas of the nutrition label that participants examined.
Materials
A Cambridge Systems Video Eyetracker Toolbox, dual screen RM 2.8 GHz Pentium PC running Microsoft Windows 2000 Professional SP4 and Video Eye Trace 2.0.1 software recorded the raw eye movement data files. A 17 inch monitor was placed directly in front and 67 cm away from the eyetracker. Raw eye movement data were analysed by in-house software written using Microsoft Visual Basic.
Two label types were devised: type A (the standard eight nutrients plus an additional energy nutrient in kJ) displayed at levels of per 100 g and per serving; and type B (as per label A plus fat, saturates, sugars and salt also being displayed as high/medium/low traffic light symbols) (see Fig. 1 for an example). Eighteen labels were produced for both label types A and B based on the nutrition levels in Table 1. To ensure consumers would not recognise that the same nutrition levels were being used, type B labels were produced from a different random seed, i.e. the underlying high, medium and low banding levels were maintained, but the actual figures presented varied. Macromedia Authorware 6 was used to present the nutrition labels and record healthiness ratings. The ‘per serving’ information was set at 250 g, reflecting a believable figure and one which produces sufficiently different values from the per 100 g values. A 250 g serving was within the range of typical servings based on a small survey of 12 items.
Procedure
Participants completed a pre-study questionnaire (assessing how often they shopped, how often they examine nutrition labels, etc.) before completing the two-part eye movement study. The first part displayed type A nutrition labels, and the second part displayed type B nutrition labels. Note that label displays were not counterbalanced: displaying label B (normal label plus traffic lights) prior to label A may have guided consumers as to the important nutrients to examine in label A.
For each of the two parts, participants were calibrated to the eye tracking system before each set of 18 nutrition labels were displayed. Labels were displayed in a random sequence, with the participant being asked to examine the nutrition information and judge the label for healthiness. A break of 30 s was given between part one and part two. Upon completion, participants filled in a post-study questionnaire (assessing what they thought of the traffic light labels, etc.) and were debriefed as to the nature of the study. The questionnaire data are not presented in this paper.
Results
Derivation of fixation data
The eye tracking equipment recorded the x, y position of the eye on the nutrition label every 20 ms. These data were summarised into fixation locations if the gaze remained in a fixed location (or at a location subtended by a maximum angle of 1° from the original x, y position) for 200 ms. The time spent examining each amount of a nutrient on a nutrition label was the cumulative amount of the fixation time at an x, y location that was located within the numeric figures for the nutrient.
The fixation data analysis for label A was based on 64 participants and the fixation data for label B was based on 71 participants. Eye tracking data that were not of a sufficient standard were discarded.
Summary of fixation data
Figures 2 and 3 show the percentages of time spent examining each of the nutrient quantities for label types A and B, respectively. There was a clear difference between the two label types – for label A, carbohydrate sugars were examined most often, whereas for label B, fat was examined most often. The traffic light label was clearly affecting the areas of the label that participants examine. The most explicit indication of this was that when all areas relating to each specific nutrient were totalled (e.g. fibre per 100 g and fibre per serving, fat per 100 g, per serving and traffic light), all of the traffic light nutrients headed the list of the nutrients that were examined the most often.
Regression analysis
Label type A – per 100 g and per serving
Multiple regression analysis was used to examine the relationship between healthiness ratings and the nutrients. All 1656 ratings of perceived healthiness for the 18 labels were included in the analysis. The independent variables (IVs) or predictors were the amounts of each of the eight nutrients (since energy kcal and energy kJ specified similar information). The dependent variable was the healthiness ratings given for each label. As there were multiple data for each participant, dummy variables to identify each participant were entered in the first block in order to control for variability due to individual differencesReference Pedhazur12. The main IVs were then entered. Diagnostic checks for collinearity and cases exerting undue influence were performed and showed no reason for concern. Table 5 provides a summary of the regression results.
* Significant at P < 0.05 or better.
† Nutrients indicated by the traffic light system.
The between-participants dummy variables in model 1 gave R = 0.55 and the adjusted R 2 = 0.26. The model including the eight nutrients gave R = 0.63 and adjusted R 2 = 0.36 (F(99,1556) = 10.33, P < 0.01). The R 2 change figure suggested that 9.5% of the variance in healthiness ratings was related to some combination of the eight nutrients.
Standardised regression coefficients for each nutrient suggested that an increase in perceived healthiness was associated with decreases in fat ( − 0.161, t(1556) = 6.19, P < 0.001), saturated fat ( − 0.194, t(1556) = 7.72, P < 0.001), energy ( − 0.092, t(1556) = 4.18, P < 0.001) and carbohydrate sugars ( − 0.055, t(1556) = 2.45, P < 0.05), and increases in fibre (0.086, t(1556) = 3.37, P < 0.01).
Label type B – per 100 g, per serving and traffic lights
The same analysis performed for label A was repeated for label B. The between-participants dummy variables in model 1 gave R = 0.40 and the adjusted R 2 = 0.11. The model including the eight nutrients gave R = 0.75 and adjusted R 2 = 0.53 (F(99,1556) = 19.97, P < 0.01). The R 2 change figure suggested that 39.7% of the variance in healthiness ratings were related to some combination of the eight nutrients.
Standardised regression coefficients for each nutrient suggested that an increase in perceived healthiness was associated with decreases in fat ( − 0.331, t(1556) = 15.25, P < 0.001), saturated fat ( − 0.251, t(1556) = 11.41, P < 0.001), energy ( − 0.049, t(1556) = 2.67, P < 0.001), carbohydrate sugars ( − 0.230, t(1556) = 12.27, P < 0.05), fibre ( − 0.062, t(1556) = 2.91, P < 0.01) and sodium ( − 0.204, t(1556) = 10.14, P < 0.001).
Do people pay the most attention to the nutrients they use to make healthiness judgements?
Table 6 shows the percentage fixation time and the absolute standardised β values for each of the nutrients for labels A and B. Percentage fixation times indicate which nutrients participants examined most often, and the absolute standardised β values indicate which nutrients participants placed the most importance on for arriving at a healthiness rating.
There was no correlation between fixation times and standardised β values for label A (r(6) = − 0.01, P>0.05) but there was a significant correlation for label B (r(6) = 0.88, P < 0.01). The traffic light system in label B clearly helps in guiding participants' attention to the most appropriate areas of the nutrition label.
Are participants accurate in their healthiness ratings?
Health scores were calculated for each of the nutrition labels displayed based on the SSAg/1 system of calculating healthinessReference Rayner, Scarborough and Stockley11. The SSAg/1 system was used because it maps onto specific nutrient values that are depicted in the nutrition label, and it gives a score with clear minimum and maximum values (0–8). Table 7 shows the SSAg/1 health scores for the 18 nutrition labels together with the mean perceived healthiness rating for that label, for each of the two label types. For ease of comparison, the mean perceived healthiness ratings were scaled to be from 0 to 8 and were reversed so that scores tending towards 0 represented ‘more healthy’ and scores tending towards 8 represented ‘less healthy’ (as per SSAg/1 scores).
Perceived healthiness ratings are adjusted and reversed such that 0 = very healthy and 8 = very unhealthy.
The amount of ‘error’ in healthiness ratings was then calculated based on the difference between the participants' perceived healthiness rating for each label and the actual SSAg/1 health score for each label. The mean error in perceived healthiness ratings was 2.22 (standard deviation (SD) 0.77) for label A and 1.77 (SD 0.76) for label B. The mean error was significantly lower for label B (t(17) = 3.57, P < 0.01), indicating that for label B, participants’ perceived healthiness ratings were closer to the actual SSAg/1 health score than they were for label A.
Discussion
The results showed a clear benefit of label type B (the traffic light label). Firstly, for label A, the nutrients that people examined bore little resemblance to the nutrients that people actually used when making a healthiness judgement. This changed with label B: the traffic light guided people to the important nutrients and thus people mainly examined the nutrients that they used to make their healthiness judgements. Secondly, the regression analyses showed that the variance in healthiness ratings accounted for by some combination of the nutrients was only 9.5% for label A whereas this increased to almost 40% for label B. Thirdly, when the traffic lights were present, people directed a lot of their attention towards them. Fourthly, healthiness ratings more closely approximated actual health scores for the nutrition labels when the traffic light was present than when it was not present.
One of the main findings was the remarkable effect the traffic light had upon the information that people examined on the labels and their resulting healthiness judgements. Both the eye movement data and the regression data indicated clear benefits for label B, i.e. two independent measures of eye movements and healthiness ratings both suggest the effectiveness of the traffic lights. There was also a clear indication from these results that the traffic lights guide people to the most important nutrients to consider – and therefore they helped to educate the consumer in relation to the important nutrients to factor when judging healthiness of foodstuffs.
The difference in results between label A and label B suggests that for the standard nutrition label, there may be too much information for consumers to comprehend, and this supports the previous literature that has examined nutrition labelsReference Black and Rayner5. The traffic lights reduce the amount of nutrients that people have to examine and, furthermore, they reduce the amount of calculation that consumers have to perform, because they indicate levels of the nutrient rather than requiring the consumer to compute what a numeric value of a nutrient means. As such, the cognitive workload of the consumer is reduced so that there is more opportunity to make an informed decision about the foodstuff.
However, the tightly controlled methodology required for objectivity together with a computer-based presentation does result in the need for some caution when interpreting the results. Firstly, the traffic light label was presented alongside the nutrition label, whereas it was actually intended to be on the front-of-pack. Secondly, no further information was given with the nutrition label, whereas in real life, nutrition decisions may be affected by a variety of information such as the foodstuff itself, nutrient claims, ingredients, etc. Thirdly, the task itself was to assess the healthiness of a foodstuff, whereas consumers often only use nutrition information for comparisons with other foodstuffsReference Higginson, Rayner, Draper and Kirk6. Fourthly, participants were only given one type of task to do – it remains to be seen if the traffic light label outperforms the standard nutrition label for other types of task, such as comparisons across labels. Nevertheless, the study presented is important because it provides a baseline measure of performance, enabling comparisons when greater context and realism are added in future research.
Using a systematic approach and measuring eye movements and healthiness ratings, this article has shown that consumers find the standard nutrition label difficult to interpret, whereas the traffic light label helps to guide consumers’ attention and hence contributes to a marked improvement in health perception of foodstuffs. Further research is required to examine the extent to which the findings apply in a real-world setting.
Acknowledgements
Sources of funding: This research was funded by the Allied Health Group at the University of Derby. The Allied Health Group is a funding body within the University of Derby that funds independent health-related research.
Conflict of interest declaration: This research was carried out independently. The authors and all bodies associated with this paper have no conflicting interests to declare.
Authorship responsibilities: Both authors of this paper provided substantial contributions to this research.
Acknowledgements: The authors would like to thank Anastasios Vaporidis for his help in collecting the data presented, Alan Wright for his help in developing the eye movement software and the Allied Health Group for help in funding this research.