Nutrient composition databases in the age of big data: foodDB, a comprehensive, real-time database infrastructure

BMJ Open. 2019 Jun 27;9(6):e026652. doi: 10.1136/bmjopen-2018-026652.

Abstract

Objectives: Traditional methods for creating food composition tables struggle to cope with the large number of products and the rapid pace of change in the food and drink marketplace. This paper introduces foodDB, a big data approach to the analysis of this marketplace, and presents analyses illustrating its research potential.

Design: foodDB has been used to collect data weekly on all foods and drinks available on six major UK supermarket websites since November 2017. As of June 2018, foodDB has 3 193 171 observations of 128 283 distinct food and drink products measured at multiple timepoints.

Methods: Weekly extraction of nutrition and availability data of products was extracted from the webpages of the supermarket websites. This process was automated with a codebase written in Python.

Results: Analyses using a single weekly timepoint of 97 368 total products in March 2018 identified 2699 ready meals and pizzas, and showed that lower price ready meals had significantly lower levels of fat, saturates, sugar and salt (p<0.001). Longitudinal analyses of 903 pizzas revealed that 10.8% changed their nutritional formulation over 6 months, and 29.9% were either discontinued or new market entries.

Conclusions: foodDB is a powerful new tool for monitoring the food and drink marketplace, the comprehensive sampling and granularity of collection provides power for revealing analyses of the relationship between nutritional quality and marketing of branded foods, timely observation of product reformulation and other changes to the food marketplace.

Keywords: big data; databases; front of pack food labelling; supermarkets; web scraping.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Commerce / statistics & numerical data*
  • Consumer Health Information / statistics & numerical data*
  • Data Collection
  • Databases, Factual
  • Dietary Fats / analysis*
  • Dietary Sucrose / analysis*
  • Fast Foods / analysis*
  • Fast Foods / economics
  • Food Labeling
  • Food-Processing Industry* / economics
  • Humans
  • Longitudinal Studies
  • Marketing
  • Meals
  • Nutrition Policy
  • Nutritional Status
  • Nutritive Value
  • Sodium Chloride, Dietary / analysis*
  • United Kingdom / epidemiology

Substances

  • Dietary Fats
  • Dietary Sucrose
  • Sodium Chloride, Dietary