Food Composition Databases (FCDBs): A Bibliometric Analysis

Food composition databases (FCDBs) are important tools that provide information on the nutritional content of foods. Previously, it was largely unclear what nutritional contents and which FCDBs were involved in highly cited papers. The bibliometric study aimed to identify the most productive authors, institutions, and journals. The chemicals/chemical compounds with high averaged citations and FCDBs used by highly cited papers were identified. In July 2023, the online database Web of Science Core Collection (WoSCC) was queried to identify papers related to FCDBs. A total of 803 papers were identified and analyzed. The first paper indexed in WoSCC was published in 1992 by Pennington, which described the usefulness of FCDB for researchers to identify core foods for their own studies. In that paper, the FCDB described was the USDA 1987–88 NFCS (the United States Department of Agriculture 1987–88 Nationwide Food Consumption Survey). The most productive author was Dr. Paul M. Finglas, the Head of the Food Databanks National Capability at the Quadram Institute (Norwich, UK) and the Managing Director of EuroFIR. His most cited paper among this dataset was about the development of an online Irish food composition database together with EuroFIR. The most productive institutions were the USDA and the World Health Organization (WHO) instead of universities. Flavonoid was the most recurring chemical class among the highly cited ones. The anti-oxidative properties and protective effects against heart disease and cancer of flavonoids might be some of the reasons for their popularity in research. Among the highly cited papers, the most heavily used FCDBs were the USDA database for the flavonoid content of selected foods, Fineli, the USDA National Nutrient Database for Standard Reference (USNDB), EuroFIR eBASIS-Bioactive Substances in Food Information Systems, and Phenol-Explorer. High-quality national and international FCDBs should be promoted and made more accessible to the research and public communities to promote better nutrition and public health on a global scale.


Introduction
Food composition databases (FCDBs) are important tools that provide information on the nutritional content of foods (mainly simple and non-cooked processed foods), including macronutrients (e.g., carbohydrates, proteins, and fats), micronutrients (e.g., vitamins and minerals), and other components (e.g., dietary fiber and water) [1,2]. Nutritional assessment via diet analysis requires two steps: evaluating food consumption qualitatively and quantitatively, followed by converting the food into nutrient intake with the aid of FCDBs [3].
Besides nutritional content, the non-nutrient components have also gained attention in recent years, some of which are called bioactive compounds that exist in plant-based food items and have "health promoting/beneficial and/or toxic effects when ingested" [4]. The data stored in FCDBs is used by nutritionists, dietitians, and researchers to assess the nutritional quality of diets, plan meals, and evaluate how food intake is associated with health [5]. Food manufacturers can also utilize FCDBs to develop and label products [6], and policymakers can use them to derive dietary guidelines and regulations [7].
It was believed that the very first food composition table was published back in 1818 by Percy and Vaquelin [8], according to Church [9]. Since then, many countries have developed their own databases, and there are now several global databases available. One very prominent online tool was developed by the European Food Information Resource Network of Excellence (EuroFIR) called FoodExplorer [10], which allows access to multiple national FCDBs mainly based in Europe but also in North America, South Africa, Australia, and Japan.
At first glance, it would seem intuitive to think that FCDBs should have international/global coverage so that all researchers and users could refer to identical reference values compiled from a single, standardized dataset. However, there are several challenges. For instance, there is a lack of standardization in food analysis methods [11]. Different laboratories may adopt different analysis methods for food samples, resulting in variations in the reported nutrient content. Moreover, some foods may not be commonly consumed or available around the food laboratories involved in the sampling, resulting in a lack of data or limited data. These scenarios could lead to data inconsistency and incompleteness, rendering it difficult for a FCDB to cover all food items on a global scale. On the other hand, national FCDBs can be very useful, as they can be compiled to record the details of major and even minor food items and food dishes consumed on a national level [12][13][14]. Besides, there exists regional variability in food composition due to many factors, such as climate, technologies, soil, and different cultivars, and hence using a national or regional FCDB could sometimes be more accurate [11].
Global harmonization in methodology is certainly needed among regional databases to allow data interchange. One important initiative in Europe was the formation of the EuroFIR Association Internationale Sans But Lucratif (AISBL, meaning an international non-profit association), which had a mission to "promote harmonization and exploitation of high-quality food composition data and foster cooperation and participation in development with national compiler organizations" [15]. Formed in 2009, EuroFIR AISBL advocates for improved data quality, storage, and accessibility for food information in Europe and the rest of the world. A prior bibliometric analysis showed that over 100 papers published since 2005 mentioned EuroFIR, and they were mainly published in journals dealing with Agricultural and Biological Sciences, Nursing, Medicine, and Chemistry [15].
With the growing literature dealing with FCDBs, it would be beneficial to examine the relevant literature from a bibliometric perspective so that the most productive researchers and institutions could be identified for further knowledge exchange and research collaboration. The highly cited chemicals/chemical compounds could be identified to reveal what nutritional contents received more attention by the scientific community. Moreover, the exact FCDBs used by the highly cited papers were identified, so that future studies and follow-up studies could choose the same FCDBs for easier comparison or choose different FCDBs with different coverage of food items and parameters, depending on their research aims.

Materials and Methods
In July 2023, the online database Web of Science Core Collection (WoSCC) was accessed with the following search string: FCDB OR FCDBs OR "food composition datab*" WoS is a comprehensive literature database with a long history and is the most widely used database in bibliometrics [16]. The search string was applied to the title, abstract, and author keyword fields of indexed publications. No additional filters were placed on other bibliographic aspects, such as publication year or publication language ( Table 1). The search yielded 803 publications.
Publication and citation counts were extracted directly from the WoSCC database. Counties from England, Scotland, North Ireland, and Wales were merged to represent the United Kingdom. The complete records of the publications were exported into VOSviewer (version 1.6.19, Centre for Science and Technology Studies of Leiden University, Leiden, The Netherlands) [17] for processing and visualization of a term map, with default parameters applied. In brief, the "Create a map based on text data" function was chosen. Then the option "Read data from bibliographic database files" was chosen. In the "Choose fields" step, title and abstract fields were chosen, and the options "Ignore structured abstract labels" and "Ignore copyright statements" were checked. Binary counting of the terms was chosen. The term map showed the recurring terms from the title and abstract of the analyzed publications. To improve visual clarity, the map showed terms appearing in at least 1% (n = 8) of the publications only, a threshold commonly adopted by previous studies [18][19][20]. Each term is labeled a node, with the node size indicating the publication count, its color indicating the citations per publication (CPP), and the inter-node distance indicating their frequency of co-occurrence.

Results
The cumulative publication and citation counts of the FCDBs research are shown in Figure 1. The 803 papers were cited 21,813 times in total, with an h-index of 76 and a CPP of 27.2). The first paper indexed in WoSCC was published in 1992 by Pennington [21] and described the usefulness of FCDB for researchers to identify core foods for their own studies. In that paper, the FCDB described was the USDA 1987-88 NFCS (the United States Department of Agriculture 1987-88 Nationwide Food Consumption Survey). Since this paper was published, the publication and citation counts have gradually increased over the years. The ratio of original articles (n = 677, CPP = 28.4) to reviews (n = 56, CPP = 42.2) was 12:1. The top 5 most productive authors, institutions, countries, journals, and journal categories are listed in Table 2. The most productive author was Dr. Paul M. Finglas, the Head of the Food Databanks National Capability at the Quadram Institute (Norwich, UK) and the Managing Director of EuroFIR. His most cited paper among this dataset was about the development of an online Irish food composition database together with EuroFIR [22]. Interestingly, the most productive institutions were the USDA and the World Health Organization (WHO) instead of universities. Meanwhile, the most productive countries were led by the United States, followed by European countries and Australia. The top five most productive journals consisted of some traditional journals, such as the Journal of Food Composition and Analysis (started in 1987; 2022 impact factor = 4.3; Q2 in JCR Food Science and Technology and Chemistry, Applied), as well as newer journals, such as Nutrients (started in 2009; 2022 impact factor = 5.9; Q1 in JCR Nutrition and Dietetics).
The recurring terms from the title and abstract of the analyzed publications are visualized as a term map in Figure 2. Terms with the highest CPP (yellow nodes) were concentrated at the top right corner of the figure. Many of these terms were chemicals or compounds ( Table 3).
The FCDBs used as the data source or involved in the methodology among the top 50 most cited papers were recorded ( Table 4). The top five FCDBs were the USDA database for the flavonoid content of selected foods, Fineli, the USDA National Nutrient Database for Standard Reference (USNDB), EuroFIR eBASIS-Bioactive Substances in Food Information Systems, and Phenol-Explorer.
The top 10 most-cited papers are reported here. Many of them concerned flavonoids and polyphenols and ranged from establishing new FCDBs to using data from existing FCDBs for their analysis (Table 5).    concentrated at the top right corner of the figure. Many of these terms were chemicals or compounds (Table 3).  The FCDBs used as the data source or involved in the methodology among the top 50 most cited papers were recorded ( Table 4). The top five FCDBs were the USDA database for the flavonoid content of selected foods, Fineli, the USDA National Nutrient Database for Standard Reference (USNDB), EuroFIR eBASIS-Bioactive Substances in Food Information Systems, and Phenol-Explorer.   It was an overview of dietary flavonoids, covering the nomenclature, occurrence, and intake, and reviewed the estimated intakes of selected subclasses of flavonoids in several countries based on data from FCDBs. This paper commented that teas provide rich dietary sources of flavan-3-ols, flavonols, and derived tannins in many countries, but FCDB values for tannin derivatives are "weak at best and in most cases nonexistent". 590 [49] Flavonols, flavones, and flavanones Review It covered the evidence from epidemiological studies on the association between human health and the intake of flavonols, flavones, and flavanones, such as a risk reduction of age-related chronic diseases ranging from cancer to cardiovascular disease and other chronic conditions. The authors concluded that clinically controlled trials should be conducted to further test the associations identified from these epidemiological studies.

[50] Polyphenols Review
It briefed readers on the chemistry, occurrence, and human health of polyphenols. It covered the most common classification of the phenolic compounds into two major groups: flavonoids and non-flavonoids called polyphenols. Due to their applications in food preservation and therapeutic usage, much research has been conducted to elucidate the association between their intake and numerous diseases, ranging from diabetes, hypertension, cardiovascular disease, and cancer.  [43] Rice antioxidants Article By compiling data from over 300 papers, this article built a FCDB on the contents of various antioxidants contained in rice, such as phenolic acids, flavonoids, gamma-oryzanol, anthocyanins, proanthocyanidins, phytic acid, tocopherols, and tocotrienols. It also highlighted that black rice had the highest antioxidant activities, followed by purple, red, and brown varieties, and japonica varieties had a higher antioxidant content than indica varieties. The FCDB was constructed based on data/values provided by published papers around the world, so the presence of certain food compounds and values depended on their availability in the existing literature.
365 [29] Non-specific Article It described the development, validation, and calibration of a quantitative food frequency questionnaire designed to target Singapore Chinese and the subsequent development of a FCDB for analyzing the collected dietary data.
330 [51] Tocopherol, tocotrienol, and plant sterol Article It reported the tocopherol, tocotrienol, and plant sterol contents of 14 vegetable and 9 industrial fats/oils commercially available in Finland. Results were compared to the values listed by Fineli and the USDA National Nutrient Database.

[52] Polyphenols Article
It collected dietary records from adults to estimate the quantity of dietary intake and the major sources of polyphenols in Finland. Results found that phenolic acids comprised the dominant group of polyphenol intake, followed by proanthocyanidins, anthocyanidins, and other flavonoids. Coffee, cereals, berries, and fruits were the major sources.

[53] Phytosterol Article
It analyzed the phytosterol composition of nuts and seeds commonly available in the United States. Results found that sesame seed and wheat germ had the highest phytosterol content, whereas Brazil nuts had the lowest. Among the common snack foods, pistachios, and sunflower kernels had the highest phytosterol levels, though they were behind sesame seed and wheat germ.

[54] Non-specific Review
It was before the establishment of EuroFIR, and the authors suggested that the development of a pan-European FCDB should be considered to standardize the quantification method, the determination of the consumption pattern of individual foods, and the integration of the likelihoods of large amounts of consumption and chemical quantity at these high levels.

[55] Berry phenolics Review
It was a review of the antioxidant and antimicrobial activities of berry phenolics in Finland. Pieces of evidence from studies of cranberries, cultivated and wild blueberries, black currants, cloudberries, lingonberries, and red raspberries were discussed.

Discussion
The bibliometric analysis of FCDBs literature has found that the 803 papers were cited 21,813 times in total, with an h-index of 76 and a CPP of 27.2. The ratio of original articles to reviews was 12:1. The most productive institutions were USDA and WHO instead of universities. This was largely different from related research fields such as nutraceuticals and functional foods [56] and ethnopharmacology [57], both of which were dominated by university research. Among the top 50 most cited papers, the heavily used FCDBs were the USDA database for the flavonoid content of selected foods, Fineli, the USDA National Nutrient Database for Standard Reference (USNDB), EuroFIR eBASIS-Bioactive Substances in Food Information Systems, and Phenol-Explorer.
FCDBs are important resources for multiple stakeholders that provide detailed information on the nutritional composition of food items, such as data on essential nutrients, vitamins, minerals, and other bioactive components present in the diverse food products indexed by the databases. The information contained in FCDBs could be applied in numerous fields, including nutrition, healthcare, food science, agriculture, and public health.
As the top five recurring journal categories were Nutrition Dietetics, Food Science Technology, Chemistry Applied, Public Environmental Occupational Health, and Endocrinology Metabolism, the relevance of FCDBs to some of these research fields would be briefly covered here. In nutrition and dietetics, dieticians could rely on FCDBs to develop personalized meal plans and provide dietary recommendations for individuals with specific dietary needs. In food science, FCDBs also provide valuable information on the nutritional composition of raw materials and food dishes. One Indonesian study devised a "low sodium, high potassium" healthy diet based on information from the Indonesian food composition database; however, found that a high potassium and high fiber diet made the menu more expensive [58]. Moreover, the food industry could rely on FCDBs for product development and marketing, whereas consumers could make informed food choices and have increased trust in the validity of food product nutritional labels. The Food Label Information Program (FLIP) from the University of Toronto was a good example [6]. It provided comprehensive food product nutrition information (from package labels) for Canadian pre-packaged food and beverages. In terms of public health, FCDBs could play an important role in the development and assessment of public health policies and interventions. For instance, a large-scale study in France referred the dietary records of participants to a FCDB to check the extent of processing of the food they consumed and found that the intake of ultra-processed foods was associated with a gain in body mass index and a higher risk of overweight as well as obesity [59]. As such, the government might promote the consumption of minimally processed foods. Besides, the data provided by FCDBs could assist governments and international agencies in devising food fortification programs and designing strategies to address malnutrition. Nutritional data on edible insects, for instance, were entered into the FAO/INFOODS Food Composition Database for Biodiversity (BioFoodComp) so that people could have a reliable data source regarding the protein and micronutrient contents of the common species [60]. For endocrinology and metabolism, FCDBs could provide data in epidemiological research to elucidate the associations between diet, nutrition, and health outcomes by estimating nutrient intake and associating it with the incidence of chronic diseases, such as cardiovascular diseases and type 2 diabetes. One pan-European study demonstrated an inverse association between flavonoids, especially flavanols and flavonols, and the incidence of type 2 diabetes [61]. It suggested the beneficial role of flavonoids in preventing diabetes.
The information from FCDBs could also be relevant to agriculture. Usually, the nutritional values of vegetables listed in FCDBs do not consider or reflect seasonal variations. For instance, one study identified vitamin C rich vegetables from the USNDB and sampled them in different seasons [62]. Results found that vitamin C was much higher in wintersampled spinach, potatoes in summer/fall, and oranges in winter/spring, implying that the average values stored in FCDBs might be over-or under-estimated when seasonal changes are considered. Accurate knowledge of the nutritional composition of crops is essential for agricultural planning and decision-making. FCDBs help in identifying nutrient-rich crop varieties and promoting their cultivation to enhance food security and combat malnutrition. Additionally, they provide insights into the potential impact of climate change and agricultural practices on food composition, allowing for the development of adaptive strategies to safeguard the nutritional quality of crops.

Limitations
This bibliometric study had some limitations. First, this study relied on a single database, WoSCC. This was conducted because each database records citation counts differently. However, publications not indexed by WoSCC would be missed. Second, by Nutrients 2023, 15, 3548 9 of 11 means of "obliteration by incorporation" [63,64], some older papers would not be cited by newer papers anymore when the initial findings have been regarded as "common sense" or "general knowledge" by the current standard. Hence, the citation count might not completely reflect the impact of some papers. Meanwhile, VOSviewer also had its limitations, such as the inability to pre-define some word phrases to be counted or to identify the exact papers that contributed to the counts of a particular term or country.

Conclusions
Overall, FCDBs are very important resources and tools for stakeholders in many fields, such as nutrition, food science, public health, and healthcare. FCDBs supported laboratory and human research studies, evidence-based policymaking, and consumer education. Many of the FCDB papers dealt with flavonoids, whose anti-oxidative properties and protective effects against heart disease and cancer might be some of the reasons for their popularity in research. Meanwhile, one of the most commonly used FCDBs among the most cited papers was the USDA database for the flavonoid content of selected foods. High-quality national and international FCDBs should be promoted and made more accessible to the research and public communities to promote better nutrition and public health on a global scale.
Funding: Publication made possible in part by support from the HKU Libraries Open Access Author Fund sponsored by the HKU Libraries.