Quality of Systematic Reviews of the Foods with Function Claims in Japan: Comparative Before- and After-Evaluation of Verification Reports by the Consumer Affairs Agency

Background: In Japan, a new type of foods with health claims, called Foods with Function Claims (FFC), was introduced in April 2015 in order to make more products available that are clearly labeled with certain health functions. Regarding substantiating product effectiveness, scientific evidence for the proposed function claims must be explained by systematic reviews (SRs), but the quality of SRs was not clear. The objectives of this review were to assess the quality of SRs based on the FFC registered on the Consumer Affairs Agency (CAA) website in Japan, and to determine whether the CAA’s verification report in 2016 was associated with improvement in the quality of SRs. Methods: We evaluated the reporting quality of each SR by the AMSTAR checklist on methodological quality. We searched the database from 1 April to 31 October 2015 as the before-SR and from 1 July 2017 to 31 January 2018 as the after-SR. Results: Among the 104 SRs reviewed, 96 final products were included: 51 (53.1%) were supplements, 42 (43.8%) were processed foods without supplements, and 3 (3.1%) were fresh foods. Of the 104 SRs, 92 (88.5%) were qualitative reviews (i.e., without meta-analysis) and 12 (11.5%) performed a meta-analysis. The average quality score of before-SRs and after-SRs was 6.2 ± 1.8 and 5.0 ± 1.9, respectively, a statistically significant decrease (p < 0.001). Conclusion: Overall, the methodology and reporting quality of after-SRs based on the FFC were poorer than those of before-SRs. In particular, there were very poor descriptions and/or implementations of study selection and data extraction, search strategy, evaluation methods for risk of bias, assessment of publication bias, and formulating conclusions based on methodological rigor and scientific quality of the included studies.


Background
The Codex Alimentarius Commission (CAC) is an intergovernmental organization that was founded in 1962 to develop food standards, guidelines, and codes of practice [1]. The basic principles of CAC are that health claims should be substantiated by currently sound and sufficient scientific evidence, provide truthful and nonmisleading information that consumers can use to choose healthy diets, and be supported by specific consumer education [2].
In accordance with CAC guidelines, only government-approved Foods for Specified Health Uses (FOSHU) and foods with nutrient function claims (FNFC) can make function claims on food labels in Japan, and these must comply with specifications and standards designated by the government [3]. The FOSHU are scientifically accepted for their usefulness in maintaining and promoting health and are therefore permitted to contain food effects and safety claims that have been evaluated by the government.
Foods approved as FNFC can be used to supplement or complement the nutrients (vitamins, minerals, etc.) that are in insufficient quantities in an individual's daily diet. These foods can carry a nutrient function claim prescribed by the government standards and can be freely manufactured and distributed without any permission from or a notification to the national government [3]. In addition to these categories, a new type of foods with health claims, called Foods with Function Claims (FFC), was introduced in April 2015 ( Figure 1). The FFC allows manufacturers to submit labeling to the Secretary-General of the Consumer Affairs Agency (CAA) in Japan that indicates the food is expected to have a specific effect on health, except for reducing the risk of diseases.

Background
The Codex Alimentarius Commission (CAC) is an intergovernmental organization that was founded in 1962 to develop food standards, guidelines, and codes of practice [1]. The basic principles of CAC are that health claims should be substantiated by currently sound and sufficient scientific evidence, provide truthful and nonmisleading information that consumers can use to choose healthy diets, and be supported by specific consumer education [2].
In accordance with CAC guidelines, only government-approved Foods for Specified Health Uses (FOSHU) and foods with nutrient function claims (FNFC) can make function claims on food labels in Japan, and these must comply with specifications and standards designated by the government [3]. The FOSHU are scientifically accepted for their usefulness in maintaining and promoting health and are therefore permitted to contain food effects and safety claims that have been evaluated by the government.
Foods approved as FNFC can be used to supplement or complement the nutrients (vitamins, minerals, etc.) that are in insufficient quantities in an individual's daily diet. These foods can carry a nutrient function claim prescribed by the government standards and can be freely manufactured and distributed without any permission from or a notification to the national government [3]. In addition to these categories, a new type of foods with health claims, called Foods with Function Claims (FFC), was introduced in April 2015 ( Figure 1). The FFC allows manufacturers to submit labeling to the Secretary-General of the Consumer Affairs Agency (CAA) in Japan that indicates the food is expected to have a specific effect on health, except for reducing the risk of diseases. Unlike the strict evaluation criteria applied through the FOSHU and FNFC processes, the FFC is only a notification system in which food manufacturers must meet five unique and specific criteria ( Table 1). Although the government does not evaluate the safety and effectiveness of the submitted product, i.e., it does not utilize a notification system, the industry (applicant) must fulfill several procedures to submit a notification. All the FFC criteria submitted by the manufacturers are disclosed on the website of the CAA, which gives approval for the labeling of food products. For a food product Unlike the strict evaluation criteria applied through the FOSHU and FNFC processes, the FFC is only a notification system in which food manufacturers must meet five unique and specific criteria (Table 1). Although the government does not evaluate the safety and effectiveness of the submitted product, i.e., it does not utilize a notification system, the industry (applicant) must fulfill several procedures to submit a notification. All the FFC criteria submitted by the manufacturers are disclosed on the website of the CAA, which gives approval for the labeling of food products. For a food product to claim effectiveness on its label, evidence for its proposed function claims must be substantiated by one of two standard scientific methods: clinical trials such as randomized controlled trials (RCTs), or systematic reviews (SRs). Details about the use of these two methods for food with function claims have been published on the CAA website [4]. A notable point in this system is that not only RCTs but also SRs are permitted. Since promoting deregulation is a national goal, SRs have the advantage of being easy to report to small and medium-sized enterprises because they are less expensive than RCTs. Table 1. Characteristics of the foods with function claims in Japan. 1 Foods with Function Claims are for people not suffering from any disease (excluding minors, pregnant women (and those planning a pregnancy), or lactating women).
2 All food products including fresh produce are subject to this system.*

3
Prior to market entry (before at least 60 days), food business operators are required to submit information, such as on food safety and effectiveness and the system in place to collect information on adverse health effects, to the Secretary-General of Consumer Affairs Agency.

4
Unlike Foods for Specified Health Uses, the government does not evaluate the safety and effectiveness of the submitted product, i.e. notification system.

5
The submitted all information is disclosed on the website of the Consumer Affairs Agency.** Modified partially for this study based on the Consumer Affairs Agency website in Japan. * Excluding Foods for Special Dietary Uses (including FOSHU), FNFC, alcohol-containing beverages, and food products that may lead to the excessive consumption of fat, cholesterol, sugar (limited to mono-and disaccharides, excluding sugar alcohols), or sodium. ** The all information was only written in Japanese.
The methodology of an SR with or without meta-analysis may not be familiar to general or nutritional researchers in the food industry. An SR addresses a question that is carefully formulated to be answered by analysis of all available evidence. It performs an objective literature search by applying predetermined inclusion and exclusion criteria to critically appraise what literature is relevant [5]. The SR is an important method that can help researchers to identify evidence of an effective intervention from a large volume of published biomedical literature.
However, although the methodology of an SR is important in terms of evidence-based nutrition (EBN), an SR has the weakness that assessment of fresh foods that most people eat daily is very difficult. Additionally, an SR may be of limited use if the methods used to conduct the SR are flawed and reporting of the SR was incomplete [6]. Moreover, the scientific validity of an SR is based on deductive planning and clear documentation of the methodological approach that was employed to design and conduct the SR [7]. We were interested in evaluating whether or not SRs of the FFC, which is based on a notification system, had been conducted by appropriate scientific methods, and we hoped to formulate a research challenge for future SRs of the FFC.
In our previous study [8], we adopted a well-known measurement tool for the 'assessment of multiple SR' (AMSTAR checklist) [9] and assessed the quality of 49 SRs that were based on the FFC registered on the CAA website from 1 April to 27 October 2015. Results from that study showed that the methodology and reporting quality of SRs were in the poor description category (mean ± SD: 6.2 ± 1.8 points, range 2-11 points for 11 points full-mark). Based on scientific quality, the SRs had very poor descriptions and/or implementation of the registration, poor evaluations of publication bias, and questionable conclusions.
On the other hand, the CAA in 2016 formatted the expert working group (methodologists for SR) in order to extract issues for appropriate operation of the FFC system and perform a verification [10] according to the PRISMA [11]. Fifty-one submitted SRs were selected for evaluation of quality. These SRs were all registered on the CAA website from 1 April to 31 October 2015. To complete basic standard-level SRs, considering the difficulty in handling foods, this project team attached "appropriate description in SRs based on the 'PRISMA Checklist: an extended version for submitted SRs of Foods with Function Claims'" to its final report and included detailed proposals and examples. Most authors of the present study (HK, HO, TK, JK, MS, and HTO) also participated in this CAA project. Since the report was a specific guideline to perform and submit new SRs and was submitted to food business operators in Japan, we assumed that all researchers in this field watched it closely, followed the checklist, and completed an appropriate description afterward. Therefore, we hypothesized that the CAA's 2016 verification report based on PRISMA [10], in addition to our article on quality evaluation in 2017 [8], were associated with improvement in the quality of subsequently submitted SRs.
The present study design was based on a previous comparative before-and after -evaluation in which RCTs published in 1994 (pre-CONSORT) were compared with RCTs published in the same journals in 1998 (post-CONSORT) [12].
The objectives of this review were to assess the quality of SRs based on the FFC registered on the CAA website in Japan, and to determine whether the CAA's verification report in 2016 was associated with improvement in the quality of SRs.

Scope of This Review
The basic scientific approach of the FFC system ensures safety, functionality, and effectiveness. The purpose of this study was to assess only the quality of SRs, and it therefore focused on face and contact validity for measuring the methodological quality of SRs. Whether each product or functional substance involved is effective is a separate research issue and was not in the scope of this study.
The PRISMA statement is a respected reporting guideline designed to improve the completeness of an SR report [11]. Furthermore, although there was a new critical appraisal tool for SRs (AMSTAR 2) [13], we already performed the before-evaluation based on the AMSTAR checklist [9]. Therefore, we performed the after-evaluation with the same tool in order to compare the evaluation results.

Criteria for Considering Studies Included in This Review
Criteria for considering studies that were included in this review were based on those in the predefined protocol.

Types of Studies
Studies were eligible if they were SRs (with or without a meta-analysis).

Types of Participants
This study was a review based on SRs and was therefore restricted to original SRs of healthy adults (people not suffering from any disease).

Comparator(s)/Control
In the original SRs, controls were defined as healthy adults identified from preplanned stratified analyses of (a) placebo controls or waiting list controls, (b) intervention groups that compared different types of products or ingredients, and (c) low-or medium-level intake groups of the same product or ingredient.

Types of Intervention and Language
For processed foods in the form of supplements, studies included at least one intervention group in which the functional ingredient and the final product were applied. For fresh food and other processed food, studies included at least one exposure group in which the functional ingredient and the final product were applied and included observational studies and intervention studies.

Types of Outcome Measures
Outcome measures included many types of positive contributions to health, to the improvement of a function, or to preserving health as an outcome. In effect, we included all notified SRs.

Search Strategies
Our search of the databases on the CAA website covered the period from 1 April 2015 (starting date) through 27 October 2015 for the before-SRs [8], and from 1 July 2017 to 31 January 2018 for the after-SRs. The special search strategies contained the elements and terms (i.e., a specific search method based on keywords) on the CAA website. All references in identified SRs were screened. The search was performed by the steering author (HK).

Hand-Searching and Reference Checking
Since this study was limited to SRs registered in the CAA database, hand-searching and reference checking were not applicable.

Selection of Studies
To select the studies that were to be reviewed, all criteria were applied by the steering author (HK) to the full library of articles published on the CAA website. Studies were selected when (i) the design was an SR based on an intervention study, (ii) the study was appropriately notified by the CAA, and (iii) the study was published on the website. Studies (notification) that were excluded are presented with reasons for exclusion.

Quality Assessment of Included Studies
To ensure that variation was not caused by systematic errors in study design or execution, four review authors (YW, TY, MS, and JK) independently assessed the quality of articles (i.e., every two reviewers were paired). Disagreements and uncertainties were resolved by discussion with another author (HK). A full quality appraisal of these papers was made using a combined tool based on the AMSTAR checklist that was developed to assess methodological quality of SRs. Each item was scored as 'present' (Yes), 'absent' (No), 'unclear or inadequately described' (Can't answer), or 'Not applicable' (n/a). Depending on the study design (with or without meta-analysis), some items were not applicable; therefore, the "n/a" score was not considered to be an error in the calculation for quality assessment. Although the original AMSTAR has 11 check items, two meanings can be applied to item #3, which reads as follows: "Was a comprehensive literature search performed. At least two electronic sources should be searched. The report must include years and databases used. Key word and/or MESH terms must be stated and where feasible the search strategy should be provided (continued)." Because the guideline for FFC notification on the CAA website requires the use of at least two electronic databases, we divided #3 into two parts in order to detect any trend arising from the use of databases: "#3a; which databases did the SR use or number of the other databases", and "#3b; Did the SR use MESH terms and related search function to detect comprehensively".
All authors attended one 3-hour consensus-training session based on the AMSTAR checklist before starting the quality assessment to ensure that they used the same criteria and correctly evaluated the check-items for an SR.
The percentage of descriptions present on all 11 (excluding #3a) of the check items for the quality assessment of articles was determined. Then, based on the percentage of risk of poor methodology and/or bias, each item was assigned to one of the following categories: good description (80%-100%), poor description (50-79%), or very poor description (0-49%).
Disagreements and uncertainties were resolved by discussion with other authors (HK, TK, and HO). Interrater reliability was calculated by the steering author (HK) on a dichotomous scale using percentage agreement and Cohen's kappa coefficient (k).

Characteristics of Studies and Data Extraction
Two authors (HK and HO) described the characteristics from each article based on information on the CAA website but did not produce a structured abstract for SRs, which is recommended [14]. Because this study focused on evaluating quality of SRs of the FFC, it did not summarize evidence for the effectiveness of each SR.

Research Protocol Registration
We submitted and registered our research protocol to PROSPERO (CRD42017080833) and UMIN-CTR (UMIN000029821). PROSPERO is an international database of prospectively registered SRs in health and social care [15]. Key features from a review protocol are recorded and maintained as a permanent record in PROSPERO. UMIN-CTR is a Japanese and international database of prospectively registered clinical trials and other trials with SRs in health and social care and was accepted as an international registry database by the International Committee of Medical Journal Editors in 2007. In a previous study [8], we implemented our protocol before UMIN-CTR was formally launched on 1 April 2015, and we planned to continue checking target SRs prospectively from our study start date. In the present study, we also planned to continue reviewing target SRs prospectively from 1 July 2017 to 31 January 2018.

Statistical Analysis
A two-sample t test was employed for comparisons between two terms (number of databases and before-and after-evaluation scores) with continuous variables in the analysis. The χ 2 test and Fisher's exact test were performed with discrete variables (i.e., number and % of good description on each item). Statistical analysis was performed with SPSS version 23.0 (IBM Corporation, Armonk, NY, USA) for Windows. For all analyses, p-values less than 0.05 were considered statistically significant.

Study Selection and Characteristics
Of the 294 potentially relevant articles included in the literature search, 198 notifications were excluded because they did not meet the eligibility criteria ( Figure 2). A total of 104 SRs (including eight multiple claims) met all inclusion criteria. The language of all eligible publications was Japanese.

Quality Assessment
We evaluated 11 items from the AMSTAR checklist in more detail (Table 3 and Table S1 data). Interrater reliability metrics for the quality assessment indicated substantial agreement (71.7%, k = 0.558) for all 1144 items (11 items multiplied by 104 SRs). Overall, there was an increase over time in evaluation score. The average of the quality score for before-SRs and after-SRs was 6.2 ± 1.8 and 5.0 ± 1.9, respectively, which was a statistically significant decrease (p < 0.001).
There was a good description and/or implementation for the following items: "Was an 'a priori' design provided?" (before-, 4% and after-, 86%, p < 0.001); "A list of included and excluded studies should be provided." (before-, 100% and after-, 98%, p = 0.329); "Were the methods used to combine the findings of studies appropriate?" (before-, 56% and after, 92%, p = 0.116); and "Were the characteristics of the included studies provided?" (before-, 84% and after-, 88%, p = 0.521). These items were still a good description or improving in the after-evaluation.
There continued to be a poor description and/or implementation for the item, "Was the scientific quality of the included studies assessed and documented?" (before-, 73% and after-, 59%, p = 0.076).
The other items were a very poor description and/or implementation: "Was there duplicate study selection and data extraction?" (decreased from 65% to 41%, p < 0.01); "Did the SR use the MESH terms and related search function to detect comprehensively?" (decreased from 53% to 47%, p = 0.492); "Was the status of publication used as an inclusion criterion?" (decreased from 24% to 3%, p < 0.001); "Was the scientific quality of the included studies used appropriately in formulating conclusions?" (decreased from 27% to 26%, p = 0.94); "Was the likelihood of publication bias assessed?" (increased from 12% to 13%, p = 0.964); and "Was the COI stated?" (all decreased; from 78% to 25%, p < 0.001).
According to one component of #3a, "which databases did the SR use or number of the other databases?", the number of used databases was the same between before-and after-evaluation (mean ± SD was 3.8 ± 1.8, p = 1.000). According to respective before-and after-evaluations, the high utility databases were PubMed (93.9% and 100%), JDream III (in Japanese databases, 79.6% and 60.6%), Ichushi-Web (in Japanese databases, 67.3% and 64.4%), The Cochrane Library (with CENTRAL, 49.0% and 55.8%), and UMIN-CTR (Japanese clinical trial registry, 18.4% and 23.1%).

Discussion
This is the first prospective before-and after-SR of SRs of the FFC registered on the CAA website in Japan.
The FFC in Japan is an original and unique system regarding health claims. A food business operator must submit a completed notification and related documents to the Secretary-General of the CAA 60 days prior to the launch date. Therefore, all consumers can check all content such as safety, functional mechanism, and effectiveness (i.e., total evidence) of the product, resulting in high transparency.
We propose that this study will be helpful to researchers and government officials who want to know about new health claims in advanced countries. We expected that the total quality of after-SRs might be improved significantly by the CAA's verification report in 2016 [10], but this study instead showed deterioration in quality. Therefore, it is necessary to discuss the interpretation of these findings and propose a practical future strategy for this issue.

Quality Assessment of Target SRs
Overall, the quality of articles significantly decreased in conduct and reporting. Although four items (#1, #5, #6, and #9) in the after-SRs group were improving or remained a good description, another seven items were poor or very poor.
The methodology for most SRs did not attempt to include so-called grey literature by the use of many other types of databases and classical literature searches. Grey literature was defined here as studies that are unpublished, have limited distribution, and/or are not included in the bibliographical retrieval system [16]. The importance of including grey literature in all SRs has been previously discussed [17]. Implementers of SRs need to recognize the importance of also searching grey literature.
"Was the likelihood of publication bias assessed?" also remained flawed in the assessment process. Additionally, many SRs used "at least two electronic sources", but these were only Japanese databases and/or not the more traditional English databases like EMBASE or MEDLINE. Furthermore, it has been pointed out that there is a bias in coverage with only one database (i.e., PubMed) [18,19]. Researchers performing SRs therefore need to use multiple databases.
Publication bias remains an area of contention amongst researchers who assess the quality of SRs [20,21]. However, it remains a research priority because it is unclear what impact publication bias has on making decisions in healthcare [9]. We assume that the new FFC guideline provides a better description of how to assess publication bias, especially for a qualitative SR.
"Was the conflict of interest stated?" was a serious problem. Although most SRs described a part of the COI, they did not include all necessary information such as the SR's sponsor, SR's funding, author's affiliation, SR's outsourcing information (research agency), supervision allowance, and consulting fees for an SR. In fact, the targeted SRs included those that were conducted only by the company itself, those conducted by other companies, such as raw material makers, those conducted by a research agency, and those supervised by academia researchers. We assume that the primary reason reviewers of quality assessment judged many SRs as 'unclear or inadequately described' (Can't answer) was because they could not cover these elements properly. The International Committee of Medical Journal Editors (ICMJE) emphasizes that when authors submit a manuscript of any type or format, they are responsible for disclosing all financial and personal relationships that might bias or be seen to bias their work [22].
"Was there duplicate study selection and data extraction?" was also a very poor description. This was described clearly in the CAA's verification report in 2016 [10]. Because everyone makes mistakes occasionally, there should be at least two independent data extractors, and a consensus procedure for disagreements should be in place. It was not clear why two additional researchers did not perform independent assessments for some of the SRs.
"Did the SR use the MESH terms and related search function to detect comprehensively?" got worse in the after-evaluation. The guideline [4] instructs that "To search comprehensively, a search formula made by combining free items and controlled terms (including MeSH for PubMed) appropriately will be set per bibliographic databases." In addition, the report [10] points out that it is essential to design an optional search formula by combining keywords and thesauruses (such as MeSH) appropriately for each clinical question according to each database characteristic.
We assume that there were multiple reasons for the quality declining significantly. The FFC system is just a notification, so the CAA does not evaluate the safety and effectiveness of a submitted product. The number of notifications has increased since the system was launched in 2015 to lead sales promotions, but the reason for this might be that multiple companies had purchased copies of the SRs that had already been accepted by the CAA and submitted them to the CAA as basis material for the evidence. Therefore, many low-level SRs may have been contained in the FFC system, so the quality of after-evaluation might have deteriorated.

Validity and Reliability of Quality Assessment by AMSTAR Checklist
For the before-evaluation, we adopted a measurement tool used for the 'assessment of multiple SR' (AMSTAR checklist). The "R-AMSTAR" [23] was also developed as an approach to minimize bias of any kind in SRs. In terms of interrater reliability and validity, the AMSTAR score was very high compared with scores from other tools. In terms of feasibility, it was very appropriate that scoring time of the AMSTAR was short (between 10 and 20 min) [24]. Furthermore, a recent methodological study showed that AMSTAR and the risk of bias in systematic reviews (ROBIS) had similar interrater reliability but differed in their construct and applicability [25].
Although we had one consensus-training session and all reviewers had conducted a quality assessment of SRs more than once, the interrater reliability metrics for the quality assessment indicated substantial agreement was average 71.7%, k = 0.558. It can also be interpreted that there were many SRs for which the quality reviewers were confused as to whether it was a "Yes", "No", or "Can't answer".
Additionally, the reviewers seemed to have some ambiguity about the details of each item. Recently, a quality assessment check list, AMSTAR 2, was developed that allows for individual responses that do not impart judgment for each item [13]. AMSTAR 2 retains 10 of the original domains, has 16 items in total (compared with 11 in the original), has simpler response categories than the original AMSTAR, includes a more comprehensive user guide, and includes the identification of high-quality SRs. It might be useful to evaluate the detail quality for each item of the SRs in a future study. Table 4 shows the future research challenge for studies on the health enhancement effects of the FFC and related healthy foods. We assumed that there are three important dimensions and six tasks due to improved systematic reviews. Regarding the food industry, researchers must study the current standard rule of an SR (i.e., AMSTAR 2, PRISMA checklist, and PRISMA-NMA checklist for meta-analysis) [26] before research is conducted. If an applicant is concerned about the implementation of an SR, they should immediately consult with experts on research methodology, which will avoid creating inappropriate SRs. Moreover, since the CAA only performs formal confirmation of documents, the methodology of the SRs that had already been notified was not always correct. Therefore, if another company's SR is reused for a notification, it becomes necessary for an applicant to carefully examine the SR before deciding to confidently introduce its own product to the market. Table 4. Research challenge on systematic review of the foods with function claims.

#1
The applicants should conduct research based on AMSTAR 2 checklist.

#2
The applicants should conduct research based on PRISMA checklist and PRISMA-NA (for meta-analysis).

#3
The applicants should examine its quality when using the SR of another company that had already been accepted by the CAA.

#4
The applicants should consult with academia researchers for unclear points in methodology.

#5
Academic researchers should provide support for food companies and others to implement the SR properly.
Study plan (study selection and data extraction, search strategy, and evaluation method of bias risk) Implementation (assessment of publication bias, and formulating conclusion based on methodological rigor quality)

#6
The authorities should evaluate not only the formal confirmation* in the document but also the quality (certain level or higher) of the SR.
* Currently, the government intends to deregulate in food industy, so the CAA cannot examine the quality of each SR.
Academia should provide its own support for food companies and other companies to implement the SR properly, and academic researchers will need to continue to convey appropriate SR methodologies to the food industry. In the present study, it became clear that there were many methodological deficiencies in targeted SRs. The FFC system in Japan relies on one SR or one clinical trial, such as a RCT as a basis for efficacy. However, a Japanese research group recently identified problems with the reporting quality and associated issues for RCTs of the FFC [27]. There was insufficient information on items associated with sample size, allocation and blinding, results of outcomes and estimation, generalizability of the results, and study registration numbers. Because it is a notification system, it is essential for academic researchers, including our group, to monitor all SRs and clinical trials for the FFC.
Considering that the Japanese government has introduced the world's most advanced FFC system as part of its growth strategy (i.e., deregulation), it may be difficult for the CAA to review individual SRs. Therefore, to protect consumers, we assume it is necessary to confirm that the notification SR is above a certain level of quality.
Either way, even for an SR that has already been accepted, it will be necessary to issue the latest (updated) version 5 to 10 years later. The prospect of this future requirement will encourage all existing SRs to be conducted by scientifically correct methodologies.

Limitations
This review had several limitations that should be acknowledged. First, publication bias was possible because there was not enough use of multiple databases for each SR. Second, we could not perform an evaluation using the PRISMA checklist. Third, our study design focused on the quality of SRs; therefore, we could not validly assess the safety or the functional mechanism of any of the products reviewed in the SRs. Lastly, because we did not conduct a retrospective analysis of the quality of "primary studies cited or used as references" that were described in submitted SRs, the effectiveness of functional substances or finished products could not be addressed.

Conclusions
Overall, the quality of methodology and reporting in after-SRs based on the FFC was poorer than that based on before-SRs. In particular, there were very poor descriptions and/or implementation of study selection, data extraction, search strategy, evaluation methodology for risk of bias, assessment of publication bias, and formulating conclusions based on methodological rigor and scientific quality of the included studies.
To develop SRs of the FFC and launch a similar global food claim notification system, the following factors will be important: (i) applicants will need to use some global standard checklist such as AMSTAR 2, PRISMA, or PRISMA-NMA; (ii) applicants will need to critically examine the quality when using another applicant's SR; (iii) academic researchers should support the food industry in order to perform an SR and/or clinical trial properly; and (iv) country authorities should confirm that the notification SR is above a certain level of quality.

Conflicts of Interest:
The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request. When there was a case of conflict of interest (COI) (e.g., consulting on the study, shareholder of the firm), the reviewer did not evaluate the study directly, and other reviewers performed the evaluation. HK supervised SRs on five corporations (FANCL CORPORATION, Morishita Jintan Co., Ltd., KAGOME Co., Ltd., Fujifilm Corporation, and AJINOMOTO AGF, Inc.) and was compensated for that work. YT analyzed an SR on three corporations (KAGOME Co., Ltd., Fujifilm Corporation, and AJINOMOTO AGF, Inc.) and was compensated for that work. Therefore, HK and YT did not evaluate SRs involving those corporations, and instead other authors did the work.