ANN-Based Integrated Risk Ranking Approach: A Case Study of Contaminants of Emerging Concern of Fish and Seafood in Europe

Seafood, one of the most important food commodities consumed worldwide, is considered a high-quality, healthy, and safe food option. However, marine ecosystems are the ultimate destination for a large group of chemicals, including contaminants of emerging concern, and seafood consumption is a major pathway of human exposure. With growing awareness of food safety and food quality, and increased demand for information on the risk of contaminants of emerging concern, there is a need to assess food safety issues related to harmful contaminants in seafood and ensure the safety of marine food resources. In this study, the risks of emerging compounds (endocrine disruptors, brominated flame retardants, pharmaceuticals and personal care products, and toxic elements) in fish and seafood were analyzed according to their PBT (persistence, bioaccumulation, toxicity) properties as well as in terms of their concentration levels in seafood. A hazard index (HI) was estimated for each compound by applying an artificial neural network (ANN) approach known as Self-Organizing-Maps. Subsequently, an integrated risk rank (IRI) was developed considering the values of HI and the concentrations of emerging compounds in seafood species gathered from the scientific literature. Current results identified HHCB, MeHg, NP, AHTN and PBDE209 as the top five highest ranked compounds present in seafood, according to the 50th percentile (mean) of the IRI. However, this ranking slightly changed when taking into account the 99th percentile of the IRI, showing toxic elements, methylmercury and inorganic arsenic, as having the highest risk. The outcome of this study identified the priority contaminants and should help in regulatory decision-making and scientific panels to design screening programs as well as to take the appropriate safety measures.


Introduction
Seafood is one of the most important food commodities consumed worldwide, being recognized as a high-quality, healthy and safe food item. However, seafood consumption is also a relevant pathway of human exposure to environmental pollutants. The issue of seafood safety is even more important in view of the growth of the international fish trade, which has undergone a tremendous expansion in the last three decades, increasing from USD 8 billion in 1976, to a record export value of USD 102.5 billion in 2010 [1]. Consumption of seafood has also seen a continuous uptrend, with an average world consumption of 11.5 kg/capita/year in 1970 compared to 19.2 kg/capita/year in 2012 [2]. Therefore, safety of seafood is central to any society, and it has a wide range of economic, social and, in many cases, environmental consequences.
Marine ecosystems are the ultimate destination for a large group of chemicals, receiving these pollutants through rivers, direct discharges and atmospheric deposition. Fish and shellfish have been identified as the food group showing the highest concentrations of a number of toxic elements [3,4]. Some contaminants can bioaccumulate in marine organisms and biomagnify along the marine food web, likely being transferred to the human food chain, with subsequent potential problems for seafood safety [5,6]. Specifically, for seafood, maximum levels for a range of contaminants are outlined in the legislation, and seafood is regularly controlled by monitoring programs for a selection of environmental contaminants. This gives rise to concern from an environmental and public health point of view. So far, the focus mainly lays on well-known chemical pollutants, such as polycyclic aromatic hydrocarbons (PAHs), polychlorinated biphenyls (PCBs), certain marine toxins and some toxic elements [7][8][9][10][11]. However, there is no regulation in place for recently detected substances for which no maximum levels have been established in EU legislation and for which a potential risk cannot be excluded. Although it is not fully implemented yet, a new EC directive on priority substances in the field of water policy revised crucial rules on determining the chemical quality of surface water [12]. Furthermore, contaminants of emerging concern could be previously identified, for which maximum levels have been laid down but need revision due to new hazard information (re-emerging contaminants) [6]. Therefore, there is an increasing need for knowledge about the presence and potential effects of the so-called "contaminants of emerging concern" in seafood [6,13,14]. Special attention has been paid to pollutants belonging to four important groups of contaminants: toxic elements, endocrine disrupting compounds (EDCs), pharmaceuticals and personal care products (PPCPs), and brominated flame retardants (BRFs).
Toxic elements are widespread in the environment from either natural or anthropogenic sources [15]. Some of these elements can occur in food because of their presence in the environment or due to contamination during food production and storage. Some elements are essential to maintain a good health in humans but exposure to others can lead to severe adverse health effects [16]. Elements may change their chemical form in the environment, but they cannot be degraded over time. This means that they are environmentally persistent and may bioaccumulate [17,18]. The maximum levels of lead, cadmium and total mercury in seafood are regulated by the European Commission regulation 1881/2006 [7][8][9]19]. For other toxic elements or specific chemical forms, no maximum levels have been laid down in the European legislation, partly due to a lack of information about their presence in seafood. From a toxicological point of view, the chemical form (i.e., the elemental speciation) in which the metal is ingested plays a significant role [20,21]. Knowledge about the chemical form(s) of certain elements (e.g., inorganic arsenic: InAs and methyl mercury: MeHg) present in seafood is therefore required in order to improve the assessment of seafood safety beyond simply knowing the total elemental amount.
There is also a growing interest in EDCs due to their ability to interfere with the endocrine system of different organisms, causing important alterations in development. Because of the lipophilic and persistent nature of most EDCs and their metabolites, many of them can bioaccumulate and biomagnify in different environmental compartments, including in marine biota [22].
PPCPs are another diverse group of potential pollutants. They can enter aquatic ecosystems from municipal wastewater treatment plant discharges, runoff from agricultural areas that utilize veterinary therapeutics and releases from aquaculture sites [23,24]. As a result, they have been increasingly detected in the environment during recent years [25][26][27][28][29]. Another source of accumulation of PPCPs in fish and seafood is the prophylactic or therapeutic use of pharmaceuticals in aquaculture. Residues of these drugs can remain in tissues creating a potential exposure for consumers [24]. The presence of pharmaceuticals in seafood may potentially act as a risk for consumers, either through direct effect of allergy and toxicity or indirectly through potential microbial resistance [24].
Flame retardants comprise a large group of chemical substances that are widely used in many industrial and household products [30]. Currently, because of their high-performance efficiency and low cost, the largest market group of flame retardants is the brominated flame retardants (BFRs) group [31]. BFR-treated products, whether in use or waste, release BFRs into the environment. Unfortunately, these contaminants may then pass into the food chain causing toxic effects to human health [32][33][34][35][36][37][38][39].
Humans may potentially be exposed to emerging environmental contaminants by eating contaminated fish and seafood. However, monitoring the large group of contaminants of emerging concern is very extensive, so it is impossible to monitor all compounds. Considering the large list and a cost-effective use of resources, priorities for screening emerging contaminants in seafood should be set. Therefore, tools to combine and simplify large data collections are mandatory for risk managers and decision-makers. In recent years, many frameworks have been proposed to prioritize contaminants using a range of approaches classified as qualitative, semi-quantitative to quantitative methods [40,41]. Semi-quantitative risk assessment provides an intermediary level between the contextual evaluation of qualitative risk assessment and the numerical evaluation of quantitative risk assessment, by evaluating risks with a score [42,43]. A quantitative approach offers a more consistent and rigorous approach to assess and compare risks and risk management strategies, and avoids some of the greater ambiguities that a qualitative risk assessment may produce. However, qualitative approaches do not require the same mathematical skills and amount of data as quantitative risk assessments, which means they can be applied to risks and strategies where precise data are missing [42]. Recently, van der Fels-Klerx et al. [43] performed an extensive systematic literature review identifying and characterizing the available methodologies for risk ranking in the fields of feed and food safety. The following methods of risk ranking for chemical hazards were identified: risk assessment, risk ratio, scoring methods, risk matrices, multi criteria decision analysis (MCDA), flow charts/decision trees. Some of these methods are also classified as new approach methodologies (NAM) which have been recommended as complimentary tools for the integrated approach to testing and assessment (IATA) strategy [44,45].
Among these approaches, the relative scoring methods are the most widely reported approaches which allow ranking the list of chemical compounds by aggregating a selection of parameters. Relative scores indicate where a particular chemical stand within a specified normative sample of chemicals. For example, physicochemical parameters such as persistence, bioaccumulation and toxicity (PBT) are often used to build the hazard index (HI), a coefficient widely implemented to prioritize chemicals [46,47]. The applications of artificial neural network (ANN) or machine learning (ML) algorithms in chemical health and safety study can date back to the mid-1990s [48]. Most of these applications were in toxicity classification and prediction studies, however, lately these algorithms have also been used in hazardous property prediction and consequence analysis [44,49]. Ranking has traditionally been developed using various data aggregation methods such as partial order ranking [50], utility function or simple additive ranking [51], fuzzy-based risk [52,53], Bayesian network classification [54,55], and clustering based ANN methods, such as Self-Organizing-Maps (SOM) [20,47]. In the latter case, HI is determined by intrinsic parameters of the chemicals (PBT) and risk can be described as a function of hazard (toxicity) and exposure (dose). Due to the ability to group data according to similar characteristics, the SOM algorithm was previously used to create PBT-based rankings of chemical pollutants [20,47,56]. Recently, SOM was also applied to elaborate an ecological hazard index of a series of pollutants found in Ebro River waters (Spain) [52,53]. Integration of PBT parameters with exposure levels in target food groups can be an interesting approach to obtain realistic information for food safety policies.
The objective of the present study was to prioritize a selection of contaminants of emerging concern by means of an artificial neural network (ANN) based approach integrating PBT properties and the concentration levels of these pollutants in seafood species. Firstly, HI was generated and applied to each individual compound by using SOM. Secondly, an integrated risk ranking was developed by combining the HI and concentration level of each compound in seafood, considering the linearity between concentration levels in the food source and the possible dose. Finally, a prioritized list of emerging contaminants was performed by ranking the chemicals according to the integrated risk score.

List of Chemicals
A list of 62 emerging chemicals was elaborated according to the availability of concentration data on seafood species in the ECsafeSEAFOOD database [6] (Table 1). Chemicals from four important groups of contaminants were incorporated in this study: toxic elements (n = 2), EDCs (n = 19), PPCPs (n = 31), and BFRs (n = 10).

PBT Parameters
The values of three parameters (persistence, bioaccumulation and toxicity) were assembled from the quantitative structure-activity relationship (QSAR) modelling software Estimation Program Interface (EPI Suite TM , [57]). EPI Suite is a Windows based software developed by the Office of Pollution Prevention Toxics and Syracuse Research Corporation (SRC), U.S. Environmental Protection Agency (EPA) [57]. This screening-level tool is used to estimate the physical and chemical properties, environmental fate and aquatic toxicology of chemicals, integrating data of more than 41,000 compounds from the PHYSPROP© database (Syracuse Research Corporation, Syracuse, NY, USA). It is a very powerful tool used to obtain estimated values when experimental information is not available.
(a) Persistence: environmental half-lives of each chemical were estimated using the Biowin TM tool [57], capable of predicting the primary aerobic and anaerobic biodegradability of organic chemicals using 7 different models, the results of which were reconverted to a semi-quantitative rate of times, with the following units: 5 h, 4 days, 3 weeks, 2 months, and 1 year [58]. (b) Bioaccumulation: bioconcentration factor logarithm (log BCF) was obtained from BCFWin TM [57] through the octanol-water constant (K ow ). (c) Toxicity: toxicity was estimated through the Ecological Structure Activity Relationships (EcoSAR TM , [57]) tool which estimates acute and chronic toxicity to aquatic organisms of different trophic levels: fish, aquatic invertebrates and green algae (Sanderson et al., 2003). The toxicity data used to build the SARs were collected from publicly available experimental studies and confidential submissions provided to the U.S. EPA New Chemicals Program.

Contamination Levels
The information currently available about emerging environmental contaminants is rather dispersed. A database was developed [6], to compile all the information from the scientific literature concerning emerging contaminant levels in seafood. Based on the information available in this database, the mean and range of the concentrations of each one of the pollutants in seafood, was estimated. Only studies reporting concentration levels expressed in wet weight were considered, as conversion factors from dry or lipid to wet weight, were not available for all species. Thus, for each one of the 62 contaminants, distribution data of concentration levels (in wet weight) in marine fish, mollusks and crustaceans were gathered and a minimum, mean and maximum concentration was reported ( Table 2). Because of unavailability of PPCP data on marine biota, studies on freshwater biota were used. As data on contamination levels in European seafood are scarce, non-European studies were also included in this study. Because this study is assessing the risk of consumption of seafood, only edible fractions were considered. For fish and crustaceans only levels in meat were considered. Levels in liver/gonads/blood of fish and in hepatopancreas/gonads of crustaceans were not considered. For mollusks, levels in the whole body were used.

Hazard Index
The compilation and organization of large amounts of data can be computed by data mining tools, such as ANNs [59]. Among the different kinds of ANNs, Kohonen's Self-Organising-Map (SOM) is one of the most commonly applied methods [60]. SOM uses an unsupervised learning algorithm that reduces the dimensionality of large input data and utilizes a neighborhood function to preserve the topological properties of the input space [61]. The results are generally visualized in two-dimension maps, allowing for clustering of the input information by grouping similar data characteristics. The final result, is on the one hand, a low dimension map (or Kohonen's map) showing the discretized representation of the multidimensional input space, and on the other hand, a set of component planes showing the clusters created by the algorithm in the Kohonen's grid. The ability of SOM to group data and cluster the analyzed parameters has been extensively applied in environmental toxicology, but little is known about its applicability in food toxicology [58,[62][63][64][65][66]. The interpretation of the SOM clusters begins with the map visualization. Each of the SOM nodes (neuron or hexagonal grid) has a specific weight, allowing one to cluster the original information, akin to multidimensional scaling. The weights associated to each node or neuron in a two-dimensional lattice are adjusted to cluster the original information. The map can also be divided into so many c-planes (component planes) as data variables, representing the variable contribution to each node in the map [20].
The integration of PBT parameters was performed with inbuilt functions of SOM toolbox for Matlab TM . The HI for each chemical was built by integration of PBT parameters through the distance measure (such as Euclidean distance), which is the average distance between the node's weight vector and that of its closest neighbors used in Kohonen's algorithm. A linear initialization was applied for SOM clustering. The competitive learning phase consisted of 10,000 steps, while the tuning phase added another 10,000 steps. After iterative trainings, SOM is eventually formed in the format that inputs with similar features are mapped to the same map unit or nearby neighboring units, creating a smooth transition of related individuals over the entire map. HI was considered as the sum of the PBT values for each compound after SOM training. As low levels of persistence and toxicity result in a higher hazard, inverse values obtained from the BiowinTM and EcoSARTM tools, respectively, were considered in the HI building [58]. The three full datasets were normalized to obtain a variance equal to one for each parameter. Default ranges of PBT parameters was re-scaled to 0-10 and hazard indexes were normalized using Equation (1) and re-scaled from 0 to 10.
where C norm is the normalized value, C i is the parameter value of species i, C min is the lowest and C max is the maximum concentration value.

Risk Ranking
In order to apply weights to the effects of contamination levels, PBT parameters were integrated with the distribution data of concentration levels of pollutants in seafood as follows (Equation (2)): where RI t is the risk index for the contaminant t, HI t is the hazard index for the contaminant t and C t,s is the sth sample of concentration level of the contaminant t in seafood from the sample concentration generated with mean µ t standard deviation σ t . Uncertainties of the concentration have been included in the risk ranking calculation by simulating the concentration distribution. If a mean concentration (and standard deviation) of the contaminant was available in seafood, a normal data distribution was assumed. If only the min-max range was available, data distribution was assumed to be uniform. Mean and standard deviation of the risk index was calculated and reported in Table 3.

Results and Discussion
The application of the SOM clustering algorithm to PBT data of all the compounds listed in Table 4 has resulted in grouping of chemicals based on their PBT properties. The clustering map structure was based on a two-dimensional grid of 100 (10 × 10) cells. The data training phase consists of a twostep primary training and a tuning phase. The labeled Cluster of Kohonen's map ( Figure 1A) visualizes distances between neighboring map units, and thus shows the cluster structures of the map obtained from iterative process of unsupervised learning. The C-planes show the distance measure obtained from normalized values of persistence, bioaccumulation and toxicity obtained from an iterative SOM procedure ( Figure 1B    The SOM based HI for each emerging contaminant is summarized in Table 5 and ranked according to its absolute score. Considering the HI values, the highest scores were attributed to BDE209 (4.473), octylphenol (4.367), triclocarban (4.367), MeHg (2.625), tetrabromobisphenol A (2.625), BDE47 (2.504), perfluorooctanesulfonamide (2.493), and hexabromocyclododecane (2.493). While BDE209 was identified as the most toxic compound, high HI values of the remaining top compounds were due to their high bioaccumulation The SOM based HI for each emerging contaminant is summarized in Table 5 and ranked according to its absolute score. Considering the HI values, the highest scores were attributed to BDE209 (4.473), octylphenol (4.367), triclocarban (4.367), MeHg (2.625), tetrabromobisphenol A (2.625), BDE47 (2.504), perfluorooctanesulfonamide (2.493), and hexabromocyclododecane (2.493). While BDE209 was identified as the most toxic compound, high HI values of the remaining top compounds were due to their high bioaccumulation (MeHg, TBBPA, PFOSA, BDE47, HBCD) or persistence (OP, TCAR). Both toxic elements, MeHg and InAs, were found in the upper side of the ranking. PPCPs and EDCs are evenly distributed throughout the ranking, while BFRs reached the highest scores as a group. The BCF cluster map clearly provided three clusters, while those relative to persistence and toxicity entailed two main clusters ( Figure 1B). As the Kohonen's based HI is a mutual scoring method where scores are computed using a set of properties data, the comparisons with other studies become complicated. Nonetheless, Fabrega et al. [58] implemented a similar methodology on PPCPs, EDCs, pesticides, perfluorinated compounds (PFCs), illicit drugs and UV filters. In that study, the most hazardous pollutants were identified to be six PFCs (PFHxDA, PFODA, PFTeDA, PFTrDA, PFDoA, and PFUdA).
Since the PBT based HI values do not reflect the current situation in terms of consumer safety, we implemented a second step by integrating the HI with the concentration of these emerging compounds in commercial seafood. This integration was performed by multiplying the HI score with the concentration level of the compounds in seafood (as explained in Section 2.3), weighting the contamination vector in the final score. Since the concentration data account for the distributional variability of different samples reported in the literature, the risk index is calculated as mean and standard deviation by propagating the uncertainty of concentration in the integrated risk index (IRI) calculation. The resulting integrated index provides a new ranking of chemicals considering the current contamination of seafood reported as mean ± Std (Table 3). In the overall ranking based on maximum value (99 percentile) of the IRI score, metals (MeHg and InAs) occupy the highest rank followed by HHCB, BDE209, AHTN and NP belonging to different groups of compounds ( Figure 2A). Nonylphenol (NP), perfluorinated octane sulfonate (PFOS) and bisphenol A (BPA) ( Figure 3A) were the endocrine disrupting compounds with highest risk, whereas most PFCs showed the lowest risk index due to their low concentration values (Table 2). Among the PPCPs, galaxolide (HHCB) and tonalide (AHTN) ( Figure 3B) were estimated to be the riskiest contaminants. However, since the PPCPs risk index was calculated using concentration values in freshwater biota, an overestimation may have occurred. In the BFRs group, BDE209, HBCD and BDE47 ( Figure 3C) were ranked as the flame retardants with the highest risk.
in the literature, the risk index is calculated as mean and standard deviation by propagating the uncertainty of concentration in the integrated risk index (IRI) calculation. The resulting integrated index provides a new ranking of chemicals considering the current contamination of seafood reported as mean ± Std (Table 3). In the overall ranking based on maximum value (99 percentile) of the IRI score, metals (MeHg and InAs) occupy the highest rank followed by HHCB, BDE209, AHTN and NP belonging to different groups of compounds ( Figure 2A). Nonylphenol (NP), perfluorinated octane sulfonate (PFOS) and bisphenol A (BPA) ( Figure 3A) were the endocrine disrupting compounds with highest risk, whereas most PFCs showed the lowest risk index due to their low concentration values (Table 2). Among the PPCPs, galaxolide (HHCB) and tonalide (AHTN) ( Figure 3B) were estimated to be the riskiest contaminants. However, since the PPCPs risk index was calculated using concentration values in freshwater biota, an overestimation may have occurred. In the BFRs group, BDE209, HBCD and BDE47 ( Figure 3C) were ranked as the flame retardants with the highest risk.    Uncertainty is inherent in the process even when using the most accurate data and the most sophisticated models. However, in this case, only uncertainty due to concentration variability was considered. "Uncertainty" is, in this case, the description of the imperfect knowledge of the true value of concentration level, or its real variability in samples or observations. The dataset for this study is very heterogeneous with different sample sizes and it is based on reported values in the literature [6]. The degree of uncertainty of the IRI was analyzed using linear propagation of uncertainty of concentration levels (Equation (2)). Using mean values of IRI (low degree of conservatism), the risk ranking of these compounds changed significantly, in comparison to the 99th percentile of the IRI (high degree of conservatism). In the overall ranking based on mean values (50th percentile) of the IRI score, HHCB occupied the highest position, followed by MeHg, NP, AHTN and BDE209 in the top five ranked compounds ( Figure 2B). Detailed IRI scores with the mean and standard deviation of all compounds in this study are presented in Table 3. Comparing the ratio of mean and max values of IRI score ( Figure 3D-F), it is evident that IRI score is skewed towards the higher end of the distribution, which is mainly due to small sample of concentration data biased toward extreme outliers (higher concentration value).
A limitation of the method used in this study is the use of theoretical values as HI parameters. Since this study focuses on contaminants of emerging concern, data on persistence, bioaccumulation and toxicity are unavailable in the scientific literature for most compounds and were thus estimated by applying the US EPI Suit TM software [57]. The process of modeling PBT data may be associated with uncertainty, especially for the "toxicity" variable. In this study, only fish toxicity values were used to estimate the HI by means of the ECOSAR TM tool [57]. This may lead to a significant bias, as the relationship with human toxicity is not taken into account. In this framework, further improvements of the hazard index should be focused on incorporating experimental PBT values whenever new data become available.
Another important limitation in this study is the exposure parameter. Since dietary exposure is determined by both concentration and consumption, the dietary consumption vector should be considered as an additional parameter in future studies. In this study, the complexity of the database does not allow for the provision of an individual input for each pollutant and fish species. However, the mean level of consumption for all the species (as mean European fish level intake) will be the same through the different pollutants and will not affect the risk index. Moreover, contamination levels considered in this study were averaged from all seafood species, sometimes including freshwater species, as the availability of levels of contaminants of emerging concern was limited. A specific ranking for each species should be performed in future studies once the levels of every contaminant become available for each seafood species.

Conclusions
Risk ranking frameworks for chemical hazards have been mainly developed to establish priority settings in order to reduce environmental problems related with pollutants, as well as to provide an objective tool for risk managers and decision makers for resources optimization. It also provides a user-friendly visualization and data analysis approach to be used as a risk communication and management strategy. The objectivity of risk ranking has been improved by applying a quantitative approach in the form of a SOM based methodology for risk ranking contaminants of emerging concern in food safety. By combining HI based on PBT parameters with contamination levels in seafood, the IRI was estimated for each environmental pollutant using SOM. The highest HI values were estimated for BDE209, octylphenol, triclocarban, MeHg, tetrabromobisphenol A, BDE47, perfluorooctanesulfonamide, and hexabromocyclododecane. Nonetheless, the integration of concentration levels with the HI modified this ranking, resulting in HHCB, MeHg, NP, AHTN and BDE209 emerging as the top five ranked compounds, according to the 50th percentile (mean) score of IRI. Furthermore, and considering the 99th percentile of IRI score, the risk ranking slightly changed, with toxic elements (MeHg and InAs) posing the highest risk, followed by HHCB, BDE209, AHTN and NP.
Uncertainty is introduced at every step of the health risk assessment. Unfortunately, in this particular case, the uncertainty associated with PBT values was not accounted due to the scarce information in the QSAR model for emerging contaminants. The availability of homogeneous high-quality data can determine the accuracy and uncertainties associated with the final results of this method. As information on PBT values and contamination levels in seafood is very heterogeneous and scarce for contaminants of emerging concern, theoretical values need to be used. Further improvements on the use of this method should be focused on incorporating homogeneous experimental values and model the uncertainty of PBT values. Besides these improvements, other aspects such as consumption levels, could be added to improve the risk ranking method, while other emerging pollutants should ideally also be incorporated.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.