Alternative Approach for Potency Assessment: in Vitro Methods

Over the last decade, incredible progress has been made in the development of non-animal tests to assess contact hypersensitivity. Four methods have been successfully validated and Organisation for Economic Cooperation and Development (OECD) guidelines are available or soon will be. Currently validated methods are useful for hazard identification, classification and labeling. However, to achieve a complete replacement of animals in skin sensitization assessment, dose-response information and evaluation of relative skin sensitizing potency to support effective risk assessment are necessary. In this context, potency is based on the concentration of chemicals needed to induce a positive response. This will require a better understanding of the mechanisms determining potency, including pathway analysis and marker signature identification (selection of an appropriate immune-mediated response to serve as the basis), together with quantitative and qualitative correlations between marker signatures and potency of chemicals in relation with T cell responses. This review aims to discuss the state-of-the-art in the field of in vitro assessment of the no induction sensitization level of contact sensitizers.


Introduction
Recent general population epidemiological studies suggest that the prevalence of contact allergy is ~15%-20% [1], making hypersensitivity reactions, resulting from environmental chemical exposure, a major health problem.More than 4000 low molecular weight chemicals (<500 Dalton) have been shown to induce allergic contact dermatitis (ACD) in humans [2].As a consequence, chemical allergies are of considerable importance to the toxicologist, whom has the responsibility of identifying and characterizing the skin and respiratory sensitizing potential of chemicals, and estimating the risk they pose to human health.Regulatory authorities worldwide require testing for ACD and appropriate hazard labeling to minimize exposure.Skin sensitization needs to be assessed within the framework of existing and forthcoming legislation for all chemicals, with the exception of substances produced or imported in Europe below 1 tonne, and it is a key endpoint for cosmetic ingredients.The evaluation of the contact sensitization potential of chemicals is currently done using the local lymph node assay (LLNA), as described in the OECD guideline 429, and its non-radioactive variants (TG442A, LLNA:DA and TG442B, LLNA:BrdU-ELISA), or by the traditional Guinea pig assays (i.e., Guinea pig maximization test and the Buehler test).
Chemical allergy refers to immune-mediated adverse health effects, including allergic sensitization and diseases, caused by exposures to low molecular weight chemicals.The cellular and molecular events that are associated with the induction of skin sensitization and the elicitation of allergic contact dermatitis are well-defined.ACD is a delayed-type hypersensitivity immune reaction mediated by T-lymphocytes resulting from repeated exposure of an allergen primarily on skin [3].
In common with other allergic diseases, ACD develops in two phases.In the first (induction) phase, exposure to a chemical allergen causes immunological priming and skin sensitization.The second (elicitation) phase is triggered if a sensitized individual is exposed subsequently to the same chemical allergen.
While a number of methods, reflecting the major key events described in the OECD adverse outcome pathway for skin sensitization (OECD GD168), are at various stages of acceptance, validation, development and use, currently it is not possible to rank chemicals for their sensitizing potency, an issue that is important for a full safety assessment [4].It is expected that a predictive method to totally replace animal testing will be in the form of a test battery comprising molecular, cell-based, and/or computational methods, the so-called "Integrated Approaches to Testing and Assessment".Currently available in vitro methods support the discrimination between skin sensitizers (i.e., UN GHS Category 1, where UN GHS stays for United Nations Globally Harmonized System of Classification and Labelling of Chemicals) and non-sensitizers in combination with other complementary information (i.e., in the context of an IATA).Depending on the regulatory framework, positive results may be used on their own to classify a chemical to UN GHS Category 1.They cannot, however, be used on their own to sub-categorize skin sensitizers into UN GHS subcategories 1A and 1B or to predict potency for safety assessment decisions, which is necessary to fully replace animal testing.
This review article focuses on the possibility to use in vitro methods for the prediction of sensitizer potency (the threshold dose for sensitization induction, which can be used as a point of departure for quantitative risk assessment).This is required for an accurate evaluation of relative skin sensitizing potency to support effective risk assessment.

Potency Assessment
Potency assessment is important as: (1) potency data are required for hazard classification and can lead to improvements in risk management; (2) potency data can facilitate risk assessment for skin sensitization.The evidence is that concentration of chemicals vary by up to four or five orders of magnitude with respect to their relative skin sensitizing potency [5].
Current mechanistic understanding of allergy is such that it can be assumed that the development of sensitization (and also elicitation) is a threshold phenomenon.The challenge is to identify levels of exposure below which sensitization will not occur.Prevention of sensitization in naive individuals, rather than prevention of elicitation in those already sensitized appears to be a more suitable and health protective health endpoint since predicting a safe exposure level for sensitized individuals is difficult due to high variability in individual susceptibility.Based on the estimation of the concentration of chemical required to induce a stimulation index of three relative to concurrent vehicle-treated controls (EC3 value), the LLNA has proven very useful in assessing the skin sensitizing potency of chemicals: low EC3 values correlated well with sensitizers known to be potent in man, whereas high EC3 values were usually associated with weak human sensitizers [6].It has been accepted that an EC3 value of >2% (weight/volume) leads to a GSH classification of sub-category 1B, indicating weaker sensitization; all others are classified as category 1A, indicating stronger sensitization.
A central event in immune sensitization is the presentation of antigen by dendritic cells (DC) to antigen-responsive T lymphocytes.The process of immune activation is orchestrated by DC and their cytokine and chemokine products, and by factors released by other cell types, including epithelial cells.Antigen activation of responsive T lymphocytes causes clonal expansion and differentiation into various functional subpopulations, some of which will migrate back to the site of exposure.At this point, the individual is sensitized and capable of mounting a clinically relevant hypersensitivity response to the same, or cross-reactive chemical allergen upon re-exposure.The development of allergic contact dermatitis requires an early activation of innate immune cells in the skin, including keratinocytes (KC) necessary for maturation and migration of DC, and DC, required for the activation of antigen-specific CD8 + cytotoxic T lymphocytes and CD4 + helper T cells.DC are a heterogeneous population of antigen-presenting cells that play a central role in the initiation and regulation of adaptive immune responses.Contact sensitizers have been demonstrated in vitro to induce phenotypic and functional changes in DC, enhancing their antigen-presenting capacity that ultimately could modulate the T cell response.It is generally assumed that three signals are required for activation of naïve T helper cells.Signal 1, or stimulation, is the recognition by the T-cell receptor (TCR) of antigenic peptides presented by major histocompatibility complex (MHC) class II molecules expressed on DC.Signal 2, or co-stimulation, is mainly provided by the triggering of CD28 on the T cell by CD80 and CD86 molecules on the DC.Signal 3, or polarization, refers to signals delivered from DC to T cell that determine its differentiation into various effector phenotypes such as Th1 and Th2.Both the quality of DC maturation signals, as well as the quantitative differences in maturation signals that are conveyed by chemical allergens can critically influence Th1 polarization and contact hypersensitivity reactions.The hypothesis is that the extent of chemical allergen-induced KC-DC activation/maturation and the lifespan may drive T cell polarization (e.g., Th1, Th2, Treg) and the magnitude of activation, which in turn results in different in vivo potencies.In addition, the clonal diversity of T lymphocytes that are allergen-reactive (degree of immunogenicity) has a significant impact on the potency of sensitization.The argument presented here is that more potent contact allergens will evoke a stronger skin inflammation/trauma than less potent allergens will.

Validated in Vitro Methods and Potency Assessment
Four non-animal tests have been recently validated, namely the direct peptide reactivity assay (DPRA), which measures depletion of a cysteine-and a lysine-containing peptide in the presence of the test chemical [7]; the KeratinoSens™ assay [8], which is based on a luciferase reporter gene under the control of an anti-oxidant response element of the human AKR1C2 gene stably inserted into HaCaT keratinocytes; the human cell line activation test (h-CLAT) [9], addressing CD86 and CD54 upregulation in THP-1 cells; and the IL-8 Luc assay, which measures the effects of chemicals on IL-8 promoter activity in the IL-8 reporter cell line THP-G8 cells [10].The first three methods have been validated by EURL-ECVAM, while the last by JaCVAM.The article of Urbisch et al. (2015) will provide detailed information on the applicability domain and predictive capacities of the different methods [11].
These methods yield not only a yes or no response helpful for hazard identification, but also provide quantitative potency data.Therefore, in principle, comparing this data with quantitative human and animal data is possible.The use of such quantitative data in the context of an integrated testing strategy for GHS classification and for estimating the potency class (weak, moderate, strong, extreme) will not be discussed, as several articles are already available [5,[12][13][14][15]. Following, the possibility to use the validated in vitro methods and other methods not yet validated for the prediction of the no induction sensitization level or the point of departure for skin sensitization is discussed below.For more complete information, it is important to mention that several integrated strategies of different non-animal assay systems have been proposed to predict in a more holistic way the skin sensitizing potential and relative potency of chemicals.Potency might be estimated through Quantitative Mechanistic Modeling or Bayesian statistics, which give promising results to predict four sensitizer potency classes, but not yet to predict a dose-area or EC3 value [16].The estimation of a no induction sensitization level is required for risk assessment, and it is still a challenge.

DPRA
The majority of skin sensitizers are electrophilic in nature, e.g., acylating agents, Michael acceptors, S N Ar and S N2 electrophiles, Shiff base formers, which underpin their ability to react with the nucleophilic aminoacids of skin proteins.This haptenation process is reproduced in vitro in the DPRA, which quantifies the amount of unbound peptides in the reaction mixture by HPLC following the covalent bond formation between hapten and aminoacid residues in the peptide.When the mean peptide depletion of the cysteine and lysine peptides is greater than 6.376%, the chemical is judged as a sensitizer.However, as the chemical reactivity varies markedly between various functional groups, the reaction rate of test chemicals with the DPRA peptide may not be linearly related to their in vivo sensitization potency, making the DPRA unsuitable for potency assessment.This is also collected in the OECD guideline, where the different DPRA classes of reactivity (i.e., low, moderate and high) are actually assigned a positive prediction, without any subclassification.The overall lack of correlation between in chemico reactivity and in vivo LLNA potency is also evident looking at Table 3, "Comparison of peptide reactivity and potency data" from the publication by Gerberick et al. [17], where, for example, strong in vivo sensitizers can have low (e.g., trimellitic anhydride) or moderate (e.g., formaldehyde) in vitro reactivity, and low in vivo sensitizers can have high in vitro reactivity (e.g., ethyl acrylate, 2,3-butanedione).

Keratinosens
The Keratinosens™ assay is based on a luciferase reporter gene under the control of an anti-oxidant response element (Nrf2) of the human AKR1C2 gene stably inserted into HaCaT keratinocytes [8].It measures a cellular defense mechanism triggered by chemical allergen-induced oxidative stress.For each chemical, the assay allows the calculation of the EC1.5, EC2 and EC3 (concentration in µM for 1.5, 2, and 3 fold induction of the luciferase activity) and the IC50 (concentration in µM resulting in 50% reduction in cellular viability).A substance is considered to be a sensitizer if an induction equal or exceeding 1.5-fold compared to the vehicle control is observed at a concentration below 1 mM and at which cells remain >70% viable.The correlation analysis of the individual parameters (logarithmic normalized) for the 244 chemicals against the EC3 in the LLNA gave a R 2 (%) of 44.8% for EC2 and 33.5% for IC50, statistically significant but not impressive.Similar values were obtained when comparing in vitro results with human data [15].While the assay can be useful for hazard identification and labeling, it cannot provide an estimation of the no induction sensitization level or the point of departure for skin sensitization.
One of the limitations of all traditional cell cultures is that cells are grown in aqueous solutions, in which the intrinsic poor solubility of lipophilic substances or the instability of certain chemicals (i.e., anhydride) in water can affect the correct potency classification, resulting in a weak correlation.The other bottleneck of many in vitro methods, which may give false negative results, is the limited metabolic capacity of the cell lines used.The possibility to use reconstituted human epidermis to overcome these problems will be later discussed.

h-CLAT
The h-CLAT assay measures the selective induction of the surface markers CD54 and CD86 in the human promyelocytic cell line THP-1, which functions as a DC surrogate [9].A substance is considered a sensitizer if the relative fluorescence intensity (RFI) of CD86 and/or CD54 is greater than 150% or 200% at any tested dose in at least two out of three experiments.From dose response curves of the three experiments, the median concentration inducing 150% of CD86 RFI and/or 200% of CD54 RFI (EC150 or EC200) is calculated like the EC3 value in the LLNA.The lower EC value obtained is considered the minimal induction threshold (MIT).Based on the MIT value, the hCLAT can be used to score strong sensitizers if MIT < 10 µg/mL and weak if MIT = 10-5000 µg/mL [13].The same limitations described for the Keratinosens also apply to the hCLAT.While the assay can be useful for hazard identification and labeling, it cannot provide an estimation of the no induction sensitization level or the point of departure for skin sensitization.

IL8 Luc Assay
More recently, JaCVAM, the Japanese Center for the Validation of Alternative Methods, validated the IL-8 Luc assay.The assay evaluates the effect of chemicals on IL-8 promoter activity in the IL-8 reporter cell line THP-G8 [17].It is a high throughput method for detection of IL-8 mRNA induction by sensitizers.With regards to potency, a rank classification is possible, with an accuracy of 70.9%, an over-prediction rate of 10.7% and an under prediction rate of 18.4% [10].In this context, over-prediction is linked to specificity, as it indicates the erroneous classification of negative substances as positive; while under-prediction is linked to sensitivity, as it indicates the erroneous classification of positive substances as negative.The EC1.4 for induction of the nSLO-LA of THP-G8 cells treated with sensitizers did not show significant correlation with the EC3 evaluated using LLNA [17].Also, this assay cannot provide an estimation of the no induction sensitization level or the point of departure for skin sensitization.

New in Vitro Models and the Possibility of Potency Assessment
Besides the validated tests, there are several other methods at different stages of development and acceptance.Among them, of particular interest are the GARD (Genomic Allergen Rapid Detection) assay [18], the SENS-IS assay [19], the EpiSensA [20], and RhE IL-18 potency test [21].
The GARD is based on the signature of predictive genes differentially regulated in the human myeloid cell-line MUTZ-3 following exposure to contact allergens.The classifications of unknown compounds as sensitizers or non-sensitizers are performed with a support vector machine (SVM) model [18].By transcriptional profiling, 33 canonical pathways were identified, and results indicate that the chemical reactivity groups seem to gradually engage more pathways and more molecules in each pathway with increasing sensitizing potency of the chemical used for stimulation.The high sensitivity (96%) and specificity (95%) of the GARD indicates that false negative and false positive results are not frequently occurring even for pre-and pro-haptens.The GARD was found to correlate well with human potency of substances as defined by Basketter et al. [22], triggering further optimization of the current prediction model.The general trend is that both metabolic and cell cycle associated pathways are engaged gradually, and in correlation with potency [23].As the authors concluded, the presented results may have implications for the design of a predictive in vitro test for assessment of sensitization potency, and suggest that genomic approaches contain valuable mechanistic information associated with potency assessment of chemicals.In addition to the genetic differentiation between substances of different potency, there is evidence suggesting that the observed effects are dose dependent.The dose required to observe cellular effects in the GARD are inversely correlating with the in vivo potency of the compound.While this information is indeed extremely interesting, it is not clear yet how the in vitro derived concentrations reflect the in vivo no induction sensitization level.
The other three methods mentioned are all based on the use of reconstructed human epidermis.Currently, methods based on skin models are being explored to resolve sensitizing potency.These methods have several advantages over traditional cell cultures, and are expected to have broader applicability.Topical application in relevant vehicles (e.g., in the vehicle used in animal tests or in a cosmetic/dermatological formulation) is only possible in 3D models which mimic in vivo bio-availability of a chemical more closely, and which may therefore lead to improved assessment of sensitizer potency [15].Furthermore, reconstructed human epidermis models and human skin have been reported to display similar gene expression profiles of drug metabolizing enzymes [24,25], which may be useful for the identification of pre/pro-haptens.
In the EpiSensA [20], skin sensitizers are identified based on the induction of three genes (four-fold increase as the positive criteria for ATF3, and two-fold increase for DNAJB4 and GCLM), which are all related to cellular stress response following 6 h of topical treatment.No correlation with potency estimation is reported.
In contrast, in the SENS-IS assay, potency class estimation is proposed [19].Similar to the EpiSensA assay, the SENS-IS assay is based on the assessment of a set of genes (191 genes analyzed by qRT-PCR) following a 6 h topical exposure.However, the number of regulated genes correlates better with the sensitization potential of a chemical rather than the intensity of the up-regulation, as was also observed in the GARD assay [23].Potency is estimated according to the extended and final SENS-IS prediction model (cut-off value for the number of over-expressed (fold induction > 1.3) genes: 10 for the ARE anti-oxidant responsive genes, and 18 for SENS-IS genes).Chemicals positive at concentrations ď50%, 10%, 1% or 0.1% are considered as weak, moderate, strong or extreme sensitizers, respectively.The correlation with the potency classification obtained at the LLNA was excellent >90% (48/50), only cinnamic aldehyde and butyl glycidyl were differently classified (both strong in the SENS-IS while moderate and weak at the LLNA).
More recently, Ahmed et al. [26] published an article on the use of human skin explants for the identification of sensitization hazards and the assessment of relative skin sensitizing potency.The proposed approach is quite elaborate and partly dependent on the user's judgment to evaluate the histological damage, which could limit its reproducibility.The test is based on the measurement of the histological damage in human skin as readout of the immune response induced by a potential skin sensitizer.In addition, sensitizers and non-sensitizers identified as positive or negative by the skin explant test were shown to induce high/low T cell proliferation and IFNγ production, with non sensitizers showing a low T cell proliferation and a SI < 3, with the exception of phenol.A high degree of inter-individual variability was observed.While a total of 44 chemicals were tested for yes/no classification, dose response experiments for potency assessment were performed only with two chemicals, namely oxazolone and cinnamic alcohol.Results are not exciting, with T cell priming assay failing to determine their different potencies.Of course, these may simply reflect the unfortunate choice of the two reference compounds.Data presented for potency assessment is premature, and it is difficult at this stage to judge its relevance.
Understanding how contact allergens promote allergic contact dermatitis is important, and the precise role of each cell type (i.e., keratinocytes, skin resident Langerhans and dendritic cells) together with the sequence of events are yet to be fully understood.For the relatively limited number of contact allergens studied so far, it appears that these cells are sensed by Toll like receptors (TLRs) and the inflammasome [27].As TLRs evolved to sense microbial pathogens, the inflammation induced by contact allergens would thus appear to be an accident in nature, but it raises the possibility that the activation of these pathways in keratinocytes and dendritic cells is a trait shared by all compounds with skin sensitizing potential.In line with these observations are our data, which show the possibility of discriminating contact allergens using the selective induction of IL-18 in keratinocytes, including reconstructed human epidermis [21,[28][29][30][31][32].The argument is that more potent contact allergens will provoke at lower concentrations a stronger innate inflammatory reaction than will less potent allergens, which will in turn influence the migration of skin dendritic cells and their degree of activation and, therefore, the quality of immune responses elicited [33].We demonstrated the possibility of combining the epidermal equivalent potency assay, based on the irritation potential, with the assessment of IL-18 release (RhE IL-18 potency assay) to provide a single test for identification and classification of skin sensitizing chemicals, including chemicals of low water solubility or stability [21].A protocol was developed using different 3D-epidermal models including in house VUMC model, epiCS ® (previously EST1000™, Cell Systems, Troisdorf, Germany), MatTek EpiDerm™ (MatTek In Vitro Life Science Laboratories, Bratislava, Slovak Republic) and SkinEthic™ (Episkin, Lyon, France) RhE.Following topical exposure for 24 h to contact allergens a robust increase in IL-18 release was observed only after exposure to contact allergens.Correlating the in vitro RhE sensitizer potency data, which assesses the chemical concentration which results in 50% cytotoxicity (EE-EC50) with human and animal data showed a superior correlation with human DSA05 (µg/cm 2 ) data (Spearman r = 0.8500; p value (two-tailed) = 0.0061) compared to LLNA data (Spearman r = 0.5968; p value (two-tailed) = 0.0542).Also, a good correlation was observed for release of IL-18 (SI-2) into culture supernatants with human DSA05 data (Spearman r = 0.8333; p value (two-tailed) = 0.0154).We propose a very simply approach to estimate in vitro the expected induction sensitization level of an unknown compound.A reference contact allergens curve is created using the RhE IL-18 potency test [22], testing well-known skin sensitizers of different potency (in the example reported in Figure 1, DNCB, eugenol, isoeugenol and benzocaine were used) by plotting LLNA EC3 and Human threshold values against in vitro EC50 or IL-18 SI-2.LLNA EC3, and Human threshold values were obtained from the ICCVAM database [34].The LLNA EC3 and Human threshold values for unknown chemicals are then predicted based on the in vitro EC50 or IL-18 SI2 values obtained in the RhE IL-18 potency test.
In the example reported in Figure 1, in vitro data relative to EC50 and IL-18 SI-2 for chlorpromazine (in vivo LLNA EC3 = 1.85%,Human threshold = 1150 µg/cm 2 ) and diethyl maleate (in vivo LLNA EC3 = 3.65%, Human threshold = 1600 µg/cm 2 ) were used to predict the in vivo LLNA EC3 and Human threshold values (manuscript in preparation).For the four reference allergens used, the best correlation was found for IL-18 SI-2 and Human threshold values (R 2 = 0.9970, p = 0.0015), which is in agreement with previous data [22].For the other parameters, similar correlations were found (Figure 1).then predicted based on the in vitro EC50 or IL-18 SI2 values obtained in the RhE IL-18 potency test.
In the example reported in Figure 1, in vitro data relative to EC50 and IL-18 SI-2 for chlorpromazine (in vivo LLNA EC3 = 1.85%,Human threshold = 1150 μg/cm 2 ) and diethyl maleate (in vivo LLNA EC3 = 3.65%, Human threshold = 1600 μg/cm 2 ) were used to predict the in vivo LLNA EC3 and Human threshold values (manuscript in preparation).For the four reference allergens used, the best correlation was found for IL-18 SI-2 and Human threshold values (R 2 = 0.9970, p = 0.0015), which is in agreement with previous data [22].For the other parameters, similar correlations were found (Figure 1).
In vitro EC50 and IL-18 SI-2 values for chlorpromazine were 0.35% and 0.16%, respectively, and for diethyl maleate 1.10% and 0.28%, respectively.These values were used to predict the in vivo LLNA EC3 and Human threshold values as reported in the insets.
Further compounds must be tested to better define the applicability and limitation of the RhE IL-18 potency test, but it is possible that using reference contact allergens, standard curves can be created to predict in vivo data, which could be useful in risk assessment.

Conclusions
Despite progress in the field, there is still much to be achieved and the best target for potency prediction is still an open question.The prediction of a probable EC3 value is adventagous since it would allow a continuous scale of potency predictions necessary for the setting of the no induction sensitization induction level to be used in risk assessment.In vitro potency estimation is still a challenge and further work is necessary.This will also require a better understanding of the basic mechanisms determining the in vivo potency before an effective in vitro strategy can be put in place.Reference contact allergens curves and prediction of "unknowns".Linear regression curves were created using four reference contact allergens, namely DNCB, isoeugenol, eugenol, benzocaine.In vitro EC50 and IL-18 SI-2 values for chlorpromazine were 0.35% and 0.16%, respectively, and for diethyl maleate 1.10% and 0.28%, respectively.These values were used to predict the in vivo LLNA EC3 and Human threshold values as reported in the insets.
Further compounds must be tested to better define the applicability and limitation of the RhE IL-18 potency test, but it is possible that using reference contact allergens, standard curves can be created to predict in vivo data, which could be useful in risk assessment.

Conclusions
Despite progress in the field, there is still much to be achieved and the best target for potency prediction is still an open question.The prediction of a probable EC3 value is adventagous since it would allow a continuous scale of potency predictions necessary for the setting of the no induction sensitization induction level to be used in risk assessment.In vitro potency estimation is still a challenge and further work is necessary.This will also require a better understanding of the basic mechanisms determining the in vivo potency before an effective in vitro strategy can be put in place.

Figure 1 .
Figure 1.Reference contact allergens curves and prediction of "unknowns".Linear regression curves were created using four reference contact allergens, namely DNCB, isoeugenol, eugenol, benzocaine.In vitro EC50 and IL-18 SI-2 values for chlorpromazine were 0.35% and 0.16%, respectively, and for diethyl maleate 1.10% and 0.28%, respectively.These values were used to predict the in vivo LLNA EC3 and Human threshold values as reported in the insets.