Improved Detection Criteria for Detecting Drug-Drug Interaction Signals Using the Proportional Reporting Ratio

There is a current demand for “safety signal” screening, not only for single drugs but also for drug-drug interactions. The detection of drug-drug interaction signals using the proportional reporting ratio (PRR) has been reported, such as through using the combination risk ratio (CRR). However, the CRR does not consider the overlap between the lower limit of the 95% confidence interval of the PRR of concomitant-use drugs and the upper limit of the 95% confidence interval of the PRR of single drugs. In this study, we proposed the concomitant signal score (CSS), with the improved detection criteria, to overcome the issues associated with the CRR. “Hypothetical” true data were generated through a combination of signals detected using three detection algorithms. The signal detection accuracy of the analytical model under investigation was verified using machine learning indicators. The CSS presented improved signal detection when the number of reports was ≥3, with respect to the following metrics: accuracy (CRR: 0.752 → CSS: 0.817), Youden’s index (CRR: 0.555 → CSS: 0.661), and F-measure (CRR: 0.780 → CSS: 0.820). The proposed model significantly improved the accuracy of signal detection for drug-drug interactions using the PRR.


Introduction
Pre-marketing randomized clinical trials typically focus on establishing the safety and efficacy of a single drug rather than investigating drug-drug interactions; therefore, patients who use drugs other than the one under investigation are usually excluded. However, unlike pre-marketing trials, it is common to use multiple drugs for treatment post-marketing. Therefore, attention should be paid not only to the adverse events caused by a given drug, but also to those arising as a result of interactions between two or more drugs. In some reports, the proportion of adverse events caused by drug-drug interactions was estimated to be approximately 30% of the unexpected adverse events [1]. The use of a spontaneous reporting system is believed to be beneficial for the early detection of druginduced adverse events post-marketing. Spontaneous reporting systems do not include the number of drug users; therefore the incidence of adverse events cannot be calculated, and instead, unknown adverse events are searched for using safety signals [2].
Several detection algorithms [3][4][5][6] have been developed based on disproportionality analysis, for example, the proportional reporting ratio (PRR) [3] is often used as an algorithm for single-drug-induced adverse events. In addition to these algorithms, several signal detection algorithms for drug-drug interactions have been reported [7][8][9]. Susuta et al. proposed the combination risk ratio (CRR) as a signal detection algorithm for drug-drug interactions [10]. As suggested by these authors, the CRR is calculated by dividing the estimated PRR points of concomitant-use drug D 1 and drug D 2
However, the number of adverse events reported during concomitant use is generally lower than that of single-drug-induced adverse events, and the 95% confidence interval (95%CI) tends to be wider for PRR drug D1 ∩ drug D2 . In other words, as shown in Figure 1, when the individual 95%CIs are considered, it is possible that the lower limit of the 95%CI of PRR drug D1 ∩ drug D2 (=PRR 025 drug D1 ∩ drug D2 ) overlaps with the upper limit of the 95%CI of PRR drug D1 or PRR drug D2 (=PRR 025 drug D1 or PRR 025 drug D2 ).
Pharmaceuticals 2020, 13, x FOR PEER REVIEW 2 of 8 Susuta et al. proposed the combination risk ratio (CRR) as a signal detection algorithm for drug-drug interactions [10]. As suggested by these authors, the CRR is calculated by dividing the estimated PRR points of concomitant-use drug D1 and drug D2 (=PRR drug D1 ∩ drug D2) by the estimated PRR points of drug D1 or drug D2 (PRR drug D1 or PRR drug D2), as shown in Table 1 and Equations (1)-(3) [10].
However, the number of adverse events reported during concomitant use is generally lower than that of single-drug-induced adverse events, and the 95% confidence interval (95%CI) tends to be wider for PRR drug D1 ∩ drug D2. In other words, as shown in Figure 1, when the individual 95%CIs are considered, it is possible that the lower limit of the 95%CI of PRR drug D1 ∩ drug D2 (=PRR025 drug D1 ∩ drug D2) overlaps with the upper limit of the 95%CI of PRR drug D1 or PRR drug D2 (=PRR025 drug D1 or PRR025 drug D2). Figure 1. The association between the combination risk ratio (CRR) and disproportionality score. Figure 1. The association between the combination risk ratio (CRR) and disproportionality score.
If such an overlap occurs, a risk signal may not be detected for concomitant use. Therefore, with reference to the interaction signal score (INTSS) [11], we proposed the  (Figure 2) and verified its applicability for improving the detection criteria for the CRR proposed by Susuta et al [10]. If such an overlap occurs, a risk signal may not be detected for concomitant use. Therefore, with reference to the interaction signal score (INTSS) [11], we proposed the concomitant signal score (CSS) ( Figure 2) and verified its applicability for improving the detection criteria for the CRR proposed by Susuta et al [10].

Model Evaluation Using Receiver Operating Characteristic (ROC) and Precision-Recall (PR) Curves and the Area Under the Curve (AUC)
The receiver operating characteristic (ROC) curve and precision-recall (PR) curve of the CSS are shown in

Model Evaluation Using Machine Learning Indicators
In all cases, 739 pairs were detected using the

Model Evaluation Using Receiver Operating Characteristic (ROC) and Precision-Recall (PR) Curves and the Area Under the Curve (AUC)
The receiver operating characteristic (ROC) curve and precision-recall (PR) curve of the CSS are shown in If such an overlap occurs, a risk signal may not be detected for concomitant use. Therefore, with reference to the interaction signal score (INTSS) [11], we proposed the concomitant signal score (CSS) ( Figure 2) and verified its applicability for improving the detection criteria for the CRR proposed by Susuta et al [10].

Model Evaluation Using Receiver Operating Characteristic (ROC) and Precision-Recall (PR) Curves and the Area Under the Curve (AUC)
The receiver operating characteristic (ROC) curve and precision-recall (PR) curve of the CSS are shown in

Model Evaluation Using Machine Learning Indicators
In all cases, 739 pairs were detected using the      Table 3).

Discussion
We evaluated the setting criteria and accuracy of the CSS, a newly proposed analytical model to overcome potential issues with the CRR. In this study, 3924 pairs of drug D 1 -drug D 2 -Stevens-Johnson syndrome (SJS) in the Japanese Adverse Drug Event Report (JADER) database were evaluated. Unknown adverse event data do not exist, therefore no "real" true data for adverse events were available. Therefore, to verify the accuracy of CSS, we had to prepare "hypothetical" true data pertaining to adverse events. These "hypothetical" true data were also used to validate the subset analysis for detecting drug-drug interaction signals [12].
The ROC curve was generated to determine the cutoff value of CSS. The highest value for Youden's index was 0.632, and the cutoff value was 1.041 (F-measure: 0.645). Furthermore, the PR curve must be considered when evaluating imbalanced data such as those used in the present study. The F-measure was the highest for a cutoff value of 1.271 (F-measure: 0.645, Youden's index: 0.623). Considering the detection criteria, including the results presented here, the criterion (CSS > 1) proposed by us would be preferable.
The highest number of signals was detected using the CSS, with 1862 pairs (47.5% of the total combinations, accuracy: 0.740, Youden's index: 0.629, and F-measure: 0.633), fol-Pharmaceuticals 2021, 14, 4 5 of 8 lowed by the CRR, with 739 pairs (18.8% of the total combinations, accuracy: 0.817, Youden's index: 0.424, F-measure: 0.569), in all cases. One reason for this difference in the number of detections is that, unlike the CSS, the CRR cannot detect combinations for n 111 < 3. Therefore, we also investigated the difference in the number of detections for n 111 ≥ 3.
As shown in Tables 2 and 3, 621 signal pairs were detected using the CSS (57.8% of the total combinations, accuracy: 0.817, Youden's index: 0.661, and F-measure: 0.820), whereas 739 pairs were detected using the CRR (68.7% of the total combinations, accuracy: 0.752, Youden's index: 0.556, and F-measure: 0.780). These results demonstrate the significantly improved signal detection accuracy of the newly proposed CSS in comparison with that of the CRR. However, the CSS exhibited slightly lower accuracy for detecting drug-drug interaction signals in comparison with the Ω shrinkage measure.
These results suggest that the CSS might be a more suitable method for detecting drug-drug interaction signals using PRR instead of the CRR.
Our previous studies [13] have shown that the Ω shrinkage measure is the most conservative signal detection method. Therefore, in this study, we also investigated the similarity between the CSS and the Ω shrinkage measure.
The similarity metrics between the CSS and the Ω shrinkage measure were κ (95%CI): 0.330 (0.314-0.345), P positive : 0.505, and P negative : 0.759, whereas those between the CRR and Ω shrinkage measure were κ: 0.718 (0.703-0.733), P positive : 0.771, and P negative : 0.948 [13] in all case. As mentioned earlier, although the CRR cannot detect combinations for n 111 < 3, the CSS can. It is considered that this difference in detection criteria affected the similarity.
Indeed, the CSS is more similar to the Ω shrinkage measure than the CRR in n 111 ≥ 3; the Ω shrinkage measure and CSS was κ (95%CI): 0.729 (0.708-0750), P positive : 0.880, and P negative : 0.849, whereas the CRR and Ω shrinkage measure were κ: 0.621 (0.597-0.646), P positive : 0.850 and P negative : 0.763 [13]. Unlike the CRR, the CSS has the advantage of being able to detect combinations for n 111 < 3, but if a more conservative signal detection is needed, consider adding "n 111 ≥ 3" to the criteria.
However, this study, like our previous study [12], has three limitations: (1) The true data used in this study consist of a statistics-based drug D 1 -drug D 2 adverse event (SJS) combination rather than a pharmacology-based combination. Unfortunately, data on unknown adverse events do not exist; thus, it was necessary to use "hypothetical" true data instead of "real" true data for validation. Therefore, we used a combination of signals detected using the three algorithms as "hypothetical" true data for detecting drug-drug interaction signals. (2) It is important to compare detection trends using all adverse events recorded in the validation datasets. However, numerous combinations of drug D 1 -drug D 2 adverse events are expected, and a study design including all such combinations is not practical. Therefore, SJS was the target adverse event in this study. Although this adverse event has been used previously in other comparative studies by our group [12,13] and other researchers [10,14], it is possible that different performance characteristics are obtained when different reference sets are used. (3) Differences in the approach adopted by regulatory authorities may result in differences in the tendency to register adverse events in the spontaneous reporting system. For example, JADER has long not accepted reports from patients, whereas the Food and Drug Administration Adverse Events Reporting System (FAERS) includes reports from non-medical professionals. It is unclear how the differences in registration tendencies would affect the results of this study [12]. However, as verified by Caster et al. [15], the statistical impact of differences in the number of cases enrolled in the spontaneous reporting system on this study might be small.

Data Sources
The validation dataset was created from the Japanese Adverse Drug Event Report database (JADER), using data from the first quarter of 2004 to the fourth quarter of

Definitions of Adverse Drug Events
The drugs targeted for the survey were all registered and classified as "suspect drugs" in verification data set. The adverse event targeted for this study was Stevens-Johnson syndrome (SJS) using the preferred term (PT) in the Medical Dictionary for Regulatory Activities Japanese version (MedDRA/J).

"Hypothetical" True Data of Adverse Events for Comparative Verification
There are no "real" true data for unknown adverse events. It was considered that only known adverse events should not be used as "real" true data for validation, because detection algorithms require the power to detect unknown adverse events. Therefore, "hypothetical" true data used in the previous study [12] were also used in this study.
"Hypothetical" true data used the combination of signals detected by three algorithms (the additive model [16], the multiplicative model [16], and the Chi-squared statistical test [17]).  (4) combination risk ratio (CRR) > 2, this was signal of drug-drug interaction.

New Model (Concomitant Signal Score) and Criteria
Two new models and criteria were considered. The lower 95%CI for the PRR (PRR 025 ) and the upper 95%CI for the PRR (PRR 975 ) was calculated using Figure 1 and Equations (2) and (4).
We proposed the CSS (Equation (5)) as a new model. The newly proposed model is the ratio of PRR 025 drug D1 ∩ drug D2 and PRR 975 drug D1 (or PRR 975 drug D2 ). This principle is similar to the interaction signal score (INTSS) [16].
4.4.3. Model (Ω Shrinkage Measure) and Criteria to be Compared.
In this study, we selected the Ω shrinkage measure [18] as the model for comparison. The Ω shrinkage measure shows the most conservative detection trends of the five algorithms based on frequency statistical model for drug-drug interactions in our previous study [13].
The calculation is shown in Equations (6) and (7). However, E 111 is the expected value of adverse event (AEs) caused by the combination of two drugs. Ω 025 > 0 is used as a threshold to screen for signals under the combination of two drugs The ROC curve is normally used to judge model performance. To properly analyze imbalanced data, which include a higher proportion of negative data than positive data, evaluation using the PR curve [19,20] in addition to the ROC curve is necessary. Therefore, the PR curve was also used in this study.

Cohen's Kappa Coefficient
The commonality of the signals detected by each statistical model was evaluated using Cohen's kappa coefficient (κ), proportionate agreement for positive rating (P positive ), and proportionate agreement for negative rating (P negative ), as reported in previous studies [12][13][14]. In this study, we investigated the similarities with the Ω shrinkage measure for the previous/newly proposed analysis (CRR/CSS).

Analysis Software
The analysis software in this study used Visual Mining Studio (NTT DATA Mathematical Systems Inc., Shinjuku-ku, Tokyo, Japan) version 8.4, Microsoft Excel 2019 (Microsoft Corp., Redmond, WA, USA), and R version 4.0.0 (R Core Team, https://www.R-project. org/) with PRROC package version 1.3.1.

Conclusions
Polypharmacy has become a contemporary social problem, which has necessitated safety signal screening, not only for single drugs but also for drug-drug interactions. A convenient method is sought because most methods for detecting drug interaction signals involve complicated calculations.
The CRR proposed by Susuta et al. is based on PRR, which not only facilitates the detection of drug-drug interaction signals, but also makes it easy to understand the fluctuations in drug-drug interactions due to a single drug in terms of signal intensity [10]. However, unlike the INTSS, CRR does not consider the overlap between the lower limit of the 95%CI of PRR drug D1 ∩ drug D2 (=PRR 025 drug D1 ∩ drug D2 ) and the upper limit of the 95%CI of PRR drug D1 or PRR drug D2 (=PRR 025 drug D1 or PRR 025 drug D2 ).
In this study, we proposed a CSS with an improved detection criteria, with reference to the INTSS, to overcome the issues associated with the CRR. Our proposed model significantly improved the accuracy of signal detection for drug-drug interactions using the PRR.

Conflicts of Interest:
The authors declare no conflict of interest.