An Open-Access Dataset of Thorough QT Studies Results

: Along with the current interest in changes of cardiovascular risk assessment strategy and inclusion of in silico modelling into the applicable paradigm, the need for data has increased, both for model generation and testing. Data collection is often time-consuming but an inevitable step in the modelling process, requiring extensive literature searches and other identification of alternative resources providing complementary results. The next step, namely data extraction, can also be challenging. Here we present a collection of thorough QT/QTc (TQT) study results with detailed descriptions of study design, pharmacokinetics, and pharmacodynamic endpoints. The presented dataset provides information that can be further utilized to assess the predictive performance of different preclinical biomarkers for QT prolongation effects with the use of various modelling approaches. As the exposure levels and population description are included, the study design and characteristics of the study population can be recovered precisely in the simulation. Another possible application of the TQT dataset is the analysis of drug characteristic/QT prolongation/TdP (torsade de pointes) relationship after the integration of provided information with other databases and tools. This includes drug cardiac safety classifications (e.g., CredibleMeds), Comprehensive in vitro Proarrhythmia Assay (CiPA) compounds classification, as well as those containing information on physico-chemical properties or absorption, distribution, metabolism, excretion (ADME) data like PubChem or DrugBank.


Summary
Cardiovascular toxicity has been one of the leading causes of concern throughout the drug development process as well as a major contributor to drug withdrawals [1]. Among the most frequent cardiac safety liabilities responsible for cardiotoxicity is proarrhythmia [2], although after implementation of The International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) guidelines, the number of cases of drug withdrawals due to proarrhythmic effects decreased substantially [3]. On the other hand, the approach proposed by the guidelines are oversensitive and may result in a high rate of false-positive results, which eventually leads to withdrawal of potentially useful compounds [4].
In E14 ICH Guidelines, the Food and Drug Administration (FDA) provides recommendations to sponsors for evaluating non-antiarrhythmic drug effects on cardiac repolarization. According to the document, almost all new drugs, or those for which a new dose or route of administration is under development, have to be tested in a trial dedicated to the assessment of their potential to delay cardiac repolarization, called a "Thorough QT/QTc (TQT) study". The clinical endpoint corresponding to the changes in the time of ventricular repolarization is QT interval prolongation on the surface electrocardiogram (ECG). Because QT interval length is related to the heart rate, it is recommended that applications should contain, apart from the raw QT data, RR interval data and QT interval data corrected for the heart rate (QTc). Among the correction methods, Bazett's and Fridericia's formulas are the most popular, although others are also used [5]. TQT studies are usually run at early stages of drug clinical development in healthy volunteers with an additional positive control group enrolled for assay sensitivity establishment. A TQT study is said to be positive if the upper boundary of the 95% confidence interval around the mean drug effect on QTc exceeds 10 ms, resulting in an expanded ECG safety evaluation during the whole drug development process [6].
TQT studies together with the assessment of the drug-dependent in vitro hERG (human ethera-go-go-related gene) channel blockade (ICH S7) form a sensitive approach for drug proarrhythmic potency evaluation, which, however, suffers from limited specificity. In 2013, a Comprehensive in vitro Proarrhythmia Assay (CiPA) initiative was raised as a new paradigm in drug cardiac safety [7]. The CiPA consists of four elements: (1) evaluation of the drug effects on multiple cardiac ion currents (to avoid an hERG-centric assessment and, therefore, potential bias), (2) in silico models of the ventricular action potential (to help to interpret previously generated in vitro inhibition data for multiple currents), (3) drug effect evaluation in human stem cell-derived ventricular myocytes, and (4) early clinical ECG measurements of unanticipated effects (to eventually replace TQT trials with more thorough ECG assessment from the First-In-Human trials) [4]. The CiPA initiative intends to move toward human-based approaches and multiple ion channel assays coupled to in silico models of human cardiomyocyte electrophysiology with the ultimate aim of predicting risk of TdP (torsade de pointes) rather than QT prolongation alone.
The aim of the current endeavor was to gather available results of the TQT trials and present them in a searchable form. The presented dataset provides information that can be further utilized to assess the predictive performance of different preclinical biomarkers for QT prolongation effects with the use of various modelling approaches. As the exposure levels and population description are included, the study design and characteristics of the study population can be recovered precisely in the simulation. Another possible application of the TQT dataset is the analysis of the drug characteristic/QT prolongation/TdP relationship after the integration of provided information with other databases and tools. This includes drug cardiac safety classifications (e.g., CredibleMeds [8]), CiPA compound classification (TdP risk level), as well as those containing information on physicochemical properties or ADME (absorption, distribution, metabolism, excretion) data like PubChem (substance structure files and physico-chemical properties) or DrugBank.

Data Description
Ultimately, 120 studies with published results were retrieved from literature sources, and information on a further 334 studies was gained from clinical trial registers and other resources. In all, the identified studies of 159 unique compounds have been assessed. The results of identified studies were reported between 2005 and 2019. The dataset contains results for 141 compounds tested as single-drug therapy, for 18 compounds given as a two-drug combination, and 2 compounds given in combination with more than 2 drugs (3 and 4). TQT study results were identified for two compounds from the CiPA compounds list (ondansetron and domperidone). In all the studies for which a positive control was reported in the data source document, moxifloxacin was used to demonstrate assay sensitivity. Results of the majority of retrieved studies were classified as "negative" (123 compounds), there were 16 compounds that tested "positive" in TQT study, and the study results for the rest of the compounds or combinations were either inconclusive or not stated in the original information source. The retrieved TQT studies were conducted in healthy volunteers; the diseased population participated in only 6 studies. Patients were HIV (Human Immunodeficiency Virus) or HCV (hepatitis C virus) infected or were diagnosed with schizophrenia or schizoaffective disorders, Parkinson's disease, or had a solid tumor. In the majority of studies, both men and women were included; however, in 12 studies, female only (2 studies) or male only (10 studies) groups were investigated. The number of subjects involved in a study or study arm varied significantly from 13 to 304, with 1 to 135 and 7 to 272 for females and males, respectively. For 108 out of 154 TQT studies, pharmacokinetic data were reported. In most cases, values of Cmax, Tmax, and AUC (area under the concentration-time curve) were known. Pharmacodynamic results were reported as QTc change, and a range of QT corrections for heart rate was utilized, including Fridericia, Bazett, individual studyspecific and individual population-based methods.

Data Source Identification
The data collection process was undertaken between November 2018 and January 2019. PubMed and Google Scholar bibliographic databases, clinical trial registers (ClinicalTrials.gov, the World Health Organization (WHO) International Clinical Trials Registry, and The European Union Clinical Trials Register), and the Internet, via the Google search engine, were searched to identify published results of thorough QT (TQT) studies. To identify as many reports as possible, the search criteria were not rigorous, that is, general search terms were applied in all the fields without any filtering options. Thorough (all fields) AND QT (all fields) was the single search query used for the title, abstract, and keyword fields in the first run of the search in the Medline database. The search yielded 479 papers. This result was augmented by the search results for TQT (all fields) and QT/QTc (all fields) queries and totaled 921 publications. No time limits were applied in all the searches. The search in Google Scholar database was performed using "thorough QT" or TQT query, and 880 were returned. First, the titles of identified articles were examined to exclude clearly irrelevant results. Then, the results were sifted through via the abstracts. Finally, the full-text of remaining papers was manually reviewed for inclusion eligibility. Further studies were retrieved from clinical trial registers. ClinicalTrials.gov (https://clinicaltrials.gov/), the WHO International Clinical Trials Registry (http://apps.who.int/trialsearch/), and The European Union Clinical Trials Register (www.clinicaltrialsregister.eu) were queried for "TQT" and "thorough QT" keywords. Forty-one, 38, and 3 registered studies were identified in each database, respectively, totaling 60 unique studies. However, the study results were posted for only 10 of them. Furthermore, searches using the Google search engine were conducted to identify TQT study results published in European public assessment reports (EPAR documents), product patient package inserts, conference abstracts, or posters.

Extracting the Data
All the papers, as well as study reports, were examined separately with the aim of extracting the data of interest including study design, study population, pharmacokinetic (PK) and pharmacodynamic (PD) results, and the study result classification (positive/negative). Each data record in the dataset was described with up to 40 parameters, defined in Table 1. All available study parameters were recorded. The PubMed ID number or ClinicalTrials.gov identifier allows for the identification of the primary data source for the further interested reader. Proportion of males and females Race Weight (mean and/or SD and/or range) Height (mean and/or SD and/or range) BMI (mean and/or SD and/or range) Health status (healthy/condition) PK Cmax and SD or CI or coefficient of variation Tmax AUC and SD or CI or coefficient of variation PD Baseline QT interval length and SD or CI or coefficient of variation QT interval length after drug administration (SD) QT assessment time after administration Baseline-corrected QT change and/or CI and/or SD and/or SE Placebo-corrected QT change and/or CI and/or SD and/or SE QT correction method Record Data source (first author name and year of publication) PMID All the compounds were defined by their common and IUPAC names, as well as PubChem's compound ID number (CID), which allow convenient browsing and definitive identification of a compound. For some of the compounds, when used in the related TQT study report, their code name in discovery and development program is given in comments. SMILES notation of a compound structure allows browsing for a compound with defined structural fragments. SMILES strings can be also imported by many molecule editing tools and converted into two-dimensional structures (or in some cases can be used directly), enabling calculation of the molecular descriptors for QSAR (Quantitative Structure-Activity Relationship) modelling approaches. The detailed study protocol, including study design, dosing schedules, and population description, enables close recovery of clinical settings in virtual clinical study. Study results classification as being positive or negative is useful, for example, for comparisons with available TdP risk classifications. Finally, the first author of the published report, together with the PubMed ID number or, in some cases, link to pdf file or webpage, is given so the data verification and possibility of acquisition of additional details not accounted for in the dataset are assured.

User Notes
The current report presents a freely available set of data describing results of TQT trials from publicly available sources. To our best knowledge, this is the only set of data of that character. The developed dataset is freely available via the Tox-Portal platform (http://tox-portal.net) or in the Mendely repository (DOI: 10.17632/47nknnw666.1).