Emidec: A Database Usable for the Automatic Evaluation of Myocardial Infarction from Delayed-Enhancement Cardiac MRI

: One crucial parameter to evaluate the state of the heart after myocardial infarction (MI) is the viability of the myocardial segment, i.e., if the segment recovers its functionality upon revascularization. MRI performed several minutes after the injection of a contrast agent (delayed enhancement-MRI or DE-MRI) is a method of choice to evaluate the extent of MI, and by extension, to assess viable tissues after an injury. The Emidec dataset is composed of a series of exams with DE-MR images in short axis orientation covering the left ventricle from normal cases or patients with myocardial infarction, with the contouring of the myocardium and diseased areas (if present) from experts in the domains. Moreover, classical available clinical parameters when the patient is managed by an emergency department are provided for each case. To the best of our knowledge, the Emidec dataset is the ﬁrst one where annotated DE-MRI are combined with clinical characteristics of the patient, allowing the development of methodologies for exam classiﬁcation as for exam quantiﬁcation. Dataset: The database is accessible via this link: http: // emidec.com / . CC-BY-NC-SA agreement.

to restore perfusion as soon as possible. However, when this revascularization fails, an extensive damage which consists in an obstruction called persistent microvascular obstruction (MVO) can appear [1][2][3] (known also as no-reflow phenomenon). Among the patients with MI, determining which of them will benefit from revascularization is of obvious clinical significance. One crucial parameter is the viability of the myocardial segments, i.e., to estimate if the myocardial segment with suspected myocardial infarction is viable or not, in other words, if the segment can recover its functionality upon revascularization. MRI performed several minutes after the injection of a contrast agent (defined as delayed enhancement-MRI or DE-MRI) is a method of choice to evaluate the extent of MI, and by extension, to assess viable tissues after an injury (in conjunction with the thickening of the muscle evaluated from cine-MRI) [4][5][6]. In addition, in current practice, the medical doctor does not only possess the imaging data but also clinical information about the patients, and their report is therefore also based on this information.
The proposed dataset is composed of 150 clinical exams involving for each case a series of short-axis MRI with ground truths (segmentation of the myocardium and diseased area if present) and the associated clinical characteristics currently used in medical practices. This dataset was available as part of the Emidec challenge (automatic Evaluation of Myocardial Infarction from Delayed-Enhancement Cardiac MRI) organized in conjunction with the STACOM workshop during the MICCAI conference in 2020 [7]), and this challenge is also registered on Zenodo website [8].
This dataset could be used to design methodologies allowing an automatic detection of the different relevant areas (the myocardial contours, the infarcted area, and the persistent microvascular obstruction area) from a series of short-axis DE-MRI covering the left ventricle and then to make a quantification of the MI, in absolute value (mm 3 ) or as a percentage of the myocardium [9]. This dataset could also be used to design classification approaches allowing to detect whether the exam is normal or pathological from only the clinical information, or based on the combination of clinical information and DE-MRI. Other datasets have already been published with DE-MRI, but they are different than our proposal. The dataset provided by Karim et al. [9] included only fifteen human cases. More recent challenges during the MICCAI conference provided DE-MRI with expert annotations, such as the MS-CMRSeg challenge in 2019 [10] and the MyoPS challenge in 2020 [11], but the number of cases was lower than that in our dataset. Other important features that render our dataset unique are the presence of normal cases and the knowledge of the clinical characteristics associated with the exam.

Data Description
The Emidec dataset is a new publicly available DE-MRI database with associated clinical information consisting of 150 exams. The data of each clinical exam are divided into two parts: -MR images composed of a series of DE-MRI in short axis orientation that cover the left ventricle of the heart, with the corresponding ground truths; -A text file with the clinical information.
There is an unbalanced distribution between normal (1/3) and pathological (2/3) cases, corresponding roughly to real life in an MRI department. The targeted cohort is any patient admitted in a cardiac emergency department with symptoms of a heart attack. Each group was clearly defined according to physiological parameters and the presence or absence of a disease area on DE-MRI. All pathological cases are patients with acute MI, and the MRI exam was done within one month of the angioplasty procedure. Patients with multiple pathologies were discarded.

MRI Exams
The use of gadolinium extracellular contrast agents with MRI using late post-gadolinium myocardial enhancement sequences have further pushed our ability to accurately and precisely analyze myocardial tissue composition, especially myocardial fibrosis content [12]. In particular, the increase in gadolinium concentration within fibrotic tissue causes T1 shortening, which appears as bright signal intensity in the T1-weighted cardiac DE-MRI. The physiological basis of the gadolinium enhancement of myocardial fibrosis after several minutes is based upon the combination of an increased volume of distribution for the contrast agent and a prolonged washout related to the decreased capillary density within the myocardial fibrotic tissue [13,14]. The fibrotic area appears bright in T1-weighted MRI when the acquisition is obtained roughly 10 min after the injection of a gadolinium-based contrast agent, whereas normal tissue appears dark. We make note of the difference between early enhancement that can be studied one or two minutes after injection of the contrast agent, and the study of late enhancement, such as in our case, with acquisition roughly 10 min after the injection. The amount of contrast agent in the myocardium and the corresponding grey level on the images can be influenced by many factors, such as the renal clearance, the type of contrast media, the contrast dose, the delay between the injection of the contrast agent and the image acquisition, and the sequence parameters, among others. In our dataset, the contrast media and the sequence parameters are always the same; however, the variation of the other features involves differences in the image signal. Persistent MVO is due to persistent perfusion defect and causes irreversible damages to the intracellular zone, which finally leads to tissue death [15,16]. This latter area appears in black on T1-weighted MRI regardless of time delay between the injection of the contrast agent and the image acquisition and is always surrounded by bright area (corresponding to myocardial infarction, as shown in Figure 1c). In the presence of scar with a transmurality of 50% or in the significant presence of MVO, late gadolinium enhancement is sufficient to predict functional recovery. Fifty one percent of pathological cases have persistent MVO. For each image, the contours of the myocardium, as well as the contours of the infarcted area and the MVO areas, if present, are considered as the ground truths. From the contours, specific labels are assigned to each voxel depending on its location: in the background, in the myocardium, in the myocardial infarction area, and in the MVO area, respectively. These masks are provided with the dataset. The whole input images are provided using a NIfTI format, i.e., one file for the whole images covering the left ventricle for one case, and also one NIfTI file with the associated ground truths.
In details, the BMI is calculated as: We included in the familial history of coronary artery disease the whole previous acute cardiac events of the patient. The study of the electrocardiogram (ECG) allows classifying the heart attack as STEMI type or not. STEMI stands for ST-elevation myocardial infarction. The ST segment refers to the flat section of an ECG reading and represents the interval between jagged heartbeats. When a person has a heart attack, this segment will no longer be flat but will appear abnormally elevated [17]. STEMI is the most serious type of heart attack, which is characterized by a long interruption of blood supply. A troponin test measures the levels of troponin T or troponin I proteins in the blood. These proteins are released when the heart muscle has been damaged, such as during a myocardial infarction. A value less than 0.1 is considered normal, and a value higher than 0.4 is generally considered pathological [17]. The Killip score between 1 and 4 is a classification which was proposed in 1967 based on the physical examination of patients with possible acute myocardial infarction [18]. The different classes of the Killip score are detailed in Table 1. Rales in the lungs, third heart sound (S3), and elevated jugular venous pressure, 3 Acute pulmonary edema 4 Cardiogenic shock or arterial hypotension, and evidence of peripheral vasoconstriction The Killip max corresponds to the maximum score recorded during the management of the patient in the emergency department. The left ventricular ejection fraction (LVEF) is calculated from conventional echocardiography during admission of the patient in the emergency department according to the following equation: The diastolic volume (or the systolic volume) corresponds to the maximum (respectively, minimum) volume of the left ventricular cavity. Cine-MRI is the gold standard to evaluate LVEF, but we instead provide LVEF from echocardiography because this information is available before the MRI exam as the other clinical information.
NT-pro-brain natriuretic peptide (NT-proBNP) is measured in venous blood, and it is an indicator for the diagnosis of heart failure [19]. Natriuretic peptides are hormones with vasodilator effects, mainly secreted in the left ventricle as a mechanism to compensate for pressure overload. A value less than 135 is considered normal.
Some cases could sometimes be ambiguous, rendering the classification not evident. Indeed, patients coming in an emergency department could have other diseases, providing normal DE-MRI but abnormal clinical information. For example, myocarditis could provide abnormal values for some clinical parameters, but normal DE-MRI. However, even if a parameter is ambiguous, considering the whole provided clinical parameters will avoid any great ambiguity.

Ethics Approval
The overall dataset was created from real clinical exams acquired at the University Hospital of Dijon (France). Acquired data were fully anonymized and handled within the regulations set by the local ethical committee. As the data were collected retrospectively, and as they are completely untraceable (because using the NIfTI format, all the administrative information included in the header is discarded), for French law, and for the staff of the ethical committee of the University Hospital of Dijon, no ethics committee approval was required. The Ethical Committee of the University Hospital of Dijon checked compliance with the law of the created dataset. The Python code allowing to convert the series of images in DICOM format of one exam into one NIfTI file is available upon request.

DE-MRI Acquisition
For the acquisition of the MR images, there was no specific protocol, since a conventional cardiovascular exam was used. This protocol included cine-MRI and DE-MRI. The creation of the Emidec dataset is a retrospective study where we extracted only the short-axis slices of the DE-MRI from the exam.
Regarding the DE-MRI, acquisition was performed on 1.5 T and 3 T magnets (Siemens Medical Solution, Erlangen, Germany) with a phased thoracic coil. All acquisitions were ECG-gated, taken while holding one's breath and performed 10 min after the injection of a gadolinium-based contrast agent (Gd-DTPA; Magnevist, Schering-AG, Berlin, Germany) injected at concentration between 0.1 and 0.2 mmol/kg. A T1-weighted phase sensitive inversion recovery (PSIR) sequence was used (TR = 3.5 ms, TE = 1.42 ms, TI = 400 ms, flip angle = 20). The resulting MR images consist of a stack of short-axis slices from base to apex of the left ventricle with the following features: pixel spacing between 1.25 × 1.25 mm 2 and 2 × 2 mm 2 , slice thickness of 8 mm, and distance between slices of 10 mm (i.e., one image every 10 mm). The variation of these parameters at the acquisition on 1.5 T or 3 T magnets allows us to deal with images with different signal to noise ratios. Currently, the scanner field strength is not indicated, but in an updated version of the dataset, this information will be available in a separate text file. There are from 5 to 10 slices per exam covering the left ventricle. Only the slices with the myocardium viewable are provided, then on the most basal one, the outflow track is sometimes present. At the level of the apex, the blood pool is always visible on the most apical slice. Only the phase-sensitive images are provided.

Annotation of the MR Images
Concerning the MR exams, the contours were manually drawn for all cases with the QIR software (CASIS, Quetigny, France). The gold standard for contours was obtained through a manual segmentation carried out by two experts. The left ventricular endocardial and epicardial borders, as well as the infarcted area and the MVO areas, if present, were first outlined by the first expert, an experienced user (a cardiologist with 10 years of experience in cardiology and MRI). Then, the second expert (a well-trained biophysicist with 20 years of experience) went through every outline and made some changes when necessary. During the contouring of the endocardial border, the papillary muscles were included in the cavity. For the contouring of the epicardial border, the expert only considered the muscle, the fat was in particular excluded, and there was an extrapolation of the contour at the level of the junction of the two ventricles. The persistent MVO area being defined as a black area surrounded by a bright area, the experts were requested to draw these black areas as persistent MVO areas while discarding black signal due to noise or artefacts. Examples are shown in Figure 2. Considering manual contouring instead of a semi-automatic approach ensures that questionable contours (due to the presence of artifacts, for example) are only due to the choice of the experts and do not depend on algorithm settings. We studied the intra-and inter-observer variations on 34 other cases outside this dataset. Thanks to the metric calculation method available on the Emidec website, we found a Dice index for the myocardium and for the MI of 0.84 and 0.76, respectively, for the intra-observer study, and of 0.83 and 0.69, respectively, for the inter-observer study.
The contours are then transcribed in the label field image. From the contours, specific labels are assigned to each voxel depending on its location: in the background, in the myocardium, in the myocardial infarction area, and in the MVO area, respectively. To prevent the drawback of the displacement of the heart location between slices due to different breath-holding occasions, the slices are realigned according to the gravity center of the area defined by the epicardial contour of the left ventricle.

Recording of the Clinical Information
For each exam, the associated clinical information is provided in a text file with a simple layout. The organization of the text file is detailed in Table 2. 1 Letter N means "normal" corresponding to normal cases, and P is associated to "pathological" cases.
In our opinion, one of the main advantages of our database is the combination of images and clinical data, simulating the classic workflow in emergency services. Indeed, additional clinical information available during the management of the patient in a clinical emergency department should not only reinforce the classification task, but also the segmentation one.