Lipid Proﬁles of Human Brain Tumors Obtained by High-Resolution Negative Mode Ambient Mass Spectrometry

: Alterations in cell metabolism, including changes in lipid composition occurring during malignancy, are well characterized for various tumor types. However, a signiﬁcant part of studies that deal with brain tumors have been performed using cell cultures and animal models. Here, we present a dataset of 124 high-resolution negative ionization mode lipid proﬁles of human brain tumors resected during neurosurgery. The dataset is supplemented with 38 non-tumor pathological brain tissue samples resected during elective surgery. The change in lipid composition alterations of brain tumors enables the possibility of discriminating between malignant and healthy tissues with the implementation of ambient mass spectrometry. On the other hand, the collection of clinical samples allows the comparison of the metabolism alteration patterns in animal models or in vitro models with natural tumor samples ex vivo. The presented dataset is intended to be a data sample for bioinformaticians to test various data analysis techniques with ambient mass spectrometry proﬁles, or to be a source of clinically relevant data for lipidomic research in oncology. Dataset: Is available in the MetaboLights repository. The package was prepared by means of ISACreator software and is accessible via link https://www.ebi.ac.uk/metabolights/MTBLS1558/. Dataset License: CC-BY.


Summary
Energy metabolism alteration is a well-known hallmark of cancer that leads to substantial changes in cell lipid composition [1]. Numerous lipid species became dysregulated in various cancer types [2]. However, at this moment, only some generic trends in upregulation of mono and diunsaturated phosphatidylcholines are observed across various diagnoses, in particular, in glioblastoma multiform [2,3], which attracts interest in the investigation of lipid composition alterations occurring during malignancy. In brain tumors, as in many other proliferating cells, anaerobic glycolysis becomes the major pathway of glucose metabolism, which is called the Warburg effect [4]. The high rate of proliferation specific to malignant tissues requires a considerable amount of biomass components to support the growth and formation of new cells, and therefore the promotion of de novo lipogenesis, especially the synthesis of phospholipids to build cell membranes and triglycerides required for energy storage and production [3,5,6]. The lipid composition of cancer cells is different from healthy ones due to many factors. The Warburg effect leads to abnormally high levels of NADH, which promotes fatty acid de novo synthesis [7,8]. On the other hand, an inhibition of aerobic glycolysis, caused by cancer tissue hypovascularity, triggers the beta-oxidation pathway of long-chain fatty acids [9]. During this process, pairs of carbon atoms cleave from the aliphatic chain, yielding acetyl-CoA, which is utilized to produce ATP required for cell metabolism. Eventually, in combination with de novo fatty acid synthesis, beta-oxidation leads to an increased ratio of saturated to unsaturated fatty acid residues in cancer cells, affecting total lipid composition.
The notable change in the lipid composition of cancer tissues compared to healthy ones is of interest not only for investigating carcinogenesis, but also for enabling the possibility of discriminating pathological and healthy tissues in a clinic, which is especially important in neuro-oncology [10,11]. The accuracy of tumor border determination is crucial, as the volume of tumor resection determines the operation outcome, but excessive resection of healthy brain tissue is unacceptable. Mass spectrometry identification of tumor tissues based on their lipid composition is an emerging technique among the variety of navigation techniques in neurosurgery [12][13][14]. The potential for intraoperative application imposes some limitations on the implemented mass spectrometry methods, the most important limit being the time required for analysis. Ambient ionization mass spectrometry, which is intended to analyze samples without any sample preparation or preliminary separation, substantially reduces the duration of individual analysis, so many efforts are being made to implement it in neurosurgery [15][16][17][18][19][20]. The high speed of analysis means that hundreds of compounds can be presented in mass spectra simultaneously, creating a molecular profile of the tissue. The molecular profile analysis is challenged by the complexity of data, the matrix effect, and possible signal instability. To overcome such complications, it is usually suggested to implement special algorithms for data evaluation, preprocessing, and further analysis using machine learning [21][22][23][24][25][26][27][28].

Data Description
The dataset contains 162 high-resolution mass spectra obtained in negative mode. In this assay, samples were collected with regard to three factors: patient gender, year of birth, and disease diagnosis. Specifically, there are samples of 36 women and 34 men. The oldest patient was born in 1942, while the youngest one was born in 2010 (mean age 48.8, median age 54). The assay data are arranged in 166 files, as described in the Table 1.
Every tissue sample obtained during neurosurgery was divided in two parts in order to obtain histological annotation for each sample. Received histochemical conclusions showed the presence of alterations typical for different oncological diseases in tissues of several samples and no such alterations in the other part of samples which were considered as non-tumor pathology. The results of the histochemical evaluation are included in the dataset together with the relevant patient data. The distribution of samples and patients over diagnosis is shown in Table 2. This dataset was used to show that tissues with different types of pathology can be reliably distinguished by the analysis of their mass spectrometric profiles ( Figure 1) for development of algorithms for tumor boundary detection.
There are 4 metadata fields in CDF-files filled as below:

Samples
The samples were provided by the N.N. Burdenko National Scientific and Practical Center for Neurosurgery (NSPCN) and analyzed under an approved N.N. Burdenko NSPCN Institutional Review Board protocol in accordance with the Helsinki Declaration as revised in 2013 (Order 40 of 12 April 2016 as amended by Order 131 of 17 July 2018). Brain tumor tissues were resected during the elective surgery All tissue samples in this dataset are related to primary tumors resected during the first course of the surgical treatment. Patients with any other tumor types in anamneses were excluded from the study. Non-tumor pathological tissues were resected in the course of surgery for drug-resistant epilepsy. A signed informed consent explicitly noting that all removed tissues could be used for further research was obtained from all patients. Every dissected tissue was anonymized and split into two parts. A professional pathologist examined the first part, and the second one was placed in normal saline, frozen, and stored at −80 °C until analysis.

Mass Spectrometry.
The samples were analyzed using a spray-form-tissue ambient ionization mass spectrometry approach [29], which provides lipid profiles of the analyzed tissue similar to other ambient ionization techniques (ICE, DESI, PESI, etc.) [10,15,16,30]. A freshly thawed tissue sample was cut into approximately 2 mm 3 samples, which were placed on the tip of the 30 × 0.6 mm injection needle. High voltage (6 ± 1 kV) and solvent flow (4 ± 1 µL/min) were then applied through the needle to obtain a stable ion current. HPLC grade methanol supplemented with 0.1% of formic acid (optionally supplemented with 30% HPLC grade chloroform, see detailed data description) was used as an extraction solvent. Solvents and formic acid were obtained from Merck (Merck KGaA, Darmstadt, Germany).
Mass spectra acquisition was performed on a Thermo Finnigan LTQ FT Ultra mass spectrometer equipped with a 7T superconducting magnet and on a Thermo LTQ XL Orbitrap ETD mass spectrometer (Thermo Fisher Scientific, San Jose, CA, USA). Samples were analyzed in a negative mode in the ranges of m/z 100-1300 (resolution 150,000 FWHM at m/z 400) and m/z 120-2000 (resolution 30,000 FWHM at m/z 400) on the LTQ FT Ultra device and LTQ XL Orbitrap device, respectively.

Data Transformation.
During dataset preparation, mass spectrometric source files were converted from Thermo RAW format to NetCDF format via in-laboratory developed software. Validity of this conversion method was approved by using MALDIquant ver.1.19.3 [31], ncdf4  Patients with any other tumor types in anamneses were excluded from the study. Nontumor pathological tissues were resected in the course of surgery for drug-resistant epilepsy. A signed informed consent explicitly noting that all removed tissues could be used for further research was obtained from all patients. Every dissected tissue was anonymized and split into two parts. A professional pathologist examined the first part, and the second one was placed in normal saline, frozen, and stored at −80 • C until analysis.

Mass Spectrometry
The samples were analyzed using a spray-form-tissue ambient ionization mass spectrometry approach [29], which provides lipid profiles of the analyzed tissue similar to other ambient ionization techniques (ICE, DESI, PESI, etc.) [10,15,16,30]. A freshly thawed tissue sample was cut into approximately 2 mm 3 samples, which were placed on the tip of the 30 × 0.6 mm injection needle. High voltage (6 ± 1 kV) and solvent flow (4 ± 1 µL/min) were then applied through the needle to obtain a stable ion current. HPLC grade methanol supplemented with 0.1% of formic acid (optionally supplemented with 30% HPLC grade chloroform, see detailed data description) was used as an extraction solvent. Solvents and formic acid were obtained from Merck (Merck KGaA, Darmstadt, Germany).
Mass spectra acquisition was performed on a Thermo Finnigan LTQ FT Ultra mass spectrometer equipped with a 7T superconducting magnet and on a Thermo LTQ XL Orbitrap ETD mass spectrometer (Thermo Fisher Scientific, San Jose, CA, USA). Samples were analyzed in a negative mode in the ranges of m/z 100-1300 (resolution 150,000 FWHM at m/z 400) and m/z 120-2000 (resolution 30,000 FWHM at m/z 400) on the LTQ FT Ultra device and LTQ XL Orbitrap device, respectively.

Data Transformation
During dataset preparation, mass spectrometric source files were converted from Thermo RAW format to NetCDF format via in-laboratory developed software. Validity of this conversion method was approved by using MALDIquant ver.1.19.3 [31], ncdf4 ver.1.17, and RNetCDF ver.2.1-1 [32] R-packages and ncread Matlab-packages for spectra analysis. We have committed a code patch to MZmine 2 to support our version of the CDF file. That code is available on GitHub for MZmine 2 [33] and the upcoming release of MZmine 3.

User Notes
The presented dataset was used to develop a novel, simple algorithm of feature selection for molecular profiles [24], which extracts stably detectable ions from the molecular profiles of glial tumors and selects features that are in agreement with the result of other experimental techniques. The molecular signatures, determined from the presented dataset, were used to demonstrate the possibility of discriminating and identifying various pathological tissue types obtained during elective surgery. It was shown that the molecular profiles of unmodified and damaged brain tissue are separable-various necrotized (necrotized tumor, necrotic tissue with necrotized vessels, necrotic tissue with tumor strain) and tumor (histologically pure tumor, tumor with necrosis, tumor lesions) tissues could be differentiated from each other as well as from the tumor boundary tissues [25]. The same data was further implemented to create classifiers for rapid identification of various tumors (glioblastoma, astrocytoma, meningioma) based on ambient mass spectrometry [34][35][36], which has become the basis for developing new ambient ionization techniques designed for clinical application [10]. On the other hand, the presented dataset, as it is an example of data representing actual data obtained in a clinic, was used as a model for developing an instrument for an interactive and automated tool for evaluating the stability and reproducibility of mass spectra [21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37], and for the unification of representations of high-and low-resolution mass spectra for further clinical implementation [23].
The dataset is very unbalanced towards malignant tissue samples because it represents a real situation with samples in clinics where it is very difficult to obtain unmodified brain tissues samples, and a controllable experiment is not possible to conduct due to work being performed with patients that are available at the moment. On the other hand, it is an ideal case from the machine learning point of view because it collects all possible difficulties connected with the analysis of real clinical data, as it presents typical intergroup variability for groups defined by patient, diagnosis, sample, or tissue type. The dataset can be very useful for the tailoring of anomaly detection and unsupervised learning methods for brain tumor clinical applications. Another problem that mass-spectrometry data as a whole, and this dataset in the particular, present for the application of machine learning techniques is the wide geometry of the dataset when the number of characteristics inside one sample considerably exceeds the number of samples.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are openly available in MetaboLights repository at https://www.ebi.ac.uk/metabolights/MTBLS1558/ (9 December 2021). The software developed for this study are available from the corresponding author upon reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.