1. Summary
Melittin, the principal bioactive peptide of bee venom, has been widely reported to exert robust cytotoxic and anticancer effects against various malignancies, including triple-negative breast cancer (TNBC), an aggressive breast cancer subtype with limited targeted therapeutic options [
1,
2,
3]. While melittin is characterized as a membrane-active peptide, the vast majority of existing studies have focused on its protein-centric molecular mechanisms, and its impact on the global lipidome and lipid metabolism of cancer cells remains largely uncharacterized [
4]. Lipid metabolism reprogramming is a well-established hallmark of cancer, and lipidomic profiling can provide critical insights into tumor progression, therapeutic resistance, and the mechanism of action of anticancer agents [
5,
6,
7,
8]. To date, no untargeted lipidomic dataset has been published to characterize melittin-induced lipidome remodeling in TNBC cells, creating a critical gap in understanding the full mechanism of melittin’s anticancer activity.
To address this gap, we generated a comprehensive untargeted lipidomic dataset from the human TNBC cell line MDA-MB-231, with and without melittin exposure. For sample collection, log-phase MDA-MB-231 cells were divided into two groups with 5 biological replicates per group: the treatment group was exposed to 4 μg/mL melittin for 15 min, and the control group was treated with vehicle under the same incubation conditions. We selected 4 μg/mL melittin treatment for 15 min as the experimental condition, which was validated to induce significant early lipid remodeling without massive cell membrane rupture and non-specific lipid leakage, providing a reliable biological boundary for interpreting the lipidomic data. Total lipids were extracted from cell samples using methyl tert-butyl ether-methanol, and quality control (QC) samples were prepared by pooling equal volumes from all experimental samples to monitor instrument stability and data reproducibility throughout the analysis process.
Untargeted lipidomic profiling was performed on a UPLC-Q Exactive HF high-resolution mass spectrometry system, equipped with an ACQUITY UPLC CSH C18 column for chromatographic separation of lipid metabolites. Mass spectrometric data were acquired in both positive and negative electrospray ionization (ESI) modes, with a full scan range of m/z 200–2000 at 70,000 resolution, followed by data-dependent MS/MS acquisition for metabolite structural identification. Raw mass spectrometry data were preprocessed using the XCMS and metaX toolboxes in R software, including peak picking, retention time correction, peak alignment, and normalization via the probabilistic quotient normalization (PQN) method. Lipid metabolites were annotated by matching accurate mass and MS/MS spectra against the KEGG, HMDB, and in-house lipid databases, with a mass tolerance of 10 ppm for MS1. Multivariate statistical analyses, including principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), differential lipid screening (defined as p < 0.05, variable importance in projection (VIP) ≥ 1), and KEGG pathway enrichment analysis, were performed to systematically characterize melittin-induced lipidomic alterations.
The final dataset includes all raw LC-MS/MS data files organized by ionization mode, corresponding MD5 checksum files for data integrity validation, and QC sample data. All raw data have been deposited in the National Genomics Data Center (NGDC) under BioProject accession number PRJCA048975. The full raw dataset can be accessed directly via the permanent NGDC BioProject link:
https://ngdc.cncb.ac.cn/bioproject/browse/PRJCA048975 (accessed on 21 May 2026). No login or access permission is required.
This dataset will serve as the core data foundation for our ongoing and future research projects focused on elucidating the lipid-mediated anticancer mechanisms of melittin, and for the preclinical development of lipid-targeted therapeutic strategies against TNBC. To date, no peer-reviewed research articles based on this full dataset have been published.
This dataset is the first publicly available lipidomic resource profiling early melittin treatment in TNBC cells. It complements the existing long-term transcriptional regulation studies by providing a comprehensive map of direct membrane lipid alterations, filling the knowledge gap of early lipid-level mechanisms underlying melittin’s anticancer effects. The high-resolution data can be reused to explore novel mechanisms of membrane-active peptides and support the development of lipid-targeted therapeutic strategies for TNBC.
Public release and description of this dataset provide multiple key benefits to the broader scientific community. First, it is the first publicly available lipidomic resource profiling melittin-treated TNBC cells, filling a critical knowledge gap and enabling further in-depth investigation into the lipid-level mechanisms of melittin’s anticancer effects beyond traditional protein-centric views. Second, the high-resolution raw MS data allow independent reanalysis, result validation, and integration with other omics datasets by researchers worldwide, supporting open and reproducible scientific research. Third, the detailed annotated differential lipid profiles can be reused to generate novel hypotheses about lipid pathways involved in cancer cell membrane disruption, ferroptosis regulation, and therapeutic response to melittin. Finally, this dataset can serve as a standardized reference for future lipidomic studies of other natural bioactive peptides or anticancer agents, facilitating cross-study comparisons and accelerating the development of new cancer therapeutics.
3. Methods
3.1. Cell Culture and Sample Preparation
The human triple-negative breast cancer cell line MDA-MB-231 was obtained from Wuhan Punose Life Science Technology Co., Ltd. (Wuhan, China), and its identity was confirmed through STR profiling to ensure the absence of cross-contamination. The Dulbecco’s Modified Eagle Medium (DMEM), trypsin, PBS, fetal bovine serum (FBS), and penicillin-streptomycin solution were purchased from Gibco (Waltham, MA, USA). Melittin was acquired from Selleck Chemicals (Houston, TX, USA), while RIPA lysis buffer and protease inhibitors were sourced from Shanghai Yamei Biopharmaceutical Technology Co., Ltd. (Shanghai, China). Cells were cultured in DMEM high-glucose medium supplemented with 10% FBS and 1% penicillin-streptomycin in a sterile incubator at 37 °C with 5% CO2. When the cell confluence reached 80%, cells were washed twice with PBS to remove residual culture medium. Subsequently, 0.25% trypsin was added for digestion, and cells were incubated at 37 °C while being monitored in real-time. Once cell detachment was observed and cell morphology became rounded, complete medium containing serum was added to terminate digestion. Cells were gently pipetted to form a single-cell suspension and were passaged at a 1:3 ratio into new culture dishes for continued cultivation. Exponentially growing and viable cells were later extracted for experiments. Melittin powders were dissolved in sterile water and diluted to 4 μg/mL in DMEM before use. Log-phase MDA-MB-231 cells were trypsinized with 0.25% trypsin and seeded at a density of 5 × 105 cells per well in 6-well plates, then incubated at 37 °C with 5% CO2 for 24 h until the cell adhesion reached 70–80%. The experiment was divided into two groups: (1) the melittin treatment group received 1 mL of melittin diluted to 4 μg/mL in DMEM, while (2) the control group received an equal volume of DMEM. Both groups were continued to be incubated under the same conditions for 15 min.
The treatment condition of 4 μg/mL melittin for 15 min was determined based on systematic pre-experiments. First, dose-gradient experiments (0, 1, 2, 4, 8 μg/mL) were performed to evaluate membrane integrity via Hoechst/PI double staining (
Figure 1). The results showed that 4 μg/mL melittin induced significant membrane permeability changes (PI-positive rate ~25%) but did not cause massive cell lysis and necrosis, which could avoid non-specific lipid release from broken cells. Second, time-gradient pre-experiments (5, 15, 30, 60 min) confirmed that 15 min was the earliest time point when melittin induced detectable lipidome remodeling, while longer treatment (≥30 min) would lead to secondary lipid changes caused by cell death. This condition ensures that the detected lipidomic alterations reflect the direct and early regulatory effects of melittin rather than non-specific consequences of cell necrosis.
After treatment, the culture medium was discarded, and cells were then washed with PBS. Next, plates were placed on ice, 1 mL of pre-chilled PBS was added, and cells were quickly scraped to one side using a cell scraper. The buffer was collected by tilting the plate, followed by centrifugation at 1000 g for 10 min to remove the supernatant, yielding the samples. A total of 50 mg (±5 mg) of samples were combined with 200 μL of 50% ice-cold methanol, and the samples were homogenized using a grinding machine. Subsequently, 600 μL of tert-butyl methyl ether was added, mixed, and centrifuged at 3000 g for 15 min. The supernatant was transferred to a new EP tube following freezing and drying. Then, 200 μL of dichloromethane-methanol (v:v = 1:1) solution was added for reconstitution. Ultimately, after centrifugation at 3000 g for 15 min, the supernatant was collected for analysis, and an equal volume from each sample was mixed to form a QC sample.
3.2. Untargeted Lipid Metabolomics Methodology
Chromatographic-Mass Spectrometric Conditions
Chromatographic separation was performed using an ACQUITY UPLC CSH C18 column (100 mm × 2.1 mm, 1.7 µm, Waters). The mobile phase A consisted of acetonitrile-water (v:v = 6:4) containing 10 mmol/L ammonium formate and 0.1% formic acid, while mobile phase B was isopropanol-acetonitrile (v:v = 9:1) with the same additives. The gradient elution program was set as follows: 0–0.4 min, 30% B; 0.4–1.0 min, 30–45% B; 1.0–3.5 min, 45–60% B; 3.5–5.0 min, 60–75% B; 5.0–7.0 min, 75–90% B; 7.0–8.5 min, 90–100% B; 8.5–8.6 min, 100% B; 8.6–8.61 min, 100–30% B; 8.61–10.0 min, 30% B. The flow rate was set to 0.3 mL/min, with an injection volume of 4 μL and a column temperature of 45 °C.
Mass spectrometry was conducted by employing an electrospray ionization source, with data collected in both positive and negative ion modes. The ion source temperature was set to 350 °C, with a capillary voltage of +3.8 kV in positive mode and −3.4 kV in negative mode. The sheath gas flow rate was set at 50 Arb, with auxiliary gas at 15 Arb and sweep gas at 0 Arb. Data acquisition was conducted in full scan and data-dependent acquisition (DDA) modes. In a single acquisition cycle, the full scan range was 200 to 2000 Da, with a resolution of 70,000, automatic gain control (AGC) set to 3 × 106, and maximum ion injection time (Maximum IT) of 100 milliseconds. The top five ions, with response intensity above 100,000 from the full scan, underwent secondary scanning, with a secondary scan resolution of 17,500 and a maximum ion injection time of 50 milliseconds. The dynamic exclusion time was set to 6 s.
3.3. Information Analysis Workflow
The mass spectrometric data were processed using XCMS software for peak picking, grouping, retention time correction, secondary peak grouping, isotopic and adduct annotation, and other preprocessing steps. The LC-MS raw data files were converted to mzXML format and processed using R software with the XCMS and metaX toolboxes. Ions were identified based on retention time (RT) and m/z data. The intensity for each peak was recorded, generating a three-dimensional matrix containing specified peak indices (RT-m/z pairs), sample names (observations), and ion intensity information (variables). Metabolites were annotated using the KEGG and HMDB databases by matching the precise molecular mass data (m/z) from the samples with the database values. If the mass deviation between observed values and database values was less than 10 ppm, metabolites were annotated, and further identification and validation of the molecular formula were conducted through isotopic distribution measurements. Additionally, an internal database was used to validate the metabolite identifications. Statistical analysis of the data was primarily performed using R software (version 4.0), following these steps: data filtering (samples with more than 80% missing data or QC missing data over 50%), data imputation (default KNN method), and data normalization (default PQN method). Clustering heatmaps were generated using the pheatmap package in R, while PCA and significant differential metabolite analysis were conducted using the metaX package. PLSDA analysis was carried out utilizing the ropls package, with the calculation of variable importance in projection (VIP) values. Correlation analysis was conducted using the Pearson correlation coefficient from the R package. Significant differential metabolites were defined with the conditions: p value < 0.05, VIP ≥ 1 calculated from PLSDA analysis, and T test p value < 0.05. Differential pathway enrichment analysis of KEGG pathways was conducted based on hypergeometric testing, with functional entries having a p-value < 0.05 considered significantly enriched among differential metabolites. Metabolite set enrichment analysis was performed with GSEA software (v4.1.0), where KEGG pathways that met the criteria |NES| > 1, NOM p value < 0.05, and FDR < 0.25 were considered significantly different between the two groups. Network diagrams were constructed based on the pathways in which the metabolites were involved.
3.4. XCMS Main Parameters
Peak extraction was primarily achieved using the open-source software XCMS (version 3.22.0), which includes peak alignment, extraction, normalization, deconvolution, and compound identification steps. The main parameter settings for peak extraction and identification are detailed in
Table 2.
3.5. metaX Main Parameters
Post-processing of extracted peaks primarily involved primary identification and quantification analysis of metabolites. The parameters for primary identification in metaX are presented in
Table 3, secondary identification parameters are shown in
Table 4.