1. Introduction
Bovine mastitis remains one of the most prevalent and economically damaging diseases affecting dairy herds worldwide. The condition exacts a heavy toll on production efficiency, primarily through reduced milk yield, impaired milk quality, extended calving intervals, compromised fertility, and shortened productive lifespans. It has been estimated that daily milk yield declines by 1.0–2.5 kg as early as two weeks before clinical onset, culminating in total lactation losses of 110–552 kg—deficits that are both substantial and irreversible [
1]. Somatic cell counts (SCCs) show a strong genetic correlation with mastitis incidence, with reported estimates ranging from 0.59 to 0.85 [
2]. Consequently, SCC is widely adopted not only as a reliable proxy for udder health and mastitis risk, but also as a key selection trait in milk quality monitoring and genetic improvement programmes [
3,
4]. In practice, SCC < 500,000 cells mL
−1 is generally considered indicative of a healthy mammary gland, SCC > 1,000,000 cells mL
−1 signals clinical mastitis, and intermediate values denote subclinical intramammary infection [
5]. The intramammary ecosystem harbours a diverse microbial community that, under physiological conditions, maintains a balanced state; this microbiota—particularly the commensal populations—contributes to the maintenance of local immune homeostasis [
6]. Disruption of this equilibrium by invading pathogens triggers an inflammatory response, which is accompanied by a marked influx of somatic cells into mammary tissue. The resulting elevation in SCC therefore reflects the convergence of local mammary immunity and systemic immune status, representing a key host defence mechanism against microbial challenge. Understanding how mammary immune regulation differs across SCC strata at the molecular level is of fundamental importance for improving udder health and enhancing production performance. Although mammary tissue represents an ideal material for studying udder health traits, its large-scale acquisition under practical production conditions remains challenging. In contrast, blood samples are readily accessible and enable reproducible analyses, with their transcriptomic information partially reflecting the systemic metabolic, immune, and endocrine status. Moreover, circulating blood leukocytes play a critical role in the onset, progression, and resolution of bovine mastitis [
7], offering a feasible avenue for investigating the regulatory mechanisms underlying milk composition and health traits.
The Xinjiang Brown cow is a Chinese dual-purpose breed valued for its adaptation to cold climates, tolerance of coarse feedstuffs, and high-quality meat and milk. Our previous investigations in this population have identified several candidate genes associated with mastitis resistance. For instance, Zhou et al. [
8] employed a genome-wide association study (GWAS) to detect three single-nucleotide polymorphisms (SNPs) significantly associated with somatic cell score (SCS) in Xinjiang Brown cattle, located in the vicinity of
LOC104969301,
FHIT and
DYRK2. Subsequent pyrosequencing analysis revealed that differential methylation of the
FHIT promoter modulates gene expression, thereby influencing mastitis susceptibility; specifically, decreased methylation levels in the
FHIT promoter region were associated with enhanced resistance to mastitis in this breed [
9]. In parallel, Wang et al. [
10] proposed that
TRAPPC9 and
CD4 are negatively regulated in relation to mastitis occurrence and identified these genes as two novel DNA methylation targets implicated in mastitis susceptibility in Xinjiang Brown cattle. Collectively, these findings have contributed candidate molecular markers for genetic improvement programmes aimed at enhancing mastitis resistance in this breed.
RNA sequencing (RNA-Seq) based on high-throughput platforms has been widely applied to investigate economically important traits in livestock and poultry. However, second-generation transcriptome sequencing technologies are not always entirely accurate during transcript assembly, and their short read lengths typically preclude the recovery of full-length transcripts. To overcome these limitations, full-length transcriptome sequencing based on single-molecule real-time sequencing technologies has emerged [
11]. Compared with second-generation approaches, this technology offers longer reads and higher throughput; it eliminates the need for RNA fragmentation during library preparation and circumvents transcript assembly during data analysis, thereby enabling the capture of complete, full-length transcript isoforms [
12]. These features confer substantial advantages in resolving transcript diversity, identifying novel transcripts, and deciphering the architecture of complex immune-related genes. In the present study, we employed Oxford Nanopore Technology (ONT) long-read transcriptome sequencing to systematically compare gene expression profiles in peripheral blood of Xinjiang Brown cattle stratified by SCCs. Our objective was to identify key candidate genes associated with differential SCC levels, thereby providing a theoretical foundation and data resource for elucidating the molecular mechanisms underlying mammary health traits and for advancing health-oriented breeding strategies in Xinjiang Brown cattle (
Figure 1).
2. Materials and Methods
2.1. Sample Collection
All Xinjiang Brown cattle used in this study were obtained from the Xinjiang Yili Brown Cattle Breeding Farm in Xinjiang, China. The selected individuals were lactating multiparous cows in mid-to-late lactation and were maintained under uniform feeding and management conditions. Blood samples were collected using 10 mL EDTA tubes, and milk samples were collected using 50 mL centrifuge tubes. Following collection, blood samples were centrifuged at 3500 rpm for 15 min. The buffy coat was immediately separated into new tubes after adding 1 mL Trizol (Thermo Fisher, Waltham, MA, USA). The samples were then promptly transported on dry ice to the laboratory and stored at −80 °C for subsequent RNA extraction. Milk samples were collected from all four mammary quarters, thoroughly mixed, and then subjected to compositional analysis and SCC determination. Based on SCC values, individuals were classified into two groups: low-SCC (≤200,000 cells mL−1) and high-SCC (≥1,000,000 cells mL−1). A total of six Xinjiang Brown cattle in mid-to-late lactation were selected, with three animals per group.
2.2. Library Construction and Sequencing
Total RNA was extracted from peripheral blood leukocytes and assessed for quality prior to library preparation. Full-length cDNA was synthesized using a high-fidelity reverse transcriptase, followed by 14 cycles of PCR amplification with LongAmp Taq (NEB, Ipswich, MA, USA) and barcoded primers containing specific adapters (cDNA-PCR Sequencing Kit, SQK-LSK110 and EXP-PCB096; Oxford Nanopore Technologies, Oxford, UK). The resulting amplicons were ligated to Oxford Nanopore sequencing adapters using T4 DNA ligase (NEB) and subsequently purified with Agencourt XP beads (Beckman Coulter, Brea, CA, USA). The final purified library was loaded onto FLO-MIN109 flow cells and sequenced on the PromethION platform (Oxford Nanopore Technologies).
2.3. Data Quality Control and Read Mapping
Raw sequencing data generated by ONT were subjected to quality control procedures. Reads with an average quality score below 6 or a length shorter than 350 bp were discarded. Ribosomal RNA (rRNA) reads were identified and removed by aligning all reads against an rRNA database. Following initial quality filtering, full-length non-chimeric (FLNC) transcripts were identified based on the presence of adapter sequences at both ends of each read. Reads containing both the 5′ primer (TTTCTGTTGGTGCTGATATTGC) and the 3′ primer (GAAGATAGAGCGACAGGCAAGT) at their termini were classified as full-length transcripts. During full-length transcript sequencing, the stable polyA structure at the 3′ end ensures relatively high integrity of this region; however, degradation may occur at the 5′ end, leading to variations in 5′ termini among different copies of the same transcript. This heterogeneity can result in the assignment of these copies to distinct clusters rather than a single consensus sequence, thereby generating redundancy in the transcript dataset. To address this redundancy, all FLNC reads were aligned to the bovine reference genome (
Bos taurus, GCF_002263795.3_ARS_UCD2.0) using minimap2 (version 2.16) [
13]. The resulting alignments were processed to remove redundant sequences using the cDNA_Cupcake package (version 29.0.0). Sequences with identity <0.9 or coverage <0.85 were filtered out to ensure the accuracy and reliability of downstream analyses.
2.4. Identification of Alternative Splicing
Alternative splicing (AS) events were identified using Astalavista software (version 3.2-0) [
14]. The analysis encompassed five major types of splicing events: alternative 3′ splice sites (A3SSs), alternative 5′ splice sites (A5SSs), exon skipping (SE), intron retention (RI) and mutually exclusive exons (MXEs). Differentially AS events were defined by a threshold of |ΔPSI| > 0.1 and
p < 0.01, where PSI (percent spliced in) represents the inclusion level of a given exon or splice junction. Genes harbouring such differential splicing events were designated as differentially spliced genes (DSGs).
2.5. Differential Expression Analysis of Genes and Transcripts
Full-length reads were quantified by alignment to the bovine reference genome. Following alignment, read counts for genes and transcripts were obtained and subsequently normalized for downstream visualization. Differential expression analysis was performed using the negative binomial regression algorithm implemented in the DESeq2 package (version 1.40.2). Genes and transcripts with a p < 0.01 and a fold change ≥1.5 were considered significantly differentially expressed.
2.6. Functional Annotation and Enrichment Analysis of DEGs and DETs
Functional annotation and pathway enrichment analysis were performed on the sets of differentially expressed genes (DEGs) and differentially expressed transcripts (DETs) identified from the comparison of high- versus low-SCC groups in Xinjiang Brown cattle. Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were conducted using the DAVID Bioinformatics Resources (
https://davidbioinformatics.nih.gov/tools.jsp, accessed on 16 March 2026). To investigate protein–protein interaction (PPI) networks, the DEGs were mapped to the STRING database (
https://cn.string-db.org/, accessed on 16 March 2026) using the BLASTx (version 2.15.0) algorithm to identify interactions with relevant orthologous species. The resulting interaction data were then imported into Cytoscape (version 3.9.1) for visualization and network analysis. Hub genes within the network were identified based on the Degree algorithm, enabling the prioritization of key nodes and providing insights into the complex relationships and potential biological roles of the candidate genes.
2.7. Cell Culture
Mac-T cells used in this study were obtained from Qingqi Biotechnology Development Co., Ltd. (Shanghai, China). For revival, cryovials containing the cells were retrieved from liquid-nitrogen storage and immediately immersed in a 37 °C water bath with gentle agitation until completely thawed. All subsequent steps were performed in a laminar flow hood. The thawed cell suspension was transferred to a centrifuge tube, diluted with at least five volumes of culture medium and mixed thoroughly. Cells were pelleted by centrifugation at 1000 rpm for 10 min, after which the supernatant was discarded. The cell pellet was resuspended in fresh culture medium and seeded into culture dishes at a split ratio of 1:10–1:15, achieving a final cell density of approximately 1–5 × 105 cells mL−1. Cells were maintained at 37 °C in a humidified incubator with 5% CO2.
2.8. Establishment of an In Vitro Cellular Inflammation Model
Cellular inflammation was induced following an established protocol developed by our research group. Briefly, phosphate-buffered saline (PBS) was sterilized and used to dilute lipopolysaccharide (LPS) to a final concentration of 10 ng/μL. The LPS working solution was then applied to Mac-T cells cultured in 12-well plates at 70–80% confluence. Cells were incubated with LPS for 3 h at 37 °C in a humidified atmosphere containing 5% CO2 prior to subsequent analyses.
2.9. RNA Extraction from Cultured Cells
Following LPS treatment for 3 h, the culture medium was removed and cells were washed three times with PBS. Subsequently, 1 mL of Trizol reagent was added to each well, and cells were lysed by repeated pipetting until the cell lysate appeared homogeneous. The resulting lysate was transferred to a 2 mL centrifuge tube. Total RNA was then extracted according to the manufacturer’s instructions.
2.10. RT-qPCR
To investigate the inflammatory expression pattern of
CXCL2, its transcript levels were examined both in peripheral blood of Xinjiang Brown cattle with high- and low-SCC and in an LPS-induced cellular inflammation model using RT-qPCR (
Table 1). Total RNA was reverse transcribed into cDNA using the PrimeScript™ RT Reagent Kit with gDNA Eraser (Perfect Real Time, TaKaRa, Kusatsu, Japan) according to the manufacturer’s instructions. Quantitative PCR was subsequently performed using the TB Green
® Premix Ex Taq™ II (Tli RNaseH Plus) kit (TaKaRa) in a 25 μL reaction volume. All reactions were carried out in accordance with the manufacturer’s protocols.
2.11. ELISA for CXCL2 Protein Quantification
To quantify CXCL2 protein levels, cell culture supernatants were collected from Mac-T cells following 3 h of LPS treatment. The supernatants were centrifuged at 3000 rpm for 20 min at 4 °C to remove cellular debris and particulate matter. The clarified supernatants were then subjected to enzyme-linked immunosorbent assay (ELISA) using a Bovine CXC Chemokine Ligand 2 (CXCL2) ELISA kit (Shanghai Enlink Biotechnology Co., Ltd., Shanghai, China), following the manufacturer’s instructions. CXCL2 concentrations were determined based on a standard curve generated from provided standards.
4. Discussion
Bovine mastitis imposes substantial economic burdens on the dairy industry worldwide, primarily through reduced milk yield, increased culling rates, and associated management costs [
16]. To date, extensive research has elucidated the pathogenic mechanisms, preventive strategies, and predictive models for mastitis, predominantly in Holstein cattle [
17]. However, compared with the Holstein breed, investigations into mastitis susceptibility and resistance in Xinjiang Brown cattle—a locally adapted dual-purpose breed in China—remain in their early stages. Elucidating the molecular basis of mastitis resistance in this understudied breed is therefore of considerable scientific and practical importance.
In this study, we employed Oxford Nanopore full-length transcriptome sequencing to profile six Xinjiang Brown cattle, comprising three individuals with high-SCC and three with low-SCC. PCA revealed that samples from the low-SCC group clustered tightly together, indicating good reproducibility among healthy individuals. By contrast, although the high-SCC samples were clearly separable from the low-SCC group, they exhibited greater dispersion within the group. This variability may reflect heterogeneity in the underlying pathogenic pathways or the specific mastitis-causing pathogens affecting individual animals. To date, more than 150 bacterial species have been associated with bovine mastitis [
18], including
Staphylococcus aureus as a major contagious pathogen [
19] and various other species [
20]. The observed transcriptional variability within the high-SCC group may therefore reflect differential host responses to distinct microbial challenges, underscoring the complexity of mammary gland immune regulation under infection conditions.
Comparative transcriptomic analysis between Xinjiang Brown cattle with high- and low-SCC identified 226 DEGs and 441 DETs. Hierarchical clustering of DEGs and DETs revealed two distinct expression modules: Group-1, characterized by elevated expression in the low-SCC group, and Group-2, comprising genes and transcripts upregulated in the high-SCC group. GO enrichment analysis of Group-2 genes revealed significant associations with immune response, chemokine signalling pathways, antimicrobial humoral immune responses mediated by antimicrobial peptides, and inflammatory responses. These functional annotations establish a direct link between the transcriptional profile of the high-SCC group and mastitis-related biological processes. The enrichment of immune-related pathways in Group-2 indicates that elevated SCC reflects an activated mammary immune state, consistent with the host response to intramammary infection.
To identify key regulators of mastitis resistance in Xinjiang Brown cattle, we performed PPI network analysis using the Degree algorithm. This approach identified several hub genes—including
CXCL2,
IL1B,
IL10 and
GRO1—that were also recognized as high-priority candidates in our previous short-read sequencing dataset, demonstrating cross-platform consistency in the transcriptomic signatures associated with SCC. Among these hub genes, the cytokines
IL1B and
IL10 exhibited marked upregulation under inflammatory conditions [
5], and
GRO1 has been shown to be significantly induced in mammary epithelial cells following inflammatory challenge [
21].
CCL4 (also known as
MIP-1β), a specific ligand for
CCR5 and member of the macrophage inflammatory protein family, plays a critical role in recruiting pro-inflammatory cells to sites of injury or infection, thereby coordinating both acute and chronic inflammatory host responses [
22]. Within the chemokine family,
CXCR1 (the receptor for
IL-8) has been extensively studied in the context of mastitis susceptibility; single-nucleotide polymorphisms in
CXCR1 are associated with SCS [
23] and influence mastitis incidence in dairy cattle [
24,
25].
PTPRC (also known as
CD45) participates in cytokine signalling and modulates multiple receptors, thereby influencing the production and release of cytokines and regulating inflammatory responses. In addition,
PTPRC indirectly modulates inflammation by affecting the activation status and function of immune cells such as T cells and B cells [
26]. Collectively, these findings reinforce the central role of immune-related genes—particularly those involved in chemokine signalling and cytokine regulation—in determining mastitis resistance in Xinjiang Brown cattle and provide a panel of candidate markers for further functional validation and genetic improvement programmes.
AS analysis revealed a striking discrepancy between sequencing platforms: only 15 DSGs were identified using ONT full-length transcriptome sequencing, whereas short-read sequencing yielded 389 DSGs—approximately 26-fold more. Although short-read sequencing remains widely used, its inherent limitation in capturing full-length RNA sequences can lead to fragmented transcript assembly and erroneous isoform annotation [
27]. By contrast, long-read transcriptome technologies overcome the constraints of short read lengths, circumvent amplification biases and capture complete, full-length transcript isoforms [
11]. The substantially lower number of DSGs detected by ONT in this study may therefore reflect the higher sensitivity and accuracy of long-read sequencing in resolving genuine splicing events, effectively filtering out artefacts introduced by short-read assembly. The three DSGs commonly identified by both platforms—
C5H12orf75,
SKA2 and
MOB3A—have been primarily associated with regulatory functions in tumorigenesis and other diseases.
C5H12orf75 (also known as
OCC-1) was initially reported as an upregulated gene in colon cancer; its three alternatively spliced isoforms differentially regulate Wnt signalling [
28]. In breast cancer, all four isoforms of
C5H12orf75 show elevated expression compared with non-tumour tissues, and increased expression of this gene is correlated with tumorigenesis [
29].
SKA2, located on chromosome 19, encodes a protein involved in regulating the biological functions of tumour cells. It forms a gene pair with
PRR11 that modulates tumour progression, and altered
SKA2 expression accompanies oncogenesis. Elevated
SKA2 expression has been documented in multiple cancer types, including lung [
30], breast [
31] and oesophageal [
32] cancers.
MOB3A has been implicated in the pathogenesis of Alzheimer’s disease [
33], suggesting broader roles for these shared DSGs beyond mammary gland physiology.
Integration of DEGs and DSGs identified from second-generation (short-read) and Oxford Nanopore full-length transcriptome sequencing yielded 35 high-priority DEGs. Among these, members of the chemokine family—including
CXCL2,
GRO1,
CXCL3,
CCL4 and
CXCR1—and cytokine genes such as
IL10 and
IL1B were prominently represented. This enrichment is biologically compelling. During mammary gland infection, chemokines and cytokines are rapidly released to mediate the recruitment of leukocytes to sites of microbial invasion [
34]. Consequently, these molecular families are intimately involved in orchestrating inflammatory responses and immune defence mechanisms and are likely to play critical roles in the initiation and progression of mastitis. Chemokines constitute a specialized class of cytokines that direct the targeted migration of leukocytes. They are fundamental to immune system development, inflammatory response, and both innate and adaptive immunity. Owing to their pivotal role in recruiting leukocytes during inflammation, chemokines are frequently classified as inflammatory mediators [
35]. Based on integration of differential expression levels and pathway enrichment patterns, we focused on
CXCL2 as a potential molecular marker for mastitis resistance.
CXCL2 is a key member of the CXC chemokine family and exerts its biological functions primarily through binding to its G protein-coupled receptor,
CXCR2. The CXCL2–CXCR2 axis plays a central role in orchestrating inflammatory responses. Beyond promoting the chemotaxis and recruitment of neutrophils and other immune cells to infection sites, this signalling axis activates multiple downstream pathways—including NF-κB and MAPK cascades—thereby regulating the production of inflammatory mediators and influencing diverse cellular processes such as proliferation, differentiation, apoptosis, migration and adhesion [
36]. These multifaceted functions position
CXCL2 as a compelling candidate for modulating mastitis susceptibility and shaping the host response to intramammary infection in dairy cattle.
Mammary epithelial cells constitute the first line of defence against invading pathogens, functioning as a physical barrier while also playing critical roles in immune recognition and response during intramammary infection. The immune system of the mammary gland comprises both innate and adaptive arms, which cooperate to maintain tissue homeostasis and preserve normal lactation function [
37,
38]. Infection of mammary tissue with Escherichia coli frequently triggers acute mastitis in dairy cattle. Lipopolysaccharide (LPS), the primary toxic component of the
E. coli cell wall, plays a pivotal role in the initiation and progression of inflammatory responses. Prolonged exposure of cells to LPS stimulation induces a robust inflammatory reaction; consequently, LPS challenge is widely employed to establish in vitro models of mastitis for investigating inflammatory responses [
38]. In the present study, we established an LPS-induced inflammatory model in bovine mammary epithelial cells (Mac-T) to examine
CXCL2 expression under inflammatory versus normal conditions. Consistent with our in vivo findings in high-SCC cattle,
CXCL2 transcript levels were significantly upregulated following LPS stimulation. These results suggest that
CXCL2 may serve as a potential molecular indicator of inflammatory status in the mammary gland.
LPS, as the major cell wall component of Gram-negative bacteria, is recognized by Toll-like receptors (TLRs) expressed on bovine mammary epithelial cells. Ligand binding activates multiple intracellular signalling cascades—including calcium signalling pathways and NF-κB—thereby promoting the synthesis and release of pro-inflammatory cytokines and chemokines. These mediators direct the migration of immune cells to sites of infection and amplify the inflammatory response, leading to physiological changes in mammary epithelial cells such as cellular swelling and increased permeability [
39]. The observed upregulation of
CXCL2 following LPS treatment in our study is consistent with this established mechanism. Notably, lipoteichoic acid (LTA), a major cell wall component of Gram-positive bacteria, is also used to establish inflammatory models of mastitis. Studies have demonstrated that LTA induces inflammatory responses in mammary tissue in vivo and significantly promotes the release of chemokines—including
CXCL1,
CXCL2 and
CXCL3—thereby mediating the recruitment of neutrophils and other immune cells [
40]. These findings indicate that, regardless of whether inflammation is triggered by Gram-negative (LPS) or Gram-positive (LTA) bacterial components, chemokines such as
CXCL2 are consistently upregulated and actively participate in regulating the inflammatory response. Collectively, our results from both in vivo and in vitro models demonstrate that
CXCL2 expression is robustly induced under inflammatory conditions, supporting its potential utility as a molecular marker for mastitis susceptibility and as a candidate target for genetic improvement programmes aimed at enhancing mammary health in dairy cattle. These findings suggest that
CXCL2 could be incorporated into marker-assisted or genomic selection strategies for improving mastitis resistance in Xinjiang Brown cattle. Future studies should focus on validating its predictive power across breeds and assessing possible pleiotropic effects.
This study systematically constructed a full-length peripheral blood transcriptome atlas of Xinjiang Brown cattle with varying somatic cell counts and identified CXCL2 as a potential molecular marker for mastitis resistance. However, several limitations should be acknowledged. First, although animals in the high-SCC group exceeded the threshold commonly used for clinical mastitis, clinical symptoms were not systematically recorded. Thus, it was not possible to definitively distinguish between clinical and subclinical mastitis. Second, the sample size was relatively small (n = 3 per group, total n = 6), which may limit statistical power. Third, the study was restricted to Xinjiang Brown cattle; therefore, the generalizability of the findings, especially regarding CXCL2, to other breeds requires further validation.