Next Article in Journal
FaceCloseup: Enhancing Mobile Facial Authentication with Perspective Distortion-Based Liveness Detection
Next Article in Special Issue
Artificial Intelligence and the Future of Mental Health in a Digitally Transformed World
Previous Article in Journal
Data Augmentation-Driven Improvements in Malignant Lymphoma Image Classification
Previous Article in Special Issue
Adoption Drivers of Intelligent Virtual Assistants in Banking: Rethinking the Artificial Intelligence Banker
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

EMGP-Net: A Hybrid Deep Learning Architecture for Breast Cancer Gene Expression Prediction

by
Oumeima Thâalbi
and
Moulay A. Akhloufi
*
Perception, Robotics, and Intelligent Machines (PRIME), Department of Computer Science, Université de Moncton, Moncton, NB E1A 3E9, Canada
*
Author to whom correspondence should be addressed.
Computers 2025, 14(7), 253; https://doi.org/10.3390/computers14070253
Submission received: 20 May 2025 / Revised: 24 June 2025 / Accepted: 25 June 2025 / Published: 26 June 2025
(This article belongs to the Special Issue AI in Its Ecosystem)

Abstract

Background: The accurate prediction of gene expression is essential in breast cancer research. However, spatial transcriptomics technologies are usually too expensive. Recent studies have used whole-slide images combined with spatial transcriptomics data to predict breast cancer gene expression. To this end, we present EMGP-Net, a novel hybrid deep learning architecture developed by combining two state-of-the-art models, MambaVision and EfficientFormer. Method: EMGP-Net was first trained on the HER2+ dataset, containing data from eight patients using a leave-one-patient-out approach. To ensure generalizability, we conducted external validation and alternately trained EMGP-Net on the HER2+ dataset and tested it on the STNet dataset, containing data from 23 patients, and vice versa. We evaluated EMGP-Net’s ability to predict the expression of 250 selected genes. EMGP-Net mixes features from both models, and uses attention mechanisms followed by fully connected layers. Results: Our model outperformed both EfficientFormer and MambaVision, which were trained separately on the HER2+ dataset, achieving the highest PCC of 0.7903 for the PTMA gene, with the top 14 genes having PCCs greater than 0.7, including other important breast cancer biomarkers such as GNAS and B2M. The external validation showed that it also outperformed models that were retrained with our approach. Conclusions: The results of EMGP-Net were better than those of existing models, showing that the combination of advanced models is an effective strategy to improve performance in this task.

1. Introduction

Breast cancer remains one of the most common and potentially most deadly forms of cancer, with hundreds of thousands of deaths and millions of women diagnosed each year, as of 2018 [1]. Despite important advances in early detection, the molecular complexity and heterogeneity of breast cancer pose serious challenges to the accurate diagnosis of the disease. Histopathology, particularly the use of hematoxylin and eosin (H&E)-stained tissue slides, has been one of the main bases for diagnosing breast cancer. These images provide detailed visual information about the structure of the tissue, making it easier for pathologists to determine the type, grade, and stage of cancer. Histopathological images were initially employed to determine cancer types [2], and several studies have utilized datasets with histopathological samples, such as the BreakHis dataset, the CIAR 2018 dataset, the BreCaHAD dataset, and others. These studies have contributed to classifying tumors into different subtypes. For example, Al-Jabbar et al. [3], Obayya et al. [4], and Clement et al. [5] developed various approaches to detect benign and malignant classes. Other studies progressed further by identifying additional subclasses of breast cancer, such as those proposed by Bagchi et al. [6], Bhausaheb and Kashyap [7], and Yu et al. [8]. However, these methods primarily focused on morphology and did not capture the molecular details that could better inform treatment decisions, such as gene expression levels, which are critical to understanding tumor behavior. ST technologies have recently emerged as strong tools that enable researchers to profile gene expression while preserving the spatial context of the tissue and allow for high-resolution mapping of gene expression across tissue sections, providing valuable insights into the molecular architecture of tumors. However, the high cost of these technologies limits their large-scale use, particularly in clinical settings. This challenge has led to the exploration of alternative approaches, such as predicting gene expression directly from histopathological images, especially whole-slide images (WSIs). In the literature, past studies have shown that WSIs contain biologically rich information that can be leveraged to predict gene expression. Many methods have successfully used deep learning models to predict gene expression patterns from WSIs. ST-Net [9], which uses DenseNet-121 to predict 250 genes, was one of the first transfer learning-based approaches to make marked progress in breast cancer gene expression prediction. Afterwards, BrST-Net [10] adapted CNN models, such as EfficientNet-b0, and incorporated an auxiliary network to predict the expression of 250 genes. GeNetFormer [11], our previous approach, integrated several transformer models, including EfficientFormer, to predict 250 genes. HisToGene [12], on the other hand, used attention mechanisms and ViT to predict the expression of 785 genes. Other models such as SEPAL [13], Hist2ST [14], and THItoGene [15] used graph neural networks (GNNs) and showed interesting results. SEPAL [13], Hist2ST [14], and THItoGene [15] not only capture spatial information from individual spots, but also use neighboring spots to gain a global perspective. Similarly, PH2ST [16] and HGGEP [17] incorporate hypergraphs, allowing both local and global integration of spatial context. In the studies reviewed, methods such as ST-Net [9], BrST-Net [10], GeNetFormer [11], EGN [18], and EGGN [19] were used to focus primarily on local predictions, whereas HisToGene [12] involves an adapted global prediction approach. However, methods such as SEPAL [13] and THItoGene [15] combine both local and global predictions. Although transformers were initially designed for natural language processing, their ability to model global dependencies has made them highly valuable in computer vision [20]. CNNs are good at detecting local patterns, but they have trouble with long-range interactions, while transformers, on the other hand, use attention mechanisms to capture relationships across the whole image [20]. This makes them effective at analyzing complex spatial structures in histopathological slides, where fine-grained detail and large-scale tissue architecture are both important for gene expression prediction. Recently, state-space models (SSMs) have emerged as a valuable alternative to attention mechanisms, offering competitive modeling capabilities at a lower computational cost. MambaVision [21], built on this principle, combines the efficiency of SSMs with the global awareness of attention. It delivered a strong performance on large image datasets while being much faster and more memory-efficient than traditional transformers. These properties make MambaVision well suited for high-resolution tasks, such as predicting gene expression from histopathological images. The goal of this study is to explore the application of advanced hybrid architectures combining MambaVision and EfficientFormer to predict breast cancer gene expression from WSIs. While convolutional neural networks (CNNs) have long been the architecture of choice for medical image analysis, they tend to excel at capturing local features, and struggle to model long-range spatial relationships and global patterns. Building on advances in deep learning-based gene expression prediction from histopathological images, and to address the limitations of older state-of-the-art (SOTA) approaches, we propose a novel approach that combines two SOTA architectures, EfficientFormer and MambaVision. MambaVision, one of the most recent models in the field, has demonstrated important potential in visual recognition tasks. By leveraging MambaVision’s advanced feature extraction capabilities, our approach aims to capture the complicated patterns in histopathological images that can contribute to more accurate gene expression prediction. EfficientFormer, a transformer model known for its efficiency and performance in gene expression prediction tasks, can complement MambaVision by providing strong spatial dependency processing capabilities. Our hybrid model, called EMGP-Net, integrates these two powerful architectures to improve gene expression prediction from WSIs. This combination of both models provides a robust approach. For this work, we used two widely used datasets: the HER2+ dataset and the STNet dataset. The HER2+ dataset consists of 36 sections from 8 patients, while the STNet dataset contains 68 sections from 23 patients. Both datasets have associated ST data. To ensure the generalizability of our model, we employed two validation strategies. First, we used the HER2+ dataset for internal validation. Then, for external validation, we evaluated the model using two approaches: one test performed on the HER2+ dataset and another on the STNet dataset. In both cases, the model was trained on the remaining dataset. All tests aimed to predict the expression of 250 genes, as in other methods [9,10,11]. This innovative hybrid architecture motivated us to further explore the potential of combining these two SOTA models for gene expression prediction. Our contributions are as follows:
  • Proposing EMGP-Net: We propose a hybrid model combining MambaVision and EfficientFormer to predict gene expression more effectively from WSIs.
  • Performing exhaustive validation: We perform internal and external validation on the HER2+ and STNet datasets to ensure model robustness and generalizability.
  • Demonstrating benefits of hybrid deep learning: We demonstrate the benefits of combining the latest powerful SOTA models and evaluate them on medical tasks, contributing to advances in breast cancer research, particularly in gene expression prediction for diagnosis.

2. Related Work

2.1. CNN-Based Approaches

Recently, several studies have focused on gene expression prediction from WSIs with associated ST data to mitigate the high cost of ST technologies by using the histopathological images that are widely available. For instance, ST-Net [9] was the first approach that combined ST data and WSIs of breast cancer tissues to predict gene expression, and a CNN-based method was developed using DenseNet-121. It uses H&E tissue images from the STNet dataset, first introduced in [9], and employs cross-validation. The model achieved the highest median PCC of 0.3400 for the GNAS gene. ST-Net was trained to predict 250 genes and was tested on the external 10x Genomics dataset, as well as on the Cancer Genome Atlas (TCGA) dataset. BrST-Net [10], also a CNN-based approach, has an integrated primary network and auxiliary network (AuxNet). It was evaluated using 10 SOTA models, and the best performing was EfficientNet-b0 with AuxNet. This model was trained and tested on the STNet dataset and achieved the highest PCC of 0.6325 for the B2M gene. The TRIPLEX approach [22], designed to predict the expression of 250 genes, consists of three parts: a target encoder, a global encoder, and a neighbor encoder. The target encoder processes the specific region of interest (target spot) in the tissue using ResNet18 and outputs predictions through a predictor. The global encoder, which uses multiple layers of transformer blocks and an atypical position encoding generator (APEG), captures a broader context by analyzing the entire tissue, while the neighbor encoder focuses on the regions around the target spot, also using ResNet18, with a final fusion layer integrating attention modules. The model was validated on three internal datasets, STNet, HER2+, and cSCC, and externally validated on the Visium dataset of breast cancer patients. On the HER2+ dataset, the model achieved a mean PCC of 0.3140 for all genes and 0.4970 for highly predictive genes. On the STNet dataset, it achieved a mean PCC of 0.3520 for all genes and 0.2060 for highly predictive genes.

2.2. Transformer-Based Approaches

Other methods apply transformer-based architectures, such as the GeNetFormer framework [11], which evaluated eight advanced transformer models, including EfficientFormer, FasterViT, BEiT v2, Swin Transformer v2, PyramidViT v2, MobileViT v2, MobileViT, and EfficientViT. These models were applied to predict 250 genes using the STNet dataset. The framework was trained using different image resolutions (224 × 224, 256 × 256) and loss functions (MSELoss, SL1Loss). The approach integrates these models into a comprehensive pipeline. The highest PCC was obtained with the configuration with the MSELoss function, 224 × 224 resolution, and EfficientFormer, which gave 9 out of the top 10 genes with the highest PCC values. HisToGene [12] focuses on modeling the spatial dependencies of gene expression using multi-head attention and a modified ViT architecture that allows the model to handle heterogeneity. In terms of results, HisToGene’s predictions on the HER2+ dataset achieved the highest mean R of 0.3200 for the GNAS gene, and it was also evaluated on the human cutaneous squamous cell carcinoma (cSCC) dataset.

2.3. Hybrid Transformer and GNN Approaches

Zeng et al. [14] proposed Hist2ST, based on transformer and GNN architectures, which starts the process by dividing the images around each spot into patches. These patches are then processed by three components. The convmixer extracts internal visual features within the image patches, while the transformer captures global spatial dependencies between spots. The GNN is then used to capture the local relationships between neighboring spots. The model was trained on the HER2+ and cSCC datasets using leave-one-out cross-validation. Hist2ST achieved an average PCC of 0.3900 for the top gene FN1. SEPAL [13] predicts gene expression by analyzing tissue images in two stages: local learning and spatial learning. First, it processes each image patch to extract visual features and predict 256 genes. In the second step, a graph is constructed for each patch and its neighbors, allowing the model to learn from both the local features and the spatial context. The GNN is used to refine the initial predictions by incorporating spatial relationships between patches. It was trained on the STNet dataset and the 10x Genomics dataset. The THItoGene [15] architecture consists of three parts: First, the image segmentation and position embedding step divides the image into patches corresponding to the location of each spot, embedding the spatial coordinates and aggregating the data for processing. Second, feature extraction is performed using a dynamic convolution module that captures deep molecular features from the patches, with the efficient capsule network improving the model by using self-attention mechanisms to adjust the convolution kernels based on the spatial context. Finally, global modeling is performed by the ViT module, which integrates the position embeddings and image features. The graph attention network (GAT) module is then used to learn the relationships between adjacent spots. The HER2+ dataset, with 785 selected genes, along with the cSCC dataset, was used for analysis, and leave-one-out cross-validation was applied. On the HER2+ dataset, PCC values of 0.7470, 0.7110, 0.6720, and 0.4520 were achieved for the genes FN1, SCD, IGKC, and FASN, respectively.

2.4. Graph-Based and Relational Modeling Approaches

Approaches such as ErwaNet [23] consist of two modules: the edge relational module (ERM) and the window attention module (WAM). The ERM captures local information by constructing a heterogeneous graph where each window in the slide image is treated as a node, and the relationships between windows are represented by three different types of edges: K-nearest neighbor (KN) edges, percent similarity (PS) edges, and K-nearest similarity (KS) edges. The WAM, on the other hand, captures global information by using an attention mechanism that aggregates feature representations from the entire tissue slide. The model was validated on two datasets, the STNet dataset and the 10x Genomics dataset, using cross-validation. A total of 250 targeted genes were selected for prediction. ErwaNet achieved PCC@F, PCC@S, and PCC@M values of 2.3600, 3.5200, and 3.3300 on the STNet dataset and 8.2800, 8.6900, and 8.3600 on the 10x Genomics dataset, respectively. Some other approaches are based on hypergraphs: PH2ST [16] is a prompt-based framework designed to handle multi-scale histological features by integrating dual-scale hypergraphs and ViT for both global and local features. This method uses a set of known ST values to guide over unmeasured spots. For feature extraction, PH2ST uses UNI, a universal histology image encoder pre-trained on a large corpus of WSIs. It captures both local and neighboring spatial context through dual-scale hypergraph-based spot representations, where each spot is connected to neighboring regions via hypergraph convolution. A cross-attention mechanism is then used to refine these representations for the final prediction. The model was applied to the HER2+ dataset containing 785 genes and also to the cSCC datasets. The highest PCC values were obtained for the genes TMSB10, CISD3, CD74, and COL6A2 from the HER2+ dataset: 0.5095, 0.2911, 0.5603, and 0.4091, respectively. The HGGEP architecture [17] consists of a gradient enhancement module (GEM) that enhances the gradients and captures cell morphological information, a ShuffleNet V2 backbone that extracts latent features from histology images, and a convolutional block attention module (CBAM) and ViT to refine the extracted features. The model includes a hypergraph association module (HAM) that captures spatial relationships between different regions in the tissue and uses long short-term memory (LSTM) to model dependencies between features. The HER2+ dataset with 785 genes and cSCC datasets were used. HGGEP achieved PCC values of 0.637 for GNAS, 0.564 for FASN, 0.652 for MR12B, and 0.649 for SCD.

2.5. Exemplar-Guided Approaches

Other approaches employ exemplar guidance learning, such as EGN [18] and its enhanced version EGGN [19]. EGN uses an exemplar bridging (EB) block and ViT as a backbone. The framework first retrieves the nearest exemplars for each tissue image window and constructs a graph to model spatial relations. It then updates window features using information from the exemplars to predict genes. EGGN constructs visual similarity graphs with the exemplars, which are then processed by a GraphSAGE-based backbone. Both models are based on the idea that images with similar visual features will have similar gene expression patterns, regardless of their location within the tissue.

3. Materials and Methods

3.1. Dataset

Two datasets were used in this work:
  • HER2+ is the HER2 (human epidermal growth factor receptor)-positive breast cancer dataset that was investigated in [24]. It was collected from eight patients (A-H). A total of 36 sections comprise the samples in this dataset. Three or, alternatively, six replicates (sections from the same patient) were assigned to each patient and stained with H&E. Each sample is in JPG format and comes with associated ST data. The dataset may represent various tissue types, including invasive cancer, breast glands, immune infiltrate, cancer in situ, connective tissue, and adipose tissue.
  • STNet is the fifth edition of the human breast cancer in situ capturing transcriptomics dataset, referred to as the STNet dataset, as was presented in [9]. It was obtained from 23 patients. It contains a total of 68 sections, with 3 sections per patient (except for 2 sections for 1 patient). The images are also stained with H&E and each sample is in JPG format and has corresponding ST data. The subtypes represented in the STNet dataset are luminal A, luminal B, triple negative, HER2 luminal, and non-luminal HER2.
Both datasets include files with the spot coordinates, the count matrices, and the gene names or symbols. Some examples of the datasets are shown in Figure 1.

3.2. Data Pre-Processing and Augmentation

Before our data were used by the model, we prepared them by applying various pre-processing techniques previously used in our earlier work [11] to both datasets. First, we filtered out spots with a total count of less than 1000. This is the same approach used in [10,11]. This resulted in a final number of spots of about 28,792 for the STNet dataset and 11,666 for the HER2+ dataset. We then normalized the gene expression count, and we added 1 as pseudo-count to avoid zero issues. We applied a log1p transformation (log(1 + x)) to the normalized counts to stabilize variance, reduce skewness, and ensure coherence between studies. Patches were generated according to the coordinates of each spot. In addition, to adjust the input dimension to fit with the model, the extracted patches were fixed to 224 × 224 × 3. For the STNet dataset, the list of genes with Ensembl identifiers (IDs) was converted to gene symbols using the HUGO Gene Nomenclature Committee (HGNC) database to ensure the use of unique gene symbols, thus simplifying electronic data retrieval and minimizing ambiguity. Classical data augmentation was also used to diversify the training data to help generalize and avoid overfitting. We applied random horizontal flipping, random vertical flipping, and random 90-degree rotation for each image patch. These techniques are important for learning from features independently of the direction or the rotation of the images. During testing, we computed the average of the eight symmetries obtained from the rotations and reflection.

3.3. Proposed Approach

3.3.1. Overview of the EMGP-Net Architecture

In this study, we introduce EMGP-Net, a hybrid deep learning architecture named after its components, EfficientFormer (E) and MambaVision (M), and its purpose, gene expression prediction (GP). The aim is to improve gene expression prediction from WSIs by using recent and robust SOTA models, and to improve the feature representations by exploiting the capabilities of both models. We followed three steps: the preparation and pre-processing of the data, the training of the model, and, finally, the evaluation. We extracted patches from the WSIs, centering on spots according to the positions provided for them in the corresponding ST data. This was performed to standardize the inputs before they were fed into the model for training. Initially, EMGP-Net uses MambaVision, an SOTA model known for its significant capacity in image tasks, especially when compared with other SOTA models [21]. In parallel, it integrates EfficientFormer, an efficient vision transformer that has shown good performance in gene expression prediction tasks, and achieved the best scores among seven evaluated transformer models implemented in the GeNetFormer framework that was introduced in [11]. We first adapted both models to produce 1024-dimensional feature vectors by replacing their original classification heads with custom linear layers. After extracting the features from both branches, they were stacked and passed on a multi-head attention mechanism. The attention layer takes these stacked vectors as input and learns how to weigh them based on their importance. This allows the model to focus on the most important features of each branch and to summarize the information in an appropriate way. Unlike simple concatenation, which increases dimensionality, or basic averaging without using attention, this approach allows the attention mechanism to influence the final combination, improving the fusion of information by capturing relevant relationships between the outputs of MambaVision and EfficientFormer. The fused features are then passed through a layer normalization (LN) layer. This helps to stabilize the training and improve the generalizability of the model. Then, we applied the Gaussian error linear unit (GeLU) activation function, which introduces non-linearity and improves learning capacity compared to the traditional ReLU. Finally, the normalized and activated features are passed through two fully connected linear layers. The first layer reduces the dimensionality from 1024 to 512, and the second layer gives predictions for the 250 targeted genes. The final output is a 250-dimensional vector representing the predicted gene expression values. To guarantee the generalizability of our model, the training was conducted with the leave-one-patient-out approach using the HER2+ dataset for the first evaluation. Then, we applied the first external validation using the STNet dataset with the model trained on the whole HER2+ dataset and the second external validation using the HER2+ dataset with the model trained on the whole STNet dataset. More details are mentioned below, and the EMGP-Net architecture is shown in Figure 2.

3.3.2. MambaVision

MambaVision: Introduced in [21], it is a hybrid architecture comprising Mamba [25] and transformers designed for improving feature learning. It presents a new SOTA approach trained on the ImageNet dataset. MambaVision combines Mamba blocks with transformer and self-attention layers. The network is divided into four stages. Two stages are based on CNN layers which integrate the GeLU activation and the batch normalization (BN) as follows:
z ^ = GeLU BN Conv 3 × 3 ( z ) , z = BN Conv 3 × 3 ( z ^ ) + z .
The other two stages come with MambaVision and transformer blocks. This novel approach showed its capacity in capturing the global context and long-range spatial dependencies and handling high-resolution images with faster training and inference than standard transformers. MambaVision provided a better accuracy–throughput balance compared to other SOTA models such as FasterViT and Swin Transformer v2 and the two models that were evaluated on gene expression prediction in [11]. Also, it outperformed other SOTA models such as NextViT, VMamba, FastViT, and Vim using the ImageNet dataset. It achieved higher accuracy than transformer and Mamba models. Figure 3 shows the architecture of the MambaVision block.

3.3.3. EfficientFormer

EfficientFormer: Presented in [26], it is efficient for low-latency applications and is completely based on a pure transformer architecture. Like many other SOTA models, it was trained on the ImageNet dataset. Within each of the four network stages, a number of meta transformer blocks (MB) are implemented to avoid the use of MobileNet components. The following equation shows the relationship between the two components, the token mixer (TokenMixer) and the multi-layer perceptron (MLP), that construct the meta transformer block:
X i + 1 = MB i ( X i ) = MLP ( TokenMixer ( X i ) ) .
EfficientFormer achieved the best results compared to other SOTA models [26]. It outperformed several ViT-based architectures, such as DeiT-Small, LeViT-256, and PoolFormer-S24, in terms of accuracy and latency using the ImageNet dataset. Furthermore, GeNetFormer, a framework for gene expression prediction task, showed the best scores when implementing the EfficientFormer model, and it outperformed FasterViT, BEiT v2, Swin Transformer v2, PyramidViT v2, MobileViT v2, MobileViT, and EfficientViT, seven models evaluated in the GeNetFormer framework. Figure 4 shows the architecture of the meta transformer block.

3.3.4. Multi-Head Attention Mechanism

Multi-head attention: shown in Figure 5, in multi-head attention, instead of using one attention function with the same size for queries, keys, and values, several attention functions in parallel are used. Each function works with smaller versions of the queries, keys, and values. After each function processes its version, the results are combined, allowing the model to look at the input from different perspectives at the same time. In the transformer model, there are eight such attention functions, called heads, each looking at the input in different ways [27].
MultiHead ( Q , K , V ) = Concat ( h e a d 1 , , h e a d h ) W O where h e a d i = Attention ( Q W i Q , K W i K , V W i V )
The implementation of our proposed approach was based on two SOTA models, MambaVision and EfficientFormer. We used the pre-trained versions on the ImageNet dataset. We added a set of layers to improve the feature learning mentioned in the previous part.

3.4. Evaluation Metrics

To evaluate the performance of the proposed EMGP-Net architecture, we used three metrics commonly applied in regression tasks. These metrics are described as follows:
  • MAE (Mean Absolute Error): Measures the average of the absolute differences between the observed and predicted values with the following equation:
    M A E = 1 n i = 1 n | y i y ^ i |
  • RMSE (Root Mean Squared Error): Measures the square root of the average of the squared differences between the observed and predicted values with the following equation:
    R M S E = 1 n i = 1 n ( y i y ^ i ) 2
  • PCC (Pearson Correlation Coefficient): A measure of the strong correlation between the observed and predicted values. It is defined by the following equation:
    P C C = i = 1 n ( y i y ¯ ) ( y ^ i y ^ ¯ ) i = 1 n ( y i y ¯ ) 2 i = 1 n ( y ^ i y ^ ¯ ) 2

4. Experimental Results

The EMGP-Net architecture proposed in this research was designed to predict the expression of 250 genes, which were initially set and selected based on their highest mean expression in the dataset. (The lists of the 250 genes from the HER2+ and STNet datasets can be found in Appendix A). In this study, we focus on presenting the top 14 predicted genes among the 250. First, we evaluated our model using a leave-one-patient-out approach with the HER2+ dataset. To assess the model’s ability to generalize, we performed external validation in two ways: In the first test, the model was trained on the entire HER2+ dataset and tested on the entire STNet dataset. In the second test, the model was trained on the entire STNet dataset and tested on the entire HER2+ dataset. Each evaluation was repeated for all 8 patients in the HER2+ dataset and 23 patients in the STNet dataset and then we selected the highest PCC value across all patients for each gene. Image dimensions were fixed at 224 × 224 pixels as input to the model, and we applied the MSELoss function along with the MAE and RMSE metrics to refine the results. The equation for the MSELoss is as follows:
M S E = 1 n i = 1 n ( y i y ^ i ) 2 .
The training was performed on eight NVIDIA GeForce RTX 3090 GPUs with 24 GB of memory each (https://www.nvidia.com/, accessed on 20 May 2025). We selected a batch size of 768, and the entire script was implemented using the PyTorch library. The learning rate and weight decay were set to 10 4 . In the following section, we present the results of our EMGP-Net model.
We applied the Wilcoxon signed-rank test to evaluate whether the observed performance improvements of EMGP-Net over the different models were reliable. This non-parametric test is well suited to our study because it handles paired data. The Wilcoxon signed-rank test statistic W is computed as follows:
W = i = 1 n R i · sign ( d i ) ,
where d i = x i y i is the difference in PCC between EMGP-Net and the other model for gene i, R i is the rank of | d i | among all non-zero differences, and sign( d i ) indicates the direction of the difference.
The p-values resulting from this analysis are reported in the last rows of Table 1, Table 2 and Table 3. Values below the common threshold of 0.05 were considered statistically meaningful, indicating that EMGP-Net achieved a better gene prediction score for most of the genes tested.

4.1. Model Trained on the HER2+ Dataset

First, we present the results obtained using the leave-one-patient-out approach with the HER2+ dataset. Table 1 shows the top 14 genes with the highest PCC. We compared four models: the EfficientFormer, MambaVision, and two versions of our proposed model, EMGP-Net, which includes an attention mechanism, and EMGP-Net-noAttn, which does not. EMGP-Net and EMGP-Net-noAttn are both based on a combination of EfficientFormer and MambaVision, along with the set of added layers mentioned previously. EMGP-Net gave better results in most cases.
Comparing EMGP-Net-noAttn with EMGP-Net demonstrated the impact of the attention mechanism. Although they have the same architecture, EMGP-Net uses attention to weigh and combine features, while EMGP-Net-noAttn simply concatenates them and applies fully connected layers. EMGP-Net outperformed EMGP-Net-noAttn on nearly all genes. This confirmed that the attention-based fusion enhances the model’s ability to focus on more representations, thus improving prediction.

4.1.1. Comparison of Architectural Components by PCC for Top-Ranked Genes:

EMGP-Net was able to predict 13 out of 14 genes with PCC scores greater than 0.7, which is higher than the PCC scores of the MambaVision and EfficientFormer models separately, as well as that of EMGP-Net-noAttn. The top 14 genes predicted by EMGP-Net included PTMA, GNAS, B2M, HNRNPA2B1, and TPT1, with PCC values of 0.7903, 0.7843, 0.7777, 0.7532, and 0.7360, respectively. On the other hand, MambaVision predicted only 1 gene, the B2M gene, out of the 14 genes with the highest PCC score, equal to 0.8049, and scored 0.7763, 0.7674, 0.7363, and 0.7198 for GNAS, PTMA, TPT1, and HNRNPA2B1, respectively. For EfficientFormer, all PCCs were lower than those of EMGP-Net, with values of 0.7777, 0.7746, 0.7661, 0.7266, and 0.7245 for PTMA, B2M, GNAS, HNRNPA2B1, and TPT1, respectively. EMGP-Net outperformed EfficientFormer on 14 genes, MambaVision on 13 genes out of 14 genes, and EMGP-Net-noAttn on all 14 genes. These results demonstrate that combining features from both backbone models and refining them with the attention mechanism greatly improved the PCC scores.

4.1.2. Comparison of Architectural Components by PCC for Common Genes

The common genes among the selected top 14 genes evaluated in all three models included PTMA, GNAS, B2M, HNRNPA2B1, TPT1, XBP1, ACTG1, HLA-B, HLA-DRA, and ACTB. As shown in Table 4, the results indicated that EMGP-Net outperformed EMGP-Net-noAttn, MambaVision, and EfficientFormer in predicting gene expression for the majority of common genes. Specifically, EMGP-Net achieved higher PCC values than MambaVision for 7 out of the 10 common genes. The genes where MambaVision outperformed EMGP-Net were B2M, TPT1, and HLA-DRA, with MambaVision achieving higher PCC scores of 0.8049, 0.7363, and 0.7089, respectively, compared to the 0.7777, 0.7360, and 0.7056 of EMGP-Net. However, when compared to EMGP-Net-noAttn and EfficientFormer, EMGP-Net showed better performance across all genes. Figure 6 shows the visualization of the predictions of six genes with the highest PCCs.

4.2. Quantitative Analysis of the Results

Our method aims to predict the expression of 250 breast cancer-related genes, selected based on their highest mean expression in the dataset.

4.2.1. Analysis of EMGP-Net Results

To evaluate the effectiveness of our model, we categorized the gene expression values into ranges. The results showed that there were 14 genes with PCC values between 0.7 and 0.8, while 56 genes had PCC values between 0.6 and 0.7. Additionally, 75 genes were in the range of 0.5 to 0.6, 54 genes had PCC values between 0.4 and 0.5, and 32 genes were in the range of 0.3 to 0.4. Furthermore, 15 genes had PCC values between 0.2 and 0.3, and 4 genes had values between 0.1 and 0.2. Notably, all genes were predicted with positive PCC values, indicating that our model successfully predicted gene expression for all selected genes.

4.2.2. Analysis of MambaVision Results

The MambaVision results show the following distribution: 1 gene with a PCC value higher than 0.8, 8 genes with a score between 0.7 and 0.8, and 39 genes with PCC values between 0.6 and 0.7. In addition, 65 genes were in the range of 0.5 to 0.6, 73 genes had PCC values between 0.4 and 0.5, and 39 genes were in the range of 0.3 to 0.4. In addition, 20 genes had PCC values between 0.2 and 0.3, 4 genes had PCC values between 0.1 and 0.2, and 1 gene had a PCC value between 0.0 and 0.1.

4.2.3. Analysis of EfficientFormer Results

The EfficientFormer PCC values were distributed as follows: 6 genes with PCC values between 0.7 and 0.8, and 45 genes with PCC values between 0.6 and 0.7. In addition, 68 genes were in the range of 0.5 to 0.6, 70 genes had PCC values between 0.4 and 0.5, and 35 genes were in the range of 0.3 to 0.4. In addition, 18 genes had PCC values between 0.2 and 0.3, 7 genes had values between 0.1 and 0.2, and 1 gene had a PCC value between 0.0 and 0.1.

4.2.4. Analysis of EMGP-Net-noAttn Results

The EMGP-Net-noAttn variant, which eliminated the attention mechanism and employed simple concatenation, achieved positive PCC values for all genes. Of the genes, 8 had PCC values between 0.7 and 0.8, 49 had a PCC value between 0.6 and 0.7, and 72 had a PCC value between 0.5 and 0.6. Additionally, 66 genes had a PCC that fell between 0.4 and 0.5, 32 had a value between 0.3 and 0.4, 15 had a value between 0.2 and 0.3, and 8 had a value between 0.1 and 0.2. Despite performing slightly worse than the full EMGP-Net, this model still showed a clear improvement over the individual backbone models.
EMGP-Net predicted a total of 145 genes with a PCC value greater than 0.5, outperforming EMGP-Net-noAttn which predicted 129, MambaVision which predicted 113, and EfficientFormer which predicted 119. In this context, EMGP-Net had the highest number of predicted genes in the PCC intervals of 0.7 to 0.8, 0.6 to 0.7, and 0.5 to 0.6, indicating its improved performance in these ranges compared to the other models, as shown in Figure 7.

4.3. External Validation

In this section, we discuss the external validation process, where we evaluated our model’s generalizability on a dataset different from the one used for training. This approach allowed us to evaluate the performance of the model on a new set of patients and validate its ability to make accurate predictions on different datasets. Our approach was compared to GeNetFormer [11] and ST-Net [9] after it was retrained.

4.3.1. Model Evaluation on the STNet Dataset

This part presents the results obtained when the model was trained on the entire HER2+ dataset and tested on all patients from the STNet dataset, selecting the best PCC for each gene across all patients. Table 2 shows that EMGP-Net had the best PCC values for the set of the top 14 genes predicted compared to GeNetFormer and ST-Net. The set of the top 14 genes predicted by EMGP-Net included ERBB2, ACTG1, CALR, RPL23, and GNAS, with PCC values of 0.7145, 0.7051, 0.7047, 0.6973, and 0.6962, respectively. For GeNetFormer, the genes were DDX5, ACTG1, CPB1, PTMA, and RPL23, with PCC values of 0.7069, 0.6510, 0.6384, 0.6235, and 0.6130, respectively. For ST-Net, the genes were GNAS, RPL23, PTPRF, ACTG1, and DDX5, with PCC values of 0.6708, 0.6592, 0.6503, 0.6460, and 0.6406, respectively.
We present a comparison of the performance of the EMGP-Net, GeNetFormer, and ST-Net models based on common genes. Table 5 shows the PCC values for the 9 common genes out of the top 14 genes for all three models. The results show that EMGP-Net outperformed GeNetFormer and ST-Net. We can see that EMGP-Net predicted eight out of nine common genes with higher PCC values compared to GeNetFormer and nine out of nine genes with higher PCC values compared to ST-Net. Specifically, EMGP-Net achieved the highest PCC scores for genes such as ACTG1, CALR, RPL23, GNAS, and PTPRF, with PCC values of 0.7051, 0.7047, 0.6973, 0.6962, and 0.6867, respectively, while GeNetFormer showed the best PCC in predicting only DDX5, with a PCC value of 0.7069.

4.3.2. Model Evaluation on the HER2+ Dataset

This part presents the results obtained when the model was trained on the entire STNet dataset and tested on all patients from the HER2+ dataset, selecting the best PCC for each gene across all patients. Table 3 shows that EMGP-Net had the best PCC values for the set of the top 14 genes predicted compared to GeNetFormer and ST-Net. The set of the top 14 genes predicted by EMGP-Net included ERBB2, S100A11, ATP5E, HSP90B1, and LGALS3, with PCC values of 0.7285, 0.6686, 0.6650, 0.6404, and 0.6347, respectively. For GeNetFormer, the top 14 genes included ATP5E, S100A11, ERBB2, PTPRF, and HSP90B1, with PCC values of 0.6746, 0.6434, 0.6141, 0.6115, and 0.5986, respectively. For ST-Net, the top 14 genes included ATP5E, ERBB2, S100A11, PTPRF, and LGALS3, with PCC values of 0.6719, 0.6620, 0.6374, 0.6227, and 0.5918, respectively.
We compared the performance of EMGP-Net, GeNetFormer, and ST-Net on the set of the common genes among the top 14 predicted genes. Table 6 shows the PCC values for ERBB2, S100A11, ATP5E, HSP90B1, LGALS3, PTPRF, and PSMB4. The results show that EMGP-Net outperformed both GeNetFormer and ST-Net for most of these genes. It is clear that EMGP-Net showed the best performance, especially for ERBB2, where it achieved a PCC value of 0.7285, compared to the 0.6620 of ST-Net and the 0.6141 of GeNetFormer. For S100A11, EMGP-Net again outperformed the other models, with a PCC of 0.6686, higher than the 0.6374 of ST-Net and the 0.6434 of GeNetFormer. Similarly, for HSP90B1, EMGP-Net achieved a PCC of 0.6404, higher than both the 0.5903 of ST-Net and the 0.5986 of GeNetFormer. GeNetFormer performed better on one gene, ATP5E, with a PCC value of 0.6746, and ST-Net outperformed on one gene, PTPRF, with a PCC value of 0.6227.

5. Discussion

Obviously, the increasing interest in gene expression prediction using WSIs combined with ST data shows the importance of improving predictive models in the field of cancer research, especially breast cancer research, diagnosis, and treatment, and this study has contributed towards addressing the limitation of the high cost of ST technologies.
Several studies have proposed different architectures. He et al. [9] used DenseNet-121. BrST-Net [10] is a CNN-based framework with EfficientNet-b0 as the best-performing model. GeNetFormer [11] is a transformer-based model, with EfficientFormer being the best performing compared to seven other transformer models. TRIPLEX [22] adapts another method and uses a three-encoder approach based on ResNet18 in which the global encoder also integrates transformer blocks. HisToGene [12] integrates multi-head attention and vision transformers. GNNs have been widely used; for example, Zeng et al. [14] proposed an architecture based on a GNN and transformer as well as the architecture introduced in [13], while Jia et al. [15] used GAT to introduce THItoGene. ErwaNet [23] follows a different strategy and integrates different modules like an ERM and a WAM and different edges like K-nearest neighbor, K-nearest similarity, and percent similarity edges. Hypergraphs were also used in several studies, such as PH2ST [16], which uses ViT and a universal histology image encoder, and the HGGEP architecture [17], which includes a hypergraph association module, gradient enhancement module, ShuffleNet V2 backbone, convolutional block attention module, and ViT. EGN [18] and EGGN [19] were based on exemplar guidance learning. All previous studies have followed different strategies and implemented different training approaches. Some have used a leave-one-patient-out approach, others have used cross-validation, and others have simply divided datasets into training, validation, and test sets. Also, different datasets have been used in different studies, such as the HER2+, STNet, and 10x Genomics datasets with different numbers of targeted genes. To maintain the coherence of research, we introduced EMGP-Net in this paper, a hybrid deep learning model that combines MambaVision and EfficientFormer, designed to predict 250 genes. By leveraging features from both models and utilizing an attention-based fusion layer, we aimed to improve the prediction of breast cancer gene expression. We employed the leave-one-patient-out approach in both internal and external validation.
Our results showed that EMGP-Net outperformed both individual models, MambaVision and EfficientFormer, in internal validation using the HER2+ dataset. It had the highest PCC scores ranging from 0.7002 (CD24) to 0.7903 (PTMA) for the top 14 genes compared to EfficientFormer, which generally dropped behind our model with scores ranging from 0.6834 (S100A11) to 0.7777 (PTMA). It also outperformed MambaVision on 13 out of 14 genes, with MambaVision’s top PCC score being 0.8049 (B2M). Of the 14 genes, we took the common genes across all models, comprising 10 genes, and compared their PCCs. Our model outperformed EfficientFormer on all genes including PTMA, GNAS, B2M, HNRNPA2B1, and TPT1. MambaVision outperformed EMGP-Net only on three genes, namely, B2M, TPT1, and HLA-DRA.
External validation was used to evaluate the performance of EMGP-Net on two datasets: the HER2+ and STNet, representing different breast cancer subtypes. It was trained on one dataset and tested on the other. The model achieved the highest PCC for most of the top 14 genes and showed the best external validation results compared to other retrained models from other studies.
  • When trained on the HER2+ dataset and tested on the STNet dataset, our model outperformed GeNetFormer and ST-Net on all the 14 genes, with PCC values ranging from 0.6563 (KRT19) to 0.7145 (ERBB2) compared to the PCC values of GeNetFormer, which ranged from 0.5250 (KRT19) to 0.7069 (DDX5), and the PCC values of ST-Net, which ranged from 0.5749 (HLA-DRA) to 0.6708 (GNAS). On 9 common genes out of the 14 genes, our model outperformed ST-Net, including ACTG1, CALR, RPL23, GNAS, and PTPRF, while GeNetFormer only outperformed our model on 1 gene, which was DDX5.
  • When trained on the STNet dataset and tested on the HER2+ dataset, our model outperformed GeNetFormer and ST-Net on all 14 genes, with PCC scores ranging from 0.5465 (GNAS) to 0.7285 (ERBB2) compared to the PCC values of GeNetFormer, which ranged from 0.5185 (COL1A2) to 0.6746 (ATP5E), and the PCC values of ST-Net, which ranged from 0.5287 (LASP1) to 0.6719 (ATP5E). Of the 7 common genes out of 14 genes, our model outperformed GeNetFormer on 6 genes, namely, ERBB2, S100A11, HSP90B1, LGALS3, PTPRF, and PSMB4, and only 1 gene, ATP5E, was well predicted with GeNetFormer. Our model also outperformed ST-Net on six other genes, namely, ERBB2, S100A11, ATP5E, HSP90B1, LGALS3, and PSMB4, and only PTPRF was well predicted by ST-Net.
When compared to other studies that applied different approaches, EMGP-Net achieved the highest PCC value of 0.7903 (PTMA) using the HER2+ dataset: THItoGene reached a top PCC value of 0.7470 (FN1), HGGEP achieved a top PCC value of 0.6520 (MYL12B), and Hist2ST achieved a top PCC value of 0.7310 (FN1) on the same dataset. Using the STNet dataset, BrST-Net reached a top PCC value of 0.6325 (B2M), while SEPAL had a top PCC value of 0.6390 (ENSG00000145824). EMGP-Net, tested on the STNet dataset as an external validation, achieved a top PCC value of 0.7145 (ERBB2).
Notably, EMGP-Net performed well in predicting the expression of genes such as PTMA, GNAS, B2M, HNRNPA2B1, TPT1, and XBP1 out of 250 targeted genes, which are important biomarkers in breast cancer prognosis. When compared to the results of MambaVision and EfficientFormer, EMGP-Net had the best PCC values for most of the genes evaluated. The model’s ability to integrate features from both MambaVision and EfficientFormer using a multi-head attention mechanism contributed to its improved performance.
Although our results are encouraging, this study has some limitations. First, the HER2+ and STNet datasets that we used are smaller than other datasets that are commonly used in computer vision. This small size may make the model more difficult to apply to other patient groups. Differences in how tissue slides are stained, the type of scanner used, and the unique characteristics of each patient could affect the model’s performance on new data. While we tested the model on both HER2+ and STNet datasets to ensure its effectiveness in various cases, using larger, more diverse datasets would give more reliable results. Additionally, the datasets used in this study are specific to particular types of breast cancer. This could make it difficult for the model to perform well on some other cancer types. For instance, what the model learned from the HER2+ samples may not be directly applicable to triple-negative or luminal A types. We may need to create specialized versions of the model for different subtypes or develop a more general model. Another limitation is that the model is not easily interpretable from a biological point of view. Even though EMGP-Net makes accurate predictions, it does not explain which genes are important for its decisions. In the future, we could use techniques like Grad-CAM or others to better understand how the model works and help doctors trust its predictions.

6. Conclusions

In this study, we introduced EMGP-Net, a hybrid deep learning architecture that combines MambaVision and EfficientFormer to predict gene expression from breast cancer WSIs. By leveraging the capabilities of both models and incorporating a multi-head attention mechanism, EMGP-Net successfully improved gene expression predictions and outperformed models in internal and external validation tests. Our results showed that EMGP-Net achieved a high performance compared to MambaVision and EfficientFormer, with high PCC values. It showed an improvement in the prediction of 250 genes, with EMGP-Net surpassing both models in most cases. External validation on two datasets, HER2+ and STNet, demonstrated the reliability of EMGP-Net. It outperformed both GeNetFormer and ST-Net in most cases. Despite the high performance of EMGP-Net, our study has some limitations. Future work will focus on increasing the number of predicted genes and exploring the integration of explainable AI techniques to improve interpretability. Synthetic data generated by generative models may also enhance training diversity and performance. EMGP-Net provides a solid foundation for advancing gene expression prediction from histopathological images in breast cancer research.

Author Contributions

Conceptualization, O.T. and M.A.A.; methodology, O.T. and M.A.A.; validation, O.T. and M.A.A.; formal analysis, O.T. and M.A.A.; writing—original draft preparation, O.T.; writing—review and editing, M.A.A.; funding acquisition, M.A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was enabled in part by support provided by the Natural Sciences and Engineering Research Council of Canada (NSERC), funding reference number RGPIN-2024-05287, and by the AI in Health Research Chair at the Université de Moncton.

Institutional Review Board Statement

This research did not require Institutional Review Board (IRB) approval as it exclusively utilized publicly available data.

Informed Consent Statement

Not applicable.

Data Availability Statement

This work uses two public datasets: 1. HER2+ dataset (https://www.synapse.org/Synapse:syn52503858/files/, accessed on 20 May 2025). 2. STNet dataset (https://data.mendeley.com/datasets/29ntw7sh4r/5, accessed on 20 May 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
APEGAtypical Position Encoding Generator
AuxNetAuxiliary Network
BNBatch Normalization
CBAMConvolutional Block Attention Module
CNNConvolutional Neural Network
EBExemplar Bridging
EMGP-NetEfficientFormer (E), MambaVision (M), gene expression prediction (GP) network (Net).
EMGP-Net-noAttnEfficientFormer (E), MambaVision (M), gene expression prediction (GP) network (Net), no attention (noAtt).
ERMEdge Relational Module
GATGraph Attention Network
GeLUGaussian Error Linear Unit
GEMGradient Enhancement Gradient
GNNGraph Neural Network
HAMHypergraph Association Module
H&EHematoxylin and Eosin
LNLayer Normalization
MAEMean Absolute Error
PCCPearson Correlation Coefficient
RMSERoot Mean Squared Error
SOTAState Of The Art
SSMState-Space Model
STSpatial Transcriptomics
ViTVision Transformer
WAMWindow Attention Module
WSIsWhole-Slide Images

Appendix A. Lists of the 250 Genes Included in This Study Across the Two Datasets: HER2+ and STNet

This appendix provides complete lists of the 250 genes used in this study. Appendix A.1 presents the genes selected from the HER2+ dataset, and Appendix A.2 presents the genes selected from the STNet dataset.

Appendix A.1. List of the 250 Genes from the HER2+ Dataset

This list contains the 250 genes selected from the HER2+ dataset that were used to train and evaluate the gene expression prediction models described in this study.
Table A1. List of the 250 genes from the HER2+ dataset. The last row uses the asterisk (*) to fill empty cells and complete the seven-column layout.
Table A1. List of the 250 genes from the HER2+ dataset. The last row uses the asterisk (*) to fill empty cells and complete the seven-column layout.
Gene NameGene NameGene NameGene NameGene NameGene NameGene Name
PTMAGNASB2MHNRNPA2B1TPT1XBP1ACTG1
HLA-BTMSB10DDX5HLA-DRAACTBS100A11CD24
HSP90B1PSMB4COX6CTUBA1BEIF4G2PRDX1HLA-C
HLA-ALAPTM4AVMP1HSP90AA1UBCATP5ECALM2
SCGB2A2NACAFTH1COX7CCALRCCT3FASN
PEBP1HSPB1PSAPSPINT2BEST1PFN1PLXNB2
ATP5BSERF2LGALS3P4HBMYH9CRIP2CHCHD2
ATP1A1ERBB2KRT19CD74FN1GAPDHHSP90AB1
HSPA8PTPRFFTLLSM4KDELR1CFL1VCP
MIDNPPP1CASLC9A3R1PABPC1APOEGRB7RACK1
EEF2TUBBJTBSH3BGRL3TXNIPSCDOAZ1
LASP1ATG10SPDEFSEPW1VIMMDKCTSB
SEC61A1GRINAIDH2UBE2MCOPS9MMACHCMZT2B
JUPUBA52PSMD8SLC2A4RGMLLT6SSR2DBI
TAPBPCIB1PPDPFCST3TSPOCD63COL1A1
PTBP1AESTAGLN2ATP5G2MYL6NUCKS1GNAI2
PLD3GNB2LMAN2HM13RALYSNRPBSDC1
ENO1COPEPHBGRNHLA-ESTARD10COL1A2
A2MALDOANUPR1LAPTM5EIF3BEDF1MAPKAPK2
SERINC2FLNAMIEN1SYNGR2MUC1COX4I1EIF4G1
C3PERPH1FXGPX4C1QBAPOC1DHCR24
PRSS8COX6B1IGLC2KRT18ERGIC1GUK1PGAP3
IGLC3IGHG3FAUUQCRQUQCR11ZYXCLDN4
CD81CD99NDUFA3CISD3RRBP1COX5BS100A6
LGALS3BPPCGF2TYMPTIMP1NDUFB9ATP6V0BAP2S1
COX8AFNBP1LCOL3A1STARD3PTMSIFI27KRT7
PFKLCTSDRABAC1PSMB3PSMD3LMNAH2AFJ
ARHGDIASPARCEEF1DSLC25A6INTS1ACTN4IGHA1
CHPFELOVL1SSR4ATP6AP1CYBATAGLNC1QA
PRRC2ARHOCIGHG1MMP14PPP1R1BCALML5BSG
CLDN3AEBP1LY6ETRAF4IGKCBGNNBL1
FKBP2AP000769.1ROMO1COL6A2IGHMC12orf57MYL9
BCAP31SCAND1TCEB2PFDN5BST2KIAA0100NDUFB7
MUCL1LGALS1POSTNTFF3MGPCOL18A1NDUFA11
IGFBP2KRT81SUPT6HORMDL3S100A9MUC6AZGP1
S100A14S100A8IGHG4ADAM15ISG15**

Appendix A.2. List of the 250 Genes from the STNet Dataset

This list contains the 250 genes selected from the STNet dataset that were used to train and evaluate the gene expression prediction models described in this study.
Table A2. List of the 250 genes from the STNet dataset. Gene names marked as N/A indicate names that were originally ambiguous. The last row uses the asterisk (*) to fill empty cells and complete the seven-column layout.
Table A2. List of the 250 genes from the STNet dataset. Gene names marked as N/A indicate names that were originally ambiguous. The last row uses the asterisk (*) to fill empty cells and complete the seven-column layout.
Gene NameGene NameGene NameGene NameGene NameGene NameGene Name
ERBB2ACTG1CALRRPL23GNASPSMD3PTPRF
TMSB10GAPDHTAGLN2DDX5HSPB1PTMAKRT19
P4HBPRDX1PFN1HLA-CS100A11RPL28ENSG00000203812
B2MHLA-DRACPB1NHERF1RPLP0S100A9RPL19
HLA-BC4BCALML5ACTBS100A8RPLP2TMSB4X
APOEGRINAENO1RPL35MGPTIMP1HLA-A
RPS11IGLL5PRSS8ENSG00000272196COX6CATP1A1CYBA
RPS19RPLP1RPS28RPS18JUPRPS2UBA52
TUBA1BSELENOWIFI27ELF3FTLN/ARPL13
RPL9ATP5F1EN/ARPL10CST3RPS4XRPL38
TAPBPSYNGR2RPS20CD74SERF2FASNC1QA
CLDN3N/ASPDEFRACK1UBCBCAP31PABPC1
RPS6N/AFLNARPS13H1-10SDC1EIF4G1
FTH1RPS9CRIP2RPS27AAEBP1CLUS100A6
RPL8FN1SEC61A1MYL6RPL15RPS17PPP1CA
GPX4RPS7BGNRPL13AATP6V0BBSGTPT1
A2MBST2PPDPFMYL9VIMRPS15AXBP1
COL1A1RPS14STARD10RPS12RPS3ISG15RPS15
ENSG00000169100MZT2AHSP90AB1CD81LY6EIFITM3MZT2B
EIF4A1PFDN5RPS8COX8AUBBLGALS3BPRPL23A
EEF2RPL29N/ATAGLNEVLN/ARPL3
MUC1SPARCN/AAPOC1H3-3BRPS23N/A
KRT8RPS21UQCR11TSPORPL27UQCRQGNB2
RPL34ARHGDIALAPTM5SNHG25RPL5N/AN/A
RHOCTUFMRPL35ARPL14EDF1N/ACFL1
RPL18AHLA-ESSR2FXYD3H2AJFAUAZGP1
BEST1COL1A2LMNARPL12GUK1COX4I1OAZ1
RPL37APLXNB2ELOBGAS5N/AGRNMALAT1
RPS24IGFBP2COX6B1CTSBTFF3RPL24ALDOA
RPL32RPS16PRDX2EEF1DRPL4RPL31CCND1
NDUFA13RPL7ARPL11RPL36NBEAL1EIF5APLD3
RPL27ACD63SH3BGRL3ATP6AP1PSAPZNF90TLE5
RPS29RPL7RPS25KRT18RPS5NDUFA11CTSD
NDUFB9SSR4C3RPS27N/AENSG00000279274RPL37
RPS3AENSG00000255823POLR2LIFI6ENSG00000269028RPS10RPL30
ENSG00000279483C12orf57GNAI2TFF1RPL18**

References

  1. Obeagu, E.I.; Obeagu, G.U. Breast cancer: A review of risk factors and diagnosis. Medicine 2024, 103, e36905. [Google Scholar] [CrossRef]
  2. Thaalbi, O.; Akhloufi, M.A. Deep learning for breast cancer diagnosis from histopathological images: Classification and gene expression: Review. Netw. Model. Anal. Health Inform. Bioinform. 2024, 13, 52. [Google Scholar] [CrossRef]
  3. Al-Jabbar, M.; Alshahrani, M.; Senan, E.M.; Ahmed, I.A. Multi-Method Diagnosis of Histopathological Images for Early Detection of Breast Cancer Based on Hybrid and Deep Learning. Mathematics 2023, 11, 1429. [Google Scholar] [CrossRef]
  4. Obayya, M.; Maashi, M.S.; Nemri, N.; Mohsen, H.; Motwakel, A.; Osman, A.E.; Alneil, A.A.; Alsaid, M.I. Hyperparameter Optimizer with Deep Learning-Based Decision-Support Systems for Histopathological Breast Cancer Diagnosis. Cancers 2023, 15, 885. [Google Scholar] [CrossRef] [PubMed]
  5. Clement, D.; Agu, E.; Obayemi, J.; Adeshina, S.; Soboyejo, W. Breast Cancer Tumor Classification Using a Bag of Deep Multi-Resolution Convolutional Features. Informatics 2022, 9, 91. [Google Scholar] [CrossRef]
  6. Bagchi, A.; Pramanik, P.; Sarkar, R. A Multi-Stage Approach to Breast Cancer Classification Using Histopathology Images. Diagnostics 2022, 13, 126. [Google Scholar] [CrossRef]
  7. Bhausaheb, D.P.; Kashyap, K.L. Detection and classification of breast cancer availing deep canid optimization based deep CNN. Multimed. Tools Appl. 2023, 82, 18019–18037. [Google Scholar] [CrossRef]
  8. Yu, D.; Lin, J.; Cao, T.; Chen, Y.; Li, M.; Zhang, X. SECS: An effective CNN joint construction strategy for breast cancer histopathological image classification. J. King Saud Univ.-Comput. Inf. Sci. 2023, 35, 810–820. [Google Scholar] [CrossRef]
  9. He, B.; Bergenstråhle, L.; Stenbeck, L.; Abid, A.; Andersson, A.; Borg, Å.; Maaskola, J.; Lundeberg, J.; Zou, J. Integrating spatial gene expression and breast tumour morphology via deep learning. Nat. Biomed. Eng. 2020, 4, 827–834. [Google Scholar] [CrossRef]
  10. Rahaman, M.M.; Millar, E.K.; Meijering, E. Breast cancer histopathology image-based gene expression prediction using spatial transcriptomics data and deep learning. Sci. Rep. 2023, 13, 13604. [Google Scholar] [CrossRef]
  11. Thaalbi, O.; Akhloufi, M.A. GeNetFormer: Transformer-Based Framework for Gene Expression Prediction in Breast Cancer. AI 2025, 6, 43. [Google Scholar] [CrossRef]
  12. Pang, M.; Su, K.; Li, M. Leveraging information in spatial transcriptomics to predict super-resolution gene expression from histology images in tumors. BioRxiv 2021. [Google Scholar] [CrossRef]
  13. Mejia, G.; Cárdenas, P.; Ruiz, D.; Castillo, A.; Arbeláez, P. SEPAL: Spatial Gene Expression Prediction from Local Graphs. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 1–6 October 2023; pp. 2294–2303. [Google Scholar] [CrossRef]
  14. Zeng, Y.; Wei, Z.; Yu, W.; Yin, R.; Yuan, Y.; Li, B.; Tang, Z.; Lu, Y.; Yang, Y. Spatial transcriptomics prediction from histology jointly through transformer and graph neural networks. Briefings Bioinform. 2022, 23, bbac297. [Google Scholar] [CrossRef] [PubMed]
  15. Jia, Y.; Liu, J.; Chen, L.; Zhao, T.; Wang, Y. THItoGene: A deep learning method for predicting spatial transcriptomics from histological images. Briefings Bioinform. 2024, 25, bbad464. [Google Scholar] [CrossRef]
  16. Niu, Y.; Liu, J.; Zhan, Y.; Shi, J.; Zhang, D.; Machado, I.; Crispin-Ortuzar, M.; Li, C.; Gao, Z. ST-Prompt Guided Histological Hypergraph Learning for Spatial Gene Expression Prediction. arXiv 2025, arXiv:2503.16816. [Google Scholar] [CrossRef]
  17. Li, B.; Zhang, Y.; Wang, Q.; Zhang, C.; Li, M.; Wang, G.; Song, Q. Gene expression prediction from histology images via hypergraph neural networks. Briefings Bioinform. 2024, 25, bbae500. [Google Scholar] [CrossRef]
  18. Yang, Y.; Hossain, M.Z.; Stone, E.A.; Rahman, S. Exemplar guided deep neural network for spatial transcriptomics analysis of gene expression prediction. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 5039–5048. [Google Scholar] [CrossRef]
  19. Yang, Y.; Hossain, M.Z.; Stone, E.; Rahman, S. Spatial transcriptomics analysis of gene expression prediction using exemplar guided graph neural network. Pattern Recognit. 2024, 145, 109966. [Google Scholar] [CrossRef]
  20. Liu, Z.; Qian, S.; Xia, C.; Wang, C. Are transformer-based models more robust than CNN-based models? Neural Netw. 2024, 172, 106091. [Google Scholar] [CrossRef]
  21. Hatamizadeh, A.; Kautz, J. Mambavision: A hybrid mamba-transformer vision backbone. arXiv 2024, arXiv:2407.08083. [Google Scholar] [CrossRef]
  22. Chung, Y.; Ha, J.H.; Im, K.C.; Lee, J.S. Accurate Spatial Gene Expression Prediction by Integrating Multi-Resolution Features. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 11591–11600. [Google Scholar] [CrossRef]
  23. Chen, C.; Zhang, Z.; Tang, P.; Liu, X.; Huang, B. Edge-relational window-attentional graph neural network for gene expression prediction in spatial transcriptomics analysis. Comput. Biol. Med. 2024, 174, 108449. [Google Scholar] [CrossRef]
  24. Andersson, A.; Larsson, L.; Stenbeck, L.; Salmén, F.; Ehinger, A.; Wu, S.Z.; Al-Eryani, G.; Roden, D.; Swarbrick, A.; Borg, Å.; et al. Spatial deconvolution of HER2-positive breast cancer delineates tumor-associated cell type interactions. Nat. Commun. 2021, 12, 6012. [Google Scholar] [CrossRef] [PubMed]
  25. Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv 2023, arXiv:2312.00752v2. [Google Scholar] [CrossRef]
  26. Li, Y.; Yuan, G.; Wen, Y.; Hu, J.; Evangelidis, G.; Tulyakov, S.; Wang, Y.; Ren, J. Efficientformer: Vision transformers at mobilenet speed. Adv. Neural Inf. Process. Syst. 2022, 35, 12934–12949. [Google Scholar] [CrossRef]
  27. Vaswani, A. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar] [CrossRef]
Figure 1. Representative whole-slide images from the HER2+ and STNet datasets: (a) Sample WSIs from the HER2+ dataset. (b) Sample WSIs from the STNet dataset.
Figure 1. Representative whole-slide images from the HER2+ and STNet datasets: (a) Sample WSIs from the HER2+ dataset. (b) Sample WSIs from the STNet dataset.
Computers 14 00253 g001
Figure 2. Overview of the EMGP-Net architecture. The workflow comprises multiple stages: extracting patches from WSIs (the red frame shows an extracted patch and its location in the original image), extracting features using MambaVision and EfficientFormer backbones, and utilizing multi-head attention followed by activation functions (GeLU), layer normalization, and fully connected layers to output the final gene expression predictions.
Figure 2. Overview of the EMGP-Net architecture. The workflow comprises multiple stages: extracting patches from WSIs (the red frame shows an extracted patch and its location in the original image), extracting features using MambaVision and EfficientFormer backbones, and utilizing multi-head attention followed by activation functions (GeLU), layer normalization, and fully connected layers to output the final gene expression predictions.
Computers 14 00253 g002
Figure 3. Architecture of the MambaVision block. The block integrates two parallel branches. One branch contains a linear layer and applies a selective SSM and a 1D convolutional layer. The other branch also includes a linear layer and applies only a 1D convolution. The outputs are then combined and passed through a last linear layer.
Figure 3. Architecture of the MambaVision block. The block integrates two parallel branches. One branch contains a linear layer and applies a selective SSM and a 1D convolutional layer. The other branch also includes a linear layer and applies only a 1D convolution. The outputs are then combined and passed through a last linear layer.
Computers 14 00253 g003
Figure 4. Architecture of the meta transformer block, including both MB 4 D and MB 3 D variants. The MB 4 D variant uses pooling followed by two convolutional layers with 1 × 1 kernels, batch normalization, and a GeLU activation function. In contrast, the MB 3 D variant includes a sequence of linear layers alternating with layer normalization and GeLU. In this variant, the input is projected into a query (Q), key (K), and value (V) before final integration through linear transformations.
Figure 4. Architecture of the meta transformer block, including both MB 4 D and MB 3 D variants. The MB 4 D variant uses pooling followed by two convolutional layers with 1 × 1 kernels, batch normalization, and a GeLU activation function. In contrast, the MB 3 D variant includes a sequence of linear layers alternating with layer normalization and GeLU. In this variant, the input is projected into a query (Q), key (K), and value (V) before final integration through linear transformations.
Computers 14 00253 g004
Figure 5. Architecture of the multi-head attention mechanism. The figure illustrates the multi-head attention structure. First, the inputs are passed through linear layers, and then the outputs are processed in parallel by several attention heads (shown as h). Then, the outputs from all heads are combined and passed through another linear layer.
Figure 5. Architecture of the multi-head attention mechanism. The figure illustrates the multi-head attention structure. First, the inputs are passed through linear layers, and then the outputs are processed in parallel by several attention heads (shown as h). Then, the outputs from all heads are combined and passed through another linear layer.
Computers 14 00253 g005
Figure 6. Visualization of the top 6 genes predicted by EMGP-Net using the HER2+ dataset with a leave-one-patient-out approach. Each pair of images shows the ground truth on the left and the corresponding prediction on the right for one gene. The color indicates gene expression levels as standard deviations from the mean. The corresponding PCC values are shown for each gene.
Figure 6. Visualization of the top 6 genes predicted by EMGP-Net using the HER2+ dataset with a leave-one-patient-out approach. Each pair of images shows the ground truth on the left and the corresponding prediction on the right for one gene. The color indicates gene expression levels as standard deviations from the mean. The corresponding PCC values are shown for each gene.
Computers 14 00253 g006
Figure 7. Distribution of PCC values for gene expression predictions from each model: (a) PCC distribution for EMGP-Net with attention mechanism. (b) PCC distribution for MambaVision. (c) PCC distribution for EfficientFormer. (d) PCC distribution for EMGP-Net without attention mechanism. Each histogram shows the number of genes that fell within specific PCC value intervals. The x-axis represents the PCC ranges, and the y-axis shows the number of genes in each range.
Figure 7. Distribution of PCC values for gene expression predictions from each model: (a) PCC distribution for EMGP-Net with attention mechanism. (b) PCC distribution for MambaVision. (c) PCC distribution for EfficientFormer. (d) PCC distribution for EMGP-Net without attention mechanism. Each histogram shows the number of genes that fell within specific PCC value intervals. The x-axis represents the PCC ranges, and the y-axis shows the number of genes in each range.
Computers 14 00253 g007
Table 1. Overview of performance comparison between EMGP-Net, EMGP-Net-noAttn, EfficientFormer, and MambaVision. The highest PCC values across the different models for the top 14 predicted genes are in bold and underlined. The p-values at the bottom show the statistical differences between each model and EMGP-Net. Values < 0.05 indicate significance.
Table 1. Overview of performance comparison between EMGP-Net, EMGP-Net-noAttn, EfficientFormer, and MambaVision. The highest PCC values across the different models for the top 14 predicted genes are in bold and underlined. The p-values at the bottom show the statistical differences between each model and EMGP-Net. Values < 0.05 indicate significance.
GenesEfficientFormerMambaVisionEMGP-Net-noAttnEMGP-Net
Gene 10.7777 (PTMA)0.8049 (B2M)0.7791 (PTMA)0.7903 (PTMA)
Gene 20.7746 (B2M)0.7763 (GNAS)0.7768 (B2M)0.7843 (GNAS)
Gene 30.7661 (GNAS)0.7674 (PTMA)0.7700 (GNAS)0.7777 (B2M)
Gene 40.7266 (HNRNPA2B1)0.7363 (TPT1)0.7356 (TPT1)0.7532 (HNRNPA2B1)
Gene 50.7245 (TPT1)0.7198 (HNRNPA2B1)0.7331 (HNRNPA2B1)0.7360 (TPT1)
Gene 60.7075 (ACTG1)0.7089 (HLA-DRA)0.7271 (ACTG1)0.7339 (XBP1)
Gene 70.6965 (XBP1)0.7042 (ACTG1)0.7237 (XBP1)0.7318 (ACTG1)
Gene 80.6964 (HLA-DRA)0.7032 (HLA-B)0.7005 (HLA-B)0.7228 (HLA-B)
Gene 90.6938 (CD24)0.7010 (XBP1)0.6959 (HLA-DRA)0.7122 (TMSB10)
Gene 100.6929 (HLA-B)0.6921 (COX6C)0.6951 (ACTB)0.7085 (DDX5)
Gene 110.6868 (TMSB10)0.6832 (VMP1)0.6873 (TMSB10)0.7056 (HLA-DRA)
Gene 120.6859 (DDX5)0.6809 (ACTB)0.6826 (TUBA1B)0.7020 (ACTB)
Gene 130.6839 (ACTB)0.6789 (PSMB4)0.6826 (COX6C)0.7016 (S100A11)
Gene 140.6834 (S100A11)0.6780 (NACA)0.6799 (DDX5)0.7002 (CD24)
p-value0.0001 (<0.05)0.0009 (<0.05)0.0001 (<0.05)N/A
Table 2. Overview of EMGP-Net performance vs. ST-Net and GeNetFormer performance. The highest PCC values across the different models for the top 14 predicted genes are in bold and underlined. The p-values at the bottom show the statistical differences between each model and EMGP-Net. Values < 0.05 indicate significance.
Table 2. Overview of EMGP-Net performance vs. ST-Net and GeNetFormer performance. The highest PCC values across the different models for the top 14 predicted genes are in bold and underlined. The p-values at the bottom show the statistical differences between each model and EMGP-Net. Values < 0.05 indicate significance.
GenesST-NetGeNetFormerEMGP-Net
Gene 10.6708 (GNAS)0.7069 (DDX5)0.7145 (ERBB2)
Gene 20.6592 (RPL23)0.6510 (ACTG1)0.7051 (ACTG1)
Gene 30.6503 (PTPRF)0.6384 (CPB1)0.7047 (CALR)
Gene 40.6460 (ACTG1)0.6235 (PTMA)0.6973 (RPL23)
Gene 50.6406 (DDX5)0.6130 (RPL23)0.6962 (GNAS)
Gene 60.6325 (PRDX1)0.5974 (PTPRF)0.6894 (PSMD3)
Gene 70.6274 (TAGLN2)0.5943 (GNAS)0.6867 (PTPRF)
Gene 80.6273 (CALR)0.5864 (CALR)0.6842 (TMSB10)
Gene 90.6235 (HSPB1)0.5840 (HSPB1)0.6835 (GAPDH)
Gene 100.6201 (PTMA)0.5701 (TMSB10)0.6814 (TAGLN2)
Gene 110.6144 (CPB1)0.5638 (TAGLN2)0.6724 (DDX5)
Gene 120.6027 (NHEERF1)0.5344 (P4HB)0.6645 (HSPB1)
Gene 130.5908 (ENSG00000203812)0.5307 (PRDX1)0.6588 (PTMA)
Gene 140.5749 (HLA-DRA)0.5250 (KRT19)0.6563 (KRT19)
p-value0.0001 (<0.05)0.0001 (<0.05)N/A
Table 3. Overview of EMGP-Net performance vs. ST-Net and GeNetFormer performance. The highest PCC values across the different models for the top 14 predicted genes are in bold and underlined. The p-values at the bottom show the statistical differences between each model and EMGP-Net. Values < 0.05 indicate significance.
Table 3. Overview of EMGP-Net performance vs. ST-Net and GeNetFormer performance. The highest PCC values across the different models for the top 14 predicted genes are in bold and underlined. The p-values at the bottom show the statistical differences between each model and EMGP-Net. Values < 0.05 indicate significance.
GenesST-NetGeNetFormerEMGP-Net
Gene 10.6719 (ATP5E)0.6746 (ATP5E)0.7285 (ERBB2)
Gene 20.6620 (ERBB2)0.6434 (S100A11)0.6686 (S100A11)
Gene 30.6374 (S100A11)0.6141 (ERBB2)0.6650 (ATP5E)
Gene 40.6227 (PTPRF)0.6115 (PTPRF)0.6404 (HSP90B1)
Gene 50.5918 (LGALS3)0.5986 (HSP90B1)0.6347 (LGALS3)
Gene 60.5903 (HSP90B1)0.5967 (CST3)0.6262 (CD24)
Gene 70.5880 (CST3)0.5572 (ACTG1)0.6049 (PTPRF)
Gene 80.5812 (KRT19)0.5449 (MYH9)0.5927 (FN1)
Gene 90.5750 (PSMB4)0.5400 (PSMB4)0.5905 (PTMA)
Gene 100.5662 (GNAS)0.5393 (LGALS3)0.5832 (FTH1)
Gene 110.5353 (EEF2)0.5384 (KRT19)0.5763 (PSMB4)
Gene 120.5317 (ACTG1)0.5310 (CD24)0.5726 (ACTB)
Gene 130.5301 (IGLC2)0.5293 (FTH1)0.5642 (MYH9)
Gene 140.5287 (LASP1)0.5185 (COL1A2)0.5465 (GNAS)
p-value0.0001 (<0.05)0.0001 (<0.05)N/A
Table 4. Overview of performance comparison between EMGP-Net, EMGP-Net-noAttn, EfficientFormer, and MambaVision. The highest PCC values for the common genes among the top 14 genes predicted by the different models are in bold and underlined.
Table 4. Overview of performance comparison between EMGP-Net, EMGP-Net-noAttn, EfficientFormer, and MambaVision. The highest PCC values for the common genes among the top 14 genes predicted by the different models are in bold and underlined.
GeneEfficientFormerMambaVisionEMGP-Net-noAttnEMGP-Net
PTMA0.77770.76740.77910.7903
GNAS0.76610.77630.77000.7843
B2M0.77460.80490.77680.7777
HNRNPA2B10.72660.71980.73310.7532
TPT10.72450.73630.73560.7360
XBP10.69650.70100.72370.7339
ACTG10.70750.70420.72710.7318
HLA-B0.69290.70320.70050.7228
HLA-DRA0.69640.70890.69590.7056
ACTB0.68390.68090.69510.7020
Table 5. Overview of EMGP-Net performance vs. ST-Net and GeNetFormer performance. The highest PCC values for common genes among the top 14 genes predicted by the different models are in bold and underlined.
Table 5. Overview of EMGP-Net performance vs. ST-Net and GeNetFormer performance. The highest PCC values for common genes among the top 14 genes predicted by the different models are in bold and underlined.
GeneST-NetGeNetFormerEMGP-Net
ACTG10.64600.65100.7051
CALR0.62730.58640.7047
RPL230.65920.61300.6973
GNAS0.67080.59430.6962
PTPRF0.65030.59740.6867
TAGLN20.62740.56380.6814
DDX50.64060.70690.6724
HSPB10.62350.58400.6645
PTMA0.62010.62350.6588
Table 6. Overview of EMGP-Net performance vs. ST-Net and GeNetFormer performance. The highest PCC values for common genes among the top 14 genes predicted by the different models are in bold and underlined.
Table 6. Overview of EMGP-Net performance vs. ST-Net and GeNetFormer performance. The highest PCC values for common genes among the top 14 genes predicted by the different models are in bold and underlined.
GeneST-NetGeNetFormerEMGP-Net
ERBB20.66200.61410.7285
S100A110.63740.64340.6686
ATP5E0.67190.67460.6650
HSP90B10.59030.59860.6404
LGALS30.59180.53930.6347
PTPRF0.62270.61150.6049
PSMB40.57500.54000.5763
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Thâalbi, O.; Akhloufi, M.A. EMGP-Net: A Hybrid Deep Learning Architecture for Breast Cancer Gene Expression Prediction. Computers 2025, 14, 253. https://doi.org/10.3390/computers14070253

AMA Style

Thâalbi O, Akhloufi MA. EMGP-Net: A Hybrid Deep Learning Architecture for Breast Cancer Gene Expression Prediction. Computers. 2025; 14(7):253. https://doi.org/10.3390/computers14070253

Chicago/Turabian Style

Thâalbi, Oumeima, and Moulay A. Akhloufi. 2025. "EMGP-Net: A Hybrid Deep Learning Architecture for Breast Cancer Gene Expression Prediction" Computers 14, no. 7: 253. https://doi.org/10.3390/computers14070253

APA Style

Thâalbi, O., & Akhloufi, M. A. (2025). EMGP-Net: A Hybrid Deep Learning Architecture for Breast Cancer Gene Expression Prediction. Computers, 14(7), 253. https://doi.org/10.3390/computers14070253

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop