Previous Article in Journal
Chronic Myeloid Leukemia and the T315I BCR::ABL1 Mutation
Previous Article in Special Issue
AlphaFold 3-Assisted Deciphering of the DNA Recognition by DREB1 Transcription Factors in Rice
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

TruMPET: A New Method for Protein Secondary Structure Prediction Using Neural Networks Trained on Multiple Pre-Selected Physicochemical and Structural Features

by
Yury V. Milchevskiy
*,
Galina I. Kravatskaya
and
Yury V. Kravatsky
Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov Str., 32, 119991 Moscow, Russia
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2025, 26(23), 11284; https://doi.org/10.3390/ijms262311284
Submission received: 9 October 2025 / Revised: 18 November 2025 / Accepted: 20 November 2025 / Published: 21 November 2025
(This article belongs to the Special Issue Recent Research of Protein Structure Prediction and Design)

Abstract

Protein structure prediction continues to pose multiple challenges, despite the progress made by ML. While recent deep learning models have achieved a strong performance using embeddings from protein language models, they often ignore non-canonical amino acids and rely heavily on sequence alignments or evolutionary profiles. Here, we present an improvement to this approach for predicting the secondary protein structure of DSSP classes solely from amino acid sequences. We suggest that ML feature sets should be generated from statistically significant mutually uncorrelated descriptors. The selection of statistically assessed descriptors, including predicting the physicochemical parameters of non-canonical amino acids, is a key component of the proposed method. The statistical significance and influence of each of the suggested features were assessed using a two-step Linear Discriminant Analysis, which permitted the evaluation of the statistical significance of each descriptor and their impact on model accuracy. We applied the set of 109 most influential statistically significant descriptors as a learning model for the two-layer Bi-LSTM network combined with ESMFold2 embeddings. Our method, TruMPET (Training upon Multiple Pre-selected Elements Technique), outperformed all other methods reported in the literature for the non-redundant datasets (CB513: DSSP Q3 = 91.36% and Q8 = 85.41%, TEST2018: DSSP Q3 = 90.64% and Q8 = 84.17%).
Keywords: protein secondary structure; PSSP (protein secondary structure prediction); DSSP (Dictionary of Secondary Structure in Proteins); machine learning (ML); LDA (Linear Discriminant Analysis); ncAA (non-canonical Amino Acid) protein secondary structure; PSSP (protein secondary structure prediction); DSSP (Dictionary of Secondary Structure in Proteins); machine learning (ML); LDA (Linear Discriminant Analysis); ncAA (non-canonical Amino Acid)

Share and Cite

MDPI and ACS Style

Milchevskiy, Y.V.; Kravatskaya, G.I.; Kravatsky, Y.V. TruMPET: A New Method for Protein Secondary Structure Prediction Using Neural Networks Trained on Multiple Pre-Selected Physicochemical and Structural Features. Int. J. Mol. Sci. 2025, 26, 11284. https://doi.org/10.3390/ijms262311284

AMA Style

Milchevskiy YV, Kravatskaya GI, Kravatsky YV. TruMPET: A New Method for Protein Secondary Structure Prediction Using Neural Networks Trained on Multiple Pre-Selected Physicochemical and Structural Features. International Journal of Molecular Sciences. 2025; 26(23):11284. https://doi.org/10.3390/ijms262311284

Chicago/Turabian Style

Milchevskiy, Yury V., Galina I. Kravatskaya, and Yury V. Kravatsky. 2025. "TruMPET: A New Method for Protein Secondary Structure Prediction Using Neural Networks Trained on Multiple Pre-Selected Physicochemical and Structural Features" International Journal of Molecular Sciences 26, no. 23: 11284. https://doi.org/10.3390/ijms262311284

APA Style

Milchevskiy, Y. V., Kravatskaya, G. I., & Kravatsky, Y. V. (2025). TruMPET: A New Method for Protein Secondary Structure Prediction Using Neural Networks Trained on Multiple Pre-Selected Physicochemical and Structural Features. International Journal of Molecular Sciences, 26(23), 11284. https://doi.org/10.3390/ijms262311284

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop