A Novel Machine-Learning Based Method for Resolving Secondary Structure Topology in Medium-Resolution Cryo-EM Density Maps
Abstract
1. Introduction
2. Results
2.1. Benchmark Dataset
2.2. Performance Measurements
2.3. Overall Correspondence Performance Across Classifiers
2.4. Helix Correspondence Accuracy
2.5. Strand Correspondence Accuracy
2.6. Runtime and Computational Efficiency
3. Discussion
4. Materials and Methods
4.1. Sequence-to-Structure Modelling
4.2. Voxel Extraction from 3D Model and Cryo-EM Density Map
4.3. Supervised Classification Formulation for SSE Correspondence
| Algorithm 1. Learning-based SSE correspondence via supervised classification. The pseudocode outlines the training and inference workflow for mapping model-derived SSEs (SSEs-S) to map-derived SSEs (SSEs-V): S and V are first split by SSE type (helix/strand), type-specific classifiers are trained (with GridSearchCV hyperparameter tuning for SVM/RF or a distance metric for Voronoi), and each map is classified to retrieve its matched model , yielding the correspondence set M. | |
| Input: Model SSE set S (SSEs-S), map SSE set V (SSEs-V), classifier type C Output: Correspondence set M | |
| 1. | |
| 2. | For eachdo |
| 3. | Construct labelled training set from (each assigned label ). |
| 4. | Ifthen |
| 5. | |
| 6. | |
| 7. | Set distance metric (e.g., Euclidean); |
| 8. | End if |
| 9. | Train classifier |
| 10. | End for |
| 11. | Initialize |
| 12. | For each map-derived SSE do |
| 13. | Let type of (helix/strand); select classifier |
| 14. | |
| 15. | Retrieve ∈ with label add () to M. |
| 16. | End for |
| 17. | Return M. |
4.4. Orientation (Direction) Resolution Using DTW
| Algorithm 2. Direction detection using DTW | |
| Input: (model SSE coordinates), (map stick coordinates) Output: (orientation; +1 forward, −1 reverse) | |
| 1. | |
| 2. | |
| 3. | |
| 4. | |
| 5. | |
| 6. | |
| 7. | |
| 8. | |
| 9. | |
| 10. | |
| 11. | |
| 12. | |
| 13. | |
| 14. | |
| 15. | |
| 16. | |
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Behkamal, B.; Naghibzadeh, M.; Pagnani, A.; Saberi, M.R.; Al Nasr, K. Solving the α-helix correspondence problem at medium-resolution Cryo-EM maps through modeling and 3D matching. J. Mol. Graph. Model. 2021, 103, 107815. [Google Scholar] [CrossRef] [PubMed]
- Jamali, K.; Käll, L.; Zhang, R.; Brown, A.; Kimanius, D.; Scheres, S.H.W. Automated model building and protein identification in cryo-EM maps. Nature 2024, 628, 450–457. [Google Scholar] [CrossRef] [PubMed]
- Behkamal, B.; Naghibzadeh, M.; Pagnani, A.; Saberi, M.R.; Al Nasr, K. LPTD: A novel linear programming-based topology determination method for cryo-EM maps. Bioinformatics 2022, 38, 2734–2741. [Google Scholar] [CrossRef] [PubMed]
- Bepler, T.; Morin, A.; Rapp, M.; Brasch, J.; Shapiro, L.; Noble, A.J.; Berger, B. Positive-unlabeled convolutional neural networks for particle picking in cryo-electron micrographs. Nat. Methods 2019, 16, 1153–1160. [Google Scholar] [CrossRef]
- Al-Haija, Q.; Al Nasr, K. Supervised Regression Study for Electron Microscopy Data. In Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), San Diego, CA, USA, 18–21 November 2019; pp. 2661–2668. [Google Scholar]
- Chiu, W.; Schmid, M.F. Pushing back the limits of electron cryomicroscopy. Nat. Struct. Biol. 1997, 4, 331–333. [Google Scholar]
- Zhou, Z.H.; Dougherty, M.; Jakana, J.; He, J.; Rixon, F.J.; Chiu, W. Seeing the herpesvirus capsid at 8.5 A. Science 2000, 288, 877–880. [Google Scholar] [CrossRef]
- Walls, A.C.; Park, Y.-J.; Tortorici, M.A.; Wall, A.; McGuire, A.T.; Veesler, D. Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein. Cell 2020, 181, 281–292, Erratum in Cell 2020, 183, 1735. [Google Scholar] [CrossRef]
- Yan, R.; Zhang, Y.; Li, Y.; Xia, L.; Guo, Y.; Zhou, Q. Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2. Science 2020, 367, 1444–1448. [Google Scholar] [CrossRef]
- Nakamura, T.; Wang, X.; Terashi, G.; Kihara, D. DAQ-Score Database: Assessment of map–model compatibility for protein structure models from cryo-EM maps. Nat. Methods 2023, 20, 775–776. [Google Scholar] [CrossRef]
- DeVore, K.; Chiu, P.-L. Probing Structural Perturbation of Biomolecules by Extracting Cryo-EM Data Heterogeneity. Biomolecules 2022, 12, 628. [Google Scholar] [CrossRef]
- Al Nasr, K.; He, J. Constrained cyclic coordinate descent for cryo-EM images at medium resolutions: Beyond the protein loop closure problem. Robotica 2016, 34, 1777–1790. [Google Scholar] [CrossRef] [PubMed]
- Al Nasr, K.; He, J. An effective convergence independent loop closure method using Forward-Backward Cyclic Coordinate Descent. Int. J. Data Min. Bioinform. 2009, 3, 346–361. [Google Scholar] [CrossRef] [PubMed]
- Casañal, A.; Shakeel, S.; Passmore, L.A. Interpretation of medium resolution cryoEM maps of multi-protein complexes. Curr. Opin. Struct. Biol. 2019, 58, 166–174. [Google Scholar] [CrossRef] [PubMed]
- Alshammari, M.; Wriggers, W.; Sun, J.; He, J. Refinement of AlphaFold2 models against experimental and hybrid cryo-EM density maps. QRB Discov. 2022, 3, e16. [Google Scholar] [CrossRef]
- Kleywegt, G.J.; Adams, P.D.; Butcher, S.J.; Lawson, C.L.; Rohou, A.; Rosenthal, P.B.; Subramaniam, S.; Topf, M.; Abbott, S.; Baldwin, P.R.; et al. Community recommendations on cryoEM data archiving and validation. IUCrJ 2024, 11, 140–151. [Google Scholar] [CrossRef]
- Bendory, T.; Bartesaghi, A.; Singer, A. Single-particle cryo-electron microscopy: Mathematical theory, computational challenges, and opportunities. IEEE Signal Process. Mag. 2020, 37, 58–76. [Google Scholar] [CrossRef]
- Levy, A.; Poitevin, F.; Martel, J.; Nashed, Y.; Peck, A.; Miolane, N.; Ratner, D.; Dunne, M.; Wetzstein, G. CryoAI: Amortized Inference of Poses for Ab Initio Reconstruction of 3D Molecular Volumes from Real Cryo-EM Images. In Proceedings of the Computer Vision—ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022; Proceedings, Part XXI (Tel Aviv, Israel, 2022); Springer: Cham, Switzerland, 2022; pp. 540–557. [Google Scholar]
- Kim, H.-U.; An, M.Y.; Kim, Y.K.; Chung, J.M.; Jung, H.S. Combining Cryo-EM with Computational Approaches To Revolutionize Structural Biology. Protein J. 2025, 44, 675–690. [Google Scholar] [CrossRef]
- Bouvier, G.; Bardiaux, B.; Pellarin, R.; Rapisarda, C.; Nilges, M. Building Protein Atomic Models from Cryo-EM Density Maps and Residue Co-Evolution. Biomolecules 2022, 12, 1290. [Google Scholar] [CrossRef]
- Iqbal, S.; Eng, E.T.; Kamal, M.A.; Shen, B. Artificial intelligence in cryo-EM: Emerging deep neural network methods from sample preparation, particle picking, map reconstruction, modelling to enhanced resolution. BMC Artif. Intell. 2026, 2, 2. [Google Scholar] [CrossRef]
- Aishima, J.; Russel, D.S.; Guibas, L.J.; Adams, P.D.; Brunger, A.T. Automated crystallographic ligand building using the medial axis transform of an electron-density isosurface. Acta Crystallogr. Sect. D 2005, 61, 1354–1363. [Google Scholar] [CrossRef]
- Si, D.; Ji, S.; Al Nasr, K.; He, J. A machine learning approach for the identification of protein secondary structure elements from cryoEM density maps. Biopolymers 2012, 97, 698–708. [Google Scholar] [CrossRef] [PubMed]
- Bansia, H.; des Georges, A. Connecting the dots: Deep learning-based automated model building methods in cryo-EM. Front. Mol. Biosci. 2025, 12, 1613399. [Google Scholar] [CrossRef]
- Si, D.; Nakamura, A.; Tang, R.; Guan, H.; Hou, J.; Firozi, A.; Cao, R.; Hippe, K.; Zhao, M. Artificial intelligence advances for de novo molecular structure modeling in cryo-electron microscopy. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2022, 12, e1542. [Google Scholar] [CrossRef]
- Sanchez-Garcia, R.; Gomez-Blanco, J.; Cuervo, A.; Carazo, J.M.; Sorzano, C.O.S.; Vargas, J. DeepEMhancer: A deep learning solution for cryo-EM volume post-processing. Commun. Biol. 2021, 4, 874. [Google Scholar] [CrossRef] [PubMed]
- Li, R.; Si, D.; Zeng, T.; Ji, S.; He, J. Deep convolutional neural networks for detecting secondary structures in protein density maps from cryo-electron microscopy. In 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); IEEE: New York, NY, USA, 2016; pp. 41–46. [Google Scholar]
- Bataineh, M.; Al Nasr, K.; Mu, R.; Alamri, M. Correction to: Deep Learning Approach to Identify Protein’s Secondary Structure Elements. In Bioinformatics Research and Applications (ISBRA 2024); Lecture Notes in Computer Science; Peng, W., Cai, Z., Skums, P., Eds.; Springer Nature: Singapore, 2024; Volume 14954, pp. 461–472. [Google Scholar] [CrossRef]
- He, J.; Huang, S.-Y. EMNUSS: A deep learning framework for secondary structure annotation in cryo-EM maps. Brief. Bioinform. 2021, 22, bbab156. [Google Scholar] [CrossRef]
- Sazzed, S. Determining Protein Secondary Structures in Heterogeneous Medium-Resolution Cryo-EM Images Using CryoSSESeg. ACS Omega 2024, 9, 26409–26416. [Google Scholar] [CrossRef]
- Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436. [Google Scholar] [CrossRef]
- Samek, W.; Wiegand, T.; Müller, K.-R. Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models. arXiv 2017, arXiv:1708.08296. [Google Scholar] [CrossRef]
- Shorten, C.; Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
- Si, D.; Moritz, S.A.; Pfab, J.; Hou, J.; Cao, R.; Wang, L.; Wu, T.; Cheng, J. Deep Learning to Predict Protein Backbone Structure from High-Resolution Cryo-EM Density Maps. Sci. Rep. 2020, 10, 4282. [Google Scholar] [CrossRef]
- Al Nasr, K.; Ranjan, D.; Zubair, M.; Chen, L.; He, J. Solving the Secondary Structure Matching Problem in Cryo-EM De Novo Modeling Using a Constrained K-Shortest Path Graph Algorithm. IEEE/ACM Trans. Comput. Biol. Bioinform. 2014, 11, 419–430. [Google Scholar] [CrossRef]
- Singer, A.; Sigworth, F.J. Computational Methods for Single-Particle Electron Cryomicroscopy. Annu. Rev. Biomed. Data Sci. 2020, 3, 163–190. [Google Scholar] [CrossRef]
- Behkamal, B.; Naghibzadeh, M.; Saberi, M.R.; Tehranizadeh, Z.A.; Pagnani, A.; Al Nasr, K. Three-Dimensional Graph Matching to Identify Secondary Structure Correspondence of Medium-Resolution Cryo-EM Density Maps. Biomolecules 2021, 11, 1773. [Google Scholar] [CrossRef]
- Al Nasr, K.; Ranjan, D.; Zubair, M.; He, J. Ranking Valid Topologies of the Secondary Structure elements Using a constraint Graph. J. Bioinform. Comput. Biol. 2011, 9, 415–430. [Google Scholar] [CrossRef]
- Al Nasr, K.; Yousef, F.; Jebril, R.; Jones, C. Analytical Approaches to Improve Accuracy in Solving the Protein Topology Problem. Molecules 2018, 23, 28. [Google Scholar] [CrossRef] [PubMed]
- Abeysinghe, S.; Ju, T.; Baker, M.L.; Chiu, W. Shape modeling and matching in identifying 3D protein structures. Comput.-Aided Des. 2008, 40, 708–720. [Google Scholar] [CrossRef]
- Baker, M.L.; Ju, T.; Chiu, W. Identification of secondary structure elements in intermediate-resolution density maps. Structure 2007, 15, 7–19. [Google Scholar] [CrossRef]
- He, J.; Huang, S.-Y. Full-length de novo protein structure determination from cryo-EM maps using deep learning. Bioinformatics 2021, 37, 3480–3490. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Zhang, B.; Freddolino, L.; Zhang, Y. CR-I-TASSER: Assemble protein structures from cryo-EM density maps using deep convolutional neural networks. Nat. Methods 2022, 19, 195–204. [Google Scholar] [CrossRef]
- Si, D.; He, J. Beta-sheet Detection and Representation from Medium Resolution Cryo-EM Density Maps. In BCB’13: Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics; Association for Computing Machinery: New York, NY, USA, 2013; pp. 764–770. [Google Scholar]
- Mostosi, P.; Schindelin, H.; Kollmannsberger, P.; Thorn, A. Haruspex: A Neural Network for the Automatic Identification of Oligonucleotides and Protein Secondary Structure in Cryo-Electron Microscopy Maps. Angew. Chem. Int. Ed. 2020, 59, 14788–14795. [Google Scholar] [CrossRef] [PubMed]
- Wang, X.; Alnabati, E.; Aderinwale, T.W.; Maddhuri Venkata Subramaniya, S.R.; Terashi, G.; Kihara, D. Detecting protein and DNA/RNA structures in cryo-EM maps of intermediate resolution using deep learning. Nat. Commun. 2021, 12, 2302. [Google Scholar] [CrossRef] [PubMed]
- Mu, Y.; Sazzed, S.; Alshammari, M.; Sun, J.; He, J. A Tool for Segmentation of Secondary Structures in 3D Cryo-EM Density Map Components Using Deep Convolutional Neural Networks. Front. Bioinform. 2021, 1, 710119. [Google Scholar] [CrossRef]
- He, J.; Lin, P.; Chen, J.; Cao, H.; Huang, S.-Y. Model building of protein complexes from intermediate-resolution cryo-EM maps with deep learning-guided automatic assembly. Nat. Commun. 2022, 13, 4066. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Z.; Cai, Y.; Zhang, B.; Zheng, W.; Freddolino, L.; Zhang, G.; Zhou, X. DEMO-EM2: Assembling protein complex structures from cryo-EM maps through intertwined chain and domain fitting. Brief. Bioinform. 2024, 25, bbae113. [Google Scholar] [CrossRef]
- Giri, N.; Wang, L.; Cheng, J. Cryo2StructData: A Large Labeled Cryo-EM Density Map Dataset for AI-based Modeling of Protein Structures. Sci. Data 2024, 11, 458. [Google Scholar] [CrossRef]











| Row | PDB ID (EMDB ID) | AA Count | ∝/∝-β Protein | |||||
|---|---|---|---|---|---|---|---|---|
| 1 | 1A7D | 118 | ∝ protein | 6 | - | 6 | 4 | - |
| 2 | 1BZ4 | 144 | ∝ protein | 5 | - | 5 | 5 | - |
| 3 | 1FLP | 142 | ∝ protein | 7 | - | 7 | 6 | - |
| 4 | 1HG5 | 289 | ∝ protein | 11 | - | 11 | 9 | - |
| 5 | 1HZ4 | 373 | ∝ protein | 21 | - | 21 | 19 | - |
| 6 | 1ICX | 155 | ∝-β protein | 6 | 7 | 13 | 4 | 7 |
| 7 | 1LWB | 122 | ∝ protein | 6 | - | 9 | 6 | - |
| 8 | 1NG6 | 148 | ∝ protein | 9 | - | 6 | 7 | - |
| 9 | 1OZ9 | 150 | ∝-β protein | 5 | 5 | 10 | 5 | 4 |
| 10 | 1P5X | 245 | ∝ protein | 19 | - | 19 | 13 | - |
| 11 | 1XQO | 256 | ∝ protein | 14 | - | 14 | 14 | - |
| 12 | 1YD0 | 96 | ∝-β protein | 5 | 3 | 8 | 4 | 3 |
| 13 | 1Z1L | 345 | ∝ protein | 23 | - | 23 | 14 | - |
| 14 | 2OVJ | 201 | ∝ protein | 12 | - | 12 | 8 | - |
| 15 | 2XB5 | 207 | ∝ protein | 13 | - | 13 | 9 | - |
| 16 | 2XVV | 585 | ∝ protein | 33 | - | 33 | 19 | - |
| 17 | 2Y4Z | 140 | ∝-β protein | 6 | 2 | 8 | 6 | 2 |
| 18 | 3ACW | 293 | ∝ protein | 17 | - | 17 | 14 | - |
| 19 | 3HJL | 329 | ∝ protein | 20 | - | 20 | 20 | - |
| 20 | 3IEE | 270 | ∝ protein | 9 | - | 9 | 8 | - |
| 21 | 3IXV | 222 | ∝ protein | 14 | - | 14 | 10 | - |
| 22 | 3LTJ | 201 | ∝ protein | 16 | - | 16 | 12 | - |
| 23 | 3ODS | 415 | ∝ protein | 21 | - | 21 | 16 | - |
| 24 | 4OXW | 119 | ∝-β protein | 5 | 3 | 8 | 4 | 3 |
| 25 | 4R9A | 144 | β protein | - | 11 | 11 | 22 | 10 |
| 26 | 4UE4 | 102 | ∝ protein | 6 | - | 6 | 5 | - |
| 27 | 4YOK | 204 | β protein | 16 | - | 16 | 14 | - |
| 28 | 3C91 (EMD-1733) | 233 | ∝-β protein | 8 | 10 | 18 | 6 | 9 |
| 29 | 3FIN (EMD-5030) | 117 | ∝ protein | 4 | - | 4 | 4 | - |
| 30 | 4CHV (EMD-2526) | 361 | ∝-β protein | 15 | 8 | 23 | 15 | 7 |
| 31 | 5I1M (EMD-8070) | 458 | ∝ protein | 19 | - | 19 | 17 | - |
| 32 | 5KBU (EMD-8231) | 1034 | ∝-β protein | 33 | 32 | 65 | 28 | 26 |
| 33 | 5M50 (EMD-4154) | 439 | ∝-β protein | 15 | 12 | 27 | 14 | 12 |
| 34 | 5O8O (EMD-3761) | 349 | ∝-β protein | 3 | 21 | 24 | 3 | 19 |
| 35 | 5UZB (EMD-8625) | 177 | ∝-β protein | 9 | 4 | 13 | 6 | 3 |
| 36 | 6EM3 (EMD-3888) | 291 | ∝-β protein | 3 | 8 | 11 | 3 | 6 |
| 37 | 6F36 (EMD-4176) | 327 | ∝ protein | 13 | - | 13 | 11 | - |
| 38 | 6UXW (EMD20934) | 1703 | ∝-β protein | 33 | 10 | 43 | 27 | 8 |
| Method A | Method B | p-Value | Significant |
|---|---|---|---|
| Voronoi | SVM-linear | 0.109 | No |
| Voronoi | Random Forest | 0.241 | No |
| Voronoi | SVM-RBF | 0.183 | No |
| Voronoi | LPTD | 0.000 | Yes |
| SVM-linear | Random Forest | 0.943 | No |
| SVM-linear | SVM-RBF | 0.695 | No |
| SVM-linear | LPTD | 0.001 | Yes |
| Random Forest | SVM-RBF | 0.601 | No |
| Random Forest | LPTD | 0.001 | Yes |
| SVM-RBF | LPTD | 0.002 | Yes |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Behkamal, B.; Etemadheravi, M.P.; Mahmoodjanloo, A.; Mansoori, A.; Naghibzadeh, M.; Al Nasr, K.; Saberi, M.R. A Novel Machine-Learning Based Method for Resolving Secondary Structure Topology in Medium-Resolution Cryo-EM Density Maps. Int. J. Mol. Sci. 2026, 27, 4388. https://doi.org/10.3390/ijms27104388
Behkamal B, Etemadheravi MP, Mahmoodjanloo A, Mansoori A, Naghibzadeh M, Al Nasr K, Saberi MR. A Novel Machine-Learning Based Method for Resolving Secondary Structure Topology in Medium-Resolution Cryo-EM Density Maps. International Journal of Molecular Sciences. 2026; 27(10):4388. https://doi.org/10.3390/ijms27104388
Chicago/Turabian StyleBehkamal, Bahareh, Mohammad Parsa Etemadheravi, Ali Mahmoodjanloo, Amin Mansoori, Mahmoud Naghibzadeh, Kamal Al Nasr, and Mohammad Reza Saberi. 2026. "A Novel Machine-Learning Based Method for Resolving Secondary Structure Topology in Medium-Resolution Cryo-EM Density Maps" International Journal of Molecular Sciences 27, no. 10: 4388. https://doi.org/10.3390/ijms27104388
APA StyleBehkamal, B., Etemadheravi, M. P., Mahmoodjanloo, A., Mansoori, A., Naghibzadeh, M., Al Nasr, K., & Saberi, M. R. (2026). A Novel Machine-Learning Based Method for Resolving Secondary Structure Topology in Medium-Resolution Cryo-EM Density Maps. International Journal of Molecular Sciences, 27(10), 4388. https://doi.org/10.3390/ijms27104388
