Large Language Models for High-Entropy Alloys: Literature Mining, Design Orchestration, and Evaluation Standards
Abstract
1. Introduction
2. Current Status and Core Challenges in High-Entropy Alloy Research
2.1. Condition Dependence and Fragmentation of HEA Evidence
2.2. Performance Optimization and Mechanistic Understanding
2.3. Advanced Fabrication and Processing Techniques
2.4. Summary of Key Challenges
2.5. Positioning Relative to Adjacent Paradigms
3. Core Capabilities of Large Language Models and Alignment with HEA Research
3.1. Large-Scale Text Comprehension, Knowledge Extraction and Natural-Language Reasoning
3.2. Code Generation and Workflow Automation
3.3. Cross-Modal and Logical Reasoning
3.4. LLMs vs. Traditional Machine Learning: Comparative Advantages and a Synergistic Paradigm
3.5. Emerging Landscape: Recent LLM Applications in HEA Research (2024–2025)
4. Key Application Scenarios of LLMs in HEA Research
4.1. Intelligent Literature Mining and Knowledge-Graph Construction
4.2. Data-Driven Composition and Process Design Assistant
4.2.1. Typical Workflow for ML-Assisted HEA Design
4.2.2. LLM-Augmented Composition and Process Design
4.3. Multiscale Simulation Interface and Integrator
4.4. Experimental Data Analysis and Mechanism Hypothesis Generation
4.5. Toward Executable, Trustworthy LLM Agent Architectures
5. Challenges in Applying LLMs to HEA Research
5.1. Case Study: LLM-Augmented Discovery of a Corrosion-Resistant HEA
- Phase 1: Evidence-grounded literature mining and hypothesis formulation.
- Phase 2: Tool-using screening with explicit constraints and validators.
- Phase 3: Experimental validation and structured knowledge update.
- What this case study demonstrates.
5.2. Technical Challenges
5.2.1. Data Quality and Scarcity
5.2.2. Model Accuracy, Hallucination and Reliability
5.2.3. Integration of Multimodal Data and Context
5.3. Systemic and Pragmatic Challenges
5.3.1. Computational Resources and Domain Adaptation
5.3.2. Open-Source and Collaborative Infrastructure
5.3.3. Strategic Model Selection: Open vs. Closed Source
5.3.4. Ethical Considerations, Bias, and the Challenge of Trustworthy AI
5.3.5. Towards a Human-in-the-Loop Collaborative Model
6. Future Directions
6.1. Domain-Specialized Foundation Models
6.2. Closed-Loop, Autonomous Research Pipelines
6.3. Enhanced Scientific Discovery and Reasoning
6.3.1. LLM-Assisted Hypothesis Generation and Pattern Discovery
6.3.2. LLM-Centered Meta-Optimization and “Process Engineering”
6.4. Open and Collaborative Ecosystems
6.5. Benchmarks and Evaluation for HEA-Focused LLMs
7. Benchmarks, Protocols, and Considerations for Reproducibility
7.1. Foundational Resources for Development and Benchmarking
7.2. Evaluation Metrics
- 1.
- Literature Mining and Knowledge Extraction
- 2.
- Code and Workflow Generation
- 3.
- Multimodal Scientific Understanding
- 4.
- Constrained Design and Hypothesis Generation
7.3. Special Considerations for Calibrating Scientific Numeric and Symbolic Outputs
8. Summary and Key Findings
9. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Yeh, J.-W.; Chen, S.K.; Lin, S.-J.; Gan, J.-Y.; Chin, T.-S.; Shun, T.-T.; Tsau, C.-H.; Chang, S.-Y. Nanostructured high-entropy alloys with multiple principal elements: Novel alloy design concepts and outcomes. Adv. Eng. Mater. 2004, 6, 299–303. [Google Scholar] [CrossRef]
- Fisher, D. High-Entropy Alloys—Microstructures and Properties; Trans Tech Publications Ltd.: Zürich, Switzerland, 2015. [Google Scholar]
- Nartita, R.; Ionita, D.; Demetrescu, I.A. A modern approach to heas: From structure to properties and potential applications. Crystals 2024, 14, 451. [Google Scholar] [CrossRef]
- Xiong, W.; Guo, A.X.Y.; Zhan, S.; Liu, C.-T.; Cao, S.C. Refractory high-entropy alloys: A focused review of preparation methods and properties. J. Mater. Sci. Technol. 2023, 142, 196–215. [Google Scholar] [CrossRef]
- Atli, K.C.; Karaman, I. A short review on the ultra-high temperature mechanical properties of refractory high entropy alloys. Front. Met. Alloys 2023, 2, 1135826. [Google Scholar] [CrossRef]
- Cui, J.-M.; Nong, Z.-S.; Cui, X.; Xu, Q.-G.; Zhang, H.-L.; Leng, Y.; Xu, R.-Z.; Arzikulov, E. Effect of carbon addition on microstructures and mechanical properties of laser cladding alcocrfeni2.1 alloy coatings. Mater. Today Commun. 2025, 42, 111534. [Google Scholar] [CrossRef]
- Hamdi, H.; Abedi, H.R.; Zhang, Y. A review study on thermal stability of high entropy alloys: Normal/abnormal resistance of grain growth. J. Alloys Compd. 2023, 960, 170826. [Google Scholar] [CrossRef]
- Li, D.; Liaw, P.K.; Xie, L.; Zhang, Y.; Wang, W. Advanced high-entropy alloys breaking the property limits of current materials. J. Mater. Sci. Technol. 2024, 186, 219–230. [Google Scholar] [CrossRef]
- Tsai, K.Y.; Tsai, M.H.; Yeh, J.W. Sluggish diffusion in co–cr–fe–mn–ni high-entropy alloys. Acta Mater. 2013, 61, 4887–4897. [Google Scholar] [CrossRef]
- Dąbrowa, J.; Zajusz, M.; Kucza, W.; Cieślak, G.; Berent, K.; Czeppe, T.; Danielewski, M. Demystifying the sluggish diffusion effect in high entropy alloys. J. Alloys Compd. 2019, 783, 193–207. [Google Scholar] [CrossRef]
- Arun, S.; Radhika, N.; Saleh, B. Effect of additional alloying elements on microstructure and properties of alcocrfeni high-entropy alloy system: A comprehensive review. Met. Mater. Int. 2025, 31, 285–324. [Google Scholar] [CrossRef]
- Himanen, L.; Geurts, A.; Foster, A.S.; Rinke, P. Data-driven materials science: Status, challenges, and perspectives. Adv. Sci. 2019, 6, 1900808. [Google Scholar] [CrossRef]
- Liu, Z.-K. First-principles calculations and calphad modeling of thermodynamics. In Zentropy; Jenny Stanford Publishing: Singapore, 2024; pp. 3–50. [Google Scholar]
- Hambarde, K.A.; Proenca, H. Information retrieval: Recent advances and beyond. IEEE Access 2023, 11, 76581–76604. [Google Scholar] [CrossRef]
- Han, S.; Wang, M.; Zhang, J.; Li, D.; Duan, J. A review of large language models: Fundamental architectures, key technological evolutions, interdisciplinary technologies integration, optimization and compression techniques, applications, and challenges. Electronics 2024, 13, 5040. [Google Scholar] [CrossRef]
- Li, Z.; Pradeep, K.G.; Deng, Y.; Raabe, D.; Tasan, C.C. Metastable high-entropy dual-phase alloys overcome the strength–ductility trade-off. Nature 2016, 534, 227–230. [Google Scholar] [CrossRef]
- Lei, Z.; Liu, X.; Yuan, W.; Wang, H.; Jiang, S.; Wang, S.; Hui, X.; Wu, Y.; Gault, B.; Kontis, P.; et al. Enhanced strength and ductility in a high-entropy alloy via ordered oxygen complexes. Nature 2018, 563, 546–550. [Google Scholar] [CrossRef] [PubMed]
- Yang, Y.; Chen, T.; Tan, L.; Poplawsky, J.D.; An, K.; Wang, Y.; Samolyuk, G.D.; Littrell, K.; Lupini, A.R.; Borisevich, A.; et al. Bifunctional nanoprecipitates strengthen and ductilize a medium-entropy alloy. Nature 2021, 595, 245–249. [Google Scholar] [CrossRef]
- Yang, T.; Zhao, Y.L.; Tong, Y.; Jiao, Z.B.; Wei, J.; Cai, J.X.; Han, X.D.; Chen, D.; Hu, A.; Kai, J.J.; et al. Multicomponent intermetallic nanoparticles and superb mechanical behaviors of complex alloys. Science 2018, 362, 933–937. [Google Scholar] [CrossRef] [PubMed]
- Huo, W.; Fang, F.; Zhou, H.; Xie, Z.; Shang, J.; Jiang, J. Remarkable strength of cocrfeni high-entropy alloy wires at cryogenic and elevated temperatures. Scr. Mater. 2017, 141, 125–128. [Google Scholar] [CrossRef]
- Liu, X.; Zhang, J.; Pei, Z. Machine learning for high-entropy alloys: Progress, challenges and opportunities. Prog. Mater. Sci. 2023, 131, 101018. [Google Scholar] [CrossRef]
- Elkatatny, S.; Abd-Elaziem, W.; Sebaey, T.A.; Darwish, M.A.; Hamada, A. Machine-learning synergy in high-entropy alloys: A review. J. Mater. Res. Technol. 2024, 33, 3976–3997. [Google Scholar] [CrossRef]
- Golbabaei, M.H.; Zohrevand, M.; Zhang, N. Applications of machine learning in high-entropy alloys: A review of recent advances in design, discovery, and characterization. Nanoscale 2025, 17, 20548–20605. [Google Scholar] [CrossRef]
- Sun, Y.; Ni, J. Machine learning advances in high-entropy alloys: A mini-review. Entropy 2024, 26, 1119. [Google Scholar] [CrossRef]
- Bagdasaryan, A.; Pshyk, A.; Coy, L.; Kempiński, M.; Pogrebnjak, A.; Beresnev, V.; Jurga, S. Structural and mechanical characterization of (tizrnbhfta) n/wn multilayered nitride coatings. Mater. Lett. 2018, 229, 364–367. [Google Scholar] [CrossRef]
- Wang, W.; Wang, J.; Yi, H.; Qi, W.; Peng, Q. Effect of molybdenum additives on corrosion behavior of (cocrfeni)100xmox high-entropy alloys. Entropy 2018, 20, 908. [Google Scholar] [CrossRef]
- Tshitoyan, V.; Dagdelen, J.; Weston, L.; Dunn, A.; Rong, Z.; Kononova, O.; Persson, K.A.; Ceder, G.; Jain, A. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 2019, 571, 95–98. [Google Scholar] [CrossRef]
- Yurchenko, N.; Stepanov, N.; Salishchev, G. Laves-phase formation criterion for high-entropy alloys. Mater. Sci. Technol. 2017, 33, 17–22. [Google Scholar] [CrossRef]
- Zhang, J.; Chen, X.; Ye, X.; Yang, Y.; Ai, B. Large language model in materials science: Roles, challenges, and strategic outlook. Adv. Intell. Discov. 2025, 202500085. [Google Scholar] [CrossRef]
- Miret, S.; Krishnan, N.A. Enabling large language models for real-world materials discovery. Nat. Mach. Intell. 2025, 7, 991–998. [Google Scholar] [CrossRef]
- Lv, T.; Zou, W.; He, J.; Ju, X.; Zheng, C. Study on the microstructure and properties of feconicral high-entropy alloy coating prepared by laser cladding–remelting. Coatings 2023, 14, 49. [Google Scholar] [CrossRef]
- Cui, M.; Zhang, Y.; Xu, B.; Xu, F.; Chen, J.; Zhang, S.; Chen, C.; Luo, Z. Highentropy alloy nanomaterials for electrocatalysis. Chem. Commun. 2024, 60, 12615–12632. [Google Scholar] [CrossRef]
- Yue, W.; Zhang, Y.; Zheng, Z.; Lai, Y. Hybrid laser additive manufacturing of metals: A review. Coatings 2024, 14, 315. [Google Scholar] [CrossRef]
- Zheng, H.; Fu, J.; Wang, Y.; Dong, Y. Controlling microstructural gradients in laser-clad alcocrfeni2.1 eheas. Surf. Coat. Technol. 2025, 518, 132885. [Google Scholar] [CrossRef]
- Hashemi, S.M.; Parvizi, S.; Baghbanijavid, H.; Tan, A.T.L.; Nematollahi, M.; Ramazani, A.; Fang, N.X.; Elahinia, M. Computational modelling of process–structure–property–performance relationships in metal additive manufacturing: A review. Int. Mater. Rev. 2022, 67, 1–46. [Google Scholar] [CrossRef]
- Zhang, J.; Cai, C.; Kim, G.; Wang, Y.; Chen, W. Composition design of high-entropy alloys with deep sets learning. npj Comput. Mater. 2022, 8, 89. [Google Scholar] [CrossRef]
- Gorsse, S.; Couziníe, J.-P.; Miracle, D.B. Database on the mechanical properties of high entropy alloys and complex concentrated alloys. Data Brief 2018, 21, 2664–2678. [Google Scholar] [CrossRef]
- Swain, M.C.; Cole, J.M. Chemdataextractor: A toolkit for automated extraction of chemical information from the scientific literature. J. Chem. Inf. Model. 2016, 56, 1894–1904. [Google Scholar] [CrossRef]
- Otis, R.; Liu, Z.-K. Pycalphad: Calphad-based computational thermodynamics in python. J. Open Res. Softw. 2017, 5, 13. [Google Scholar] [CrossRef]
- Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; McGrew, B. Gpt-4 technical report. arXiv 2023, arXiv:2303.08774. [Google Scholar] [CrossRef]
- Hu, S.; Ouyang, M.; Gao, D.; Shou, M.Z. The dawn of gui agent: A preliminary case study with claude 3.5 computer use. arXiv 2024, arXiv:2411.10323. [Google Scholar] [CrossRef]
- Grattafiori, A.; Dubey, A.; Jauhri, A.; Pandey, A.; Kadian, A.; Al-Dahle, A.; Letman, A.; Mathur, A.; Schelten, A.; Vaughan, A.; et al. The llama 3 herd of models. arXiv 2024, arXiv:2407.21783. [Google Scholar] [CrossRef]
- Xu, P.; Ding, Y.; Fan, W. ChartAdapter: Large Vision-Language Model for Chart Summarization. arXiv 2024, arXiv:2412.20715. [Google Scholar]
- Yao, S.; Zhao, J.; Yu, D.; Du, N.; Shafran, I.; Narasimhan, K.; Cao, Y. React: Synergizing reasoning and acting in language models. arXiv 2022, arXiv:2210.03629v3. [Google Scholar]
- Venugopal, V.; Olivetti, E. Matkg: An autonomously generated knowledge graph in materials science. Sci. Data 2024, 11, 217. [Google Scholar] [CrossRef]
- Lála, J.; O’Donoghue, O.; Shtedritski, A.; Cox, S.; Rodriques, S.G.; White, A.D. Paperqa: Retrieval-augmented generative agent for scientific research. arXiv 2023, arXiv:2312.07559. [Google Scholar]
- Lewis, P.; Perez, E.; Piktus, A.; Petroni, F.; Karpukhin, V.; Goyal, N.; Küttler, H.; Lewis, M.; Yih, W.-T.; Rocktäschel, T.; et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv. Neural Inf. Process. Syst. 2020, 33, 9459–9474. [Google Scholar]
- Huang, S.; Cole, J.M. A database of battery materials auto-generated using chemdataextractor. Sci. Data 2020, 7, 260. [Google Scholar] [CrossRef] [PubMed]
- Huang, S.; Cole, J.M. Batterybert: A pretrained language model for battery database enhancement. J. Chem. Inf. Model. 2022, 62, 6365–6377. [Google Scholar] [CrossRef] [PubMed]
- Schick, T.; Dwivedi-Yu, J.; Dess`I, R.; Raileanu, R.; Lomeli, M.; Zettlemoyer, L.; Cancedda, N.; Scialom, T. Toolformer: Language models can teach themselves to use tools. arXiv 2023, arXiv:2302.04761. [Google Scholar]
- Jain, A.; Ong, S.P.; Chen, W.; Medasani, B.; Qu, X.; Kocher, M.; Brafman, M.; Petretto, G.; Rignanese, G.-M.; Hautier, G.; et al. Fireworks: A dynamic workflow system designed for high-throughput applications. Concurr. Comput. Pract. Exp. 2015, 27, 5037–5059. [Google Scholar] [CrossRef]
- Weng, Y.; Gao, L.; Zhu, L.; Huang, J. Matqna: A benchmark dataset for multimodal large language models in materials characterization and analysis. arXiv 2025, arXiv:2509.11335. [Google Scholar]
- Zipoli, F.; Viterbo, V.; Schilter, O.; Kahle, L.; Laino, T. Prediction of phase diagrams and associated phase structural properties. Ind. Eng. Chem. Res. 2022, 61, 8378–8389. [Google Scholar] [CrossRef]
- Hao, Y.; Duo, L.; He, J. Autonomous materials synthesis laboratories: Integrating artificial intelligence with advanced robotics for accelerated discovery. ChemRxiv 2025. [Google Scholar] [CrossRef]
- Maqsood, A.; Chen, C.; Jacobsson, T.J. The future of material scientists in an age of artificial intelligence. Adv. Sci. 2024, 11, 2401401. [Google Scholar] [CrossRef]
- Kamnis, S. Introducing pre-trained transformers for high entropy alloy informatics. Mater. Lett. 2024, 358, 135871. [Google Scholar] [CrossRef]
- Kamnis, S.; Delibasis, K. High entropy alloy property predictions using a transformer-based language model. Sci. Rep. 2025, 15, 11861. [Google Scholar] [CrossRef] [PubMed]
- Zhen, S.; Zhang, L. AI-Driven Design of High-Entropy Alloys for Efficient Hydrogen Electrocatalysis. ChemRxiv 2025. [Google Scholar] [CrossRef]
- Luo, M.; Xie, Z.; Li, H.; Zhang, B.; Cao, J.; Huang, Y.; Qu, H.; Zhu, Q.; Chen, L.; Jiang, J.; et al. Physics-informed, dual-objective optimization of high-entropy-alloy nanozymes by a robotic AI chemist. Matter 2025, 8. [Google Scholar] [CrossRef]
- Choudhary, K. Atomgpt: Atomistic generative pretrained transformer for forward and inverse materials design. J. Phys. Chem. Lett. 2024, 15, 6909–6917. [Google Scholar] [CrossRef]
- Kumar, S.; Sourav, A.; Yebaji, S.; Chauhan, L.; Babu, A.; Chelvane, A. Effect of heat treatment on the oxidation behavior of an alcocrfeni2 near-eutectic high entropy alloy. Corros. Sci. 2023, 221, 111298. [Google Scholar] [CrossRef]
- Kanyane, L.R.; Malatji, N.; Shongwe, M.B. Hot corrosion, phase stability and compressive strength of alcrfenicu-nb high entropy alloy fabricated via additive manufacturing. Solid State Phenom. 2025, 378, 45–51. [Google Scholar] [CrossRef]
- Zhao, Y.M.; Zhang, J.Y.; Liaw, P.K.; Yang, T. Machine learning-based computational design methods for high-entropy alloys. High Entropy Alloys Mater. 2025, 3, 41–100. [Google Scholar] [CrossRef]
- Jiang, D.; Xie, L.; Wang, L. Current application status of multiscale simulation and machine learning in research on high entropy alloys. J. Mater. Res. Technol. 2023, 26, 1341–1374. [Google Scholar] [CrossRef]
- Zhao, S.; Jiang, B.; Song, K.; Liu, X.; Wang, W.; Si, D.; Zhang, J.; Chen, X.; Zhou, C.; Liu, P.; et al. Machine learning assisted design of high-entropy alloys with ultra-high microhardness and unexpected low density. Mater. Des. 2024, 238, 112634. [Google Scholar] [CrossRef]
- He, J.; Li, Z.; Zhao, P.; Zhang, H.; Zhang, F.; Wang, L.; Cheng, X. Machine learning-assisted design of high-entropy alloys with superior mechanical properties. J. Mater. Res. Technol. 2024, 33, 260–286. [Google Scholar] [CrossRef]
- Raman, L.; Debnath, A.; Furton, E.; Lin, S.; Krajewski, A.; Ghosh, S.; Liu, N.; Ahn, M.; Poudel, B.; Shang, S.; et al. Data-driven inverse design of monbtivwzr refractory multicomponent alloys: Microstructure and mechanical properties. Mater. Sci. Eng. A 2024, 918, 147475. [Google Scholar] [CrossRef]
- Yang, C.; Ren, C.; Jia, Y.; Wang, G.; Li, M.; Lu, W. A machine learning-based alloy design system to facilitate the rational design of high entropy alloys with enhanced hardness. Acta Mater. 2022, 222, 117431. [Google Scholar] [CrossRef]
- Xie, E.; Yang, C. Ai design for high entropy alloys: Progress, challenges and future prospects. Metals 2025, 15, 1012. [Google Scholar] [CrossRef]
- Zhang, Y.; Wen, C.; Dang, P.; Jiang, X.; Xue, D.; Su, Y. Elemental numerical descriptions to enhance classification and regression model performance for high-entropy alloys. npj Comput. Mater. 2025, 11, 75. [Google Scholar] [CrossRef]
- Shen, F.; Yu, L.; Fu, T.; Zhang, Y.; Wang, H.; Cui, K.; Wang, J.; Hussain, S.; Akhtar, N. Effect of the al, cr and b elements on the mechanical properties and oxidation resistance of nb-si based alloys: A review. Appl. Phys. A 2021, 127, 852. [Google Scholar] [CrossRef]
- Ni, B.; Glaser, B.; Taheri-Mousavi, S.M. End-to-end prediction and design of additively manufacturable alloys using a generative AlloyGPT model. npj Comput, Mater. 2025, 11, 294. [Google Scholar] [CrossRef]
- Walker, N.; Trewartha, A.; Huo, H.; Lee, S.; Cruse, K.; Dagdelen, J.; Dunn, A.; Persson, K.; Ceder, G.; Jain, A. The impact of domain-specific pre-training on ner in materials science. SSRN 2021, 3950755. [Google Scholar] [CrossRef]
- Gupta, T.; Zaki, M.; Krishnan, N.A.; Mausam. Matscibert: A materials domain language model for text mining and information extraction. npj Comput. Mater. 2022, 8, 102. [Google Scholar] [CrossRef]
- Gupta, T.; Zaki, M.; Khatsuriya, D.; Hira, K.; Krishnan, N.M.A.; Mausam. Discomat: Distantly supervised composition extraction from tables. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023; pp. 13465–13483. [Google Scholar]
- Miret, S.; Krishnan, N.M.A. Are llms ready for real-world materials discovery? arXiv 2024, arXiv:2402.05200. [Google Scholar] [CrossRef]
- Uhrin, L.; Huber, S.P.; Yu, J.; Marzari, N.; Pizzi, G.; Talirz, L. Workflows in AiiDA: Engineering a high-throughput, event-based engine for robust and modular computational workflows. Comput. Mater. Sci. 2021, 187, 110086. [Google Scholar] [CrossRef]
- Katz, D.S.; Babuji, Y.N.; Woodard, A.; Li, Z.; Clifford, B.; Kumar, R.; Lacinski, L.; Chard, R.; Wozniak, J.M.; Foster, I.; et al. Parsl: Pervasive parallel programming in Python. In Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing (HPDC ’19), Phoenix, AZ, USA, 22–29 June 2019; ACM2019, pp. 25–36. [Google Scholar] [CrossRef]
- Lee, J.W.; Park, W.B.; Lee, J.H.; Singh, S.P.; Sohn, K.S. A deep-learning technique for phase identification in multiphase inorganic compounds using synthetic XRD powder patterns. Nat. Commun. 2020, 11, 86. [Google Scholar] [CrossRef]
- Szymanski, N.J.; Fu, S.; Persson, E.; Ceder, G. Integrated analysis of X-ray diffraction patterns and pair distribution functions for machine-learned phase identification. npj Comput. Mater. 2024, 10, 45. [Google Scholar] [CrossRef]
- Vosoughi, A.; Shahnazari, A.; Xi, Y.; Zhang, Z.; Hess, G.; Xu, C.; Abdolrahim, N. OPENXRD: A Comprehensive Benchmark and Enhancement Framework for LLM/MLLM XRD Question Answering. arXiv 2025, arXiv:2507.09155. [Google Scholar] [CrossRef]
- Park, W.B.; Chung, J.; Jung, J.; Sohn, K.; Singh, S.P.; Pyo, M.; Shin, N.; Sohn, K.S. Classification of crystal structure using a convolutional neural network. IUCrJ 2017, 4, 486–494. [Google Scholar] [CrossRef]
- Samantaray, D.; Mondal, S.; Mishra, A. Nanoscale ordering in ptcu3 nanowires: Low-temperature synthesis and structural characterization of the l12 phase. Appl. Phys. A 2025, 131, 785. [Google Scholar] [CrossRef]
- Schober, M.; Schnitzer, R.; Leitner, H. Precipitation evolution in a ti-free and ti-containing stainless maraging steel. Ultramicroscopy 2009, 109, 553–562. [Google Scholar] [CrossRef]
- Keerthipalli, T.; Aepuru, R.; Biswas, A. Review on precipitation, intermetallic and strengthening of aluminum alloys. Proc. Inst. Mech. Eng. Part B J. Eng. Manuf. 2023, 237, 833–850. [Google Scholar] [CrossRef]
- Li, Y.; Li, T.; Tang, L.; Ma, S.; Wu, Q.; Gupta, P.; Bauchy, M. Convfeatnet ensemble: Integrating microstructure and pre-defined features for enhanced prediction of porous material properties. Mater. Sci. Eng. A 2025, 931, 148173. [Google Scholar] [CrossRef]
- Salgado, J.E.; Lerman, S.; Du, Z.; Xu, C.; Abdolrahim, N. Automated classification of big x-ray diffraction data using deep learning models. Npj Comput. Mater. 2023, 9, 214. [Google Scholar] [CrossRef]
- Karpas, E.; Abend, O.; Belinkov, Y.; Lenz, B.; Lieber, O.; Ratner, N.; Shoham, Y.; Bata, H.; Levine, Y.; Leyton-Brown, K.; et al. MRKL systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning. arXiv 2022, arXiv:2205.00445. [Google Scholar] [CrossRef]
- Foppiano, L.; Lambard, G.; Amagasa, T.; Ishii, M. Mining experimental data from materials science literature with large language models: An evaluation study. Sci. Technol. Adv. Mater. Methods 2024, 4, 2356506. [Google Scholar] [CrossRef]
- Walczak, M.; Nowak, W.J.; Okuniewski, W.; Chocyk, D. Effect of adding molybdenum on the microstructure and corrosion resistance of alcocrfenimo0.25 high-entropy alloy. Materials 2025, 18, 4566. [Google Scholar] [CrossRef] [PubMed]
- Huang, Z.-Q.; Chu, S.; Guo, Y.; Ouyang, J.; Ge, X.-W.; Zhang, Z.-J.; Wu, Y.-Y.; Li, C. Corrosion resistance prediction in high-entropy alloys and its application via a cpsp framework with mat-nrkg. npj Mater. Degrad. 2025, 9, 81. [Google Scholar] [CrossRef]
- Feng, R.; Zhang, C.; Gao, M.C.; Pei, Z.; Zhang, F.; Chen, Y.; Ma, D.; An, K.; Poplawsky, J.D.; Ouyang, L. High-throughput design of high-performance lightweight high-entropy alloys. Nat. Commun. 2021, 12, 4329. [Google Scholar] [CrossRef]
- Singh, M.; Barr, E.; Aidhy, D. Consolidated database of high entropy materials (COD’HEM): An open online database of high entropy materials. Comput. Mater. Sci. 2025, 248, 113588. [Google Scholar] [CrossRef]
- Pei, Z.; Yin, J.; Zhang, J. Language models for materials discovery and sustainability: Progress, challenges, and opportunities. Prog. Mater. Sci. 2025, 154, 101495. [Google Scholar] [CrossRef]
- Ji, Z.; Lee, N.; Frieske, R.; Yu, T.; Su, D.; Xu, Y.; Fung, P. Survey of hallucination in natural language generation. ACM Comput. Surv. 2023, 55, 1–38. [Google Scholar] [CrossRef]
- Chelli, M.; Descamps, J.; Lavoué, V.; Trojani, C.; Azar, M.; Deckert, M.; Ruetsch-Chelli, C. Hallucination rates and reference accuracy of ChatGPT and bard for systematic reviews: Comparative analysis. J. Med. Internet Res. 2024, 26, e53164. [Google Scholar] [CrossRef]
- Walters, W.H.; Wilder, E.I. Fabrication and errors in the bibliographic citations generated by ChatGPT. Sci. Rep. 2023, 13, 14045. [Google Scholar] [CrossRef]
- Orduña-Malea, E.; Cabezas-Clavijo, Á. ChatGPT and the potential growing of ghost bibliographic references. Scientometrics 2023, 128, 5351–5355. [Google Scholar] [CrossRef]
- Li, X.; Li, Z.; Chen, C.; Ren, Z.; Wang, C.; Liu, X.; Zhang, Q.; Chen, S. Calphad as a powerful technique for design and fabrication of thermoelectric materials. J. Mater. Chem. A 2021, 9, 6634–6649. [Google Scholar] [CrossRef]
- Emsley, R. Chatgpt: These are not hallucinations—they’re fabrications and falsifications. Schizophrenia 2023, 9, 52. [Google Scholar] [CrossRef]
- Huang, L.; Yu, W.; Ma, W.; Zhong, W.; Feng, Z.; Wang, H.; Chen, Q.; Peng, W.; Feng, X.; Qin, B.; et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Trans. Inf. Syst. 2025, 43, 1–55. [Google Scholar] [CrossRef]
- Zheng, D.; Lapata, M.; Pan, J.Z. Large language models as reliable knowledge bases? arXiv 2024, arXiv:2407.13578. [Google Scholar] [CrossRef]
- Manakul, P.; Liusie, A.; Gales, M. Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–10 December 2023; pp. 9004–9017. [Google Scholar]
- Zhang, T.; Qiu, L.; Guo, Q.; Deng, C.; Zhang, Y.; Zhang, Z.; Zhou, C.; Wang, X.; Fu, L. Enhancing uncertainty-based hallucination detection with stronger focus. arXiv 2023, arXiv:2311.13230. [Google Scholar] [CrossRef]
- Chen, C.; Liu, K.; Chen, Z.; Gu, Y.; Wu, Y.; Tao, M.; Fu, Z.; Ye, J. Inside: Llms’ internal states retain the power of hallucination detection. arXiv 2024, arXiv:2402.03744. [Google Scholar] [CrossRef]
- Su, W.; Wang, C.; Ai, Q.; Hu, Y.; Wu, Z.; Zhou, Y.; Liu, Y. Unsupervised real-time hallucination detection based on the internal states of large language models. arXiv 2024, arXiv:2403.06448. [Google Scholar] [CrossRef]
- Liang, T.; He, Z.; Jiao, W.; Wang, X.; Wang, Y.; Wang, R.; Yang, Y.; Shi, S.; Tu, Z. Encouraging divergent thinking in large language models through multiagent debate. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, USA, 12–16 November 2024; pp. 17889–17904. [Google Scholar]
- Moro, V.; Loh, C.; Dangovski, R.; Ghorashi, A.; Ma, A.; Chen, Z.; Kim, S.; Lu, P.Y.; Christensen, T.; Soljačić, M. Multimodal learning for materials. arXiv 2023, arXiv:2312.00111. [Google Scholar]
- Katzer, B.; Steffen, K.; Katrin, S. Towards an automated workflow in materials science for combining multi-modal simulation and experimental information using data mining and large language models. Mater. Today Commun. 2025, 45, 112186. [Google Scholar] [CrossRef]
- Li, J.; Li, D.; Savarese, S.; Hoi, S. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. In Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; Volume 202, pp. 19730–19742. Available online: https://proceedings.mlr.press/v202/li23q.html (accessed on 21 January 2026).
- Irwin, R.; Dimitriadis, S.; He, J.; Bjerrum, E.J. Chemformer: A pre-trained transformer for computational chemistry. Mach. Learn. Sci. Technol. 2022, 3, 015022. [Google Scholar] [CrossRef]
- Patterson, D.; Gonzalez, J.; Le, Q.; Liang, C.; Munguia, L.M.; Rothchild, D.; Dean, J. Carbon emissions and large neural network training. arXiv 2021, arXiv:2104.10350. [Google Scholar] [CrossRef]
- Brandon, W.; Mishra, M.; Nrusimha, A.; Panda, R.; Ragan-Kelley, J. Reducing transformer key-value cache size with cross-layer attention. Adv. Neural Inf. Process. Syst. 2024, 37, 86927–86957. [Google Scholar]
- Canty, R.B.; Bennett, J.A.; Brown, K.A.; Buonassisi, T.; Kalinin, S.V.; Kitchin, J.R.; Maruyama, B.; Moore, R.G.; Schrier, J.; Seifrid, M.; et al. Science acceleration and accessibility with self-driving labs. Nat. Commun. 2025, 16, 3856. [Google Scholar] [CrossRef] [PubMed]
- Bender, E.M.; Gebru, T.; McMillan-Major, A.; Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’21), Virtual Event Canada, 3–10 March 2021; pp. 610–623. [Google Scholar] [CrossRef]
- Mosqueira-Rey, E.; Hernández-Pereira, E.; Alonso-Ríos, D.; Bobes-Bascarán, J.; Fernández-Leal, Á. Human-in-the-loop machine learning: A state of the art. Artif. Intell. Rev. 2023, 56, 3005–3054. [Google Scholar] [CrossRef]
- Zaki, M.; Jayadeva Mausam Krishnan, N.M.A. Mascqa: Investigating matrials science knowledge of large language models. Digit. Discov. 2024, 3, 313–327. [Google Scholar] [CrossRef]
- Song, Y.; Miret, S.; Liu, B. Matsci-nlp: Evaluating scientific language models on materials science language tasks using text-to-schema modeling. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023. [Google Scholar]
- Häse, F.; Roch, L.M.; Aspuru-Guzik, A. Next-generation experimentation with self-driving laboratories. Trends Chem. 2019, 1, 282–291. [Google Scholar] [CrossRef]
- Boiko, D.A.; MacKnight, R.; Kline, B.; Gomes, G. Autonomous chemical research with large language models. Nature 2023, 624, 570–578. [Google Scholar] [CrossRef] [PubMed]
- A Bennett, J.; Abolhasani, M. Autonomous chemical science and engineering enabled by self-driving laboratories. Curr. Opin. Chem. Eng. 2022, 36, 100831. [Google Scholar] [CrossRef]
- Davis, E. Using a Large Language Model to Generate Program Mutations for a Genetic Algorithm to Search for Solutions to Combinatorial Problems: Review of (Romera-Paredes et al.). 2023. Available online: https://cs.nyu.edu/ (accessed on 21 January 2026).
- Ridnik, T.; Kredo, D.; Friedman, I. Code generation with alphacodium: From prompt engineering to flow engineering. arXiv 2024, arXiv:2401.08500. [Google Scholar] [CrossRef]
- Coja-Oghlan, A.; Loick, P.; Mezei, B.F.; Sorkin, G.B. The ising antiferromagnet and max cut on random regular graphs. arXiv 2020, arXiv:2009.10483. [Google Scholar] [CrossRef]
- Völker, C.; Rug, T.; Jablonka, K.M.; Kruschwitz, S. Llms Can Design Sustainable Concrete—A Systematic Benchmark (Resubmitted Version). 2024. Available online: https://www.researchgate.net/publication/377722231_LLMs_can_Design_Sustainable_Concrete_-a_Systematic_Benchmark_re-submitted_version (accessed on 21 January 2026).
- Zhao, S.; Chen, S.; Zhou, J.; Li, C.; Tang, T.; Harris, S.J.; Liu, Y.; Wan, J.; Li, X. Potential to transform words to watts with large language models in battery research. Cell Rep. Phys. Sci. 2024, 5, 101844. [Google Scholar] [CrossRef]
- Lei, G.; Docherty, R.; Cooper, S.J. Materials science in the era of large language models: A perspective. Digit. Discov. 2024, 3, 1257–1272. [Google Scholar] [CrossRef]
- Romera-Paredes, B.; Barekatain, M.; Novikov, A.; Balog, M.; Kumar, M.P.; Dupont, E.; Ruiz, F.J.R.; Ellenberg, J.S.; Wang, P.; Kohli, P.; et al. Mathematical discoveries from program search with large language models. Nature 2024, 625, 468–475. [Google Scholar] [CrossRef]
- Blaiszik, B.; Ward, L.; Schwarting, M.; Gaff, J.; Chard, R.; Pike, D.; Chard, K.; Foster, I. The materials data facility: Data services to advance materials science research. JOM 2016, 68, 2045–2052. [Google Scholar] [CrossRef]
- Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.-W.; Silva Santos, L.B.; Bourne, P.E.; et al. The fair guiding principles for scientific data management and stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef]
- Dunn, A.; Wang, Q.; Ganose, A.; Dopp, D.; Jain, A. Benchmarking materials property prediction methods: The matbench test set and the automatminer reference algorithm. npj Comput. Mater. 2020, 6, 138. [Google Scholar] [CrossRef]
- Alampara, N.; Schilling-Wilhelmi, M.; Ríos-García, M.; Mandal, I.; Khetarpal, P.; Grover, H.S.; Krishnan, N.M.A.; Jablonka, K.M. Probing the limitations of multimodal language models for chemistry and materials research. Nat. Comput. Sci. 2025, 5, 952–961. [Google Scholar] [CrossRef]
- Peters, U.; Chin-Yee, B. Generalization bias in large language model summarization of scientific research. R. Soc. Open Sci. 2025, 12, 241776. [Google Scholar] [CrossRef]
- Jain, A.; Ong, S.P.; Hautier, G.; Chen, W.; Richards, W.D.; Dacek, S.; Cholia, S.; Gunter, D.; Skinner, D.; Ceder, G.; et al. Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL Mater. 2013, 1, 011002. [Google Scholar] [CrossRef]
- Miret, S.; Lee, K.L.K.; Gonzales, C.; Nassar, M.; Spellings, M. The open matsci ml toolkit: A flexible framework for machine learning in materials science. arXiv 2022, arXiv:2210.17484. [Google Scholar] [CrossRef]
- Frey, N.C.; Soklaski, R.; Axelrod, S.; Samsi, S.; Gómez-Bombarelli, R.; Coley, C.W.; Gadepally, V. Neural scaling of deep chemical models. Nat. Mach. Intell. 2023, 5, 1297–1305. [Google Scholar] [CrossRef]
- Kim, E.; Huang, K.; Saunders, A.; McCallum, A.; Ceder, G.; Olivetti, E. Materials synthesis insights from scientific literature via text extraction and machine learning. Chem. Mater. 2017, 29, 9436–9444. [Google Scholar] [CrossRef]
- Beltagy, I.; Lo, K.; Cohan, A. Scibert: A pretrained language model for scientific text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 3615–3620. [Google Scholar]
- Chen, M. Evaluating large language models trained on code. arXiv 2021, arXiv:2107.03374. [Google Scholar] [CrossRef]
- Roziere, B.; Gehring, J.; Gloeckle, F.; Sootla, S.; Gat, I.; Tan, X.E.; Adi, Y.; Liu, J.; Sauvestre, R.; Remez, T.; et al. Code llama: Open foundation models for code. arXiv 2023, arXiv:2308.08589. [Google Scholar]
- Masry, A.; Long, D.X.; Tan, J.Q.; Joty, S.; Hoque, E. Chartqa: A benchmark for question answering about charts with visual and logical reasoning. In Proceedings of the Findings of ACL 2022; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 2263–2279. [Google Scholar]
- Li, S.; Tajbakhsh, N. Scigraphqa: A large-scale synthetic multi-turn question answering dataset for scientific graphs. arXiv 2023, arXiv:2308.03349. [Google Scholar]
- Abu-Odeh, A.; Galvan, E.; Kirk, T.; Mao, H.; Chen, Q.; Mason, P.; Malak, R.; Arroyave, R. Exploration of the high entropy alloy space as a constraint satisfaction problem. Acta Mater. 2018, 164, 1–11. [Google Scholar]
- Lookman, T.; Balachandran, P.V.; Xue, D.; Yuan, R. Active learning in materials science with emphasis on adaptive sampling using uncertainties for targeted design. npj Comput. Mater. 2019, 5, 21. [Google Scholar] [CrossRef]
- Guo, C.; Pleiss, G.; Sun, Y.; Weinberger, K.Q. On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning 2017, Sydney, NSW, Australia, 6–11 August 2017; Volume 70, pp. 1321–1330, PMLR (2017). [Google Scholar]
- Khajondetchairit, P.; Somdee, S.; Saelee, T.; Ektarawong, A.; Alling, B.; Praserthdam, P.; Rittiruam, M.; Praserthdam, S. Machine learning-accelerated density functional theory optimization of ptpdbased high-entropy alloys for hydrogen evolution catalysis. Int. J. Miner. Metall. Mater. 2025, 32, 2777–2785. [Google Scholar] [CrossRef]
- Fei, Y.; Rendy, B.; Kumar, R.; Dartsi, O.; Sahasrabuddhe, H.P.; McDermott, M.J.; Wang, Z.; Szymanski, N.J.; Walters, L.N.; Milsted, D.; et al. Alabos: A python-based reconfigurable workflow management framework for autonomous laboratories. Digit. Discov. 2024, 3, 2275–2289. [Google Scholar] [CrossRef]





















| Dimension | ML | LLMs | Synergy |
|---|---|---|---|
| Data | Structured/Numerical | Unstructured/Multimod | alText → structured data |
| Capability | Pattern recognition, regression | Semantic reasoning, generation | Feature optimization, work flow automation |
| Knowledge | Explicit patterns in data | Implicit logic in the literature | Provides prior knowledge for physical plausibility |
| Interaction | Code, parameters | Natural Language | NLP to invoke/configure ML models |
| Year | Work | Task/Scope | Data | Originality/HEA Relevance | Scientific Integrity/ Reproducibility | Key Limitations |
|---|---|---|---|---|---|---|
| 2024 | Kamnis, S.—“Introducing pre-trained transformers for high entropy alloy informatics” [56] | Transfer-learning property prediction (theory → experiment) | Thermodynamic unlabeled pretrain; experimental HEA fine-tune | HEA-oriented pretrain–fine-tune framing; transfer across data sources | Needs leakage-safe splits; OOD generalization reporting; artifact release | Supervised only (non-agentic); limited failure-mode/condition-mismatch analysis |
| 2025 | Kamnis, S.—“HEA property predictions using a transformer-based language model” (journal) [57] | Peer-reviewed transfer learning; + interpretability | Synthetic pretrain; HEA fine-tune; k-fold | Journal-validated extension; broader analysis | Requires explicit uncertainty quantification; stricter out‑of‑distribution testing; full artifact release for reproducibility | Non-tool-using; sensitive to dataset bias and metadata gaps |
| 2025 | Zhen, S.—“AI-Driven Design of HEAs for H2 electrocatalysis” (ChemRxiv) [58] | LLM-assisted literature curation + screening workflow | Literature-mined database + screening pipeline | End-to-end LLM support for curation-driven HEA workflow | Requires prompts + corpus snapshot; auditable inclusion/exclusion; pipeline artifacts | LLM mainly curation/organization; susceptible to literature-selection bias |
| 2024 | Luo, M.—“Robotic AI chemist for HEA nanozymes” (ChemRxiv) [59] | Closed-loop autonomy; LLM-in-the-loop (GPT-4) + BO | Robotic synthesis/testing + optimization loop | Agentic LLM integrated into autonomous HEA discovery | Needs prompt/guardrail transparency; logged decision traces; failure-case reporting; transfer/holdout tests | Preprint; task/assay-specific; limited evidence of cross-system generalization |
| 2024 | Choudhary, K.—“AtomGPT” (arXiv) [60] | General materials GPT (predict + generate) | Text/structure/property; DFT validation | Foundation-model direction; potentially adaptable to HEA tasks | For HEA: needs HEA benchmarks; clear splits/baselines; reproducible fine-tuning in-domain vs. out-of-domain performance | Not HEA-specific; HEA performance uncertain without domain adaptation |
| Core Dimension | Key Challenges/Status Quo | LLM-Enabled Opportunities/Core Capabilities | Priority Actions/Considerations |
|---|---|---|---|
| Knowledge Management | Experimental data and theoretical insights are fragmented across a rapidly growing, unstructured literature. | Automated mining and synthesis of text to extract structured facts and build queryable knowledge graphs, integrating disparate findings. | Develop community-shared, high-quality HEA knowledge bases with standardized ontologies and robust entity-linking tools. |
| Design Paradigm | The composition-process-property relationship is high-dimensional, non-linear, and expensive to explore experimentally. | Data-driven prediction and generative inverse design, leveraging transfer learning and sequence modeling to propose plausible candidates. | Mandate rigorous out-of-distribution evaluation, uncertainty quantification, and hard physical-constraint enforcement for all generated suggestions. |
| Research Workflow | Multiscale simulation and experimental loops are often manual, siloed, and lack reproducibility. | Intelligent orchestration and agency, automating workflows from literature to simulation to experiment within a unified interface. | Create reproducible, human-in-the-loop LLM-agent frameworks with open toolchains and explicit audit trails for full transparency. |
| Evaluation Standard | A lack of trusted, domain-specific benchmarks for assessing LLM utility in materials discovery. | A unified interface for multi-ability evaluation (extraction, reasoning, generation, explanation) under scientifically meaningful metrics. | Establish community-adopted benchmarks that test multimodal understanding, causal reasoning, and real-world experimental validation hit rates. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Guo, Y.; Yang, C. Large Language Models for High-Entropy Alloys: Literature Mining, Design Orchestration, and Evaluation Standards. Metals 2026, 16, 162. https://doi.org/10.3390/met16020162
Guo Y, Yang C. Large Language Models for High-Entropy Alloys: Literature Mining, Design Orchestration, and Evaluation Standards. Metals. 2026; 16(2):162. https://doi.org/10.3390/met16020162
Chicago/Turabian StyleGuo, Yutong, and Chao Yang. 2026. "Large Language Models for High-Entropy Alloys: Literature Mining, Design Orchestration, and Evaluation Standards" Metals 16, no. 2: 162. https://doi.org/10.3390/met16020162
APA StyleGuo, Y., & Yang, C. (2026). Large Language Models for High-Entropy Alloys: Literature Mining, Design Orchestration, and Evaluation Standards. Metals, 16(2), 162. https://doi.org/10.3390/met16020162

