Next Article in Journal
Gaze Point Estimation via Joint Learning of Facial Features and Screen Projection
Previous Article in Journal
Entry Status Matters: A Case Study on Running Performance Profiles of Starters and Substitutes in the Initial 15 Min of Professional Football Matches
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Editorial

The Specialization of Intelligence in AI Horizons: Present Status and Visions for the Next Era

by
Antonio Pagliaro
1,2,3,* and
Pierluca Sangiorgi
1,*
1
INAF IASF Palermo, Via Ugo La Malfa 153, I-90146 Palermo, Italy
2
Istituto Nazionale di Fisica Nucleare Sezione di Catania, Via Santa Sofia, 64, I-95123 Catania, Italy
3
ICSC—Centro Nazionale di Ricerca in HPC, Big Data e Quantum Computing, I-40121, Bologna, Italy
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2025, 15(23), 12474; https://doi.org/10.3390/app152312474
Submission received: 15 September 2025 / Accepted: 21 November 2025 / Published: 25 November 2025

1. Introduction: The Expanding Horizon of Applied Artificial Intelligence

The current era of artificial intelligence is often portrayed through the lens of large general-purpose models. A closer look, however, reveals an even more consequential dynamic: the deep specialization of AI across scientific and industrial domains. Progress is increasingly measured not by proximity to a hypothetical monolithic intelligence but by the extent to which AI systems are co-designed with domain knowledge to solve concrete high-stakes problems. In practice, AI is becoming a toolkit of bespoke instruments—models, data pipelines, and evaluation protocols—purpose-built to augment human expertise and to open new frontiers of discovery [1].
This pattern is visible across the natural sciences and engineering, where domain-shaped architectures and training regimes are advancing the state of the art. In the physical sciences, machine learning has delivered skillful medium-range weather forecasts and probabilistic nowcasts and has accelerated materials discovery by coupling representation learning with physical priors and structured databases [2,3,4,5]. In healthcare, foundation models are being adapted to multimodal clinical data, catalyzing progress in clinical reasoning, decision support, and population-scale analyses—while also foregrounding questions of safety, equity, and external validity [6,7,8,9].
As these applications mature, the center of gravity shifts toward data-centric development: curating, validating, and transforming data have become as decisive as model choice. Robust pipelines for data quality, documentation, and governance—together with methods that transform or synthesize training data—are now indispensable for dependable performance and reproducibility [10,11,12,13]. In parallel, methodological work on explainability is moving upstream from post hoc rationalizations to ante hoc constraint-aware designs that make interpretability a first-class objective, aligning model structure with the semantics and invariances of the target domain [14,15,16].
The rise of increasingly capable language and multimodal models also renews long-standing questions about the reasoning, reliability, and nature of machine cognition. Recent surveys chart advances in tool use, planning, and verification; at the same time, critical examinations caution against conflating performance on benchmarks with genuine reasoning and propose cognitive–theoretic perspectives—such as dual-process accounts—to diagnose strengths and limitations [17,18,19,20]. These debates intersect with practical imperatives from governance and risk management, where emerging standards define processes for mapping risks, measuring controls, and mitigating harms across the AI lifecycle [21].
Taken together, these trajectories suggest a grounded vision for the next phase of AI: systems that are deeply integrated with domain practice; that privilege data stewardship and measurement; that embed interpretability by design; and that are evaluated not only by aggregate metrics but by their faithfulness to domain constraints, their reliability under distribution shifts, and their conformance to evolving norms and standards. This perspective reframes “progress” as a multi-objective optimization—advancing accuracy, robustness, and usefulness in tandem with transparency, safety, and social value.

2. The Diversification of AI in Practice: A Synthesis of Novel Applications

A representative cross section of recent literature paints a vivid picture of AI’s diversification, showcasing its customization to create new knowledge and solve practical problems across diverse fields. This synthesis is organized around three key areas of application: the physical and engineering sciences, healthcare and precision medicine, and human-centric systems for industry and society.

2.1. AI in the Physical and Engineering Sciences

In the physical and engineering sciences, AI is becoming an indispensable instrument for modeling complex systems where purely analytical models or classical numerical schemes reach their limits. The defining trend is the tight integration of domain knowledge—constraints, symmetries, measurement processes—into model design, data pipelines, and evaluation.
A concrete illustration comes from Travincas et al. [22], who predict the open porosity of industrial mortar: beyond intrinsic mortar properties, the model gains accuracy by encoding substrate characteristics (e.g., bulk density, porosity), reflecting a systems-level view in which AI captures interactions within composite materials. In high-energy astrophysics, La Parola et al. [23] address gamma/hadron discrimination in Cherenkov data by introducing temporal features from shower images, a paradigmatic case of domain-specific feature engineering that leverages the distinct time evolution of particle showers to improve performance in the low-energy regime.
These examples align with broader advances driven less by novel learning algorithms and more by domain-shaped data and targets. In geoscience, data-driven weather models now deliver skillful medium-range forecasts and calibrated probabilistic predictions by exploiting spatiotemporal structure and physical priors [2,3]. In materials discovery, large-scale campaigns couple representation learning with crystal-structure databases and physically motivated stability filters to identify viable compounds at unprecedented scale [4,5].
Taken together, this shift—from “AI for Science” to “Science-informed AI”—foregrounds data-centric development (curation, transformation, documentation) and ante hoc inductive biases aligned with the semantics and invariances of the system under study. The result is not only higher accuracy but also improved robustness, interpretability, and transferability [10,14].

2.2. AI in Healthcare and Precision Medicine

In healthcare, AI is catalyzing a fundamental transition from generalized treatments to precision medicine. This transition increasingly relies on multimodal foundation models that can integrate heterogeneous data types—including structured clinical data, unstructured textual notes, and physiological measurements—as well as imaging and genomics, to support risk prediction, triage, and decision support at the point of care [6,7,8]. Alongside technical advances, scholars have underscored the need to evaluate the external validity, equity, and clinical safety from the outset [9].
A compelling demonstration of this is the work by Rhazzafe et al. [24], which predicts the Length of Stay (LOS) for patients in the Intensive Care Unit (ICU). Their approach creates a hybrid summarization of unstructured Electronic Health Record (EHR) notes, fusing this textual information with structured data such as vital signs and laboratory results. This kind of text–tabular fusion—now common in clinical foundation-model pipelines—illustrates how representation learning over messy EHR artifacts can recover clinically meaningful signals [6,8]. Complementing this predictive approach, the review by Trezza et al. [25] highlights the critical role of unsupervised learning in advancing precision medicine. By analyzing vast unlabeled datasets of genomic and medical information, unsupervised algorithms can uncover novel disease subtypes and identify new biomarkers. These data-driven strata can then be linked to therapeutic targets and trial enrichment strategies, tightening the loop between discovery and care [7].
The evolution of AI in medicine, as illustrated by these contributions, is moving beyond simple classification tasks. The challenge is no longer just prediction (e.g., will a patient be readmitted?) but also sophisticated knowledge extraction and synthesis. The work by Rhazzafe et al. [24] is not merely predicting a number; it is first creating a new condensed representation of the patient’s state through summarization, a crucial step in managing information overload for clinicians. Likewise, the emphasis on unsupervised methods by Trezza et al. [25] points to AI’s growing role as a hypothesis-generation tool, helping researchers and doctors ask new questions by revealing patterns that were previously invisible in the data. Viewed through this lens, clinical AI becomes a knowledge-discovery engine that must balance performance with transparency, robustness across populations, and responsible deployment within healthcare systems [7,9].

2.3. Human-Centric AI for Industry and Society

AI is increasingly deployed to enhance human experience, well-being, accessibility, and learning—an orientation that resonates with human-centered innovation agendas and with emerging norms for trustworthy risk-aware deployment [21]. Rather than optimizing exclusively for technical or business metrics, recent work foregrounds outcomes such as worker health, inclusion, and cognitive load, and emphasizes transparency and human oversight as design primitives [14].
A representative example is the framework by Rosca and Stancu [26], which fuses physiological signals from wearables with sentiment analysis of workplace communications to construct a Well-Being Index (WBI) for proactive organizational care. In accessibility, Kawulok and Maćkowski [27] leverage YOLO-based object detection to adapt mathematical graphs for blind learners, automating a previously labor-intensive process and lowering barriers to STEM participation. In education, Elahi, Morato and Iglesias [28] apply NLP to assess the topical relevance of instructional videos, aiming to streamline web readability and improve learner engagement by aligning multimedia with textual content.
Across these domains, effective systems share two methodological pillars. First, data-centric practice—from targeted curation to transformation and documentation—proves decisive for robustness and fairness in human-facing tasks [10,11]. Second, ante hoc interpretability—embedding constraints, human-meaningful features, or process transparency directly into models—supports accountability and adoption in sensitive organizational and educational settings [14]. Taken together, these directions recast AI as a sociotechnical instrument: one that augments human capabilities while meeting explicit requirements for safety, inclusivity, and governance [21].
Beyond public services and workplace settings, financial markets offer a demanding testbed for domain-informed AI. Recent evidence shows that tree-based ensembles and hybrid pipelines can beat single classifiers on directional accuracy—sometimes markedly—yet statistical gains often shrink once implementation frictions (transaction costs, market impact, latency) are accounted for, and the performance varies across regimes [29,30]. A critical reassessment argues for multi-dimensional evaluation that complements accuracy with risk-adjusted returns (e.g., Sharpe), drawdowns, and budget-aware computing, together with robust validation across market phases and assets [29]. These studies reinforce our broader thesis: progress hinges on data-centric practice, interpretable structure, and resource-rational design—so that models deliver sustained economic value and not just attractive backtests [29,30].

3. Advancing the AI Toolkit: Methodological Innovations and Data-Centric Approaches

Shifting focus from applications to the methods that enable them, recent work converges on a clear message: the most impactful results arise from a deep engagement with the entire data–model–evaluation pipeline. Rather than “deploying a model,” researchers increasingly practice a rigorous scientific workflow that designs datasets, inductive biases, targets, and measurements in concert [10,11,13].
A foundational element of this shift is the centrality of data. The contribution by Carrilho, Hambarde and Proenca [31] on fabric defect detection is exemplary: the creation of the Lusitano dataset—collected under operational conditions—addresses the domain mismatch and sparsity that limit the external validity of models trained on legacy corpora. This data-centric stance resonates with broader evidence that curation, transformation, and documentation of training data frequently constrain downstream performance more than algorithmic novelty [10,12].
Alongside data work, composite modeling strategies are becoming standard practice. The approach of Rhazzafe et al. [24]—combining extractive sentence selection with abstractive summarization for EHRs—illustrates how hybrid pipelines can distill heterogeneous information into clinically meaningful representations. Building on and refining prior feature-engineering strategies from Pagliaro et al. [32], La Parola et al. [23] achieve superior event discrimination in high-energy astrophysics by ensembling classical morphological descriptors with novel temporal features, showing that carefully composed feature sets and model combinations can outperform single-method baselines in complex regimes.
Architectural scrutiny and task-aligned inductive biases are equally prominent. The comparative analysis of multiple YOLO variants by Kawulok and Maćkowski [27] is not a superficial benchmark but a targeted study of speed–accuracy–suitability trade-offs for accessibility-oriented graph element detection. More broadly, ante hoc interpretability—embedding constraints, human-meaningful structure, or process transparency directly into models—is moving upstream in the design loop, improving robustness and trust without relying solely on post hoc explanations [14].
Taken together, these methodological trends suggest that modern AI research increasingly mirrors scientific experimentation: careful hypothesis formation (e.g., “temporal dynamics will improve discrimination”), meticulous and domain-faithful data collection (e.g., Lusitano), bespoke “instruments” via hybrid/ensemble models (EHR summarization, astrophysical feature fusion), and rigorous comparative studies of architectures and training regimes (YOLO evaluations). As the field matures, progress depends on orchestrating data management, model design, and measurement—advancing accuracy and utility alongside transparency, reliability, and reproducibility [10,11,14].

4. The Next Era: Towards Transparent, Robust, and Cognitively-Aware AI

The most forward-looking strands of recent research outline the fundamental challenges and directions likely to shape the next era of AI: building systems that are not only more capable but also transparent by design, robust under distribution shift, and grounded in a clearer account of machine reasoning. These priorities align with emerging norms for trustworthy deployment, where transparency, risk management, and evaluability are treated as first-class operational requirements [21].

4.1. The Imperative of Transparency: From Post Hoc to Pre Hoc Explainability

As AI systems move deeper into high-stakes domains, trust and transparency are not optional add-ons but design constraints. This motivates a shift from predominantly post hoc explanations toward ante/pre hoc approaches that embed interpretability directly into objectives, architectures, and data pipelines [14,15,16]. Post hoc methods remain useful for auditing and debugging, but their instability and potential misalignment with model internals limit their sufficiency when accountability is paramount.
A representative step in this direction is the contribution of Acun and Nasraoui [33], which formalizes “pre hoc” and “co hoc” explainability. By coupling a transparent white-box model (e.g., sparse logistic regression) to regularize or co-train with a higher-capacity black-box, they obtain predictors that maintain accuracy while inheriting interpretable structure. This strategy exemplifies “explainability by design”: interpretability is not retrofitted but optimized jointly with predictive performance. In practice, this complements data-centric measures—curation, documentation, and feature semantics—that anchor explanations to domain-meaningful variables and constraints [10].
Looking ahead, the research agenda broadens from explaining models to engineering explainable systems: selecting targets that reflect domain semantics; using priors and invariances to shape representations; adopting validation protocols that test explanation faithfulness and usefulness; and aligning artifacts with governance frameworks for risk mapping and control evaluation [21]. In this view, transparency is inseparable from robustness and reliability: models that expose structure and rationale are easier to calibrate, monitor, and adapt—key properties for sustainable deployment at scale.

4.2. Deconstructing the ‘Illusion of Thinking’: A Critical Debate on LRM Reasoning

A growing body of work is converging on a more scientific account of what current Large Reasoning Models (LRMs) can and cannot do, moving beyond polarizing slogans toward testable claims about representations, inference, and failure modes [17,18]. The debate is framed by a quartet of interconnected perspectives. The provocation comes from Shojaee et al. [19], who articulate an “illusion of thinking”: LRMs may surpass standard LLMs on medium-complexity problems, yet exhibit sharp performance and efficiency collapse as complexity rises, suggesting brittle heuristics rather than durable reasoning. This claim is tempered by methodological critiques from Lawsen and Opus [34] and Dellibarda Varela et al. [35], who argue that parts of the observed collapse are artifacts of evaluation design—e.g., unsolvable instances, hidden confounders, and token/compute ceilings that cap search and verification—implying that some “illusion” may be benchmark-induced rather than model-intrinsic.
Zooming out, recent surveys systematize LRM ingredients—tool use, search, program synthesis, and self-verification—and map how data, prompts, and training signals shape reasoning behavior [17,18]. Complementary cognitive–theoretic views propose dual-process perspectives in which fast heuristic pathways interact with slower deliberative routines, offering a principled lens on when and why models succeed or fail [20]. Together, these strands motivate a shift from headline metrics to diagnostic evaluation: (i) complexity-controlled testbeds; (ii) perturbation and counterfactual analyses to probe shortcutting; (iii) faithfulness tests for intermediate rationales; and (iv) resource-aware protocols that separate algorithmic limits from imposed budget constraints [17,18].
In practical terms, the emerging consensus is neither triumphalist nor dismissive. LRMs can emulate aspects of structured reasoning when scaffolded with appropriate tools, search, and verification; yet their reliability hinges on data curation, transparency about intermediate steps, and guardrails that detect confabulation and specious chains of thought [10,14]. For high-stakes use, this debate underscores the need to pair capability research with governance and risk management practices that stress test models under shifting distributions and clearly delimit safe operating conditions.

4.3. A New Lens for AI Limitations: Bounded Rationality and Dual-Process Theory

A unifying perspective in this debate is offered by Gorelik [20], who reframes LRM limitations not as bugs or mere illusions but as manifestations of bounded rationality: systems reason under finite computational budgets and partial information. Through the lens of dual-process theory, LRM behavior can be viewed as the interaction between a fast heuristic “System 1” (pattern-guided token generation) and a slower deliberative “System 2” (scaffolded reasoning such as Chain-of-Thought, tool use, or program synthesis). From this standpoint, the observed “performance collapse” at higher complexities is an expected phase transition when capacity ceilings are reached, rather than a categorical failure [19].
A key insight in Gorelik [20] is the cross-disciplinary analogy to the seminal 1966 study by Kahneman and Beatty [36], where human cognitive effort (indexed by pupil dilation) follows an inverted-U: it grows with task demand until capacity is met, then plateaus or recedes as the system disengages. LRMs display an analogous profile when computational effort (e.g., tokens or search steps) is plotted against problem complexity—suggesting that reliability depends as much on resource allocation as on raw competence. Recent surveys sharpen this view by decomposing “System 2” into ingredients—external tools, search, program induction, and self-verification—and by showing how prompts, data, and training signals modulate when models switch from fast heuristics to slower deliberation [17,18].
This reframing redirects the improvement agenda. Rather than “fixing” collapse by unbounded scaling, it motivates resource-rational design: meta-controllers that allocate compute adaptively; curricula and data that teach when to deliberate; and evaluation that scores not only the final accuracy but also the efficiency, calibration, and prudent abstention. Concretely, this implies (i) complexity-controlled testbeds and budget-aware protocols that separate intrinsic limits from imposed ceilings; (ii) cognitively inspired architectures that couple lightweight System 1 modules with targeted System 2 routines under meta-cognitive control; and (iii) workflows for human–AI collaboration that route hard or uncertain cases to human experts when the model operates outside its reliable regime [10,14]. In short, bounded rationality and dual-process theory provide principled guardrails for building LRMs that reason well enough under constraints—and for deploying them safely in the wild.

5. Conclusions: Charting Future Research Directions

Recent advances depict a field in transition: from broad model-centric deployment to science-informed data-centric practice and toward systems that foreground transparency, robustness, and a clearer account of machine reasoning [10,14,17,18]. The synthesis presented here suggests that embracing bounded rationality and drawing on insights from cognitive science can provide principled guardrails for next-generation AI—especially where governance and risk management are operational requirements [21]. From this perspective, several directions emerge:
  • Foster Cross-Domain Methodological Transfer. The domain-shaped feature engineering of La Parola et al. [23] and the hybrid pipelines of Rhazzafe et al. [24] offer reusable patterns. Future work should crystallize such patterns into transferable frameworks (data schemas, evaluation suites, and reference pipelines) to accelerate progress across disciplines—including geoscience and materials discovery, where structured priors and curated corpora have proven decisive [2,4,5].
  • Develop Benchmarks for Bounded Rationality. Accuracy-only leaderboards are insufficient for advanced reasoning systems. Building on Gorelik [20], Lawsen and Opus [34], and related analyses [19], benchmarks should incorporate metrics of computational effort (token budgets, inference time, search steps), calibration and abstention, and complexity-controlled testbeds that disentangle intrinsic limits from imposed resource ceilings—aligned with emerging taxonomies of reasoning capabilities [17,18].
  • Launch an Integrated Research Program in AI and Cognitive Science. The analogy between LRM compute profiles and human cognitive load in Gorelik [20] warrants formal joint experiments. A shared task suite could co-measure human effort (e.g., pupillometry, EEG) and LRM computational effort on matched problems to test dual-process predictions and resource-rational control, taking inspiration from classic evidence such as Kahneman and Beatty [36].
  • Prioritize the Engineering of Inherently Trustworthy Systems. Extending the “explainability by design” agenda of Acun and Nasraoui [33], research should target ante/pre hoc approaches with guarantees on faithfulness–accuracy trade-offs, supported by rigorous data stewardship and documentation [10]. Deployment in high-stakes settings should align with lifecycle-oriented governance frameworks that operationalize risk mapping, control testing, and monitoring [21] and with design principles for human-meaningful transparency [14].
Pursuing these directions can consolidate today’s applied successes into reliable, explainable, and resource-rational AI—systems that perform well under constraints, generalize across domains, and earn trust through transparent design and disciplined evaluation.

Acknowledgments

The authors acknowledge the supercomputing resources and support from ICSC—Centro Nazionale di Ricerca in High Performance Computing, Big Data and Quantum Computing—and hosting entity, funded by European Union—NextGenerationEU.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Pagliaro, A.; Sangiorgi, P. AI in Experiments: Present Status and Future Prospects. Appl. Sci. 2023, 13, 10415. [Google Scholar] [CrossRef]
  2. Lam, R.; Sanchez-Gonzalez, A.; Willson, M.; Wirnsberger, P.; Fortunato, M.; Alet, F.; Ravuri, S.; Ewalds, T.; Eaton-Rosen, Z.; Hu, W.; et al. Learning Skillful Medium-Range Global Weather Forecasting. Science 2023, 382, eadi2336. [Google Scholar] [CrossRef]
  3. Price, I.; Sanchez-Gonzalez, A.; Alet, F.; Andersson, T.R.; El-Kadi, A.; Masters, D.; Ewalds, T.; Stott, J.; Mohamed, S.; Battaglia, P.; et al. Probabilistic Weather Forecasting with Machine Learning. Nature 2025, 637, 84–90. [Google Scholar] [CrossRef] [PubMed]
  4. Merchant, A.; Batzner, S.; Schoenholz, S.S.; Aykol, M.; Cheon, G.; Cubuk, E.D. Scaling Deep Learning for Materials Discovery. Nature 2023, 624, 80–85. [Google Scholar] [CrossRef]
  5. Pyzer-Knapp, E.O.; Manica, M.; Staar, P. Foundation Models for Materials Discovery—Current State and Future Directions. Npj Comput. Mater. 2025, 11, 61. [Google Scholar] [CrossRef]
  6. Moor, M.; Banerjee, O.; Abad, Z.S.H.; Krumholz, H.M.; Leskovec, J.; Topol, E.J.; Rajpurkar, P. Foundation Models for Generalist Medical Artificial Intelligence. Nature 2023, 616, 259–265. [Google Scholar] [CrossRef] [PubMed]
  7. Timilsina, M.; Buosi, S.; Asif Razzaq, M.; Haque, R.; Judge, C.; Curry, E. Harmonizing Foundation Models in Healthcare: A Comprehensive Review. Comput. Biol. Med. 2025, 189, 109925. [Google Scholar] [CrossRef]
  8. Khan, W.; Leem, S.; See, K.B.; Wong, J.K.; Zhang, S.; Fang, R. A Comprehensive Survey of Foundation Models in Medicine. arXiv 2024, arXiv:2406.10729. [Google Scholar] [CrossRef] [PubMed]
  9. Ranisch, R.; Haltaufderheide, J. Foundation Models in Medicine Are a Social Experiment: Time for an Ethical Framework. Npj Digit. Med. 2025, 8, 1924. [Google Scholar] [CrossRef]
  10. Zha, D.; Bhat, Z.P.; Lai, K.H.; Yang, F.; Jiang, Z.; Zhong, S.; Hu, X. Data-centric Artificial Intelligence: A Survey. ACM Comput. Surv. 2024, 56, 1–42. [Google Scholar] [CrossRef]
  11. Priestley, M.; O’donnell, F.; Simperl, E. A Survey of Data Quality Requirements that Matter in ML Development Pipelines. ACM J. Data Inf. Qual. 2023, 15, 1–39. [Google Scholar] [CrossRef]
  12. Wang, D.; Huang, Y.; Ying, W.; Bai, H.; Gong, N.; Wang, X.; Dong, S.; Zhe, T.; Liu, K.; Xiao, M.; et al. Towards Data-Centric AI: A Comprehensive Survey of Traditional, Reinforcement, and Generative Approaches for Tabular Data Transformation. arXiv 2025, arXiv:2501.10555. [Google Scholar] [CrossRef]
  13. Jakubik, J.; Vössing, M.; Kühl, N.; Walk, J.; Satzger, G. Data-Centric Artificial Intelligence. arXiv 2023, arXiv:2212.11854. [Google Scholar] [CrossRef]
  14. Di Marino, A.; Bevilacqua, V.; Ciaramella, A.; De Falco, I.; Sannino, G. Ante-Hoc Methods for Interpretable Deep Models: A Survey. ACM Comput. Surv. 2025, 57, 1–36. [Google Scholar] [CrossRef]
  15. Zhao, H.; Chen, H.; Yang, F.; Liu, N.; Deng, H.; Cai, H.; Wang, S.; Yin, D.; Du, M. Explainability for Large Language Models: A Survey. ACM Comput. Surv. 2024, 56, 1–38. [Google Scholar] [CrossRef]
  16. Palikhe, A.; Yu, Z.; Wang, Z.; Zhang, W. A Survey on Explainable Large Language Models. arXiv 2025, arXiv:2506.21812. [Google Scholar] [CrossRef]
  17. Li, Z.-Z.; Zhang, D.; Zhang, M.-L.; Zhang, J.; Liu, Z.; Yao, Y.; Xu, H.; Zheng, J.; Wang, P.-J.; Chen, X.; et al. A Survey of Reasoning Large Language Models. arXiv 2025, arXiv:2502.17419. [Google Scholar] [CrossRef]
  18. Xu, F.; Hao, Q.; Zong, Z.; Wang, J.; Zhang, Y.; Wang, J.; Lan, X.; Gong, J.; Ouyang, T.; Meng, F.; et al. Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models. arXiv 2025, arXiv:2501.09686. [Google Scholar] [CrossRef]
  19. Shojaee, P.; Mirzadeh, I.; Alizadeh, K.; Horton, M.; Bengio, S.; Farajtabar, M. The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity. arXiv 2025, arXiv:2506.06941. [Google Scholar] [CrossRef]
  20. Gorelik, B. Not an Illusion but a Manifestation: Understanding Large Language Model Reasoning Limitations Through Dual-Process Theory. Appl. Sci. 2025, 15, 8469. [Google Scholar] [CrossRef]
  21. NIST AI 100-1; Artificial Intelligence Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology (NIST): Gaithersburg, MD, USA, 2023. [CrossRef]
  22. Travincas, R.; Mendes, M.P.; Torres, I.; Flores-Colen, I. Predicting the Open Porosity of Industrial Mortar Applied on Different Substrates: A Machine Learning Approach. Appl. Sci. 2024, 14, 10780. [Google Scholar] [CrossRef]
  23. La Parola, V.; Cusumano, G.; Lombardi, S.; Compagnino, A.A.; La Barbera, A.; Tutone, A.; Pagliaro, A. Machine Learning-Enhanced Discrimination of Gamma-Ray and Hadron Events Using Temporal Features: An ASTRI Mini-Array Analysis. Appl. Sci. 2025, 15, 3879. [Google Scholar] [CrossRef]
  24. Rhazzafe, S.; Caraffini, F.; Colreavy-Donnelly, S.; Dhassi, Y.; Kuhn, S.; Nikolov, N.S. Hybrid Summarization of Medical Records for Predicting Length of Stay in the Intensive Care Unit. Appl. Sci. 2024, 14, 5809. [Google Scholar] [CrossRef]
  25. Trezza, A.; Visibelli, A.; Roncaglia, B.; Spiga, O.; Santucci, A. Unsupervised Learning in Precision Medicine: Unlocking Personalized Healthcare through AI. Appl. Sci. 2024, 14, 9305. [Google Scholar] [CrossRef]
  26. Rosca, C.-M.; Stancu, A. Fusing Machine Learning and AI to Create a Framework for Employee Well-Being in the Era of Industry 5.0. Appl. Sci. 2024, 14, 10835. [Google Scholar] [CrossRef]
  27. Kawulok, M.; Maćkowski, M. YOLO-Type Neural Networks in the Process of Adapting Mathematical Graphs to the Needs of the Blind. Appl. Sci. 2024, 14, 11829. [Google Scholar] [CrossRef]
  28. Elahi, E.; Morato, J.; Iglesias, A. Improving Web Readability Using Video Content: A Relevance-Based Approach. Appl. Sci. 2024, 14, 11055. [Google Scholar] [CrossRef]
  29. Pagliaro, A. Artificial Intelligence vs. Efficient Markets: A Critical Reassessment of Predictive Models in the Big Data Era. Electronics 2025, 14, 1721. [Google Scholar] [CrossRef]
  30. Pagliaro, A. Forecasting Significant Stock Market Price Changes Using Machine Learning: Extra Trees Classifier Leads. Electronics 2023, 12, 4551. [Google Scholar] [CrossRef]
  31. Carrilho, R.; Hambarde, K.A.; Proença, H. A Novel Dataset for Fabric Defect Detection: Bridging Gaps in Anomaly Detection. Appl. Sci. 2024, 14, 5298. [Google Scholar] [CrossRef]
  32. Pagliaro, A.; Cusumano, G.; La Barbera, A.; La Parola, V.; Lombardi, S. Application of Machine Learning Ensemble Methods to ASTRI Mini-Array Cherenkov Event Reconstruction. Appl. Sci. 2023, 13, 8172. [Google Scholar] [CrossRef]
  33. Acun, C.; Nasraoui, O. Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance. Appl. Sci. 2025, 15, 7544. [Google Scholar] [CrossRef]
  34. Lawsen, A.; Opus, C. The Illusion of the Illusion of Thinking: A Comment on Shojaee et al. (2025). arXiv 2025, arXiv:2506.09250. [Google Scholar] [CrossRef]
  35. Dellibarda Varela, I.; Romero-Sorozabal, P.; Rocon, E.; Cebrian, M. Rethinking the Illusion of Thinking. arXiv 2025, arXiv:2507.01231. [Google Scholar] [CrossRef]
  36. Kahneman, D.; Beatty, J. Pupil Diameter and Load on Memory. Science 1966, 154, 1583–1585. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pagliaro, A.; Sangiorgi, P. The Specialization of Intelligence in AI Horizons: Present Status and Visions for the Next Era. Appl. Sci. 2025, 15, 12474. https://doi.org/10.3390/app152312474

AMA Style

Pagliaro A, Sangiorgi P. The Specialization of Intelligence in AI Horizons: Present Status and Visions for the Next Era. Applied Sciences. 2025; 15(23):12474. https://doi.org/10.3390/app152312474

Chicago/Turabian Style

Pagliaro, Antonio, and Pierluca Sangiorgi. 2025. "The Specialization of Intelligence in AI Horizons: Present Status and Visions for the Next Era" Applied Sciences 15, no. 23: 12474. https://doi.org/10.3390/app152312474

APA Style

Pagliaro, A., & Sangiorgi, P. (2025). The Specialization of Intelligence in AI Horizons: Present Status and Visions for the Next Era. Applied Sciences, 15(23), 12474. https://doi.org/10.3390/app152312474

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop