ChatMicroscopy: A Perspective Review of Large Language Models for Next-Generation Optical Microscopy

Sancataldo, Giuseppe

doi:10.3390/app16052502

Open AccessReview

ChatMicroscopy: A Perspective Review of Large Language Models for Next-Generation Optical Microscopy

by

Giuseppe Sancataldo

Department of Physics and Chemistry—Emilio Segrè, University of Palermo, Viale delle Scienze, 18, 90128 Palermo, Italy

Appl. Sci. 2026, 16(5), 2502; https://doi.org/10.3390/app16052502

Submission received: 19 January 2026 / Revised: 26 February 2026 / Accepted: 4 March 2026 / Published: 5 March 2026

(This article belongs to the Special Issue Biomedical Optics and Imaging: Latest Advances and Prospects)

Download

Browse Figures

Versions Notes

Featured Application

This perspective review outlines how Large Language Models (LLM) can be deployed as intelligent interfaces and orchestration layers for advanced optical microscopy platforms. A representative future application is the development of conversational LLM-driven microscope assistants capable of translating high-level experimental goals, such as optimizing live-cell imaging conditions or autonomously exploring heterogeneous samples, into validated acquisition workflows. By integrating instrument control, real-time analysis, and facility-level data management, such systems have the potential to lower barriers to advanced microscopy, improve reproducibility, and enable adaptive, closed-loop experiments in both research laboratories and shared imaging facilities.

Abstract

Optical microscopy is a fundamental tool in the physical, chemical, and life sciences, enabling direct investigation of structure, dynamics, and function across multiple spatial and temporal scales. Advances in optical design, detectors, and computational techniques have greatly enhanced performance, but have also increased the complexity of modern microscopes, which are now software-driven and embedded in data-intensive workflows. Artificial intelligence has become an important component of this landscape, particularly through task-specific machine learning approaches for image analysis, optimization, and limited instrument control. While effective, these solutions are often fragmented and lack the ability to integrate experimental intent, contextual knowledge, and multi-step reasoning. Recent progress in large language models (LLMs) offers a new paradigm for intelligent microscopy. As foundation models trained on large-scale text and code, LLMs exhibit emergent capabilities in reasoning, abstraction, and tool coordination, allowing them to act as natural interfaces between users and complex experimental systems. This perspective highlights how LLMs can function as cognitive and orchestration layers that connect experiment design, instrument control, data analysis, and knowledge integration. Emerging applications include conversational microscope control, workflow supervision, and scientific assistance for data exploration and hypothesis generation, alongside important technical, ethical, and governance challenges.

Keywords:

advanced optical microscopy; artificial intelligence; large language model; AI-driven instrument

1. Introduction

Optical microscopy has undergone a continuous technological evolution over the past several decades, establishing itself as one of the most fundamental and widely used research tools across the physical, chemical, and life sciences [1,2]. Its enduring impact stems from a unique ability to provide direct, non-invasive access to structure, dynamics, and function across a wide range of spatial and temporal scales [3]. As a result, microscopy underpins a broad spectrum of applications, ranging from live-cell and developmental biology to materials characterization, nanofabrication, soft-matter physics, and chemical dynamics, serving as a unifying observational framework for both hypothesis-driven and exploratory research [4,5,6,7]. Sustaining this central role has required continuous research and technological development. Advances in optical design, contrast mechanisms, detectors, and computational methods have repeatedly expanded what can be observed and quantified at the microscale. Early microscopy platforms relied almost entirely on manual operation, demanding substantial expertise to align optical components, adjust illumination conditions, and optimize acquisition parameters. The progressive digitization of microscopy, driven by sensitive detectors, motorized stages, and computer-controlled hardware, enabled the automation of routine tasks and facilitated the acquisition of increasingly complex datasets, including three-dimensional, time-resolved, spectral, and multimodal measurements [8,9,10]. Profound innovations in optical architecture and sample preparation have considerably broadened the bounds of microscopy. Techniques such as super-resolution imaging [11], which overcomes the diffraction limit, correlation- and fluorescence lifetime-based measurements [12,13], which provide access to molecular dynamics, interactions, and local nano-environmental properties, and light-sheet and lattice light-sheet microscopy [14], which enable fast, gentle, and high-contrast volumetric imaging, have opened access to spatial and temporal regimes that were previously inaccessible with conventional optical microscopy. These developments have enabled systematic investigations of molecular organization, cellular dynamics, and mesoscale processes in living and complex systems. At the same time, they have substantially increased experimental complexity. Modern microscopes now integrate multiple illumination modalities, detection pathways, adaptive optical elements, and computational pipelines, often exposing hundreds of configurable parameters. Optimal imaging conditions, therefore, emerge from delicate, context-dependent trade-offs among resolution, acquisition speed, phototoxicity, photobleaching, and information content, making experimental optimization increasingly nontrivial and expertise-intensive. Despite growing levels of automation, microscopy remains fundamentally guided by human reasoning, as experimental design, hypothesis formulation, and interpretation require contextual understanding that extends beyond numerical optimization.

Alongside these developments in microscopy, the past decade has been characterized by a rapid and sustained growth of Artificial Intelligence (AI) [15,16]. Progress in machine learning (ML), deep learning (DL), and, more recently, large-scale foundation models have progressively expanded the scope of what can be automated, optimized, and assisted across complex scientific workflows [17]. In particular, large language models (LLMs) have demonstrated unprecedented capabilities in natural language understanding, reasoning, code generation, and tool orchestration [18]. LLMs have rapidly entered everyday life, where they are widely used both in professional settings to support productivity, communication, and decision-making, and in leisure contexts for creativity, entertainment, and interactive experiences [19]. By enabling flexible interaction with complex software environments and translating high-level human intent into executable actions, these models have shifted artificial intelligence from narrow, task-specific solutions toward more general-purpose, context-aware systems capable of integrating knowledge, reasoning, and decision-making [20].

To date, ML- and DL-based approaches have been widely adopted for microscopy image analysis tasks such as denoising, segmentation, object detection, tracking, and super-resolution reconstruction, directly addressing the challenges posed by large, high-dimensional datasets [21]. More recently, these methods have moved beyond post-processing to selected control and optimization tasks, including autofocus, adaptive optics, and adaptive sampling, enabling partial automation of operations that previously relied heavily on expert intervention [22,23]. Despite these advances, current AI-driven microscopy systems remain largely task-specific and fragmented, with limited integration across data acquisition, analysis, and experimental decision-making. In this context, LLMs offer a complementary and transformative approach by providing a higher-level cognitive layer capable of integrating experimental intent, contextual knowledge, and multi-step reasoning. Rather than replacing existing ML components, LLMs can orchestrate heterogeneous algorithms, coordinate instrument control and data analysis, and serve as natural language interfaces between users and complex microscopy platforms, thereby enabling more coherent, adaptive, and human-centered experimental workflows.

In this rapidly evolving landscape, the convergence of optical microscopy and large language models points toward a new conceptual framework for how imaging experiments may be conceived and conducted in the future. Within this perspective Review, ChatMicroscopy is introduced as a conceptual paradigm in which optical microscopy platforms are envisioned to be augmented by large language models acting as a cognitive interface between users, instruments, and data (Figure 1). To clarify the level of maturity of the approaches discussed in this Review, three tiers of evidence are identified. First, validated machine learning and deep learning applications for microscopy image analysis and selected instrument optimization tasks, which are experimentally demonstrated and widely adopted. Second, early proof-of-concept implementations of large language models applied to instrument control or experiment design, currently limited to prototype systems and laboratory-scale demonstrations. Third, forward-looking and conceptual scenarios involving facility-level orchestration, FAIR-integrated infrastructures, and fully closed-loop experimental ecosystems, which remain prospective and require substantial further validation.

2. State of the Art: Current AI-Driven Approaches in Optical Microscopy

Artificial intelligence is increasingly emerging as a central enabling technology in optical microscopy, driven by the rapid growth in data volume, dimensionality, and experimental complexity that characterizes modern imaging workflows. By exploiting data-driven models, machine learning, and in particular deep learning, enables the extraction of meaningful statistical and structural information from complex, high-dimensional imaging data, addressing long-standing challenges such as low signal-to-noise ratio, diffraction-limited resolution, optical aberrations, and limited photon budgets [24,25,26]. Convolutional neural networks (CNNs) form the backbone of most AI-based microscopy applications, owing to their ability to learn hierarchical spatial features and exploit local correlations intrinsic to optical images. CNN-based architectures are widely adopted for image restoration, classification, and reconstruction tasks across fluorescence, phase-contrast, and brightfield microscopy modalities [27]. Encoder–decoder architectures such as U-Net further enhance performance by combining contextual and spatial information through skip connections, enabling accurate segmentation and pixel-wise reconstruction even in low-contrast or noisy imaging conditions [28,29]. Residual and attention-based architectures, including ResNet, ResU-Net, and RCAN, improve training stability and selectively emphasize informative features, resulting in superior denoising and resolution enhancement [30,31]. Beyond supervised learning, self-supervised and unsupervised strategies play an increasingly important role in optical microscopy by reducing reliance on large, manually annotated datasets, which are often difficult or impractical to obtain. Content-aware and self-supervised approaches such as CARE, Noise2Noise, and Noise2Void enable effective restoration of low-photon and low-SNR images without requiring paired ground-truth data, substantially mitigating phototoxicity and photobleaching while preserving quantitative fidelity [32,33]. Generative models further expand the methodological landscape. Generative adversarial networks (GANs) have been extensively applied to super-resolution, deblurring, phase retrieval, and modality transfer tasks, including virtual staining and label-free-to-fluorescence image translation [34,35,36]. In parallel, hybrid architectures that combine CNNs with transformer-based self-attention mechanisms enable the modeling of long-range spatial and temporal dependencies, thereby improving reconstruction fidelity and robustness in volumetric and time-lapse microscopy data [37,38,39]. More recently, machine learning has progressed beyond post-acquisition analysis to directly influence image formation and microscope operation. Learned autofocus metrics and ML-driven adaptive optics strategies outperform traditional contrast-based or iterative model-based approaches under low-signal and aberrated conditions, particularly in deep-tissue and multiphoton fluorescence microscopy [40]. Reinforcement learning and active learning frameworks further enable adaptive acquisition and closed-loop optimization of imaging parameters, illumination strategies, and sampling density, maximizing information content while minimizing phototoxicity and acquisition time, volumetric, and super-resolution microscopy [41,42]. Together, these approaches constitute a coherent computational toolbox that complements optical hardware and enables more efficient, quantitative, and intelligent microscopy workflows.

Despite substantial progress, current AI-driven methods remain largely task-specific and weakly integrated across acquisition, analysis, and experimental decision-making. This fragmentation motivates the exploration of more general-purpose AI paradigms capable of coordinating heterogeneous tools, incorporating prior knowledge, and supporting adaptive, goal-directed microscopy [43]. The AI approaches currently adopted in optical microscopy, including convolutional neural networks, generative models, and reinforcement learning frameworks, are primarily designed to solve well-defined, task-specific problems. They operate at the level of numerical image representations and are optimized for objectives such as denoising, segmentation, reconstruction, or parameter tuning. Even in adaptive or closed-loop settings, their scope typically remains confined to localized optimization within predefined state and reward structures.

From a broader system-level perspective, two alternative trajectories can be envisioned for AI-enabled microscopy. One consists of progressively expanding collections of specialized tools, in which deep learning, reinforcement learning, and multimodal models address individual tasks within the imaging pipeline. The other envisions fully autonomous platforms capable of planning and executing experiments with minimal human intervention, following the paradigm of self-driving laboratories. ChatMicroscopy is conceptually positioned as an intermediate pathway between these two extremes. Rather than functioning as a set of isolated computational modules or as a fully autonomous experimental controller, it is proposed as a collaborative and conversational layer that supports human researchers. By leveraging the semantic and reasoning capabilities of large language models, such systems can interpret experimental intent expressed in natural language, propose strategies, retrieve relevant data and literature, coordinate existing AI components, and assist in optimizing experimental workflows. In this sense, LLMs are not introduced as replacements for established AI methods or for human expertise, but as integrative agents that enhance contextual reasoning, accessibility, and coherence across complex microscopy processes.

3. A Focus on Large Language Models (LLMs)

Large language models (LLMs) are a class of artificial intelligence systems based on deep neural network architectures, most commonly transformer models, trained on large-scale corpora of text and code using self-supervised learning objectives [44,45,46]. This training paradigm leverages statistical regularities in language to learn high-dimensional representations that encode syntactic structure, semantic relationships, and long-range contextual dependencies [47]. As a consequence, LLMs exhibit emergent capabilities that extend beyond text generation, including multi-step reasoning, abstraction, summarization, translation across representational domains, and the generation of executable code. A key aspect that distinguishes large language models (LLMs) from earlier machine learning approaches is their strong scale dependence. Empirical evidence has shown that systematic increases in model size, training data volume and diversity, and available computational resources lead not only to incremental improvements, but also to qualitative gains in performance [48]. These gains include the emergence of advanced capabilities such as abstraction, reasoning, and robust generalization to tasks that were not explicitly encountered during training. This characteristic scaling behavior underlies the concept of LLMs as foundation models. Rather than being optimized for a single, narrowly defined application, LLMs provide a general-purpose representational substrate that captures broad patterns in language, code, and domain knowledge [49]. This substrate can be readily adapted and steered through prompting or enhanced by integrating external tools and domain-specific models. In this way, it can support a wide range of downstream tasks with little to no additional training or fine-tuning, reducing both development time and computational costs [50,51]. As a result, LLMs can be deployed across heterogeneous domains and applications while maintaining a coherent internal representation of knowledge, context, and user intent, making them particularly well-suited to complex, multi-stage scientific workflows. From an architectural perspective, transformer-based attention mechanisms enable LLMs to flexibly integrate information across long sequences, making them particularly well suited for tasks that require contextual reasoning, dependency tracking, and the integration of heterogeneous information sources. When coupled with external tools, such as databases, simulation engines, or hardware control interfaces, LLMs can act as coordinating agents that select actions, invoke tools, and reason over intermediate results [52]. This capability places LLMs at the intersection of symbolic reasoning and data-driven learning, helping to bridge the long-standing divide between statistical pattern recognition and higher-level cognitive functions.

In scientific contexts, these properties allow LLMs to operate not only as pattern-recognition tools but also as mediators between human intent and technical systems. Experimental goals, constraints, hypotheses, and protocols can be expressed in natural language, enabling LLMs to reason over scientific intent rather than purely numerical inputs. By translating high-level conceptual descriptions into structured representations and executable actions, LLMs support a new mode of human–machine interaction that is iterative, contextual, and adaptive [53]. In the context of advanced optical microscopy, this shift has important implications. As microscope platforms grow increasingly complex and software-defined, LLMs can serve as cognitive interface layers that connect experimental reasoning with instrument control, data analysis pipelines, and workflow orchestration [54]. Instead of replacing existing machine learning methods, LLMs complement them by acting as an integrative layer that coordinates heterogeneous tools, incorporates prior knowledge, and preserves coherence throughout multi-step experimental workflows. This broader conceptual role distinguishes LLMs from task-specific AI components and supports their recognition as enabling technologies for next-generation intelligent microscopy systems. Within this framework, ChatMicroscopy is conceived as a functional paradigm in which large language models act as an interpretative and coordinating layer within the microscopy ecosystem. Rather than referring to a specific instrument, it designates a mode of interaction where natural language becomes the primary interface for expressing experimental intent, constraints, and objectives. Through this interface, LLMs can support tasks such as experimental planning, imaging modality selection, parameter optimization, workflow orchestration, and contextual interpretation of results. By maintaining a persistent representation of experimental context, ChatMicroscopy may enable iterative, dialog-driven refinement of imaging strategies and adaptive responses to outcomes, while reasoning transparently about trade-offs among resolution, speed, phototoxicity, and information content. In this way, it serves as a bridge between human scientific reasoning and the increasingly software-defined, data-intensive nature of modern optical microscopy.

4. LLMs as Interfaces for Conversational Microscope Control and Experiment Design

To date, demonstrated applications of large language models in optical microscopy remain limited to proof-of-concept and early-stage implementations. Within this emerging landscape, one of the most promising directions is their use as natural-language interfaces for instrument control and experiment design [55,56]. Modern microscopes increasingly expose their functionality through application programming interfaces (APIs), enabling programmatic access to hardware components, acquisition routines, and analysis pipelines, as exemplified by open microscopy frameworks and control layers [57]. However, these interfaces remain fragmented, weakly standardized across vendors, and largely accessible only to expert users with substantial programming experience. As a result, a persistent gap exists between high-level experimental intent and low-level instrument configuration. LLMs can help bridge this gap by translating high-level experimental descriptions expressed in natural language into executable code or structured commands for instrument control. This paradigm aligns with recent developments in autonomous and self-driving laboratory systems in chemistry and materials science, where AI models combined with robotic platforms have demonstrated goal-driven orchestration and adaptive experimentation [58,59]. These studies provide important conceptual precedents; however, they were conducted in chemical and material synthesis contexts, and their direct extension to optical microscopy platforms remains to be experimentally demonstrated. In materials science, LLM-driven autonomous agents have demonstrated the ability to control microscopes, perform image segmentation with vision foundation models, and reason about experimental outcomes without task-specific training, while closely related approaches are emerging in scanning probe microscopy, where such agents orchestrate multi-step experiments and coordinate integrated acquisition and analysis pipelines [60,61,62]. Reported examples include conversational interfaces that generate microscope control scripts on demand, LLM-guided selection of imaging modalities based on sample properties and experimental constraints, and adaptive acquisition strategies in which parameters are iteratively refined based on intermediate results [63]. In the field of AFM microscopy, the use of large language model agents for experiment automation has been evaluated by introducing appropriate metrics to systematically assess their performance, reliability, and robustness [64]. In autonomous or semi-autonomous settings, LLMs can coordinate multiple stages of an experiment, including calibration, acquisition, quality assessment, and downstream analysis, while maintaining a human-readable representation of experimental logic and decision pathways. Natural language-based control paradigms enhance accessibility, training, and reproducibility. By lowering the barrier to complex microscope operation, LLMs enable non-experts to conduct sophisticated experiments and allow experts to rapidly prototype advanced acquisition strategies. At the same time, recording experimental intent and decisions in text form improves transparency, reproducibility, and knowledge transfer, in line with FAIR data principles [65]. At the same time, deploying LLMs as control interfaces introduces critical challenges. Current models may lack detailed domain-specific knowledge and can generate incorrect, ambiguous, or unsafe commands if unconstrained, particularly in safety-critical experimental contexts. Robust implementations, therefore, require structured prompting, formal representations of instrument capabilities, sandboxed execution environments, and validation layers that enforce physical, optical, and experimental constraints. Human-in-the-loop designs remain essential to ensure safety, reliability, and scientific accountability as conversational control paradigms continue to mature.

5. Organization and Scientific Interaction of Data, Images, and Knowledge

Microscopy experiments generate rich, heterogeneous datasets that extend far beyond raw images, encompassing metadata, experimental parameters, annotations, analysis outputs, and contextual knowledge derived from protocols and prior studies. As imaging modalities become increasingly multimodal and high-throughput, extracting meaning from these complex data ecosystems poses a major challenge. LLMs can interact with image-derived features and representations produced by computer vision and foundation models, enabling higher-level reasoning over combined visual, numerical, and textual information [66]. Recent work on vision–language models in microscopy has shown that zero-shot or weakly supervised tasks, such as image classification, segmentation guidance, and visual question answering, are increasingly feasible across biological and materials datasets [67]. While pure LLMs operate only on text and cannot directly process raw images, vision–language models combine visual encoders with language components to enable multimodal reasoning. Within this ecosystem, LLMs are best understood as a high-level coordination layer: they support workflow planning, code generation, experimental interpretation, and integration of heterogeneous analytical tools rather than replacing dedicated vision models. When combined with vision components, they enable interactive data exploration, allowing users to query image collections in natural language, obtain explanations of detected structures, and guide downstream analyses without extensive scripting. When integrated with LLMs, these models support interactive data exploration workflows in which users can query image collections using natural language, request explanations of detected structures, or guide downstream analyses without explicit scripting. For example, LLM-assisted interfaces can be used to navigate large microscopy datasets, retrieve representative images based on semantic descriptions, and suggest candidate regions of interest for further inspection or quantitative analysis. Beyond direct image analysis, LLMs can function as knowledge organization and integration layers. By linking microscopy data to experimental metadata, laboratory notebooks, protocols, and relevant literature, LLM-based systems can facilitate semantic annotation, automated report generation, and contextualization of results within broader scientific frameworks. Early case studies in analytical chemistry and spectroscopy demonstrate how LLMs can assist in interpreting complex spectra, suggesting experimental adjustments, and managing knowledge across experiments [68]. Analogous approaches in microscopy could enable automated cross-referencing of imaging results with prior experiments, theoretical models, or external databases, supporting cumulative and reproducible research practices. LLMs also open new possibilities for hypothesis generation and exploratory reasoning in microscopy-driven studies. By synthesizing information across datasets, experimental conditions, and prior knowledge, LLM-based assistants can propose testable hypotheses, highlight unexpected correlations, or suggest follow-up experiments. In collaborative settings, such systems may act as scientific companions that support iterative sense-making, bridging quantitative image analysis with qualitative interpretation. The integration of LLMs with bioimaging foundation models therefore represents a key step toward multimodal scientific intelligence, enabling microscopy platforms that not only acquire and process data, but also actively support reasoning, discovery, and knowledge creation.

6. Management of Complex Microscopy Workflows: From Single Instruments to Integrated Facilities

As microscopy infrastructures continue to evolve from single-instrument laboratories toward shared facilities and highly integrated imaging platforms, the management of experimental workflows has become a critical scientific, technical, and organizational challenge. Modern imaging facilities routinely operate heterogeneous collections of instruments spanning widefield, confocal, super-resolution, light-sheet, and multimodal microscopy, while supporting diverse user communities ranging from expert microscopists to occasional users [69,70,71]. These environments generate massive, high-dimensional datasets whose value depends not only on acquisition quality, but also on coordinated analysis pipelines, standardized metadata capture, long-term storage, and reproducible data access [72,73]. As a result, facility-scale microscopy increasingly resembles a distributed physical system rather than a collection of independent instruments. Within this landscape, automation and intelligent coordination across instruments, enabled by conversational LLMs, can function as a unifying layer connecting users, experimental platforms, and data streams (Figure 2). By mediating interactions between humans and complex instrumentation, conversational LLMs can simplify experimental design, orchestrate adaptive workflows in real time, and support data-driven decision-making. This, in turn, enhances efficiency, reproducibility, and insight generation across the entire experimental pipeline. Large language models are particularly well-suited to this role, as they operate at a semantic level capable of integrating experimental intent, procedural knowledge, and contextual information. Rather than controlling individual low-level tasks, an LLM can act as a supervisory layer that supports experiment planning, scheduling of shared resources, coordination of multimodal and multi-instrument acquisitions, and continuous tracking of metadata and provenance. Recent demonstrations of goal-driven orchestration and autonomous laboratory coordination have been reported primarily in chemistry and materials science contexts [74,75]. This paradigm aligns with recent developments in autonomous and self-driving laboratory systems, where AI models combined with domain-specific controllers and perception models enable zero-shot or few-shot execution of complex experimental workflows [76,77,78]. While these demonstrations have primarily focused on chemistry and materials science, the underlying concepts of goal-driven orchestration, adaptive decision-making, and tool integration are directly transferable to microscopy-intensive facilities.

A key enabling factor for facility-level orchestration is integration with Laboratory Information Management Systems (LIMS) and scientific data infrastructures. By interfacing with structured repositories of samples, protocols, instrument logs, calibration records, and historical experiments, LLM-driven systems can support experiment reuse, automated documentation, provenance tracking, and compliance with FAIR data principles [65]. In microscopy, standardized data formats and metadata models further facilitate interoperability across instruments and analysis pipelines, providing a substrate on which LLMs can reason about datasets and workflows at scale [79,80]. At the facility level, these capabilities enable new modes of operation, including automated experiment logging, intelligent user support, adaptive scheduling based on experimental priorities, and continuous optimization of resource utilization [81]. They also open the possibility of longitudinal knowledge accumulation, where insights from prior experiments inform future acquisitions and analysis strategies. LLM integration with FAIR-compliant microscopy data management systems (OMERO, LIMS) remains a promising future research direction. However, deploying LLMs in shared research infrastructures raises important challenges related to robustness, accountability, transparency, and user trust. Decisions affecting experimental outcomes, data integrity, or instrument access must remain interpretable and auditable, particularly in multi-user environments. These considerations strongly motivate human-in-the-loop designs, in which LLMs augment expert oversight rather than replace it, ensuring that automation enhances efficiency and reproducibility while preserving scientific responsibility and user agency [82].

7. Challenges and Ethical Considerations

The integration of large language models into microscopy presents several technical, methodological, and ethical challenges that must be carefully addressed. Hallucinations, brittle reasoning, and limited robustness remain major concerns, particularly in experimental settings where incorrect decisions can lead to wasted resources, damaged samples, or misleading scientific conclusions [83,84,85,86]. Strong performance in scientific question answering or code generation does not necessarily translate into reliable operation in real laboratory environments. A central challenge, therefore, lies in validation and interpretability: model outputs must be systematically assessed against experimental ground truth, and their reasoning processes must be transparent enough to allow human experts to evaluate assumptions, detect potential biases, and identify conditions under which the model’s recommendations may break down. Without robust validation frameworks and interpretable decision pathways, high-performing models risk remaining powerful yet fragile tools, unsuitable for autonomous or safety-critical laboratory applications. While LLMs are capable of generating fluent explanations, these explanations do not always reflect the internal decision-making processes of the model. In microscopy, where experimental choices often involve subtle trade-offs and safety constraints, it is essential to develop validation frameworks that go beyond output plausibility and explicitly test correctness, consistency, and physical feasibility. Hybrid human–AI control paradigms, in which LLMs provide recommendations or orchestrate workflows under human supervision, represent a pragmatic intermediate step toward greater autonomy.

Large language models are intrinsically probabilistic systems whose outputs reflect learned statistical patterns rather than deterministic physical laws. This fundamental characteristic introduces a structural tension with the epistemic foundations of experimental science, which rely on reproducibility, falsifiability, and physically grounded validation. Human-in-the-loop designs mitigate, but do not fully eliminate, this tension. For LLM-supported microscopy to remain scientifically rigorous, probabilistic suggestions must be embedded within formal validation layers, explicit physical and instrumental constraints, traceable logging mechanisms, and reproducible execution pipelines. In this framework, LLM outputs can be treated initially as hypothesis-generating or strategy-suggesting proposals rather than authoritative decisions. Scientific rigor is therefore preserved not by removing probabilistic reasoning, but by ensuring that all AI-mediated recommendations are subject to transparent verification, experimental confirmation, and accountable human oversight. In this context, the integration of LLMs into microscopy workflows requires systematic validation to ensure scientific reliability. Because LLM-based systems may influence experiment design, instrument configuration, and data interpretation, benchmarking must operate at multiple levels: (i) correctness of LLM-generated workflows compared with expert-defined protocols, (ii) quantitative evaluation of resulting image quality, and (iii) robustness of intermediate reasoning steps. Reproducibility must account for the stochastic nature of LLMs. Deterministic configurations can be strengthened through temperature-zero inference, random seed control, model and version locking, and fixed prompt templates. Safety validation is equally critical when LLMs generate instrument commands, requiring independent constraint layers for parameter range checking, sequence validation, collision avoidance, and sandboxed execution. Domain shift across microscope vendors, laboratories, and protocols further necessitates cross-site validation. Meaningful baselines should include comparisons against human expert performance and traditional automation pipelines. Emerging multimodal benchmarks such as MicroVQA [87], MicroVQA++ [88], and Micro-Bench [89] provide structured evaluation environments for vision–language reasoning, although dedicated benchmarks for hardware-level LLM orchestration remain to be developed.

It is important to recognize that, in microscopy-specific settings, hallucination risks may translate into concrete experimental and safety concerns. LLMs can generate non-existent protocols or fabricated references, suggest physically invalid or unsafe parameter combinations (e.g., laser powers exceeding photodamage thresholds), conflate imaging modalities, or misuse domain-specific terminology such as “resolution” in its optical versus digital meaning. Such errors may compromise reproducibility, sample integrity, and experimental validity. Moreover, LLM outputs are sensitive to prompt formulation: equivalent experimental requests phrased differently may yield divergent acquisition strategies, posing a potential threat to reproducibility. Mitigation strategies include standardized prompt templates, few-shot domain exemplars, structured input schemas, and constrained generation limited to validated instrument capabilities and parameter ranges. Reproducible integration into microscopy workflows further requires explicit control of stochastic generation parameters (e.g., temperature), deterministic settings when appropriate, and comprehensive logging infrastructures. Model versioning, prompt archiving, parameter tracking, and decision provenance documentation are essential to ensure auditability.

Finally, in biomedical and clinical microscopy, data protection regulations (e.g., GDPR) impose strict constraints on the handling of patient-derived imaging data. The use of cloud-based LLM services may expose sensitive or proprietary information. Local or on-premise model deployment, federated learning strategies, and privacy-preserving techniques such as differential privacy represent viable mitigation pathways in regulated environments.Ethical and governance considerations are particularly salient in biomedical and clinical microscopy. Issues of data privacy, intellectual property, and regulatory compliance are amplified when cloud-based or third-party LLMs are used to process sensitive imaging data or experimental metadata [90,91]. These concerns are driving growing interest in the local deployment of open-source models and the adoption of privacy-preserving training strategies within secure, institutionally controlled infrastructures. In parallel, clear guidelines will be required to define responsibility and accountability when AI-assisted systems influence experimental decisions or interpretations. From a technological perspective, several research directions are likely to shape the future of LLM-enabled microscopy. These include the integration of domain-specific knowledge bases and physical constraints into LLM reasoning, the coupling of language models with multimodal perception systems capable of operating directly on microscopy data, and the development of agent-based architectures that can plan, execute, and revise multi-step experiments. Early demonstrations of LLM-agent-based autonomous microscopy frameworks suggest the feasibility of closed-loop experimentation in controlled research settings [92]. However, these systems remain in an early validation phase, and their robustness, scalability, and safety in routine microscopy practice have yet to be systematically established.

8. Future Perspectives

From a developmental perspective, the integration of large language models into advanced microscopy should be framed as a staged and progressive trajectory rather than as an immediately deployable solution. In the near term, achievable steps include LLM-assisted generation of control scripts for existing microscope APIs, human-in-the-loop conversational support for experiment planning, structured prompt templates to reduce ambiguity, and sandboxed validation layers that constrain parameter ranges according to physical and instrumental limits. These implementations can be layered onto current infrastructures without requiring full experimental autonomy. Mid-term developments may involve constraint-aware LLM agents capable of incorporating domain-specific rules and physical priors into their reasoning processes, closer integration with vision models to support adaptive acquisition strategies, standardized logging and audit mechanisms to ensure traceability, and supervised closed-loop refinement of workflows. Longer-term research directions include multi-instrument coordination at the facility scale, deep integration with FAIR-compliant data infrastructures and laboratory information systems, and the development of robust semi-autonomous experimental platforms operating under formal validation and accountability frameworks. Such a roadmap underscores the importance of incremental validation, transparency, and human oversight while progressively expanding the capabilities of LLM-supported microscopy systems.

In this scenario, the technological maturity of LLM-assisted microscopy must be realistically assessed. Targeted patent searches conducted in major international databases did not identify recent dedicated patents explicitly combining large language models with optical microscopy control or automation. This contrasts with the well-established patent landscape of conventional AI-driven microscopy, where numerous intellectual property filings exist for convolutional neural network-based autofocus, segmentation, and acquisition optimization. While the absence of patents does not invalidate the conceptual soundness of LLM-assisted approaches, it indicates that this field remains at an early stage of technological readiness, currently closer to exploratory research and proof-of-concept demonstrations than to standardized industrial deployment. At present, LLM-enabled microscopy should therefore be regarded as a pre-commercial and research-driven paradigm, whose translation into robust, validated, and vendor-integrated systems will require further interdisciplinary development and technological consolidation.

Looking forward, the long-term vision is not one of fully replacing human scientists, but of redefining human–machine collaboration in experimental research. LLMs may increasingly act as co-scientists that augment human creativity, support exploratory reasoning, and reduce the cognitive burden associated with complex instrumentation and data-intensive workflows. If developed responsibly, such systems have the potential to accelerate discovery, improve reproducibility, and broaden access to advanced microscopy techniques, ultimately reshaping how experiments are designed, executed, and interpreted across the sciences.

9. Conclusions

This Perspective Review outlines how large language models can serve as enabling technologies for a new generation of intelligent optical microscopy systems. By positioning LLMs as cognitive interface and orchestration layers, rather than as replacements for established machine learning methods or human expertise, their unique capacity to bridge experimental intent, instrument control, data analysis, and knowledge integration is highlighted. This shift moves artificial intelligence in microscopy beyond task-specific optimization toward systems capable of contextual reasoning and adaptive decision-making. The convergence of advanced microscopy platforms, foundation models, and agent-based AI–architectures opens the door to experimental workflows that are increasingly adaptive, interpretable, and autonomous. At the same time, realizing this vision will require sustained interdisciplinary effort. Progress will depend not only on advances in model architectures and computational efficiency, but also on the development of robust validation strategies, transparent Human–AI interaction paradigms, and shared standards for data, metadata, and instrument interfaces. By addressing these challenges alongside technological innovation, the microscopy community can harness the transformative potential of LLMs while preserving scientific rigor, safety, and reproducibility. Ultimately, the integration of language-based AI into microscopy promises not only more efficient experiments, but a qualitative shift in how scientists interact with complex instruments, reason about data, and explore the microscopic world.

Funding

This work was funded by the European Union—NextGenerationEU—MUR funds D.M. 737/2021—BrightEYES research project and University of Palermo, FFR Unipa 2024.

Data Availability Statement

Not applicable.

Acknowledgments

The author thanks the Molecular Biophysics and Nanotechnology Group of the University of Palermo for valuable discussions. During the preparation of this manuscript, the author used ChatGPT (version 5.2) for grammatical refinement of non-native English and for image optimization using engineered prompts. The author has critically reviewed and edited the generated content and takes full responsibility for the content of this publication.

Conflicts of Interest

The author declares no conflicts of interest.

References

Mertz, J. Introduction to Optical Microscopy; Cambridge University Press: Cambridge, UK, 2019. [Google Scholar]
Mondal, P.P.; Diaspro, A. Fundamentals of Fluorescence Microscopy: Exploring Life with Light; Springer: Dordrecht, The Netherlands, 2014; pp. 1–218. [Google Scholar] [CrossRef]
Balasubramanian, H.; Hobson, C.M.; Chew, T.L.; Aaron, J.S. Imagining the Future of Optical Microscopy: Everything, Everywhere, All at Once. Commun. Biol. 2023, 6, 1096. [Google Scholar] [CrossRef]
Hsieh, H.C.; Han, Q.; Brenes, D.; Bishop, K.W.; Wang, R.; Wang, Y.; Poudel, C.; Glaser, A.K.; Freedman, B.S.; Vaughan, J.C.; et al. Imaging 3D Cell Cultures with Optical Microscopy. Nat. Methods 2025, 22, 1167–1190. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, X.; Xu, J.; Sun, X.; Zhao, X.; Li, H.; Liu, Y.; Tian, J.; Hao, X.; Kong, X.; et al. The Development of Microscopic Imaging Technology and Its Application in Micro- and Nanotechnology. Front. Chem. 2022, 10, 931169. [Google Scholar] [CrossRef]
Gupta, P.; Rai, N.; Verma, A.; Gautam, V. Microscopy Based Methods for Characterization, Drug Delivery, and Understanding the Dynamics of Nanoparticles. Med. Res. Rev. 2024, 44, 138–168. [Google Scholar] [CrossRef] [PubMed]
Read, I.A. Modern Applications in Multiphoton Microscopy. In Proceedings of the Multiphoton Microscopy in the Biomedical Sciences XXV, San Francisco, CA, USA, 25–31 January 2025; p. 31. [Google Scholar] [CrossRef]
Schmolze, D.B.; Standley, C.; Fogarty, K.E.; Fischer, A.H. Advances in Microscopy Techniques. Arch. Pathol. Lab. Med. 2011, 135, 255–263. [Google Scholar] [CrossRef]
Suar, M.; Misra, N.; Bhavesh, N.S. Biomedical Imaging Instrumentation: Applications in Tissue, Cellular and Molecular Diagnostics; Academic Press: Cambridge, MA, USA, 2022. [Google Scholar]
Herman, B.; Lemasters, J.J. Optical Microscopy: Emerging Methods and Applications; Academic Press: Cambridge, MA, USA, 2012; p. 462. [Google Scholar]
Leung, B.O.; Chou, K.C. Review of Super—Resolution Fluorescence Microscopy for Biology. Appl. Spectrosc. 2011, 65, 967–980. [Google Scholar] [CrossRef]
Datta, R.; Heaster, T.M.; Sharick, J.T.; Gillette, A.A.; Skala, M.C. Fluorescence Lifetime Imaging Microscopy: Fundamentals and Advances in Instrumentation, Analysis, and Applications. J. Biomed. Opt. 2020, 25, 071203. [Google Scholar] [CrossRef] [PubMed]
Yu, L.; Lei, Y.; Ma, Y.; Liu, M.; Zheng, J.; Dan, D.; Gao, P. A Comprehensive Review of Fluorescence Correlation Spectroscopy. Front. Phys. 2021, 9, 644450. [Google Scholar] [CrossRef]
Stelzer, E.H.K.; Strobl, F.; Chang, B.J.; Preusser, F.; Preibisch, S.; McDole, K.; Fiolka, R. Light Sheet Fluorescence Microscopy. Nat. Rev. Methods Primers 2021, 1, 73. [Google Scholar] [CrossRef]
Andrews, J.C. The Future of AI and Emerging Trends. Int. Sci. J. Eng. Manag. 2025, 4, 1–7. [Google Scholar] [CrossRef]
Tibrewala, A. Advancements in Artificial Intelligence: Breakthroughs, Challenges and the Road Ahead. In Proceedings of the 2025 IEEE Conference on Artificial Intelligence, CAI, Santa Clara, CA, USA, 5–7 May 2025; pp. 1394–1402. [Google Scholar] [CrossRef]
Choi, R.Y.; Coyner, A.S.; Kalpathy-Cramer, J.; Chiang, M.F.; Peter Campbell, J. Introduction to Machine Learning, Neural Networks, and Deep Learning. Transl. Vis. Sci. Technol. 2020, 9, 14. Available online: https://tvst.arvojournals.org/article.aspx?articleid=2762344 (accessed on 1 March 2026).
Naveed, H.; Khan, A.U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; Mian, A. A Comprehensive Overview of Large Language Models. ACM Trans. Intell. Syst. Technol. 2025, 16, 106. [Google Scholar] [CrossRef]
Civitarese, G.; Fiori, M.; Choudhary, P.; Bettini, C. Large Language Models Are Zero-Shot Recognizers for Activities of Daily Living. ACM Trans. Intell. Syst. Technol. 2025, 16, 78. [Google Scholar] [CrossRef]
Huang, J.; Xu, Y.; Wang, Q.; Wang, Q.; Liang, X.; Wang, F.; Zhang, Z.; Wei, W.; Zhang, B.; Huang, L.; et al. Foundation Models and Intelligent Decision-Making: Progress, Challenges, and Perspectives. Innovation 2025, 6, 100948. [Google Scholar] [CrossRef] [PubMed]
Melanthota, S.K.; Gopal, D.; Chakrabarti, S.; Kashyap, A.A.; Radhakrishnan, R.; Mazumder, N. Deep Learning-Based Image Processing in Optical Microscopy. Biophys. Rev. 2022, 14, 463–481. [Google Scholar] [CrossRef]
Morgado, L.; Gómez-de-Mariscal, E.; Heil, H.S.; Henriques, R. The Rise of Data-Driven Microscopy Powered by Machine Learning. J. Microsc. 2024, 295, 85–92. [Google Scholar] [CrossRef] [PubMed]
Pinkard, H.; Waller, L.; Phillips, Z.; Babakhani, A.; Fletcher, D.A. Deep Learning for Single-Shot Autofocus Microscopy. Optica 2019, 6, 794–797. [Google Scholar] [CrossRef]
Amin, A.A.; Sajid Iqbal, M.; Hamza Shahbaz, M. Development of Intelligent Fault-Tolerant Control Systems with Machine Learning, Deep Learning, and Transfer Learning Algorithms: A Review. Expert Syst. Appl. 2024, 238, 121956. [Google Scholar] [CrossRef]
Moen, E.; Bannon, D.; Kudo, T.; Graf, W.; Covert, M.; Van Valen, D. Deep Learning for Cellular Image Analysis. Nat. Methods 2019, 16, 1233–1246. [Google Scholar] [CrossRef]
Bayoudh, K. A Survey of Multimodal Hybrid Deep Learning for Computer Vision: Architectures, Applications, Trends, and Challenges. Inf. Fusion 2024, 105, 102217. [Google Scholar] [CrossRef]
Zhao, X.; Wang, L.; Zhang, Y.; Han, X.; Deveci, M.; Parmar, M. A Review of Convolutional Neural Networks in Computer Vision. Artif. Intell. Rev. 2024, 57, 99. [Google Scholar] [CrossRef]
Pan, P.; Zhang, C.; Sun, J.; Guo, L. Multi-Scale Conv-Attention U-Net for Medical Image Segmentation. Sci. Rep. 2025, 15, 12041. [Google Scholar] [CrossRef]
Zuo, C.; Qian, J.; Feng, S.; Yin, W.; Li, Y.; Fan, P.; Han, J.; Qian, K.; Chen, Q. Deep Learning in Optical Metrology: A Review. Light Sci. Appl. 2022, 11, 39. [Google Scholar] [CrossRef]
Guo, M.; Wu, Y.; Hobson, C.M.; Su, Y.; Qian, S.; Krueger, E.; Christensen, R.; Kroeschell, G.; Bui, J.; Chaw, M.; et al. Deep Learning-Based Aberration Compensation Improves Contrast and Resolution in Fluorescence Microscopy. Nat. Commun. 2025, 16, 313. [Google Scholar] [CrossRef] [PubMed]
Chen, Y.; Hu, X.; Zhu, Y.; Liu, X.; Yi, B. Real-Time Non-Invasive Hemoglobin Prediction Using Deep Learning-Enabled Smartphone Imaging. BMC Med. Inform. Decis. Mak. 2024, 24, 187. [Google Scholar] [CrossRef] [PubMed]
Boothe, T.; Ivanković, M.; Grohme, M.A.; Markus, M.A.; Dullin, C.; Xu, X.; Rink, J.C. Content Aware Image Restoration Improves Spatiotemporal Resolution in Luminescence Imaging. Commun. Biol. 2023, 6, 518. [Google Scholar] [CrossRef]
Weigert, M.; Schmidt, U.; Boothe, T.; Müller, A.; Dibrov, A.; Jain, A.; Wilhelm, B.; Schmidt, D.; Broaddus, C.; Culley, S.; et al. Content-Aware Image Restoration: Pushing the Limits of Fluorescence Microscopy. Nat. Methods 2018, 15, 1090–1097. [Google Scholar] [CrossRef]
Fanous, M.J.; Popescu, G. GANscan: Continuous Scanning Microscopy Using Deep Learning Deblurring. Light Sci. Appl. 2022, 11, 265. [Google Scholar] [CrossRef]
Luo, Z.; Xu, X.; Lin, D.; Qu, J.; Lin, F.; Li, J. Removing Non-Resonant Background of CARS Signal with Generative Adversarial Network. Appl. Phys. Lett. 2024, 124, 264101. [Google Scholar] [CrossRef]
Vernuccio, F.; Broggio, E.; Sorrentino, S.; Bresci, A.; Junjuri, R.; Ventura, M.; Vanna, R.; Bocklitz, T.; Bregonzio, M.; Cerullo, G.; et al. Non-Resonant Background Removal in Broadband CARS Microscopy Using Deep-Learning Algorithms. Sci. Rep. 2024, 14, 23903. [Google Scholar] [CrossRef]
Rehman, A.; Zhovmer, A.; Sato, R.; Mukouyama, Y.S.; Chen, J.; Rissone, A.; Puertollano, R.; Liu, J.; Vishwasrao, H.D.; Shroff, H.; et al. Convolutional Neural Network Transformer (CNNT) for Fluorescence Microscopy Image Denoising with Improved Generalization and Fast Adaptation. Sci. Rep. 2024, 14, 18184. [Google Scholar] [CrossRef]
Yuan, F.; Zhang, Z.; Fang, Z. An Effective CNN and Transformer Complementary Network for Medical Image Segmentation. Pattern Recognit. 2023, 136, 109228. [Google Scholar] [CrossRef]
Bai, B.; Yang, X.; Li, Y.; Zhang, Y.; Pillar, N.; Ozcan, A. Deep Learning-Enabled Virtual Histological Staining of Biological Samples. Light Sci. Appl. 2023, 12, 57. [Google Scholar] [CrossRef]
Chen, J.; Sasaki, H.; Lai, H.; Su, Y.; Liu, J.; Wu, Y.; Zhovmer, A.; Combs, C.A.; Rey-Suarez, I.; Chang, H.Y.; et al. Three-Dimensional Residual Channel Attention Networks Denoise and Sharpen Fluorescence Microscopy Image Volumes. Nat. Methods 2021, 18, 678–687. [Google Scholar] [CrossRef] [PubMed]
Daetwyler, S.; Fiolka, R.P. Light-Sheets and Smart Microscopy, an Exciting Future Is Dawning. Commun. Biol. 2023, 6, 502. [Google Scholar] [CrossRef] [PubMed]
Volpe, G.; Wählby, C.; Tian, L.; Hecht, M.; Yakimovich, A.; Monakhova, K.; Waller, L.; Sbalzarini, I.F.; Metzler, C.A.; Xie, M.; et al. Roadmap on Deep Learning for Microscopy. J. Phys. Photonics 2025, 8, 012501. [Google Scholar] [CrossRef]
Lahari, P.V.; Dutta, S.; Deeksha, H.; Patel, S.A.; Dehury, B.; Mazumder, N. Deep Learning Integration in Optical Microscopy: Advancements and Applications. Microsc. Res. Tech. 2026. [Google Scholar] [CrossRef]
Khurana, D.; Koli, A.; Khatter, K.; Singh, S. Natural Language Processing: State of the Art, Current Trends and Challenges. Multimed. Tools Appl. 2023, 82, 3713–3744. [Google Scholar] [CrossRef] [PubMed]
Raza, M.; Jahangir, Z.; Riaz, M.B.; Saeed, M.J.; Sattar, M.A. Industrial Applications of Large Language Models. Sci. Rep. 2025, 15, 13755. [Google Scholar] [CrossRef]
Shao, M.; Basit, A.; Karri, R.; Shafique, M. Survey of Different Large Language Model Architectures: Trends, Benchmarks, and Challenges. IEEE Access 2024, 12, 188664–188706. [Google Scholar] [CrossRef]
Wang, X.; Chen, Z.; Wang, H.; Hou, U.L.; Li, Z.; Guo, W. Large Language Model Enhanced Knowledge Representation Learning: A Survey. Data Sci. Eng. 2025, 10, 315–338. [Google Scholar] [CrossRef]
Lu, W.; Luu, R.K.; Buehler, M.J. Fine-Tuning Large Language Models for Domain Adaptation: Exploration of Training Strategies, Scaling, Model Merging and Synergistic Capabilities. npj Comput. Mater. 2025, 11, 84. [Google Scholar] [CrossRef]
Budnikov, M.; Bykova, A.; Yamshchikov, I.P. Generalization Potential of Large Language Models. Neural Comput. Appl. 2024, 37, 1973–1997. [Google Scholar] [CrossRef]
Wu, X.-K.; Chen, M.; Li, W.; Wang, R.; Lu, L.; Liu, J.; Hwang, K.; Hao, Y.; Pan, Y.; Meng, Q.; et al. LLM Fine-Tuning: Concepts, Opportunities, and Challenges. Big Data Cogn. Comput. 2025, 9, 87. [Google Scholar] [CrossRef]
Anisuzzaman, D.M.; Malins, J.G.; Friedman, P.A.; Attia, Z.I. Fine-Tuning Large Language Models for Specialized Use Cases. Mayo Clin. Proc. Digit. Health 2025, 3, 100184. [Google Scholar] [CrossRef] [PubMed]
Shen, Z. LLM with Tools: A Survey. arXiv 2024, arXiv:2409.18807. [Google Scholar]
Fui-Hoon Nah, F.; Zheng, R.; Cai, J.; Siau, K.; Chen, L. Generative AI and ChatGPT: Applications, Challenges, and AI-Human Collaboration. J. Inf. Technol. Case Appl. Res. 2023, 25, 277–304. [Google Scholar] [CrossRef]
Zhao, Z.; Lee, W.S.; Hsu, D. Large Language Models as Commonsense Knowledge for Large-Scale Task Planning. Adv. Neural Inf. Process. Syst. 2023, 36, 31967–31987. [Google Scholar]
Xie, Y.; He, K.; Castellanos-Gomez, A. Toward Full Autonomous Laboratory Instrumentation Control with Large Language Models. Small Struct. 2025, 6, 2500173. [Google Scholar] [CrossRef]
Liu, Y.; Proksch, R.; Bemis, J.; Pratiush, U.; Dubey, A.; Ahmadi, M.; Emery, R.; Rack, P.D.; Liu, Y.C.; Yang, J.C.; et al. Machine Learning-Based Reward-Driven Tuning of Scanning Probe Microscopy: Toward Fully Automated Microscopy. ACS Nano 2025, 19, 19659–19669. [Google Scholar] [CrossRef]
Cho, B.H.; Cao-Berg, I.; Bakal, J.A.; Murphy, R.F. OMERO.Searcher: Content-Based Image Search for Microscope Images. Nat. Methods 2012, 9, 633–634. [Google Scholar] [CrossRef]
Dai, T.; Vijayakrishnan, S.; Szczypiński, F.T.; Ayme, J.F.; Simaei, E.; Fellowes, T.; Clowes, R.; Kotopanov, L.; Shields, C.E.; Zhou, Z.; et al. Autonomous Mobile Robots for Exploratory Synthetic Chemistry. Nature 2024, 635, 890–897. [Google Scholar] [CrossRef] [PubMed]
Szymanski, N.J.; Rendy, B.; Fei, Y.; Kumar, R.E.; He, T.; Milsted, D.; McDermott, M.J.; Gallant, M.; Cubuk, E.D.; Merchant, A.; et al. An Autonomous Laboratory for the Accelerated Synthesis of Novel Materials. Nature 2023, 624, 86–91. [Google Scholar] [CrossRef]
Boiko, D.A.; MacKnight, R.; Kline, B.; Gomes, G. Autonomous Chemical Research with Large Language Models. Nature 2023, 624, 570–578. [Google Scholar] [CrossRef]
Kalinin, S.V.; Ziatdinov, M.; Hinkle, J.; Jesse, S.; Ghosh, A.; Kelley, K.P.; Lupini, A.R.; Sumpter, B.G.; Vasudevan, R.K. Automated and Autonomous Experiments in Electron and Scanning Probe Microscopy. ACS Nano 2021, 15, 12604–12627. [Google Scholar] [CrossRef]
Krull, A.; Hirsch, P.; Rother, C.; Schiffrin, A.; Krull, C. Artificial-Intelligence-Driven Scanning Probe Microscopy. Commun. Phys. 2020, 3, 54. [Google Scholar] [CrossRef]
Liu, Y.; Checa, M.; Vasudevan, R.K. Synergizing Human Expertise and AI Efficiency with Language Model for Microscopy Operation and Automated Experiment Design *. Mach. Learn. Sci. Technol. 2024, 5, 02LT01. [Google Scholar] [CrossRef]
Mandal, I.; Soni, J.; Zaki, M.; Smedskjaer, M.M.; Wondraczek, K.; Wondraczek, L.; Gosvami, N.N.; Krishnan, N.M.A. Evaluating Large Language Model Agents for Automation of Atomic Force Microscopy. Nat. Commun. 2025, 16, 9104. [Google Scholar] [CrossRef]
Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.W.; da Silva Santos, L.B.; Bourne, P.E.; et al. The FAIR Guiding Principles for Scientific Data Management and Stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef] [PubMed]
Umeike, R.; Getty, N.; Xia, F.; Stevens, R. Scaling Large Vision-Language Models for Enhanced Multimodal Comprehension in Biomedical Image Analysis. In Proceedings of the International Symposium on Biomedical Imaging, Houston, TX, USA, 14–17 April 2025. [Google Scholar] [CrossRef]
Verma, P.; Van, M.H.; Wu, X. Beyond Human Vision: The Role of Large Vision Language Models in Microscope Image Analysis. In Proceedings of the 2024 IEEE International Conference on Big Data (BigData), Washington, DC, USA, 15–18 December 2024; pp. 1700–1705. [Google Scholar] [CrossRef]
Duponchel, L.; de Oliveira, R.R.; Motto-Ros, V. Large Language Models (Such as ChatGPT) as Tools for Machine Learning-Based Data Insights in Analytical Chemistry. Anal. Chem. 2025, 97, 6956–6961. [Google Scholar] [CrossRef] [PubMed]
Zimmermann, T. Setting up a Light Microscopy Core Facility: Facility Design. J. Microsc. 2024, 294, 255–267. [Google Scholar] [CrossRef]
Ferrando-May, E.; Hartmann, H.; Reymann, J.; Ansari, N.; Utz, N.; Fried, H.U.; Kukat, C.; Peychl, J.; Liebig, C.; Terjung, S.; et al. Advanced Light Microscopy Core Facilities: Balancing Service, Science and Career. Microsc. Res. Tech. 2016, 79, 463–479. [Google Scholar] [CrossRef][Green Version]
Cartwright, H.N.; Hobson, C.M.; Chew, T.L.; Reiche, M.A.; Aaron, J.S. The Challenges and Opportunities of Open-Access Microscopy Facilities. J. Microsc. 2024, 294, 386–396. [Google Scholar] [CrossRef] [PubMed]
Ouyang, W.; Zimmer, C. The Imaging Tsunami: Computational Opportunities and Challenges. Curr. Opin. Syst. Biol. 2017, 4, 105–113. [Google Scholar] [CrossRef][Green Version]
Andreev, A.; Koo, D.E.S. Practical Guide to Storage of Large Amounts of Microscopy Data. Micros. Today 2020, 28, 42–45. [Google Scholar] [CrossRef]
Chiarello, F.; Giordano, V.; Spada, I.; Barandoni, S.; Fantoni, G. Future Applications of Generative Large Language Models: A Data-Driven Case Study on ChatGPT. Technovation 2024, 133, 103002. [Google Scholar] [CrossRef]
Kumar, Y.; Cardan, R.A.; Chang, H.H.; Heinzman, K.A.; Gultekin, K.; Goss, A.; McDonald, A.; Murdaugh, D.; McConathy, J.; Rothenberg, S.; et al. Demonstrating an Academic Core Facility for Automated Medical Image Processing and Analysis: Workflow Design and Practical Applications. Diagnostics 2025, 15, 803. [Google Scholar] [CrossRef] [PubMed]
Martin, H.G.; Radivojevic, T.; Zucker, J.; Bouchard, K.; Sustarich, J.; Peisert, S.; Arnold, D.; Hillson, N.; Babnigg, G.; Marti, J.M.; et al. Perspectives for Self-Driving Labs in Synthetic Biology. Curr. Opin. Biotechnol. 2023, 79, 102881. [Google Scholar] [CrossRef]
Li, T.; Song, W.; Chen, N.; Wang, Q.; Gao, F.; Xing, Y.; Wu, S.; Song, C.; Li, J.; Liu, Y.; et al. The Artificial Intelligence-Driven Intelligent Laboratory for Organic Chemistry Synthesis. Appl. Sci. 2025, 15, 7387. [Google Scholar] [CrossRef]
Soldatov, M.A.; Butova, V.V.; Pashkov, D.; Butakova, M.A.; Medvedev, P.V.; Chernov, A.V.; Soldatov, A.V. Self-driving Laboratories for Development of New Functional Materials and Optimizing Known Reactions. Nanomaterials 2021, 11, 619. [Google Scholar] [CrossRef] [PubMed]
Allan, C.; Burel, J.M.; Moore, J.; Blackburn, C.; Linkert, M.; Loynton, S.; MacDonald, D.; Moore, W.J.; Neves, C.; Patterson, A.; et al. OMERO: Flexible, Model-Driven Data Management for Experimental Biology. Nat. Methods 2012, 9, 245–253. [Google Scholar] [CrossRef]
OMERO for Microscopy Research Data Management—2022—Wiley Analytical Science. Available online: https://analyticalscience.wiley.com/content/article-do/omero-microscopy-research-data-management (accessed on 8 January 2026).
Morone, D.; D’Antuono, R. AI-Based Hardware and Software Tools in Microscopy to Boost Research in Immunology and Virology. Front. Immunol. 2025, 16, 1610345. [Google Scholar] [CrossRef]
Amershi, S.; Weld, D.; Vorvoreanu, M.; Fourney, A.; Nushi, B.; Collisson, P.; Suh, J.; Iqbal, S.; Bennett, P.N.; Inkpen, K.; et al. Guidelines for Human-AI Interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI ’19), Glasgow, UK, 4–9 May 2019; p. 13. [Google Scholar] [CrossRef]
Ji, Z.; Lee, N.; Frieske, R.; Yu, T.; Su, D.; Xu, Y.; Ishii, E.; Bang, Y.J.; Madotto, A.; Fung, P. Survey of Hallucination in Natural Language Generation. ACM Comput. Surv. 2023, 55, 248. [Google Scholar] [CrossRef]
Farquhar, S.; Kossen, J.; Kuhn, L.; Gal, Y. Detecting Hallucinations in Large Language Models Using Semantic Entropy. Nature 2024, 630, 625–630. [Google Scholar] [CrossRef] [PubMed]
Parupudi, V.S.R. Systematic Diagnosis of Brittle Reasoning in Large Language Models. arXiv 2025, arXiv:2510.08595. [Google Scholar]
Moradi, M.; Yan, K.; Colwell, D.; Samwald, M.; Asgari, R. A Critical Review of Methods and Challenges in Large Language Models. Comput. Mater. Contin. 2025, 82, 1681–1698. [Google Scholar] [CrossRef]
Burgess, J.; Nirschl, J.J.; Bravo-Sánchez, L.; Lozano, A.; Gupte, S.R.; Galaz-Montoya, J.G.; Zhang, Y.; Su, Y.; Bhowmik, D.; Coman, Z.; et al. MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research. arXiv 2025, arXiv:2503.13399v1. [Google Scholar]
Li, M.; He, R.; Ma, C.; Tan, W.; Yan, B. MicroVQA++: High-Quality Microscopy Reasoning Dataset with Weakly Supervised Graphs for Multimodal Large Language Model. arXiv 2025, arXiv:2511.11407. [Google Scholar]
Burgess, J.; Gupte, S.; Lozano, A.; Nirschl, J.; Unell, A.; Yeung-Levy, S.; Zhang, Y. Micro-Bench: A Microscopy Benchmark for Vision-Language Understanding. In Proceedings of the Advances in Neural Information Processing Systems 37, Vancouver, BC, Canada, 10–15 December 2024; pp. 30670–30685. [Google Scholar] [CrossRef]
Haltaufderheide, J.; Ranisch, R. The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs). npj Digit. Med. 2024, 7, 183. [Google Scholar] [CrossRef] [PubMed]
Price, W.N.; Cohen, I.G. Privacy in the Age of Medical Big Data. Nat. Med. 2019, 25, 37–43. [Google Scholar] [CrossRef]
Mandal, I.; Soni, J.; Zaki, M.; Smedskjaer, M.M.; Wondraczek, K.; Wondraczek, L.; Gosvami, N.N.; Krishnan, N.M.A. Autonomous Microscopy Experiments through Large Language Model Agents. arXiv 2024, arXiv:2501.10385. [Google Scholar]

Figure 1. A conversational AI framework for intelligent microscopy management, where researchers interact through natural language (LLMs) with an AI system that translates dialogue into adaptive experimental control, enabling optimized acquisition, feedback, and decision-making (ChatMicroscopy).

Figure 2. Conceptual overview of a microscopy facility supported by LLMs. Multiple microscopes generate heterogeneous imaging data that are centrally stored in a shared database. Researchers jointly explore and interpret these data through a computer interface, where an LLM retrieves relevant experimental evidence and contextual information to support grounded reasoning, data integration, and collaborative scientific decision-making.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sancataldo, G. ChatMicroscopy: A Perspective Review of Large Language Models for Next-Generation Optical Microscopy. Appl. Sci. 2026, 16, 2502. https://doi.org/10.3390/app16052502

AMA Style

Sancataldo G. ChatMicroscopy: A Perspective Review of Large Language Models for Next-Generation Optical Microscopy. Applied Sciences. 2026; 16(5):2502. https://doi.org/10.3390/app16052502

Chicago/Turabian Style

Sancataldo, Giuseppe. 2026. "ChatMicroscopy: A Perspective Review of Large Language Models for Next-Generation Optical Microscopy" Applied Sciences 16, no. 5: 2502. https://doi.org/10.3390/app16052502

APA Style

Sancataldo, G. (2026). ChatMicroscopy: A Perspective Review of Large Language Models for Next-Generation Optical Microscopy. Applied Sciences, 16(5), 2502. https://doi.org/10.3390/app16052502

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ChatMicroscopy: A Perspective Review of Large Language Models for Next-Generation Optical Microscopy

Featured Application

Abstract

1. Introduction

2. State of the Art: Current AI-Driven Approaches in Optical Microscopy

3. A Focus on Large Language Models (LLMs)

4. LLMs as Interfaces for Conversational Microscope Control and Experiment Design

5. Organization and Scientific Interaction of Data, Images, and Knowledge

6. Management of Complex Microscopy Workflows: From Single Instruments to Integrated Facilities

7. Challenges and Ethical Considerations

8. Future Perspectives

9. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI