Next Article in Journal
Crystal Morphology Prediction of LTNR in Different Solvents by Molecular Dynamics Simulation
Previous Article in Journal
Solvent-Dependent Stabilization of Gold Nanoparticles: A Comparative Study on Polymers and the Influence of Their Molecular Weight in Water and Ethanol
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Lost Data in Electron Microscopy

by
Nina M. Ivanova
,
Alexey S. Kashin
and
Valentine P. Ananikov
*
N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky Prospect 47, Moscow 119991, Russia
*
Author to whom correspondence should be addressed.
Chemistry 2025, 7(5), 160; https://doi.org/10.3390/chemistry7050160
Submission received: 29 August 2025 / Revised: 17 September 2025 / Accepted: 23 September 2025 / Published: 1 October 2025

Abstract

The goal of this study is to estimate the amount of lost data in electron microscopy and to analyze the extent to which experimentally acquired images are utilized in peer-reviewed scientific publications. Analysis of the number of images taken on electron microscopes at a core user facility and the number of images subsequently included in peer-reviewed scientific journals revealed low efficiency of data utilization. Up to around 90% of electron microscopy data generated during routine instrument operation can remain unused. Of the more than 150,000 electron microscopy images evaluated in this study, only approximately 3500 (just over 2%) were made available in publications. For the analyzed dataset, the amount of lost data in electron microscopy can be estimated as >90% (in terms of data being recorded but not being published in peer-reviewed literature). On the one hand, these results highlight a shortcoming in the optimal use of microscopy images; on the other hand, they indicate the existence of a large pool of electron microscopy data that can facilitate research in data science and the development of AI-based projects. The considerations important to unlock the potential of lost data are discussed in the present article.

1. Introduction

Modern progress in automated data processing, including the use of computer algorithms based on neural networks, greatly facilitated the solution of research tasks and fastened data analysis in chemistry, life science, nanotechnology and many other areas. Machine learning techniques are widely used to solve problems in synthetic and computational chemistry [1,2,3,4,5,6,7], materials science [8,9,10] and catalysis [11,12,13,14]. Computer-aided data analysis is frequently employed in a variety of practical physicochemical applications, including the development of diagnostic tools [15,16,17]. The increasing use of machine learning approaches has made it possible to rapidly analyze large amounts of experimental data in different scientific fields. However, the issues of appropriate sharing [18,19,20,21] and storage [22,23,24] of scientific data, as well as realizing the potential for data reuse and rethinking [25,26,27,28], become rather challenging. This is closely related to the questions of statistical significance and reproducibility of results obtained, the efficiency of using expensive and busy equipment, and the lack of comprehensive information in the scientific literature on negative results. Furthermore, this is a crucial aspect of the larger issue of creating new publishing models for the digital era [29,30,31].
A vivid example of a vast body of scientific information is the data obtained from electron microscopy, a direct observation method now used to study the microstructure and nanostructure of materials [32,33,34]. Complex morphology, possible dynamic behavior and variations in micro- and nanostructures result in obtaining different images and a large overall number of microphotographs of a single sample. Each of the images taken may differ significantly from the others and is a separate source of scientifically valuable information. The use of computer-aided processing of electron microscopy data [35,36,37,38,39,40,41], especially for dynamic systems [42,43,44,45,46,47], in some cases allows comprehensive structural information to be obtained. However, despite the wealth of data available from electron microscopy experiment for a single sample, it seems that often only one or a few images confirming a particular hypothesis are used to illustrate a publication. There is a risk that the majority of the images taken in the experiment may remain unpublished. In view of this, the question of increasing the efficiency of the use of the results of microscopic studies and the problem of lost data in electron microscopy is of much importance. The issue is becoming even more urgent as electron microscopy is gradually expanding its presence in novel areas of chemical research, for example, in the rapidly developing field of utilizing micro- and nanostructural arrangement of reaction media to regulate chemical transformations in solid matter [48,49,50] and in liquid phase [51,52,53,54].
In this article, we present an analysis of an array of electron microscopy data obtained during >10 years of operation at the center for structural analysis and the core user facility, and the fraction of electron microscopy images published in peer-reviewed journals to date. The consolidation and systematization of the scanning and transmission electron microscopy (SEM and TEM) data resulted in an array of approximately 152k initial images, 142k sets of parameters and 3577 images published in 292 articles. The results showed that more than 97% of the scientifically significant electron microscopy data were actually not published (lost for the development of science), demonstrating the critically low efficiency of data utilization and the need to revise and rationalize approaches to the use of microphotographs in scientific research. As a note, we have tried to focus on electron microscopy studies in chemical nanoscience, synthetic chemistry and catalysis, which are different from biological and environmental research. However, some common principles have also been mentioned briefly to give a more comprehensive picture of the problem and possible solutions.
To the best of our knowledge, this is the first systematic study to quantitatively assess the extent of lost data in electron microscopy. By consolidating and analyzing over a decade’s worth of real-world microscopy output, this work offers a data-driven perspective on how extensively experimental results are underutilized. This analysis not only reveals a substantial inefficiency in the dissemination of valuable scientific information but also opens up important discussions on research transparency, data reuse, and the cognitive biases that shape what gets published. The findings are broadly relevant across scientific domains where data-intensive methodologies are employed, and they underscore the need to rethink current practices in data management, publication standards, and the design of AI-driven discovery pipelines. Through this lens, the study contributes to a growing conversation on the cultural and infrastructural shifts required to enable more complete, equitable, and intelligent use of experimental data.

2. Results and Discussion

2.1. General Remarks and Scope of the Analysis

The first problem to be solved was to find a sufficiently comprehensive source of raw electron microscopy data for further analysis. Arrays collected directly on the microscope are a good source of data as they contain all the original micrographs taken during the measurements. Access to such third-party image repositories is usually limited, so we had to rely on the data we have access to, which was available in statistically significant quantities. This range of data included images obtained on scanning and transmission electron microscopes, while additional information such as X-ray microanalysis and diffraction data (selected area electron diffraction, SAED, or electron backscatter diffraction, EBSD) were excluded from consideration because of their relevance to only a limited range of samples and tasks. It should be noted that all the microphotographs included into analysis were free from significant artifacts, were not the result of imaging with incorrect or non-optimized parameters and were of good quality, allowing each of them to be used as a reliable source of structural information. A representative example of the data images is described in the Supporting Information as an illustration.

2.2. Data Collection and Preparation

In the primary step of the study, the initial data of the scanning and transmission electron microscopes installed in the center for structural analysis were collected and statistically analyzed. A total of 152,097 images (403 GB of data) were prepared for analysis. The total number of SEM and TEM images was 119,557 (143 GB of data) and 32,540 (260 GB of data), respectively. Data processing scripts were written to allow analysis based on a number of different parameters (Figure 1).
The attributes of the files deposited in the data storage were chosen as the basic source of information for the analysis, and a file search was employed for array processing. The range of basic parameters relevant to the analysis included the file type and the date of its last modification, which made it possible to tentatively sort the electron microscopy images. A more complete analysis was possible using metadata with microscope characteristics and imaging parameters used or corresponding information stored in separate log files. It should be noted that the storage of the imaging parameters in text form actually proved to be less reliable and led to the loss of some data. A total of 141,681 log files and image files with metadata were found in the archive. The range of stored imaging parameters available for analysis was greater than the number of significant file attributes and included, for example, microscope model, detector type, magnification and other characteristics (Figure 1A). In this case, the analysis was performed by a string search using regular expressions. The search was carried out using two algorithms. The first consisted of sequential filtering of the array by several parameters and then counting the number of remaining files (Figure 1B). This approach was used, for example, to determine the number of images taken in a given year on a microscope of a specific type using a particular detector (P1 = <year> & P2 = <microscope> & P3 = <detector>). The second approach to analysis was to sort the whole array into different values of the same parameter and then count the number of files in each subarray (Figure 1C). This algorithm was used, for example, to sort microphotographs by the magnification value (P1 = <magnification-1, magnification-2, magnification-3, …>).
Time period of analysis was selected up to 2023, since publication time may sometimes be up to 1–2 years (during 1 year on average, but 2 years are not uncommon). For the images obtained in 2024 and 2025 currently it is not clearly possible to distinguish the published/unpublished status. Thus, the images obtained in 2024 and 2025 were excluded from the analysis.

2.3. Analysis of Stored Electron Microscopy Data

To obtain information about the quantitative composition of the entire archive of images, a string search was performed using three parameters: (1) year of image acquisition from 2011 to 2023, (2) microscope type (SEM or TEM), and (3) detector type (was used to separate the STEM images). Analysis of the data revealed a fairly steady increase in the number of images captured from year to year (Figure 2A). There is some variation in instrument use depending on the number of ongoing projects, the intensity of electron microscopy usage in each project, and hardware (re)configuration and repair, among other factors. Taking all these factors into account, the data represent a real load on a core user facility involved in a sufficiently large number of projects. On average, just over 10,000 images are taken annually, of which 74% are SEM, 23% are TEM and 3% are STEM images. To some extent, this proportion of sample surface studies using SEM reflects the specificity of the tasks carried out at the core user facility, but it is interesting to obtain a more visual numerical expression of the nature of the objects studied. To solve this problem, we used a simple approach based on the analysis of image acquisition parameters, namely, on the analysis of the range of scale bar sizes used (Figure 2B).
For this purpose, all images were grouped according to the magnification parameter. The magnification values were then sequentially converted to field of view (FOV) and scale bar length, which was conventionally assumed to be 10% of the FOV. As more than 80 different values of scale bar size were obtained, they were grouped according to the order of magnitude for clarity. The analysis showed that scanning electron microscopy, which operates in the micrometer range, and transmission electron microscopy, which focus mainly on nanoscale objects, together cover 6 orders of magnitude of characteristic sizes. It can therefore be concluded that the typical electron microscopy dataset contains a wealth of information on the morphology of a wide range of objects, from single nanoparticles to submillimeter assemblies and devices.
To check whether the content of the original electron microscopy dataset was fully reflected in the published articles, we analyzed the articles in peer-reviewed journals that included the data obtained at the core user facility.

2.4. Classification and Analysis of Published Images

Published articles mentioning electron microscopes installed in the user facility were analyzed in this study, with 292 publications containing electron microscopy images either in the main text or in the Supplementary Information. Expert analysis of the nature of the systems imaged allowed to tentatively group the articles into 5 broad categories: materials, catalysis, organic chemistry, ionic liquids and microscopic control (Figure 3A).
The most voluminous category, “Materials”, is represented by 10 subcategories. The most extensive subcategory includes membranes and porous materials: micro- and mesoporous carbon matrices of various origins, metal–organic frameworks, sorbents, molecular sieves, porous polymeric materials, etc. (references to specific articles are omitted here and below to avoid redundancy). In addition, separate subcategories have been identified that include articles devoted to the preparation and characterization of composite and hybrid materials, soft materials, biomaterials, polymer materials, minerals and ceramics. Two subcategories of carbon-based materials have also been identified. The first category includes pure carbon materials such as carbon quantum dots, carbon nanotubes, granular activated carbon, etc. The second subcategory includes metallic particles on carbon supports, as these materials are of particular importance as catalysts for many chemical transformations. The last subcategory is nanoscale particles, which are divided into metallic and nonmetallic subtypes. However, the borders between the selected subcategories may intercept because the same publication can cover both types of nanoparticles.
The “Catalysis” topic is subdivided into heterogeneous catalysis and homogeneous catalysis plus nanocatalysis. Heterogeneous catalysis includes systems in which a solid material plays the role of a support and the transfer of catalyst particles between different phases is insignificant or absent. All other systems were attributed to homogeneous catalysis. This category included both classical catalysts based on soluble metal complexes and catalysts based on metal nanoparticles in dynamic equilibrium with the molecular phase.
The analysis also identified a large group of articles that required the creation of specific subcategories: microscopic control, organic chemistry and ionic liquids. The first of these takes into account those cases where it is not a specific substance/material or its morphology that needs to be studied but rather a set of phenomena that occur under external stimuli. The increasing involvement of electron microscopy methods in new areas of research has led to a significant number of articles devoted to the study of organic chemical systems outside the classical fields of organic materials science, such as soft matter or polymer chemistry. In this regard, two distinct categories were identified: organic chemistry—electron microscopy observations related to solving problems in synthetic organic chemistry, and ionic liquids—examples of direct studies of liquid phase samples based on this class of liquid organic salts used as solvents.
During the whole period analyzed, 3577 microscopy images were published (Figure 3B), which averages over 12 images per article. It should be noted that this number is significantly overestimated due to the occurrence of occasional spikes in the number of images presented in papers or Supplementary Materials associated with the special cases of publication of large datasets. After only 8 articles containing more than 50 images each were removed, the total number of published microphotographs decreased to 1853 (for 284 articles), which corresponds to an average of 6.5 images per paper. As in the case of the array of experimental images, the published microphotographs were analyzed according to the order of magnitude of the scale bar. To construct the corresponding histograms, the field of view (FOV) values were extracted from each published image and processed in accordance with the abovementioned method. The results show that, in the case of SEM, the most valuable micrographs are those with a scale bar size corresponding to hundreds of nanometers (Figure 3C), which is an order of magnitude smaller than the typical value obtained by processing the entire dataset (Figure 2B). In the case of TEM, there is no difference between the acquired and published data (cf. Figure 2B and Figure 3D), and the most common scale bar size is on the order of 10 nm. Therefore, no valuable inconsistencies were found, making a significant portion of the acquired image array meaningful to researchers. Therefore, the small number of published images is not due to a lack of information provided by electron microscopy analysis but to the common practice of using microphotographs as illustrative material rather than as a self-sufficient result of structural research. It is worth noting that in many cases, this problem is solved by publishing the results of the statistical processing of a large number of images, but even this presentation of data results in the loss of significant morphological information compared to the original images.

2.5. Estimation of the Data Loss

The final step in the analysis was to compare the number of acquired and published electron microscopy images (Table 1) and to estimate the average number of electron microscopy images used in the publications as well as the amount of lost data. This part of the study is mainly based on data from the core user facility, whose collection and analysis methods are described in the previous sections of the paper. Also, as a reference, some publications containing electron microscopy images from third-party facilities were included in the analysis.
The results in Table 1 show that SEM images (as well as STEM images, which fall entirely into this category, as they were recorded on a SEM instrument) taken at the core user facility appear in 238 publications, with the number of published microphotographs representing only 2.55% of the total SEM dataset. For TEM, this value is even lower and amounts to 1.61%. It should be noted that 40 articles included published images from both scanning and transmission electron microscopes, so the total number of articles was less than the simple sum of the number of articles for SEM and TEM separately. However, regardless of the type of instrument and the total number of images taken, the amount of data lost is on the order of 97–98%, and on average, only 2–3% of microphotographs are published.
Comparing the average number of images taken within the core user facility published in a single article with a similar parameter for third-party images derived from analysis of the content of 200 publications in peer-reviewed scientific publications on the topics of materials science, catalysis and nanotechnology showed that the value obtained for a particular source of microphotographs (12.25 based on both SEM and TEM images) correlated well with the general trends (13.83 based on SEM, STEM and TEM images). Therefore the results of statistical analysis described above can be considered reliable.
The results obtained demonstrate the extremely low efficiency of the use of electron microscopy data and, at the same time, indicate the great potential of using previously obtained experimental SEM and TEM results for further processing, for example, using modern AI technologies.

2.6. Solving the Data Loss Problem

A detailed statistical analysis of the use of electron microscopy data described above revealed the existence of a data loss problem. At the same time, the identified features of the acquisition, storage and publication of electron microscopy data allowed to formulate a number of recommendations for the rational use of microphotographs in scientific research (Figure 4, see Supporting Information for the detailed description).
All the proposed solutions can be divided into several categories according to the stages of electron microscopy-aided research: image acquisition, storage, analysis and publication in scientific journals. In addition to direct approaches that make large amounts of data available, the ways to optimize the handling of raw data, to obtain more useful information from the same amount of electron microscopy images, and to optimize the time spent by researchers are also presented. It is worth mentioning that human time factor is of particular importance, as it is often a lack of time that leaves raw electron microscopy data unprocessed and therefore unsuitable for publication in peer-reviewed journals.
The standardization of image acquisition conditions, the most complete possible archiving of the conditions of electron microscopy experiments, and the careful cataloging of images at the acquisition stage will lay the foundation for the creation of universal databases that can be used by researchers in different fields without the need to reproduce experiments on readily available or familiar equipment. The automation of these processes will significantly increase the efficiency of the researchers’ work, which will contribute to a faster filling of the databases.
Electron microscopy data storage systems can be made publicly available. Capacity of modern web servers allows large amounts of data to be stored at minimal cost, making it possible to host micrographs on data sharing platforms. In addition, it is worth noting that open databases can also become discussion platforms to improve the quality of the data and respond to the current needs of researchers. Impressive efforts are currently being made to collect the thousands or even millions of microphotographs, including electron microscope images, and to share them between researchers in different scientific fields [55]. In particular, the introduction of a number of databases, e.g., the Image Data Resource (IDR, more than 14M images) [56], the Electron Microscopy Public Image Archive (EMPIAR, about 2k entries) [57] and the Australian Antarctic Data Centre electron microscopy database 1995–2007 (about 17k images) [58] should be mentioned. The agreement on the format and common standards for data storage [59,60] has facilitated the handling and reuse of microscopy data in the bioimaging community.
Undoubtedly, qualitative and complete processing of the acquired images will improve the efficiency of the data use. Modern image analysis systems, including those based on machine learning algorithms, will greatly simplify the analysis and can provide an opportunity for rapid data processing. Combining electron microscopy images with additional data for the same samples obtained by alternative methods, such as spectral techniques, can be a good way of extracting high quality scientific information from the original microphotographs. Collaboration between researchers from different disciplines in data analysis will greatly streamline the process and increase the proportion of scientific information generated that is suitable for widespread use.
Of course, there is also the need to improve the quality and amount of data published in traditional peer-reviewed scientific journals. The volume of data published can be significantly increased by making full use of the ability to attaching Supplementary Materials, and the way in which these materials are presented can also be improved in terms of faster access to the data and better visualization. The quality of the electron microscopy images presented will depend directly on the data publication policy, data presentation standards and the existence of the review and data check procedures.
We believe that considering these factors will improve the policy of electron microscopy data application in scientific research and allow more researchers to use this powerful and convenient technique in a rational way.

2.7. Importance of Core User Facility Policy and Impact on Data Loss

The core user facility analyzed in this study operates under a multiple-user access policy that provides researchers with direct and independent access to the electron microscopy equipment. In this model, researchers interested in using the microscopes undergo basic training, pass an equipment proficiency test, and are subsequently granted access to operate the equipment autonomously. Once certified, users are free to acquire microscopy data without limitations on the number of recorded images or experimental iterations. This flexible policy encourages exploration, trial-and-error optimization, and in-depth characterization of samples at the discretion of individual users and research groups.
The multiple-user access policy described here may be common in research centers and corresponds to one of the typical modes of operation adopted by shared instrumentation facilities.
Such a policy is particularly advantageous for broadening access to advanced instrumentation and accelerating experimental throughput. However, it also results in the generation of large volumes of data with a substantial proportion remaining unpublished. In practice, many of the acquired micrographs are exploratory in nature, and although they may contain scientifically valuable information, they are often excluded from publications due to redundancy, selectivity, or time constraints during data processing and manuscript preparation.
It is important to note that the amount of lost data is likely influenced by the specific operational policies of user facilities. Facilities with different access models—such as centralized acquisition by trained staff, project-based scheduling, or pre-reviewed experiment planning—may produce significantly smaller or more curated datasets, potentially leading to a lower proportion of unused images. Therefore, the results and statistics reported in this work are closely tied to the policy of multiple-user access with unrestricted data generation, and should be interpreted in this context. Further comparative studies across various operational models may help to elucidate how user policies affect data retention and publication efficiency in scientific research environments.
In per-project analysis, for some unique samples with only a few areas of interest available, a much higher data usage rate was observed. When only a few images are recorded in total (i.e., due to unique morphology of a certain single area, high cost of the sample with a small area, or other possible reasons) data usage/loss ratio may be approximately 1:1. Thus, the amount of lost data may be reduced to around 50%. Overall, the number of such projects was rather small, so it did not change the overall lost data value calculated for the entire dataset. As discussed above, the number of such projects may increase with different operational policies of user facilities.

3. Conclusions

This study presents the first systematic, data-driven quantification of lost data in electron microscopy, based on more than 150,000 micrographs acquired over a decade at a core user facility. A detailed comparison between the number of experimentally acquired images and those published in peer-reviewed scientific journals revealed that over 97% of recorded electron microscopy data remain unpublished and, therefore, largely unused in the scientific record. Specifically, only 2.55% of SEM and 1.61% of TEM images were found to be used in journal articles, indicating that the majority of image data generated by routine microscopy experiments are effectively lost.
One should not treat the specific value of 97% lost data as a universal metric, as the actual amount may vary depending on equipment usage policies, research practices, and institutional workflows; rather, the key conclusion is that the proportion of unused data may be undeniably high and merits serious attention.
Given the variety of user policies and projects, it is reasonable to assume that the loss of 50–90% of data is not uncommon in electron microscopy. Even an estimate as low as 50% represents a significant amount of lost data, which has a strong impact on scientific research, equipment usage, and project economics.
It should be emphasized that the amount of lost data depends on the cost of the equipment and operation. For equipment/operations with a relatively small/regular cost, the amount of lost data may be larger since many images are taken and the time needed to record one image is rather small, as well as because scanning of large sample areas may take place. For high-cost equipment/operations, the amount of lost data may be significantly smaller due to limited availability, higher efficiency of use, or more time needed to record an image.
The analysis made in this study encompassed a wide range of research topics and imaging conditions and showed that the phenomenon of lost data is not due to poor data quality, but rather reflects a systemic inefficiency in data utilization. The findings provide the first quantitative evidence that vast amounts of high-quality scientific information are routinely discarded, highlighting a critical disconnect between data generation and its dissemination. The analysis of publications suggests that this pattern aligns with standard research practices and reflects a broader trend in scientific publishing and experimental workflows.
This work introduces a new perspective on the concept of “lost data” in experimental science and reveals an untapped resource with immense potential for data analysis, artificial intelligence training, and data-intensive research. It also emphasizes that the data loss rate may be strongly influenced by the operational model of user facilities, including policies on researcher access and data ownership.
The conclusions drawn here are not only related to microscopy but are broadly relevant to the design of institutional data policies, open science practices, and the strategic development of scientific infrastructure. Recognizing, quantifying, and addressing the problem of lost data is a necessary step toward improving research efficiency, enhancing reproducibility, and maximizing the return on investment in scientific instrumentation.
This study lays the groundwork for future efforts aimed at capturing, organizing, and repurposing unused microscopy data, and provides a model for similar assessments in other domains of experimental science.
To minimize the amount of data loss, the following key points should be considered:
(1)
Employ reliable data acquisition protocols and use flexible, easily accessible storage solutions to facilitate data reporting, sharing, and reuse.
(2)
Include all meaningful microscopy data in scientific publications, such as detailed imaging parameters and the results of seemingly “unsuccessful” experiments.
(3)
Use automated data analysis to extract hidden structural information that could be useful for future research.
(4)
Treat all high-quality images as sources of information for the future development of science.
(5)
Use AI/ML tools to analyze all microscopy images obtained in the project and include the results in the published data domain.
Despite the large dataset analyzed and the comprehensive statistical approach, several limitations of this study should be acknowledged as well as directions for future research. The analysis was based on a single core user facility and primarily focused on chemical research applications of electron microscopy; therefore, extrapolation to other disciplines should be made with caution. The study also did not assess the scientific value of unpublished images directly, and for possible reasons for their exclusion on a per-project basis. Future work should explore qualitative aspects of unused data, user behavior in data selection, and facility-specific publication practices. Expanding this approach across multiple institutions and disciplines would help validate the generality of the findings and support the development of unified strategies for microscopy data retention, reuse, and sharing. Integration of AI tools for automated quality assessment and metadata enrichment may further transform how unused microscopy images are evaluated and incorporated into new research workflows. Analyzing the financial aspects of the data loss problem is also important. Clearly, the rational use of equipment can reduce the workload of microscopy devices, saving on maintenance and consumables purchases. Further research on this issue may focus on detailed economic studies and the sustainability of research practices.

4. Methods and Experimental Data Processing

4.1. Experimental Details Typical for the Shared Facilities Operation

Electron microscopy imaging was carried out using a set of high-performance microscopes: scanning electron microscopes (Hitachi SU8000, Hitachi High-Tech, Tokyo, Japan and Hitachi SU8230/Regulus8230, Hitachi High-Tech, Tokyo, Japan) and a transmission electron microscope (Hitachi HT7700, Hitachi High-Tech, Tokyo, Japan). These instruments provided capabilities for imaging in SEM (Scanning Electron Microscopy), TEM (Transmission Electron Microscopy), and STEM (Scanning Transmission Electron Microscopy) modes, depending on the nature and resolution requirements of the samples.

4.2. Scanning Electron Microscopy (SEM)

SEM imaging was performed primarily using the Hitachi SU8000 and SU8230 field-emission microscopes. Both instruments support high-resolution surface imaging and operate with accelerating voltages ranging typically from 0.5 to 30 kV. The working distance was adjusted between 3 and 15 mm depending on the desired depth of field and signal strength. Images were acquired using multiple detectors, including secondary electron (SE) detectors for topographic contrast and backscattered electron (BSE) detector for compositional imaging. Low-kV imaging (1–5 kV accelerating voltage) and beam deceleration technique (0.01–2 kV landing voltage) were frequently used for surface-sensitive analysis of non-conductive and beam-sensitive samples without the need for conductive coating. The resolution under high-voltage conditions reached sub-nanometer scales (<1.0 nm at 15 kV).

4.3. Transmission Electron Microscopy (TEM)

TEM imaging was performed on the Hitachi HT7700, an electron microscope with thermionic electron source (configuration with LaB6 cathode was used, tungsten cathode can be installed as an option) optimized for materials characterization at relatively low accelerating voltages up to 120 kV. Samples were prepared as fine powders, thin films or ultramicrotomed sections with thicknesses generally below 100 nm to ensure sufficient electron transmission. TEM micrographs were usually recorded using bright-field (BF-TEM) mode. Selected Area Electron Diffraction (SAED) was used in specific cases to assess crystallinity and lattice spacing. The instrument allowed magnification from 1000× to 800,000×, with typical imaging resolutions down to ~0.2 nm.

4.4. Scanning Transmission Electron Microscopy (STEM)

STEM imaging was conducted using the SU8000 and SU8230 scanning electron microscopes equipped with the STEM detection systems, allowing high-resolution analysis in transmitted electron mode with nanometer and sub-nanometer resolution. The STEM mode was employed particularly for high-magnification imaging of nanoparticles, interfaces, and fine structural details. Bright-field (BF-STEM) and dark-field (DF-STEM) signals were collected depending on the contrast requirements. The pixel size, dwell time, and scan rate were optimized to balance resolution with beam damage, particularly for sensitive organic and hybrid materials.

4.5. Imaging Conditions and Calibration

Across all instruments, imaging conditions such as accelerating voltage, beam current, aperture size, spot (pixel) size, magnification, and detector mode were optimized for each specific sample type. The microscopes were routinely calibrated using certified standard specimens to ensure dimensional accuracy and desired spatial resolution limit. Digital images were recorded in high-resolution formats (typically TIFF or JPEG) and were saved with corresponding metadata when supported by the software.

4.6. Collection of the Array

Electron microscopy images were obtained using equipment installed at the center for structural analysis and the core user facility for both scanning electron microscopy and transmission electron microscopy. The total amount of data collected was 403 GB or 152,097 images and 109,141 text files with parameters.

4.7. Images Filtering

No manual selection or curation of image quality or project relevance was applied before statistical analysis. All micrographs meeting basic technical integrity (free of critical imaging artifacts) were included, thereby preserving the unbiased nature of the experimental dataset.

4.8. Image Acquisition Parameters

Metadata files include acquisition parameters such as acceleration voltage, working distance, detector mode, and magnification, which were used to reconstruct the imaging context and field of view for each micrograph.

4.9. Analysis of the Array

The data were analyzed using file search and string search tools (a string search was employed in the case of text files with parameters or images with metadata). The analysis was automated by using batch files. A summary of the search parameters and tasks is given in Table 2.

4.10. Analysis of the Published Images

Peer-reviewed publications containing images from the abovementioned dataset were found in the Scopus, Web of Science and Google Scholar databases for the period from 2011 to 2023 inclusive. Full-text searches were performed using the microscope model, author affiliation and core user facility name. In addition, detailed manual search was performed. Review articles and other types of publications with reused images (i.e., article translations) were excluded from the final set. In total, 292 publications were selected for further analysis. The published electron microscopy images were processed using ImageJ software (version 1.53e).
The analysis of published data from third-party sources was based up to 2023 issues (the most recent issues for the analysis period selected in the article) of several journals in the fields of catalysis (ACS Catalysis, American Chemical Society, Washington, DC, USA), materials science (Advanced Materials Interfaces, Wiley-VCH, Weinheim, Germany), nanotechnology (ACS Nano, American Chemical Society, Washington, DC, USA) and microsciences (Small, Wiley-VCH, Weinheim, Germany). A total of 16 issues containing 474 articles were analyzed. Of these, 200 were selected that contained at least one electron microscopy image in the main text of the article or in Supplementary Materials.

4.11. Experimental Workflow Overview

A flowchart or schematic (optional) could illustrate the steps: data collection → storage audit → metadata parsing → statistical analysis → cross-referencing with publications. Details on each workflow stage are descried above in the text.

4.12. Automated Processing

All batch processing and string searches were performed using automated scripts for minimizing human intervention and facilitating accurate processing.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/chemistry7050160/s1, Examples of “lost” TEM (Figures S1–S20) and SEM/STEM (Figures S21–S100) data for different samples; Figure S1: Palladium particles on graphite; Figure S2: Palladium particles on graphite; Figure S3: Nickel thiolate [Ni(Sm-FC6H4)2]n; Figure S4: Nickel thiolate [Ni(Sm-FC6H4)2]n; Figure S5: Nickel thiolate [Ni(So-NH2C6H4)2]n; Figure S6: Nickel thiolate [Ni(So-NH2C6H4)2]n; Figure S7: Nickel thiolate [Ni(Sp-CH3C6H4)2]n; Figure S8: Nickel thiolate [Ni(SC6H5)2]n; Figure S9: Pd/C-catalyst in an ionic liquid [C4mim][BF4]; Figure S10: Pd/ TiO2 catalyst in an ionic liquid [C4mim][BF4]; Figure S11: Palladium particles on graphite; Figure S12: Keyhole limpet hemocyanin; Figure S13: Keyhole limpet hemocyanin; Figure S14: TiO2 particles; Figure S15: TiO2 particles; Figure S16: γ−Al2O3 support for catalysts; Figure S17: γ−Al2O3 support for catalysts; Figure S18: Al2O3 support for catalysts; Figure S19: Al2O3 support for catalysts; Figure S20: TiO2 support for catalysts; Figure S21: Nickel thiolate [Ni(Sp-BrC6H4)2]n; Figure S22: Nickel thiolate [Ni(Sp-BrC6H4)2]n; Figure S23: Crystals of 4-nitrothiophenol (p-NO2C6H4SH); Figure S24: Crystals of 4-nitrothiophenol (p-NO2C6H4SH); Figure S25: Organosilicone matrix, annealed at 550 °C; Figure S26: Organosilicone matrix, annealed at 550 °C; Figure S27: Organosilicone matrix, annealed at 550 °C; Figure S28: Pd precipitate obtained from Pd(OAc)2 in CH3OH; Figure S29: Pd precipitate obtained from Pd(OAc)2 in CH3OH; Figure S30: Pd precipitate obtained from Pd(OAc)2 in CH3CN; Figure S31: Freeze-dried colloidal solution of Pd in CH3OH; Figure S32: Freeze-dried colloidal solution of Pd in CH3OH; Figure S33: Freeze-dried colloidal solution of Pd in CH3OH; Figure S34: Freeze-dried colloidal solution of Pd in dioxane; Figure S35: Freeze-dried colloidal solution of Pd in dioxane; Figure S36: Crystalline Pd(OAc)2 obtained from dioxane solution; Figure S37: Palladium-containing particles obtained from Pd(OAc)2 in the presence of C6H5I in dioxane; Figure S38: Palladium-containing particles obtained from Pd(OAc)2 in the presence of C6H5I in dioxane; Figure S39: Solid phase particles isolated from the Suzuki reaction; Figure S40: Nickel sulfide particles obtained by precipitation from solution; Figure S41: NaCl crystals in a [C4mim][OTf] ionic liquid environment, deposited from aqueous solution; Figure S42: CsCl crystals in a [C4mim][OTf] ionic liquid environment, deposited from aqueous solution; Figure S43: Cellulose solution in [C4mim][OTf] ionic liquid; Figure S44: Cellulose solution in [C4mim][OTf] ionic liquid; Figure S45: Cellulose crystallized from solution in a [C4mim][OAc] ionic liquid; Figure S46: Cellulose crystallized from solution in a [C4mim][OAc] ionic liquid; Figure S47: Cellulose crystallized from solution in a [C4mim][OAc] ionic liquid; Figure S48: Gold particles; Figure S49: Gold particles; Figure S50: Gold particles; Figure S51: Gold particles; Figure S52: Polyethylene furanoate powder; Figure S53: Polyethylene furanoate powder; Figure S54: Polyethylene furanoate powder; Figure S55: Composite material for 3D printing filled with carbon fiber; Figure S56: Composite material for 3D printing filled with carbon fiber; Figure S57: Composite material for 3D printing filled with carbon fiber; Figure S58: Composite material for 3D printing filled with carbon fiber; Figure S59: Nickel sulfide NiS crystallized from melt; Figure S60: Nickel sulfide NiS crystallized from melt; Figure S61: Nickel sulfide NiS crystallized from melt; Figure S62: NaCl crystals; Figure S63: Crystallized tetrabutylammonium salt; Figure S64: Crystallized tetrabutylammonium salt; Figure S65: Multi-walled carbon nanotubes; Figure S66: Multi-walled carbon nanotubes; Figure S67: Multi-walled carbon nanotubes; Figure S68: Diatomite for laboratory needs; Figure S69: Diatomite for laboratory needs; Figure S70: Oligomers obtained by resinification of 5-(hydroxymethyl)furfural; Figure S71: Oligomers obtained by resinification of 5-(hydroxymethyl)furfural; Figure S72: Silver chloride with metallic silver particles on the surface; Figure S73: Silver chloride with metallic silver particles on the surface; Figure S74: Sorbent from desiccant (molecular sieves); Figure S75: Sorbent from desiccant (molecular sieves); Figure S76: Freeze-dried coffee; Figure S77: Freeze-dried coffee; Figure S78: Freeze-dried coffee; Figure S79: Fine titanium alloy powder for 3D-printing; Figure S80: Fine titanium alloy powder for 3D-printing; Figure S81: Palladium particles on a lacey-type formvar support grid obtained from the Pd2(dba)3 complex in chloroform; Figure S82: Palladium particles on a lacey-type formvar support grid obtained from the Pd2(dba)3 complex in chloroform; Figure S83: Nickel thiolate [Ni(Sp-FC6H4)2]n; Figure S84: Nickel thiolate [Ni(Sp-FC6H4)2]n; Figure S85: Cobalt thiolate [Co(SC6H5)2]n; Figure S86: Cobalt thiolate [Co(SC6H5)2]n; Figure S87: Pd particles in [C4mim][BF4] ionic liquid; Figure S88: Pd particles in [C4mim][BF4] ionic liquid; Figure S89: Acinetobacter baumannii B-3190 microorganisms; Figure S90: Acinetobacter baumannii B-3190 microorganisms; Figure S91: Arthrobacter halodurans JSM 078085 microorganisms; Figure S92: Brevundimonas vesicularis NBRC 12165 microorganisms; Figure S93: Paracoccus yeei VKM B-3302 microorganisms; Figure S94: Rhodococcus fascians LMG 3623 microorganisms; Figure S95: Rhodococcus fascians LMG 3623 microorganisms; Figure S96: Pd particles on Paraccocus yeei VKM B-3302 cells; Figure S97: Paraccocus yeei VKM B-3302 microorganisms; Figure S98: Section of Paraccocus yeei VKM B-3302 microorganisms with palladium nanoparticles; Figure S99: Section of Paraccocus yeei VKM B-3302 microorganisms with palladium nanoparticles; Figure S100: Section of Paraccocus yeei VKM B-3302 microorganisms with palladium nanoparticles.

Author Contributions

Conceptualization, N.M.I., A.S.K., and V.P.A.; methodology, A.S.K.; investigation, N.M.I.; data curation, N.M.I.; writing—original draft preparation, N.M.I. and A.S.K.; writing—review and editing, A.S.K. and V.P.A.; visualization, N.M.I.; supervision, V.P.A.; funding acquisition, V.P.A. All authors have read and agreed to the published version of the manuscript.

Funding

The study was supported by the Ministry of Science and Higher Education of the Russian Federation (agreement no. 075-15-2024-531).

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author(s).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Butler, K.T.; Davies, D.W.; Cartwright, H.; Isayev, O.; Walsh, A. Machine Learning for Molecular and Materials Science. Nature 2018, 559, 547–555. [Google Scholar] [CrossRef]
  2. Artrith, N.; Butler, K.T.; Coudert, F.-X.; Han, S.; Isayev, O.; Jain, A.; Walsh, A. Best Practices in Machine Learning for Chemistry. Nat. Chem. 2021, 13, 505–508. [Google Scholar] [CrossRef]
  3. Keith, J.A.; Vassilev-Galindo, V.; Cheng, B.; Chmiela, S.; Gastegger, M.; Müller, K.-R.; Tkatchenko, A. Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems. Chem. Rev. 2021, 121, 9816–9872. [Google Scholar] [CrossRef] [PubMed]
  4. Meuwly, M. Machine Learning for Chemical Reactions. Chem. Rev. 2021, 121, 10218–10239. [Google Scholar] [CrossRef]
  5. Coley, C.W.; Green, W.H.; Jensen, K.F. Machine Learning in Computer-Aided Synthesis Planning. Acc. Chem. Res. 2018, 51, 1281–1289. [Google Scholar] [CrossRef]
  6. Coley, C.W.; Barzilay, R.; Jaakkola, T.S.; Green, W.H.; Jensen, K.F. Prediction of Organic Reaction Outcomes Using Machine Learning. ACS Cent. Sci. 2017, 3, 434–443. [Google Scholar] [CrossRef] [PubMed]
  7. Park, S.; Han, H.; Kim, H.; Choi, S. Machine Learning Applications for Chemical Reactions. Chem.—Asian J. 2022, 17, e202200203. [Google Scholar] [CrossRef] [PubMed]
  8. Schrier, J.; Norquist, A.J.; Buonassisi, T.; Brgoch, J. In Pursuit of the Exceptional: Research Directions for Machine Learning in Chemical and Materials Science. J. Am. Chem. Soc. 2023, 145, 21699–21716. [Google Scholar] [CrossRef]
  9. Oviedo, F.; Ferres, J.L.; Buonassisi, T.; Butler, K.T. Interpretable and Explainable Machine Learning for Materials Science and Chemistry. Acc. Mater. Res. 2022, 3, 597–607. [Google Scholar] [CrossRef]
  10. Kim, J.; Kang, D.; Kim, S.; Jang, H.W. Catalyze Materials Science with Machine Learning. ACS Mater. Lett. 2021, 3, 1151–1171. [Google Scholar] [CrossRef]
  11. Zahrt, A.F.; Henle, J.J.; Rose, B.T.; Wang, Y.; Darrow, W.T.; Denmark, S.E. Prediction of Higher-Selectivity Catalysts by Computer-Driven Workflow and Machine Learning. Science 2019, 363, eaau5631. [Google Scholar] [CrossRef]
  12. Kitchin, J.R. Machine Learning in Catalysis. Nat. Catal. 2018, 1, 230–232. [Google Scholar] [CrossRef]
  13. Toyao, T.; Maeno, Z.; Takakusagi, S.; Kamachi, T.; Takigawa, I.; Shimizu, K. Machine Learning for Catalysis Informatics: Recent Applications and Prospects. ACS Catal. 2020, 10, 2260–2297. [Google Scholar] [CrossRef]
  14. Schlexer Lamoureux, P.; Winther, K.T.; Garrido Torres, J.A.; Streibel, V.; Zhao, M.; Bajdich, M.; Abild-Pedersen, F.; Bligaard, T. Machine Learning for Computational Heterogeneous Catalysis. ChemCatChem 2019, 11, 3581–3601. [Google Scholar] [CrossRef]
  15. Aliev, T.A.; Belyaev, V.E.; Pomytkina, A.V.; Nesterov, P.V.; Shityakov, S.; Sadovnichii, R.V.; Novikov, A.S.; Orlova, O.Y.; Masalovich, M.S.; Skorb, E.V. Electrochemical Sensor to Detect Antibiotics in Milk Based on Machine Learning Algorithms. ACS Appl. Mater. Interfaces 2023, 15, 52010–52020. [Google Scholar] [CrossRef]
  16. Aliev, T.A.; Lavrentev, F.V.; Dyakonov, A.V.; Diveev, D.A.; Shilovskikh, V.V.; Skorb, E.V. Electrochemical Platform for Detecting Escherichia Coli Bacteria Using Machine Learning Methods. Biosens. Bioelectron. 2024, 259, 116377. [Google Scholar] [CrossRef]
  17. Aliev, T.; Korolev, I.; Yasnov, M.; Nosonovsky, M.; Skorb, E.V. Rosé or White, Glass or Plastic: Computer Vision and Machine Learning Study of Cavitation Bubbles in Sparkling Wines. RSC Adv. 2025, 15, 5151–5158. [Google Scholar] [CrossRef] [PubMed]
  18. Bird, C.L.; Frey, J.G. Chemical Information Matters: An e-Research Perspective on Information and Data Sharing in the Chemical Sciences. Chem. Soc. Rev. 2013, 42, 6754–6776. [Google Scholar] [CrossRef] [PubMed]
  19. Baker, K.S.; Duerr, R.E.; Parsons, M.A. Scientific Knowledge Mobilization: Co-Evolution of Data Products and Designated Communities. Int. J. Digit. Curation 2015, 10, 110–135. [Google Scholar] [CrossRef]
  20. Tedersoo, L.; Küngas, R.; Oras, E.; Köster, K.; Eenmaa, H.; Leijen, Ä.; Pedaste, M.; Raju, M.; Astapova, A.; Lukner, H.; et al. Data Sharing Practices and Data Availability upon Request Differ across Scientific Disciplines. Sci. Data 2021, 8, 192. [Google Scholar] [CrossRef] [PubMed]
  21. Hardwicke, T.E.; Mathur, M.B.; MacDonald, K.; Nilsonne, G.; Banks, G.C.; Kidwell, M.C.; Hofelich Mohr, A.; Clayton, E.; Yoon, E.J.; Henry Tessler, M.; et al. Data Availability, Reusability, and Analytic Reproducibility: Evaluating the Impact of a Mandatory Open Data Policy at the Journal Cognition. R. Soc. Open Sci. 2018, 5, 180448. [Google Scholar] [CrossRef]
  22. Kindling, M.; Strecker, D. Data Quality Assurance at Research Data Repositories. Data Sci. J. 2022, 21, 1–17. [Google Scholar] [CrossRef]
  23. Hart, E.M.; Barmby, P.; LeBauer, D.; Michonneau, F.; Mount, S.; Mulrooney, P.; Poisot, T.; Woo, K.H.; Zimmerman, N.B.; Hollister, J.W. Ten Simple Rules for Digital Data Storage. PLoS Comput. Biol. 2016, 12, e1005097. [Google Scholar] [CrossRef]
  24. Roche, D.G.; Lanfear, R.; Binning, S.A.; Haff, T.M.; Schwanz, L.E.; Cain, K.E.; Kokko, H.; Jennions, M.D.; Kruuk, L.E.B. Troubleshooting Public Data Archiving: Suggestions to Increase Participation. PLoS Biol. 2014, 12, e1001779. [Google Scholar] [CrossRef] [PubMed]
  25. Palmer, C.L.; Weber, N.M.; Cragin, M.H. The Analytic Potential of Scientific Data: Understanding Re-Use Value. Proc. Am. Soc. Inf. Sci. Technol. 2011, 48, 1–10. [Google Scholar] [CrossRef]
  26. Strieth-Kalthoff, F.; Sandfort, F.; Kühnemund, M.; Schäfer, F.R.; Kuchen, H.; Glorius, F. Machine Learning for Chemical Reactivity: The Importance of Failed Experiments. Angew. Chem. Int. Ed. 2022, 61, e202204647. [Google Scholar] [CrossRef] [PubMed]
  27. Vines, T.H.; Albert, A.Y.K.; Andrew, R.L.; Débarre, F.; Bock, D.G.; Franklin, M.T.; Gilbert, K.J.; Moore, J.-S.; Renaut, S.; Rennison, D.J. The Availability of Research Data Declines Rapidly with Article Age. Curr. Biol. 2014, 24, 94–97. [Google Scholar] [CrossRef]
  28. Goodman, A.; Pepe, A.; Blocker, A.W.; Borgman, C.L.; Cranmer, K.; Crosas, M.; Di Stefano, R.; Gil, Y.; Groth, P.; Hedstrom, M.; et al. Ten Simple Rules for the Care and Feeding of Scientific Data. PLoS Comput. Biol. 2014, 10, e1003542. [Google Scholar] [CrossRef]
  29. Pagliaro, M. Publishing Scientific Articles in the Digital Era. Open Sci. J. 2020, 5, 1–12. [Google Scholar] [CrossRef]
  30. Pagliaro, M. “Highly Read and Poorly Cited?” A Critical Perspective on Academic Social Networks. J. Data Sci. Inf. Cit. Stud. 2024, 3, 155–160. [Google Scholar] [CrossRef]
  31. Ciriminna, R.; Li Petri, G.; Angellotti, G.; Luque, R.; Pagliaro, M. Open and Impactful Academic Publishing. Front. Res. Metrics Anal. 2025, 10, 1544965. [Google Scholar] [CrossRef]
  32. Science of Microscopy; Hawkes, P.W., Spence, J.C.H., Eds.; Springer: New York, NY, USA, 2007; ISBN 978-0-387-25296-4. [Google Scholar]
  33. Modern Electron Microscopy in Physical and Life Sciences; Janecek, M., Kral, R., Eds.; IN TECH d.o.o.: Rijeka, Croatia, 2016; ISBN 978-953-51-2252-4. [Google Scholar]
  34. Liquid Cell Electron Microscopy; Ross, F.M., Ed.; Cambridge University Press: New York, NY, USA, 2017; ISBN 978-1-107-11657-3. [Google Scholar]
  35. Kalinin, S.V.; Ophus, C.; Voyles, P.M.; Erni, R.; Kepaptsoglou, D.; Grillo, V.; Lupini, A.R.; Oxley, M.P.; Schwenker, E.; Chan, M.K.Y.; et al. Machine Learning in Scanning Transmission Electron Microscopy. Nat. Rev. Methods Prim. 2022, 2, 11. [Google Scholar] [CrossRef]
  36. Muto, S.; Shiga, M. Application of Machine Learning Techniques to Electron Microscopic/Spectroscopic Image Data Analysis. Microscopy 2020, 69, 110–122. [Google Scholar] [CrossRef] [PubMed]
  37. Groschner, C.K.; Choi, C.; Scott, M.C. Machine Learning Pipeline for Segmentation and Defect Identification from High-Resolution Transmission Electron Microscopy Data. Microsc. Microanal. 2021, 27, 549–556. [Google Scholar] [CrossRef]
  38. Botifoll, M.; Pinto-Huguet, I.; Arbiol, J. Machine Learning in Electron Microscopy for Advanced Nanocharacterization: Current Developments, Available Tools and Future Outlook. Nanoscale Horiz. 2022, 7, 1427–1477. [Google Scholar] [CrossRef]
  39. Eremin, D.B.; Galushko, A.S.; Boiko, D.A.; Pentsak, E.O.; Chistyakov, I.V.; Ananikov, V.P. Toward Totally Defined Nanocatalysis: Deep Learning Reveals the Extraordinary Activity of Single Pd/C Particles. J. Am. Chem. Soc. 2022, 144, 6071–6079. [Google Scholar] [CrossRef] [PubMed]
  40. Galushko, A.S.; Boiko, D.A.; Pentsak, E.O.; Eremin, D.B.; Ananikov, V.P. Time-Resolved Formation and Operation Maps of Pd Catalysts Suggest a Key Role of Single Atom Centers in Cross-Coupling. J. Am. Chem. Soc. 2023, 145, 9092–9103. [Google Scholar] [CrossRef]
  41. Ho, B.; Zhao, J.; Liu, J.; Tang, L.; Guan, Z.; Li, X.; Li, M.; Howard, E.; Wheeler, R.; Bae, J. SEMPro: A Data-Driven Pipeline To Learn Structure–Property Insights from Scanning Electron Microscopy Images. ACS Mater. Lett. 2023, 5, 3117–3125. [Google Scholar] [CrossRef]
  42. Zheng, H.; Lu, X.; He, K. In Situ Transmission Electron Microscopy and Artificial Intelligence Enabled Data Analytics for Energy Materials. J. Energy Chem. 2022, 68, 454–493. [Google Scholar] [CrossRef]
  43. Faraz, K.; Grenier, T.; Ducottet, C.; Epicier, T. Deep Learning Detection of Nanoparticles and Multiple Object Tracking of Their Dynamic Evolution during in Situ ETEM Studies. Sci. Rep. 2022, 12, 2484. [Google Scholar] [CrossRef]
  44. Kang, S.; Kim, J.-H.; Lee, M.; Yu, J.W.; Kim, J.; Kang, D.; Baek, H.; Bae, Y.; Kim, B.H.; Kang, S.; et al. Real-Space Imaging of Nanoparticle Transport and Interaction Dynamics by Graphene Liquid Cell TEM. Sci. Adv. 2021, 7, eabi5419. [Google Scholar] [CrossRef]
  45. Yao, L.; Ou, Z.; Luo, B.; Xu, C.; Chen, Q. Machine Learning to Reveal Nanoparticle Dynamics from Liquid-Phase TEM Videos. ACS Cent. Sci. 2020, 6, 1421–1430. [Google Scholar] [CrossRef]
  46. Cheng, B.; Ye, E.; Sun, H.; Wang, H. Deep Learning-Assisted Analysis of Single Molecule Dynamics from Liquid-Phase Electron Microscopy. Chem. Commun. 2023, 59, 1701–1704. [Google Scholar] [CrossRef]
  47. Kashin, A.S.; Boiko, D.A.; Ananikov, V.P. Neural Network Analysis of Electron Microscopy Video Data Reveals the Temperature-Driven Microphase Dynamics in the Ions/Water System. Small 2021, 17, 2007726. [Google Scholar] [CrossRef]
  48. Jordan, J.W.; Chernov, A.I.; Rance, G.A.; Stephen Davies, E.; Lanterna, A.E.; Alves Fernandes, J.; Grüneis, A.; Ramasse, Q.; Newton, G.N.; Khlobystov, A.N. Host–Guest Chemistry in Boron Nitride Nanotubes: Interactions with Polyoxometalates and Mechanism of Encapsulation. J. Am. Chem. Soc. 2023, 145, 1206–1215. [Google Scholar] [CrossRef]
  49. Fung, K.L.Y.; Skowron, S.T.; Hayter, R.; Mason, S.E.; Weare, B.L.; Besley, N.A.; Ramasse, Q.M.; Allen, C.S.; Khlobystov, A.N. Direct Measurement of Single-Molecule Dynamics and Reaction Kinetics in Confinement Using Time-Resolved Transmission Electron Microscopy. Phys. Chem. Chem. Phys. 2023, 25, 9092–9103. [Google Scholar] [CrossRef]
  50. Yan, P.; Zhang, D.; Zhang, W.; Sun, K.; Jin, M.; Chamberlain, T.W.; Khlobystov, A.N.; Kaiser, U.; Hu, Y.; Cao, K. Atomic-Scale Imaging of Transformation of Nickel Nanocrystals to Nickel Carbides in Real Time. ACS Nano 2025, 19, 23306–23314. [Google Scholar] [CrossRef] [PubMed]
  51. Greco, R.; Lloret, V.; Rivero-Crespo, M.Á.; Hirsch, A.; Doménech-Carbó, A.; Abellán, G.; Leyva-Pérez, A. Acid Catalysis with Alkane/Water Microdroplets in Ionic Liquids. JACS Au 2021, 1, 786–794. [Google Scholar] [CrossRef] [PubMed]
  52. Leyva-Pérez, A.; Bilanin, C.; Bacic, M.; Greco, R. Acid and Base Water Coexists in a Micro-Structured Ionic Liquid and Catalyzes Organic Reactions in One-Pot. ChemCatChem 2022, 14, e202200557. [Google Scholar] [CrossRef]
  53. Fedorets, A.A.; Koltsov, S.; Muravev, A.A.; Fotin, A.; Zun, P.; Orekhov, N.; Nosonovsky, M.; Skorb, E.V. Observation of a Chemical Reaction in a Levitating Microdroplet Cluster and Droplet-Generated Music. Chem. Sci. 2024, 15, 12067–12076. [Google Scholar] [CrossRef] [PubMed]
  54. Kashin, A.S.; Ananikov, V.P. Nanoscale Advancement Continues—From Catalysts and Reagents to Restructuring of Reaction Media. Angew. Chem. Int. Ed. 2021, 60, 18926–18928. [Google Scholar] [CrossRef] [PubMed]
  55. Ede, J.M. Deep Learning in Electron Microscopy. Mach. Learn. Sci. Technol. 2021, 2, 011004. [Google Scholar] [CrossRef]
  56. Williams, E.; Moore, J.; Li, S.W.; Rustici, G.; Tarkowska, A.; Chessel, A.; Leo, S.; Antal, B.; Ferguson, R.K.; Sarkans, U.; et al. Image Data Resource: A Bioimage Data Integration and Publication Platform. Nat. Methods 2017, 14, 775–781. [Google Scholar] [CrossRef]
  57. Iudin, A.; Korir, P.K.; Somasundharam, S.; Weyand, S.; Cattavitello, C.; Fonseca, N.; Salih, O.; Kleywegt, G.J.; Patwardhan, A. EMPIAR: The Electron Microscopy Public Image Archive. Nucleic Acids Res. 2023, 51, D1503–D1511. [Google Scholar] [CrossRef]
  58. van den Enden, D. Electron Microscope Images—1995–2007, Ver. 1; Australian Antarctic Data Centre: Kingston, Tasmania, Australia, 2023. [CrossRef]
  59. Sarkans, U.; Chiu, W.; Collinson, L.; Darrow, M.C.; Ellenberg, J.; Grunwald, D.; Hériché, J.-K.; Iudin, A.; Martins, G.G.; Meehan, T.; et al. REMBI: Recommended Metadata for Biological Images—Enabling Reuse of Microscopy Data in Biology. Nat. Methods 2021, 18, 1418–1422. [Google Scholar] [CrossRef]
  60. Moore, J.; Basurto-Lozada, D.; Besson, S.; Bogovic, J.; Bragantini, J.; Brown, E.M.; Burel, J.-M.; Casas Moreno, X.; de Medeiros, G.; Diel, E.E.; et al. OME-Zarr: A Cloud-Optimized Bioimaging File Format with International Community Support. Histochem. Cell Biol. 2023, 160, 223–251. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Different types of available electron microscopy data and corresponding strategies for their analysis (A). Two main algorithms used to analyze the array: sequential filtering of the array by several parameters (B) and sorting the whole array into different values of the same parameter (C). P1, P2 and so on refer to the search parameters. The different colors on each of the panels (B,C) highlight subarrays that match specific values of the search parameters.
Figure 1. Different types of available electron microscopy data and corresponding strategies for their analysis (A). Two main algorithms used to analyze the array: sequential filtering of the array by several parameters (B) and sorting the whole array into different values of the same parameter (C). P1, P2 and so on refer to the search parameters. The different colors on each of the panels (B,C) highlight subarrays that match specific values of the search parameters.
Chemistry 07 00160 g001
Figure 2. Summary of the composition of the electron microscopy image archive. Distribution of SEM and TEM images by year (A) and conditional scale bar size (B). Some numerical values for overlapping points in panel (A) have been removed for clarity.
Figure 2. Summary of the composition of the electron microscopy image archive. Distribution of SEM and TEM images by year (A) and conditional scale bar size (B). Some numerical values for overlapping points in panel (A) have been removed for clarity.
Chemistry 07 00160 g002
Figure 3. Summary of the published electron microscopy data. Percentage distribution of published articles by selected categories and subcategories (A) as well as distribution of the number of published images by article topic and year of publication (B). The characteristic sizes of the objects studied by S(T)EM (C) or TEM (D) in terms of the conditional scale bar size. The points for the microscopic control and ionic liquids categories for 2017 and 2020, respectively, are not shown in the panel (B) due to overlapping with points for organic chemistry category.
Figure 3. Summary of the published electron microscopy data. Percentage distribution of published articles by selected categories and subcategories (A) as well as distribution of the number of published images by article topic and year of publication (B). The characteristic sizes of the objects studied by S(T)EM (C) or TEM (D) in terms of the conditional scale bar size. The points for the microscopic control and ionic liquids categories for 2017 and 2020, respectively, are not shown in the panel (B) due to overlapping with points for organic chemistry category.
Chemistry 07 00160 g003
Figure 4. Schematic representation of summary of suggestions focused on improving electron microscopy data management, analysis and reporting to avoid the problem of data loss (see Supporting Information “Section S3. Solving the Data Loss Problem” for other details).
Figure 4. Schematic representation of summary of suggestions focused on improving electron microscopy data management, analysis and reporting to avoid the problem of data loss (see Supporting Information “Section S3. Solving the Data Loss Problem” for other details).
Chemistry 07 00160 g004
Table 1. Estimation of the amount of electron microscopy data used and lost.
Table 1. Estimation of the amount of electron microscopy data used and lost.
Data
Source
Core User FacilityThird-Party Facilities
Data
type
SEMTEMTOTALSEMSTEMTEMTOTAL
Number of acquired
images
119,55732,540152,097No dataNo dataNo dataNo data
Number of published articles2389429215838122200
Number of published images3054523357716242748182766
Average number of images per article12.835.5612.2510.287.216.7013.83
Published
data, %
2.551.612.35
Lost
data, %
97.4598.3997.65
Table 2. The list of parameters used to analyze the electron microscopy image dataset. Refer to the Windows command-line documentation for additional information on the used syntax.
Table 2. The list of parameters used to analyze the electron microscopy image dataset. Refer to the Windows command-line documentation for additional information on the used syntax.
ParameterSearch TypeFormulated Task/Request
Type of the instrumentFile searchSearch for the image files (*.png, *.jpg, *.tif) in the tree folder transferred from a specific instrument’s workstation
Year of image acquisitionFile search, string searchFile attribute search,
date search in DD.MM.YYYY format:
 
dir/t:w/s *.%Filetype% > <List>
Filetype = png, jpg, tif
 
find/c/i “%Year%” <List>
Year = 2011, 2012, …, 2023
 
Search in parameter files (in metadata),
date search in MM/DD/YYYY format:
 
findstr/s/m “/%Year%” <Log file> > <List>
Year = 2011, 2012, …, 2023
Type of the detectorString searchSearch in parameter files (in metadata),
search for a specific value:
 
findstr/s/m “SignalName=%Detector%” <Log file> > <List>
Detector = SE, LA-BSE, HA-BSE, PDBSE, TE, BFSTEM, DFSTEM
Magnification valueString searchSearch in parameter files (in metadata),
search for a specific value:
 
findstr/s/m “Magnification=%Mag%” <Log file> > <List>
Mag = 20, 25, 30,…, 800,000
 
Search in parameter files (in metadata),
extraction of all values:
 
findstr/s/n “Magnification=” <Log file> > <List>
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ivanova, N.M.; Kashin, A.S.; Ananikov, V.P. Lost Data in Electron Microscopy. Chemistry 2025, 7, 160. https://doi.org/10.3390/chemistry7050160

AMA Style

Ivanova NM, Kashin AS, Ananikov VP. Lost Data in Electron Microscopy. Chemistry. 2025; 7(5):160. https://doi.org/10.3390/chemistry7050160

Chicago/Turabian Style

Ivanova, Nina M., Alexey S. Kashin, and Valentine P. Ananikov. 2025. "Lost Data in Electron Microscopy" Chemistry 7, no. 5: 160. https://doi.org/10.3390/chemistry7050160

APA Style

Ivanova, N. M., Kashin, A. S., & Ananikov, V. P. (2025). Lost Data in Electron Microscopy. Chemistry, 7(5), 160. https://doi.org/10.3390/chemistry7050160

Article Metrics

Back to TopTop