Next Article in Journal
Stochastic Finite Element Analysis for Static Bending Beams with a Two-Dimensional Random Field of Material Properties
Previous Article in Journal
Advancement of Artificial Intelligence in Cost Estimation for Project Management Success: A Systematic Review of Machine Learning, Deep Learning, Regression, and Hybrid Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evaluation of Neural Networks for Improved Computational Cost in Carbon Nanotubes Geometric Optimization

by
Luis Josimar Vences-Reynoso
1,
Daniel Villanueva-Vasquez
1,
Roberto Alejo-Eleuterio
1,
Federico Del Razo-López
1,
Sonia Mireya Martínez-Gallegos
1 and
Everardo Efrén Granda-Gutiérrez
2,*
1
División de Estudios de Posgrado e Investigación, Tecnológico Nacional de Mexico/Instituto Tecnológico de Toluca, Av. Tecnológico s/n, Col. Agrícola, Metepec 52149, Mexico
2
Centro Universitario UAEM Atlacomulco, Universidad Autónoma del Estado de México, Km. 60 Carretera Toluca-Atlacomulco, Atlacomulco 50450, Mexico
*
Author to whom correspondence should be addressed.
Modelling 2025, 6(2), 36; https://doi.org/10.3390/modelling6020036
Submission received: 20 March 2025 / Revised: 20 April 2025 / Accepted: 30 April 2025 / Published: 2 May 2025

Abstract

:
Geometric optimization of carbon nanotubes (CNTs) is a fundamental step in computational simulations, enabling precise studies of their properties for various applications. However, this process becomes computationally expensive as the molecular structure grows in complexity and size. To address this challenge, this study utilized three deep-learning-based neural network architectures: Multi-Layer Perceptron (MLP), Bidirectional Long Short-Term Memory (BiLSTM), and 1D Convolutional Neural Networks (1D-CNNs). Simulations were performed using the CASTEP module in Material Studio to generate datasets for training the neural networks. While the final geometric optimization calculations were completed within Material Studio, the neural networks effectively generated preoptimized CNT structures that served as starting points, significantly reducing computational time. The results showed that the 1D-CNN architecture performed best for CNTs with 28, 52, 76, and 156 atoms, while the MLP outperformed others for CNTs with 84, 124, 148, and 196 atoms. Across all cases, computational time was reduced by 39.68% to 90.62%. Although the BiLSTM also achieved reductions, its performance was less effective than the other two architectures. This work highlights the potential of integrating deep learning techniques into materials science; it also offers a transformative approach to reducing computational costs in optimizing CNTs and presents a way for accelerated research in molecular systems.

Graphical Abstract

1. Introduction

In materials science, geometric optimization is important in advancing the development of materials. Researchers can use computational simulations to explore a broad spectrum of geometric configurations and material properties with remarkable efficiency [1]. The primary goal of this process is to identify the most stable and energetically favorable configuration of a molecular structure [2]. Geometric optimization involves systematically adjusting atomic positions to minimize the system’s total energy, which is critical for conducting simulations that explain the fundamental properties and behaviors of materials.
Carbon nanotubes (CNTs) have emerged as a cornerstone in nanotechnology research due to their extraordinary physical and chemical properties. Geometric optimization of CNTs is particularly useful for understanding and enhancing their potential applications [3,4,5,6,7]. One of the most relevant representatives is hydrogen storage, which has direct implications for developing advanced energy storage technologies [8,9,10,11,12]. In this field, integrating computational simulation tools has proven to be a powerful strategy for innovation in the design and study of nanostructured materials [13,14,15].
However, the geometric optimization of CNTs is computationally intensive, with the processing time increasing substantially as the number of atoms in the system grows [16]. This computational challenge arises from the necessity to locate the global minimum energy state amidst numerous local minima [17]. For CNTs, this complexity can impede the feasibility of large-scale simulations, especially in scenarios involving intricate molecular structures. Thus, exploring innovative methods to accelerate geometric optimization without compromising accuracy is imperative.
To address these challenges, this study investigates the potential of deep neural networks (DNNs) to reduce the computational cost associated with the geometric optimization of CNTs. The research focuses on a hybrid approach where DNNs generate preoptimized (suboptimal) configurations of CNTs, which serve as starting points for further optimization using the software BIOVIA Materials Studio®, version 2024, CASTEP Module [18].
In this work, BIOVIA Materials Studio® was used to construct and optimize the geometry of CNTs. This software is widely employed in materials science for its powerful molecular modeling tools and its support for universal force fields, including Lennard-Jones potentials [19]. The licensed availability of this tool at our institution facilitated its use. The generated CNT structures were used for data extraction and validating our artificial neural network (ANN) predictions.
In this sense, the methodology proposed in this work comprises three phases:
1.
Initial CNT configurations are generated using BIOVIA Materials Studio®, which are then used to train DNN models.
2.
Suboptimal configurations of CNTs are produced using the trained DNNs. These configurations are subsequently refined through optimization in BIOVIA Materials Studio®.
3.
The effectiveness of using DNNs as a precursor step to optimization in BIOVIA Materials Studio® is evaluated.
The proposed approach contributes significantly to the geometric optimization of complex systems by introducing deep neural networks as a preprocessing step, thereby reducing the computational time required by BIOVIA Materials Studio®. The reported time savings, ranging from 39.68% to 90.62%, are based on benchmark experiments carried out in this study, comparing conventional optimization procedures with the ANN-assisted approach across various CNT configurations.
Thus, this study demonstrates the viability of integrating artificial intelligence into materials science workflows, providing a more efficient exploration of CNT properties. The research aims to accelerate innovation in designing and applying advanced materials by addressing computational bottlenecks using deep learning tools.

2. Related Works

In recent years, advancements in artificial intelligence, particularly in artificial neural networks (ANNs), have revolutionized various research domains, including materials science [20,21]. In this field, ANNs have successfully contributed to developing new materials, such as carbon nanotubes, frequently studied through computational approaches to uncover their potential and utility in diverse applications. This section reviews notable studies highlighting the application of ANNs for improving and discovering new materials.
Vivanco et al. [22] examined machine learning techniques for modeling carbon nanotubes, emphasizing using ANNs, support vector machines (SVMs), and random forests to analyze these nanostructures. Valentina et al. [23] utilized multilayer perceptrons (MLPs) and One-Dimensional Convolutional Neural Networks (1D CNNs) to predict stress–strain curves in CNTs, achieving high accuracy in their predictions.
Similarly, Fakhrabadi et al. [24] applied an MLP neural network to predict fundamental vibrational frequencies in CNTs, while Kaushal et al. [25] developed an ANN-based model to forecast the yield and diameter of single-walled carbon nanotubes (SWCNTs) with over 90% accuracy. Akbari et al. [26] explored the impact of gas interactions on the conductivity of CNTs by employing ANNs and SVMs to model current–voltage (I-V) characteristics, finding that SVMs delivered superior predictions.
Marko [27] developed an ANN model to predict mechanical properties based on data derived from molecular dynamics simulations, showcasing the precision of deep learning techniques. Anderson et al. [28] integrated molecular simulations with ANN methods to analyze hydrogen storage in porous crystals, enabling rapid preassessment of storage capacities.
In predicting nanomaterial properties, Salah et al. [29] utilized an MLP system to achieve a remarkable 99.7997% accuracy in forecasting electromagnetic absorption in polycarbonate and CNT films. Nguyen et al. [30] reported a significant enhancement in the thermal conductivity of hybrid nanofluids, emphasizing their relevance in heat transfer applications and applied neural networks for precise modeling.
Despite these advancements, the computational time required for developing carbon structures has received limited attention. A notable exception is the study by Aci and Avci [31], which employed feedforward neural networks (FFNN) and generalized regression neural networks (GRNN) for the geometric optimization of structures. Their approach achieved an 85% reduction in the number of iterations required for calculations, underscoring the efficiency of these techniques in computational optimization and highlighting their potential for similar applications.
The successful application of deep learning neural networks in predicting properties and optimizing materials demonstrates their potential to significantly reduce simulation times. These advancements open new opportunities for accelerating research and development in advanced materials while making efficient ways of exploring nanostructures like CNTs and other innovative materials.

3. Theoretical Foundations

3.1. Carbon Nanotubes (CNTs)

Carbon nanotubes (CNTs) are one-dimensional materials composed of graphite planes rolled into cylindrical shapes with diameters at the nanometric scale. They can be classified as single-walled carbon nanotubes (SWCNTs) or multi-walled carbon nanotubes (MWCNTs). SWCNTs consist of a single graphene layer seamlessly rolled into a cylinder, while MWCNTs comprise multiple concentric cylindrical graphene layers [32].
Graphene sheets are planes of carbon (C) atoms arranged in a hexagonal lattice. Each carbon atom forms a covalent C-C bond with its nearest neighbors in this structure. The rolling of a graphene sheet to form a CNT is defined by a chiral vector ( Ch ) that determines the direction along which the graphene sheet is rolled into a tubular structure, as ilustrated in Figure 1 [33]. This chiral vector is mathematically represented as depicted in Equation (1).
Ch = n a 1 + m a 2 ,
where n and m are integers referred to as the chiral indices and a 1 and a 2 are unit vectors of the graphene lattice. The pair ( n , m ) uniquely describes an SWCNT and governs its geometry, primarily its diameter D and chiral angle θ , as well as its physicochemical properties [33,34].
Due to the hexagonal symmetry of the graphene lattice and the chiral symmetry of the nanotubes ( n , m ) , the chiral angle for CNTs are limited to the range 0 θ 30 . Depending on the value of θ , CNTs are categorized into specific configurations: (1) zigzag nanotubes ( m = 0 , and θ = 0 ° ), resulting in a straight-line pattern along the tube circumference; (2) armchair nanotubes ( n = m , and θ = 30 ° ), forming a symmetric “chair-like” pattern; and (3) chiral nanotubes ( 0 < θ < 30 ), which are asymmetrical configurations and represent the majority of possible nanotube structures [35].

3.2. Artificial Neural Networks

In this section, we explain the neural network architectures employed in this research to outline their fundamental characteristics, clarify their operational principles, and justify their selection in the context of the study.

3.2.1. Multilayer Perceptron

A Multilayer Perceptron (MLP) is a feedforward neural network composed of an input layer, one or more hidden layers, and an output layer. Each neuron computes a weighted sum of its inputs and applies a nonlinear activation function σ , enabling the network to model complex relationships and approximate continuous functions [36]. The output y j of a hidden neuron j is computed as:
s j = i = 1 n w j i x i + b j , y j = σ ( s j )
where w j i reflects the synaptic weights, x i represents the inputs from the previous layer, b j is the bias, and σ can be ReLU, sigmoid, or tanh.
The output layer neuron k computes:
y k = σ j = 1 m w k j y j + b k
MLPs are trained via supervised learning to minimize a loss function E using backpropagation and gradient-based optimization (e.g., SGD, Adam) [37]. The update rule for weights, given a learning rate η is expressed as follows:
Δ w j i = η E w j i
In this work, MLPs are selected for their proven ability to approximate complex functions, manage high-dimensional input data, and predict material properties efficiently. Their structure facilitates hierarchical feature extraction, making them suitable for the geometric optimization of carbon nanotubes.

3.2.2. Bidirectional Long Short-Term Memory Networks

Bidirectional Long Short-Term Memory Networks (BiLSTMs) are a powerful extension of Recurrent Neural Networks (RNNs) that process sequential data in both forward and backward directions, improving the modeling of long-range dependencies [38]. This dual context is particularly valuable when the output at each step depends on both past and future elements in the sequence.
A standard Long Short-Term Memory (LSTM) unit incorporates gating mechanisms to control the flow of information. At each time step t, the input gate i t , forget gate f t , output gate o t , and cell state C t are described as follows:
i t = σ ( W i x t + U i h t 1 + b i )
f t = σ ( W f x t + U f h t 1 + b f )
o t = σ ( W o x t + U o h t 1 + b o )
C t = f t C t 1 + i t C ˜ t
where the candidate cell state C ˜ t is computed as:
C ˜ t = tanh ( W C x t + U C h t 1 + b C )
and the hidden state ( h t ) encodes the output of the LSTM at each time step, and it is expressed as:
h t = o t tanh ( C t )
where σ is the sigmoid activation function, ⊙ denotes element-wise multiplication, and x t and h t are the input and hidden state at time t, respectively, because in a BiLSTM, two separate LSTMs traverse the input sequence:
h t = LSTM fwd ( x t , h t 1 ) , h t = LSTM bwd ( x t , h t + 1 )
and their outputs are concatenated:
h t = [ h t ; h t ]
This structure enables the network to capture comprehensive contextual information at each point in the sequence. The forward and backward LSTMs operate independently and do not share weights, increasing model flexibility and representational capacity [39,40].
In this study, BiLSTMs are employed to analyze CNT structures by capturing dependencies within their atomic configurations. This is particularly advantageous when processing chirality-encoded vectors, as structural features often rely on both preceding and succeeding elements in the CNT sequence.

3.2.3. Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are a class of neural networks well-suited for identifying patterns in structured data such as images or sequences. Their architecture generally includes convolutional layers for feature extraction, followed by fully connected layers for classification or regression tasks [41,42,43].
In this study, CNNs are adapted to process vector-based representations of CNTs using one-dimensional (1D) convolutions. Each convolutional layer applies a set of filters K k across the input sequence x, with m being the offset (or index traversing the input), generating feature maps f i j k as follows:
f i j k = ( K k x ) i j = m K m k x ( i + m )
Unlike traditional CNNs, pooling layers are omitted to preserve input dimensionality and retain all geometric and physical information during convolution.
The output of the convolutional layers is passed to fully connected layers, where each neuron z j computes:
z j = i w j i a i + b j
where a i denotes activations from the previous layer, w j i the synaptic weights, and b j the bias term.
The use of CNNs in this context enables efficient processing of CNT data by using local spatial correlations in the atomic structure. Their ability to automatically learn hierarchical representations makes them particularly effective for modeling the structural properties of CNTs. Moreover, by using 1D convolutions and excluding pooling layers, the network retains all relevant geometric and physical information during feature extraction, ensuring that critical details are preserved.

4. Experimental Design

The methodology of this study is divided into three phases to ensure a systematic analysis of the computational efficiency of the proposed ANN-based approach for CNT geometric optimization.
Phase I: Involves generating the dataset comprising initial and optimized CNT molecules. These structures are obtained using BIOVIA Materials Studio® (CASTEP module), which performs first-principles simulations to optimize the geometric properties of the CNTs.
Phase II: Suboptimized CNT structures are generated using selected ANN architectures. Subsequently, these suboptimized structures are used as the starting point for geometric optimization via the CASTEP module.
Phase III: The computational times required for the geometric optimization of CNTs is analyzed. Two approaches are evaluated: one using the structures directly optimized in CASTEP module and the other using the suboptimized CNTs generated by the ANN as the initial input.

4.1. Phase I: Dataset Generation

The dataset for training the selected ANN models was generated by first constructing CNTs in BIOVIA Materials Studio®, where key parameters such as atomic type (carbon) and bond length (1.42 Å as the default value) were specified. These initial structures were then subjected to geometric optimization using the CASTEP module, with the electronic energy tolerance parameter set to 1 × 10 5 eV per atom to ensure high precision in the computational results.
All CNT structures in this study were optimized using energy minimization protocols at 0 K, as implemented by default in BIOVIA Materials Studio®. This approach allows for the determination of equilibrium geometries without thermal noise.
While temperature effects are crucial in atomistic simulations involving molecular dynamics or thermal stability assessments, they were not considered here, as the objective was to evaluate the structural prediction accuracy of the ANN under idealized conditions. Moreover, since many CNT synthesis techniques aim for processes compatible with room temperature or mild conditions, the resulting geometries remain relevant for experimental applications [44].
As a reference, Figure 2 illustrates a CNT with chirality n = 2 and m = 1 . Chirality parameters define how the graphene sheet is rolled to form the nanotube. The specific CNT depicted contains 28 atoms, and its spatial coordinates (x, y, z) were used as inputs for the CASTEP module. After optimization, the corresponding coordinates ( x , y , z ) served as the output targets.
The number of atoms corresponds to the number of carbon atoms within the translational unit cell of the chiral CNT (2,1). This value results from the construction of the nanotube based on its chiral vector Ch = 2 a 1 + a 2 , and the corresponding translational vector along the tube axis. Thus, the unit cell formed is the smallest repeating structure that fully captures the atomic arrangement of this chirality.
The number of atoms in the unit cell of a chiral carbon nanotube is given by the expression Equation (15) [45]:
N = n 2 + n m + m 2 d R
where d R is the greatest common divisor of 2 m + n . Then, the (2,1) chirality yields N = 28 atoms, corresponding to the minimal translational unit cell along the nanotube axis.
Table 1 details the spatial coordinates of the 28 atoms comprising the CNT. The first three columns represent the unoptimized CNT’s initial coordinates (x, y, z). In comparison, the last three columns list the optimized coordinates ( x , y , z ) after processing with the Materials Studio software. Then, eight CNTs with different chiralities were constructed, resulting in structures with varying numbers of atoms, as shown in Table 2.
It is important to note that the carbon nanotubes analyzed in Table 2, especially those with chiral indices (2,1), (3,1), etc., correspond to structures with very small diameters (i.e., diameter < 0.5 nm). While valid from a theoretical standpoint, these geometries are known to present high curvature-induced strain, which can lead to reduced thermodynamic stability under ambient conditions.
Nonetheless, such ultra-small CNTs have been predicted and, in some cases, observed experimentally, particularly in confined environments or synthesized using specialized techniques [45]. One study by Zhao et al. [46] found that a stable 3 Å carbon nanotube can be grown inside a multi-walled carbon nanotube (MWNT) using high-resolution transmission electron microscopy. Density functional calculations indicated that this 3 Å CNT is the armchair CNT (2,2). While this study does not directly address the observation by Zhao et al. [46], it provides evidence that very small carbon nanotubes can be stable and observable under certain conditions.
The study of low chiralities CNTs remains relevant for understanding the limits of CNT structural stability and validating computational nanotube behavior models. Moreover, other studies suggest that carbon nanotubes with low chiralities should be studied because they exhibit unique electronic, optical, and mechanical properties, have a strong preference in growth processes, and are crucial for high-purity separation and specific applications [34,47,48].
The datasets created under conditions specified in Table 1 and Table 2 are grouped into three categories, as follows:
  • CNT-BCO: Contains both the initial (random) and CNT data optimized by CASTEP module of BIOVIA Materials Studio® software.
  • CNT-BCO2: Includes the initial CNTs and those optimized twice using the CASTEP module to augment the data volume.
  • CNT-BCO-ALL: A comprehensive dataset combining the first two datasets.
The three categories were generated based on the process in Table 3. First, we used the base CNT and the BIOVIA Materials Studio® sofware to optimize the atom coordinates; then, an optimized CNT dataset was produced.
Afterward, we processed the base CNT in BIOVIA Materials Studio® twice, thus creating a new augmented dataset (CNT-BCO2). The CNT-BCO-ALL contains the dataset from the two previous stages.
These datasets were the foundation for ANN training, enabling the models to learn from the initial and optimized CNT geometries. By incorporating diverse examples, the datasets ensured the robustness of the ANN predictions, ultimately enhancing the geometric optimization process of CNTs.

4.2. Phase II: Construction of Models for Generating Suboptimized CNTs

To generate suboptimized CNTs, the three ANN architectures explained in Section 3.2 were employed: a Multilayer Perceptron (MLP), a Bidirectional Long Short-Term Memory (BiLSTM) network, and a 1D Convolutional Neural Network (1D-CNN). These architectures were configured with the hyperparameters detailed in Table 4.
The datasets, previously presented in Table 3, were split into three subsets to ensure proper training, evaluation, and validation of the models: (a) 60% for training, used to adjust the model’s parameters during the learning phase, (b) 30% for testing, utilized to evaluate the model’s performance and prevent over-fitting, and (c) 10% for validation, applied during training to monitor generalization and fine-tune hyperparameters.
Each ANN was trained using a supervised learning approach. The Early Stopping criterion was implemented to optimize computational efficiency and ensure convergence. This mechanism halts training if the validation loss does not improve for ten consecutive epochs, thereby preventing over-fitting and excessive computation.
The primary goal of this phase was to generate suboptimized CNT structures that serve as intermediate geometries between the unoptimized and fully optimized states. These were later used in Phase III, allowing for a detailed comparison of computational efficiency in contrast with the original processed datasets (see Section 4.1).
This systematic approach ensured that each ANN architecture was rigorously tested and could reliably produce suboptimized CNTs, thus reducing the processing time in the final optimization process with Materials Studio software.

5. Results and Discussion

The results focus on Phase III, which evaluates the effectiveness of the neural network models as a preoptimization step for CNTs. After training the ANN models with the CNTs detailed in Table 2, suboptimized structures were generated. These were then used as initial inputs for geometric optimization via the CASTEP module (BIOVIA Materials Studio®), aiming to assess the reduction in computational time.
Table 5 summarizes the results obtained using the BiLSTM network. It demonstrated a notable reduction in computation time, ranging from 3.69% for the CNT with 196 atoms to 65.23% for the CNT with 84 atoms. This trend indicates that the reduction percentage decreases as the number of atoms in the CNTs increases. Similar behavior was observed in the other neural network models, albeit to a lesser extent.
In Table 6, results from six trials conducted on a CNT with chirality n = 5 , m = 3 (196 atoms) are summarized. Identical hyperparameters were configured across tests; also, the used ANN was 1D-CNN, and the dataset for this experiment was CNT-BCO2. Five iterations in the initial CNT dataset were performed. The reference time for time-saving estimation was reported as 6342.31 s, which was the CASTEP module processing time without ANN preprocessing. Each trial represents the results of varying the iterations in the suboptimized CNT, as depicted in the first column.
It must be noted in Table 6 that the performance varied notably among the trials. For instance, in trial 2 (2nd row), the network achieved a modest 1.11% reduction in computational time, indicating limited learning effectiveness. In contrast, trials 3 and 6 (highlighted in bold) yielded substantial improvements, with up to 52.11% reduction in computational time. Also, please note that suboptimized CNT with 7 iterations (first row) increased the computational time.
Following this variability analysis, the most favorable results in terms of time reduction were selected for CNTs with varying chiralities. These are summarized in Table 7, where the suboptimization with 1D-CNN and MLP architectures was also tested. Datasets CNT-BCO and CNT-BCO2 were used for this analysis.
For smaller CNTs (28, 52, and 76 atoms), the 1D-CNN architecture was the most effective overall. A notable exception was the CNT with chirality n = 4 and m = 1 , where the MLP network outperformed others, achieving a remarkable 68.27% reduction in computational time. Notably, for the CNT with 28 atoms ( n = 2 , m = 1 ), the 1D-CNN achieved an outstanding 90.62% reduction, marking the highest efficiency in time cost improvement across all tested cases.
On the other hand, for larger CNTs (148, 124, and 196 atoms), the MLP architecture demonstrated superior efficiency in most scenarios. However, for the CNT with chirality n = 5 , m = 2 , the 1D-CNN surpassed the MLP, achieving a 55.93% reduction in computational time. From a general point of view, in many cases, computational time reductions exceeded 50%, demonstrating a significant enhancement in the optimization process’s efficiency in terms of saving time.
The results confirm that deep ANNs, particularly 1D-CNNs, and MLPs significantly reduce the computational time for geometric optimization by generating suboptimized CNT structures that serve as effective starting points for CASTEP. These reductions, reaching up to 90.62% (CNT with 28 atoms), represent a breakthrough in computational efficiency for material simulations. The demonstrated effectiveness of these ANN-based methods highlights their potential for accelerating similar processes in nanotechnology and materials science.
Nonetheless, as shown in Table 5, Table 6 and Table 7, the relative time savings achieved through neural network preprocessing decrease as the chirality indices increase (and thus, the number of atoms). This effect becomes more pronounced starting from chirality (5,3), where the computational overhead of the neural network inference and the subsequent geometric optimization in BIOVIA Materials Studio® begins to surpass the initial speedup. This behavior is anticipated, as the scaling of the energy minimization process is highly sensitive to system size.
Despite this, exploring CNTs with a wide range of chiralities, including small-diameter variants, remains relevant. Small CNTs such as (2,2), (3,3), and (4,2), though less frequently synthesized, have been shown to exhibit mechanical stability under specific conditions. For example, Zhao et al. [46] reported that such structures can be stable despite higher strain energy than larger tubes. In other work, Peng et al. [49] found that CNTs with diameters as small as 0.33 nm can remain stable at temperatures exceeding 1000 °C. Including these cases in the analysis allows us to identify the threshold at which the preprocessing method begins to lose efficiency, providing valuable information for future scalability studies.
Therefore, while larger and more stable CNTs hold clear experimental and technological interest, the present work offers a broader perspective on how neural networks operate across various structural configurations, aiding in defining practical boundaries for their application in computational materials design.

6. Conclusions

This study has demonstrated that deep-learning-based artificial neural networks can reduce computational time for geometric optimization of CNTs by up to 90.62%, significantly improving the time consumed by CASTEP Materials Studio Software. It could extend this methodology to a broader range of carbon nanotube chiralities and possibly other molecular systems. With further refinement of ANN architectures and weight optimization, achieving even more significant reductions in computation time is plausible, particularly for more complex molecular structures. Moreover, once trained, these ANNs do not necessarily require retraining for similar tasks, further enhancing computational efficiency.
Unlike previous studies, such as the work of Aci and Avci [31], which focused on reducing the number of iterations using ANNs, this work implemented deep-learning-based ANNs, including MLP, BiLSTM, and 1D-CNN architectures. Experimental results indicate that these models demonstrate a more accurate learning ability for the numerical patterns that define the energy minimum of molecular structures. This improvement directly leads to a significant decrease in computational time related to geometric optimization.
One notable challenge identified is the dependence on proprietary software, specifically the Material Studio platform and its CASTEP module, for generating the dataset used in ANN training. This reliance presents limitations in scalability and accessibility for broader applications. Future research could explore alternative strategies, such as using unsupervised or reinforcement learning approaches. These methods could potentially eliminate dependency on specific simulation tools while maintaining or even improving the computational efficiency and accuracy of the optimization process.
To successfully implement these advanced methodologies, it would be important to consider the numerical methods and algorithms CASTEP utilizes during geometric optimization. By integrating these principles into the training process, future models could further enhance their ability to predict optimal configurations while reducing computational overhead.

Author Contributions

Conceptualization, L.J.V.-R. and R.A.-E.; methodology, F.D.R.-L. and D.V.-V.; software, L.J.V.-R. and D.V.-V.; validation, S.M.M.-G., R.A.-E. and E.E.G.-G.; formal analysis, D.V.-V. and R.A.-E.; investigation, L.J.V.-R. and S.M.M.-G.; writing—original draft preparation, L.J.V.-R.; writing—review and editing, E.E.G.-G. and R.A.-E. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors would like to extend their sincere appreciation to the Tecnológico Nacional de México and the Instituto Tecnológico de Toluca for the support provided. The first author acknowledges SECIHTI for the support through grant 845048.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Reveles, J.U.; Köster, A.M. Geometry optimization in density functional methods. J. Comput. Chem. 2004, 25, 1109–1116. [Google Scholar] [CrossRef] [PubMed]
  2. Clark, S.J.; Segall, M.D.; Pickard, C.J.; Hasnip, P.J.; Probert, M.I.J.; Refson, K.; Payne, M.C. First principles methods using CASTEP. Z. Krist. Cryst. Mater. 2005, 220, 567–570. [Google Scholar] [CrossRef]
  3. Sui, Y.; Sun, M.; Wang, Y.; Xu, Z.; Yan, J.; Liu, H. DFT study of B substitution on the hydrogen storage properties of pt-modified conical cup-stacked carbon nanotube. Comput. Theor. Chem. 2024, 1231, 114402. [Google Scholar] [CrossRef]
  4. Karki, S.; Chakraborty, S.N. Hydrogen adsorption in nanotube and cylindrical pore: A grand canonical Monte Carlo simulation study. Int. J. Hydrogen Energy 2023, 48, 2731–2741. [Google Scholar] [CrossRef]
  5. Prasad, A.; Gupta, A.; Kumar, N. Design and validation of Clathrate-CNT systems for solid state hydrogen storage. Int. J. Hydrogen Energy 2023, 48, 7814–7827. [Google Scholar] [CrossRef]
  6. Demir, S.; Fellah, M.F. Carbon nanotubes doped with Ni, Pd and Pt: A density functional theory study of adsorption and sensing NO. Surf. Sci. 2020, 701, 121689. [Google Scholar] [CrossRef]
  7. Liu, Y.; Peng, Y.; An, B.; Li, L.; Liu, Y. Effect of molecular structure on the adsorption affinity of sulfonamides onto CNTs: Batch experiments and DFT calculations. Chemosphere 2020, 246, 125778. [Google Scholar] [CrossRef] [PubMed]
  8. Aleem, A.; Perveen, F. Hydrogen production and storage through adsorption and dissociation of H2O on pristine and functionalized SWCNT: A DFT approach. J. Mol. Model. 2023, 29, 305. [Google Scholar] [CrossRef]
  9. Yang, Z.; Guo, Z.; Yuan, C.; Bai, X. Tribological behaviors of composites reinforced by different functionalized carbon nanotube using molecular dynamic simulation. Wear 2021, 476, 203669. [Google Scholar] [CrossRef]
  10. Lyu, J.; Kudiiarov, V.; Lider, A. An Overview of the Recent Progress in Modifications of Carbon Nanotubes for Hydrogen Adsorption. Nanomaterials 2020, 10, 255. [Google Scholar] [CrossRef]
  11. Sdanghi, G.; Canevesi, R.L.S.; Celzard, A.; Thommes, M.; Fierro, V. Characterization of Carbon Materials for Hydrogen Storage and Compression. C 2020, 6, 46. [Google Scholar] [CrossRef]
  12. Bi, L.; Yin, J.; Huang, X.; Wang, Y.; Yang, Z. Graphene pillared with hybrid fullerene and nanotube as a novel 3D framework for hydrogen storage: A DFT and GCMC study. Int. J. Hydrogen Energy 2020, 45, 17637–17648. [Google Scholar] [CrossRef]
  13. Li, Q.; Lu, Y.; Luo, Q.; Yang, X.; Yang, Y.; Tan, J.; Dong, Z.; Dang, J.; Li, J.; Chen, Y.; et al. Thermodynamics and kinetics of hydriding and dehydriding reactions in Mg-based hydrogen storage materials. J. Magnes. Alloys 2021, 9, 1922–1941. [Google Scholar] [CrossRef]
  14. Shi, M.; Bi, L.; Huang, X.; Meng, Z.; Wang, Y.; Yang, Z. Design of three-dimensional nanotube-fullerene-interconnected framework for hydrogen storage. Appl. Surf. Sci. 2020, 534, 147606. [Google Scholar] [CrossRef]
  15. Bi, L.; Ding, J.; Zou, J.; Nie, M.; Xu, Y.; Yin, J.; Huang, X.; Yang, Z.; Wang, Y. DFT study of hydrogen sorption on light metal (Li, Be, and Na) decorated novel fullerene-CNTs networks. Appl. Surf. Sci. 2021, 569, 151000. [Google Scholar] [CrossRef]
  16. Kan, B.; Tian, Y.; Xie, D.; Wu, Y.; Fan, Y.; Shang, H. Solving the Electronic Schrödinger Equation by Pairing Tensor-Network State with Neural Network Quantum State. Mathematics 2024, 12, 433. [Google Scholar] [CrossRef]
  17. Schlegel, H.B. Geometry optimization. Comput. Mol. Sci. 2011, 1, 790–809. [Google Scholar] [CrossRef]
  18. BIOVIA. Materials Studio (Version 2024, CASTEP Module) [Software]. Dassault Systèmes. 2024. Available online: https://www.3ds.com/products-services/biovia/products/materials-studio/ (accessed on 29 April 2025).
  19. Pérez-Álvarez, M.; Sánchez-Ruíz, F.J.; Domínguez, H.; Vicente-Hinestroza, L.; Illescas, J.; Martínez-Gallegos, S. Molecular dynamics model quantum field for prediction of the interaction between chitosan–silver nanoparticles. Mol. Simul. 2024, 50, 1220–1232. [Google Scholar] [CrossRef]
  20. Ragone, M.; Shahabazian-Yassar, R.; Mashayek, F.; Yurkiv, V. Deep learning modeling in microscopy imaging: A review of materials science applications. Prog. Mater. Sci. 2023, 138, 101165. [Google Scholar] [CrossRef]
  21. Thakkar, P.; Khatri, S.; Dobariya, D.; Patel, D.; Dey, B.; Singh, A.K. Advances in materials and machine learning techniques for energy storage devices: A comprehensive review. J. Energy Storage 2024, 81, 110452. [Google Scholar] [CrossRef]
  22. Vivanco-Benavides, L.E.; Martínez-González, C.L.; Mercado-Zúñiga, C.; Torres-Torres, C. Machine learning and materials informatics approaches in the analysis of physical properties of carbon nanotubes: A review. Comput. Mater. Sci. 2022, 201, 110939. [Google Scholar] [CrossRef]
  23. Košmerl, V.; Štajduhar, I.; Čanađija, M. Predicting stress–strain behavior of carbon nanotubes using neural networks. Neural Comput. Appl. 2022, 34, 17821–17836. [Google Scholar] [CrossRef]
  24. Fakhrabadi, M.M.S.; Samadzadeh, M.; Rastgoo, A.; Yazdi, M.H.; Mashhadi, M.M. Vibrational analysis of carbon nanotubes using molecular mechanics and artificial neural network. Phys. E Low-Dimens. Syst. Nanostruct. 2011, 44, 565–578. [Google Scholar] [CrossRef]
  25. Kaushal, A.; Alexander, R.; Rao, P.; Prakash, J.; Dasgupta, K. Artificial neural network, Pareto optimization, and Taguchi analysis for the synthesis of single-walled carbon nanotubes. Carbon Trends 2021, 2, 100016. [Google Scholar] [CrossRef]
  26. Akbari, E.; Buntat, Z.; Enzevaee, A.; Ebrahimi, M.; Yazdavar, A.H.; Yusof, R. Analytical modeling and simulation of I–V characteristics in carbon nanotube based gas sensors using ANN and SVR methods. Chemom. Intell. Lab. Syst. 2014, 137, 173–180. [Google Scholar] [CrossRef]
  27. Čanađija, M. Deep learning framework for carbon nanotubes: Mechanical properties and modeling strategies. Carbon 2021, 184, 891–901. [Google Scholar] [CrossRef]
  28. Anderson, G.; Schweitzer, B.; Anderson, R.; Gómez-Gualdrón, D.A. Attainable Volumetric Targets for Adsorption-Based Hydrogen Storage in Porous Crystals: Molecular Simulation and Machine Learning. J. Phys. Chem. C 2018, 123, 120–130. [Google Scholar] [CrossRef]
  29. Salah, L.S.; Chouai, M.; Danlée, Y.; Huynen, I.; Ouslimani, N. Simulation and Optimization of Electromagnetic Absorption of Polycarbonate/CNT Composites Using Machine Learning. Micromachines 2020, 11, 778. [Google Scholar] [CrossRef]
  30. Nguyen, Q.; Rizvandi, R.; Karimipour, A.; Malekahmadi, O.; Bach, Q.V. A Novel Correlation to Calculate Thermal Conductivity of Aqueous Hybrid Graphene Oxide/Silicon Dioxide Nanofluid: Synthesis, Characterizations, Preparation, and Artificial Neural Network Modeling. Arab. J. Sci. Eng. 2020, 45, 9747–9758. [Google Scholar] [CrossRef]
  31. Aci, M.; Avci, M. Artificial neural network approach for atomic coordinate prediction of carbon nanotubes. Appl. Phys. A 2016, 122, 631. [Google Scholar] [CrossRef]
  32. Sanders, D.F.; Smith, Z.P.; Guo, R.; Robeson, L.M.; McGrath, J.E.; Paul, D.R.; Freeman, B.D. Energy-efficient polymeric gas separation membranes for a sustainable future: A review. Polymer 2013, 54, 4729–4761. [Google Scholar] [CrossRef]
  33. Yang, F.; Wang, M.; Zhang, D.; Yang, J.; Zheng, M.; Li, Y. Chirality Pure Carbon Nanotubes: Growth, Sorting, and Characterization. Chem. Rev. 2020, 120, 2693–2758. [Google Scholar] [CrossRef] [PubMed]
  34. Artyukhov, V.I.; Penev, E.S.; Yakobson, B.I. Why nanotubes grow chiral. Nat. Commun. 2014, 5, 4892. [Google Scholar] [CrossRef]
  35. Gogotsi, Y. Nanomaterials Handbook; CRC Press: Boca Raton, FL, USA, 2006. [Google Scholar] [CrossRef]
  36. Lin, R.; Zhou, Z.; You, S.; Rao, R.; Kuo, C.C.J. Geometrical Interpretation and Design of Multilayer Perceptrons. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 2545–2559. [Google Scholar] [CrossRef] [PubMed]
  37. Werbos, P. Backpropagation through time: What it does and how to do it. Proc. IEEE 1990, 78, 1550–1560. [Google Scholar] [CrossRef]
  38. Graves, A.; Schmidhuber, J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 2005, 18, 602–610. [Google Scholar] [CrossRef]
  39. Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  40. Schuster, M.; Paliwal, K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar] [CrossRef]
  41. Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
  42. Chen, L.; Li, S.; Bai, Q.; Yang, J.; Jiang, S.; Miao, Y. Review of Image Classification Algorithms Based on Convolutional Neural Networks. Remote Sens. 2021, 13, 4712. [Google Scholar] [CrossRef]
  43. Tian, Y. Artificial Intelligence Image Recognition Method Based on Convolutional Neural Network Algorithm. IEEE Access 2020, 8, 125731–125744. [Google Scholar] [CrossRef]
  44. Szabó, A.; Perri, C.; Csató, A.; Giordano, G.; Vuono, D.; Nagy, J.B. Synthesis Methods of Carbon Nanotubes and Related Materials. Materials 2010, 3, 3092–3140. [Google Scholar] [CrossRef]
  45. Dresselhaus, M.S.; Dresselhaus, G.; Eklund, P.C.; Rao, A.M. Carbon Nanotubes. In The Physics of Fullerene-Based and Fullerene-Related Materials; Physics and Chemistry of Materials with Low-Dimensional Structures; Lévy, F., Mooser, E., Andreoni, W., Eds.; Springer: Dordrecht, The Netherlands, 2000; Volume 23, pp. 331–379. [Google Scholar] [CrossRef]
  46. Zhao, X.; Liu, Y.; Inoue, S.; Suzuki, T.; Jones, R.O.; Ando, Y. Smallest Carbon Nanotube Is 3 Å in Diameter. Phys. Rev. Lett. 2004, 92, 125502. [Google Scholar] [CrossRef]
  47. Sanchez-Valencia, J.R.; Dienel, T.; Gröning, O.; Shorubalko, I.; Mueller, A.; Jansen, M.; Amsharov, K.; Ruffieux, P.; Fasel, R. Controlled synthesis of single-chirality carbon nanotubes. Nature 2014, 512, 61–64. [Google Scholar] [CrossRef] [PubMed]
  48. Han, F.; Li, L.; Qian, L.; Gao, Y.; Wu, Q.; Wang, Z.; Liu, H.; Zhang, J.; He, M. High-Temperature Growth of Chirality-Enriched, Highly Crystalline Carbon Nanotubes for Efficient Single-Chirality Separation. Adv. Funct. Mater. 2025, 35, 2419702. [Google Scholar] [CrossRef]
  49. Peng, L.M.; Zhang, Z.L.; Xue, Z.Q.; Wu, Q.D.; Gu, Z.N.; Pettifor, D.G. Stability of Carbon Nanotubes: How Small Can They Be? Phys. Rev. Lett. 2000, 85, 3249–3252. [Google Scholar] [CrossRef]
Figure 1. Representation of the basic chiral vector and how it defines the CNT geometry upon the hexagonal carbon lattice (in black). In this example, the vector (3,6) describes a chiral nanotube. The rectangle represents the boundary of the carbon atoms used to form the nanotube (shown in blue.)
Figure 1. Representation of the basic chiral vector and how it defines the CNT geometry upon the hexagonal carbon lattice (in black). In this example, the vector (3,6) describes a chiral nanotube. The rectangle represents the boundary of the carbon atoms used to form the nanotube (shown in blue.)
Modelling 06 00036 g001
Figure 2. Graphical representation of the 28-atom CNT. The red circle denotes a single carbon atom.
Figure 2. Graphical representation of the 28-atom CNT. The red circle denotes a single carbon atom.
Modelling 06 00036 g002
Table 1. Spatial coordinates of the atoms that make up the 28-atom CNT, with base (x, y, z) and optimized ( x , y , z ).
Table 1. Spatial coordinates of the atoms that make up the 28-atom CNT, with base (x, y, z) and optimized ( x , y , z ).
xyz x y z
0.7160.6460.1140.7550.6710.113
0.6010.7200.2090.6240.7560.209
0.2860.4400.0660.2530.4320.067
0.3310.5380.1850.3060.5380.184
0.2840.3540.2800.2550.3320.281
0.6760.7030.3280.7100.7440.328
0.5050.6940.4230.5170.7330.423
0.6010.7200.5420.6150.7610.542
0.4090.6290.6370.3930.6550.637
0.7160.6460.7800.7450.6680.781
0.6760.7030.9950.7050.7280.995
0.3990.2800.0420.3850.2400.043
0.5910.3710.1370.6070.3450.137
0.4950.3060.2560.5050.2800.256
0.6690.4620.3520.7010.4560.353
0.2860.4400.3990.2550.4340.398
0.3240.2970.4710.2950.2730.471
0.2840.3540.6140.2450.3290.613
0.3990.2800.7090.3760.2440.709
0.5050.6940.7560.4950.7200.756
0.3310.5380.5440.2990.5440.544
0.4090.6290.6450.3930.6550.637
0.5910.3710.9710.7430.9320.970
0.4950.3060.7890.5570.8930.889
0.6690.4620.8730.7020.8910.868
0.2860.4400.3450.2570.2460.314
0.3240.2970.4750.2450.2880.445
0.7140.5600.8300.7180.7580.898
Table 2. Summary of the base CNT dataset used in this research.
Table 2. Summary of the base CNT dataset used in this research.
Chirality (m,n)Number of AtomsData Amount
(2, 1)28168
(3, 1)52312
(3, 2)76456
(4, 1)84504
(4, 3)148888
(5, 1)124744
(5, 2)156936
(5, 3)1961176
Table 3. Composition of the three datasets generated after single and double processing in Materials Studio software.
Table 3. Composition of the three datasets generated after single and double processing in Materials Studio software.
DatasetInputOutput
CNT-BCOBase CNTCNT optimized
CNT-BCO2Base CNTCNT optimized 2 times
CNT-BCO-ALLBase CNTCNT optimized
CNT optimized 2 times
Table 4. ANN parameter configuration summary.
Table 4. ANN parameter configuration summary.
ParameterMLPBiLSTM1D-CNN
Layers211
Neurons5025200
Activation functionGeluSwishRelu
OptimizerAdamaxAdamaxAdamax
Epochs100010001000
Batch323232
η 0.0010.0010.001
Table 5. Best results in computational time savings in CASTEP geometric optimization of CNTs after BiLSTM suboptimization.
Table 5. Best results in computational time savings in CASTEP geometric optimization of CNTs after BiLSTM suboptimization.
Chirality ( m , n ) Atoms NumberInitial CASTEP Time (s)Iterations Initial CNTIterations Suboptimized CNTComputing Time (s)/Dataset% Improved TimeSaved Time (s)
2,128473.11172195.27/CNT-BCO258.72%277.84
3,152486.6953249.86/CNT-BCO48.66%236.83
3,276479.5553348.19/CNT-BCO227.39%131.36
4,1841011.1754351.55/CNT-BCO265.23%513.58
4,31482525.23542001.80/CNT-BCO220.72%523.43
5,11241946.78541270.92/CNT-BCO234.71%675.86
5,21563573.41532651.84/CNT-BCO225.78%921.57
5,31966342.31536108.05/CNT-BCO23.69%234.26
Table 6. Computational time savings in CASTEP geometric optimization of CNTs after 1D-CNN suboptimization. Chirality (5,3) was set for all cases.
Table 6. Computational time savings in CASTEP geometric optimization of CNTs after 1D-CNN suboptimization. Chirality (5,3) was set for all cases.
Iterations Suboptimized CNTFinal CASTEP Time (s)% Improved TimeSaved Time (s)
77116.17--
66271.561.11%70.75
33036.8452.11%3305.47
66322.440.31%19.87
55070.6220.05%1271.69
33043.3152.01%3299
Table 7. Best results in computational time savings in CASTEP geometric optimization of CNTs after 1D-CNN and MLP suboptimization.
Table 7. Best results in computational time savings in CASTEP geometric optimization of CNTs after 1D-CNN and MLP suboptimization.
Chirality ( m , n ) /ANNAtoms NumberInitial CASTEP Time (s)Iterations Initial CNTIterations Suboptimized CNTComputing Time (s)/Dataset% Improved TimeSaved Time (s)
2,1/1D-CNN28473.1117244.36/CNT-BCO90.62%428.75
3,1/1D-CNN52486.6953153.02/CNT-BCO268.55%337.67
3,2/1D-CNN76479.5553221.14/CNT-BCO253.8%254.41
4,1/MLP841011.1754320.81/CNT-BCO68.27%690.36
4,3/MLP1482525.23541523.19/CNT-BCO239.68%1002.04
5,1/MLP1241946.7854827.48/CNT-BCO257.49%1119.3
5,2/1D-CNN1563573.41531574.78/CNT-BCO255.93%1998.63
5,3/MLP1966342.31533036.84/CNT-BCO252.11%3305.47
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Vences-Reynoso, L.J.; Villanueva-Vasquez, D.; Alejo-Eleuterio, R.; Del Razo-López, F.; Martínez-Gallegos, S.M.; Granda-Gutiérrez, E.E. Evaluation of Neural Networks for Improved Computational Cost in Carbon Nanotubes Geometric Optimization. Modelling 2025, 6, 36. https://doi.org/10.3390/modelling6020036

AMA Style

Vences-Reynoso LJ, Villanueva-Vasquez D, Alejo-Eleuterio R, Del Razo-López F, Martínez-Gallegos SM, Granda-Gutiérrez EE. Evaluation of Neural Networks for Improved Computational Cost in Carbon Nanotubes Geometric Optimization. Modelling. 2025; 6(2):36. https://doi.org/10.3390/modelling6020036

Chicago/Turabian Style

Vences-Reynoso, Luis Josimar, Daniel Villanueva-Vasquez, Roberto Alejo-Eleuterio, Federico Del Razo-López, Sonia Mireya Martínez-Gallegos, and Everardo Efrén Granda-Gutiérrez. 2025. "Evaluation of Neural Networks for Improved Computational Cost in Carbon Nanotubes Geometric Optimization" Modelling 6, no. 2: 36. https://doi.org/10.3390/modelling6020036

APA Style

Vences-Reynoso, L. J., Villanueva-Vasquez, D., Alejo-Eleuterio, R., Del Razo-López, F., Martínez-Gallegos, S. M., & Granda-Gutiérrez, E. E. (2025). Evaluation of Neural Networks for Improved Computational Cost in Carbon Nanotubes Geometric Optimization. Modelling, 6(2), 36. https://doi.org/10.3390/modelling6020036

Article Metrics

Back to TopTop