Artiﬁcial Intelligence Applications for MEMS-Based Sensors and Manufacturing Process Optimization

: Micro-electromechanical systems (MEMS) technology-based sensors have found diverse ﬁelds of application due to the advancement in semiconductor manufacturing technology, which produces sensitive, low-cost, and powerful sensors. Due to the fabrication of different electrical and mechanical components on a single chip and complex process steps, MEMS sensors are prone to deterministic and random errors. Thus, testing, calibration, and quality control have become obligatory to maintain the quality and reliability of the sensors. This is where Artiﬁcial Intelligence (AI) can provide signiﬁcant beneﬁts, such as handling complex data, performing root cause analysis, efﬁcient feature estimation, process optimization, product improvement, time-saving, automation, fault diagnosis and detection, drift compensation, signal de-noising, etc. Despite several beneﬁts, the embodiment of AI poses multiple challenges. This review paper provides a systematic, in-depth analysis of AI applications in the MEMS-based sensors ﬁeld for both the product and the system level adaptability by analyzing more than 100 articles. This paper summarizes the state-of-the-art, current trends of AI applications in MEMS sensors and outlines the challenges of AI incorporation in an industrial setting to improve manufacturing processes. Finally, we reﬂect upon all the ﬁndings based on the three proposed research questions to discover the future research scope


Introduction
Micro-electromechanical systems (MEMS)-based technologies have been in the market for decades due to the rapid integration of MEMS sensors. It has revolutionized market segments, such as consumer electronics, automotive, healthcare, industry 4.0, internet-ofthings, etc. Figure 1 shows that the demand for MEMS sensors is ever-increasing and will keep rising, as shown. The growth is due to the highly scalable and efficient manufacturing technology available to mass-produce the sensors at a low cost, mostly in the consumer electronics and automotive industry. The most common and heavily used MEMS inertial sensors are gyroscopes and accelerometers used in consumer devices and automotive [1,2]. The advancement in the silicon MEMS/CMOS technology at both chip and device levels has enabled further improvement of MEMS sensors in terms of miniaturization, high accuracy, high quality and performance at low cost. MEMS design and manufacturing processes are becoming more complex and diverse due to incorporating newer technologies and their multi-domain structure. Due to this, sensors are prone to various kinds of deterministic and random errors, such as misalignment, white noise, random walk, quantization noise, etc. If these errors are not handled promptly, they can accumulate and affect the behavior of the sensors negatively [3][4][5][6][7]. This makes the quality checks even more challenging, resulting in increased testing costs. Testing cost consists of wafer-level testing and packaging-related testing costs, contributing to the overall device manufacturing cost [8][9][10]. Thus, there is an ongoing demand to reduce these costs and improve the production process in hardware and software domains. Artificial Intelligence (AI) has been around for decades, and the recent applications of AI are more intensive than ever due to the latest boom in data availability, increased system computational power, and storage capacity. AI applications branch out to multiple sub-disciplines, such as deep learning (DL), machine learning (ML), natural language processing (NLP), computer vision, robotics, etc. Figure 2 depicts a high-level overview of different components, types, and sub-fields of AI. Machine learning, which emerged from the statistical background, has shown its strength and convenience from complex data understanding to multidimensional data handling. AI applications are countless, such as autonomous vehicles, predictive maintenance, supply chain optimization, resource optimization, manufacturing process optimization, banking, financing, surveillance, recommendation system, healthcare, marketing, quality inspection, education, etc. [12][13][14][15]. Looking at the current trends, it can be seen that different AI algorithms, such as tree-based algorithms, deep neural networks, and reinforcement learning, are mainly used for industrial domain applications.
Hence, it is worth looking into the possible amalgamations of these two strong domains, MEMS and AI, to benefit both. This paper surveys AI applications in MEMS-based sensors and related processes. We try to find research opportunities in the MEMS sensors manufacturing domain that are not well explored or less investigated. The rest of the paper is structured as follows. A brief overview of the MEMS sensors types, applications, and MEMS manufacturing process is presented in Section 2. In Section 3, the research methodology and materials used for this review paper are discussed. Section 4 provides a detailed analysis of the current trends of AI applications in the MEMS-based sensors, design, and manufacturing process. A review summary of the papers is presented in a tabular format in Section 5. Section 6 reflects on the three research questions raised in Section 3. It provides an overview of AI implementation benefits and challenges in industrial settings. Finally, we conclude the paper in Section 7.

MEMS Background
In this section, we briefly introduce the MEMS-based system. MEMS devices are rugged, small, and silicon-based and are fabricated in the micrometer range with advanced technologies, such as semiconductor manufacturing. MEMS involves electrical and mechanical components fabricated using Integrated circuits (IC) batch processing at the micrometer scale [17]. Often, multiple devices are fabricated into a single chip, and cross-signal interaction and interference are unavoidable. This is why new failure modes are introduced and must be taken care of efficiently. Figure 3 depicts a simplified MEMS manufacturing process. After the design, process simulation, layout, and mask generation, the micro-fabrication process starts with a silicon substrate. The material deposition, pattern transfer, and excess material removal run multiple cycles until the desired result is achieved. Compared to traditional IC fabrication techniques, thicker films and deeper etching lead to fewer cycles to achieve the desired result. Special probing and sectioning techniques are used to protect the parts. Testing and calibration are crucial in the MEMS manufacturing process to retain high quality and reliability. The complexity comes from the spatial distribution of the wafers, the process-induced effects, and the combination of both. Moreover, complexity plays such an important role because the useful signal from the MEMS and the degree of fluctuations/effects from the process (temperature, pressure, mechanical stress, gas concentration, etc.) are much more comparable than in the case of a macroscopic product. At the same time, the expectation is that each product has the same properties compared to its specification. In comparison to traditional sensors, MEMS-based sensors provide significant advantages, such as (i) low production cost, (ii) less power consumption, (iii) improved sensing in terms of accuracy and sensitivity, (iv) lightweight, (v) more diminutive size, (vi) high and straightforward integration, (vii) parallelism, (viii) resilient to shock, vibration, and radiation, (xi) scalability, etc. MEMS-based gyroscopes and accelerometers were first developed around the 1990s for automotive applications, triggering the high demand for consumer electronics. Mid-2000 marked the boom of MEMS sensors by incorporating MEMS sensors into Nintendo's Wii. Later, a high adaptation rate could be seen in other areas, such as smartphones, microphones, motion detection, image stabilization, home security, wearable devices, such as fitness watches, the automotive industry, military, unmanned aerial vehicles (UAV), air crafts, etc. Due to more interest and research in autonomous vehicles, it has become more critical than ever.
MEMS's current market size value in 2022 is 14.32 billion USD and is projected to reach 75 billion USD by 2032 with a global growth rate of 18.01% from 2022-2032 [18]. Advancements in new technology, such as sensor fusion, big data, AI, and the industrial internet of things, created opportunities for new application areas, such as smart homes, connected cars, and autonomous vehicles.

MEMS Components
As can be seen from Figure 4, MEMS are not only sensors; MEMS components can be subdivided into four categories, (1) microelectronics, (2) microactuators, (3) microsensors, and (4) microstructures, which are integrated into a single chip. Microsensors sense the physical parameters and detect changes in stimuli, such as pressure, temperature, motion, mass, light, etc. The microelectronics then process and analyze the collected information, which sends the signal to the microactuators. Upon receiving the signal, the microactuators respond and provide outputs as a change in the environment. MEMS components are packaged together with an application-specific integrated circuit (ASIC), which acts as an electrical interface among the components to send and receive analog and digital information. This makes the calibration and packaging processes quite complicated. As discussed so far, MEMS technology and related components are diverse, and the discussion of this paper is limited to only MEMS-based sensors.

Types of MEMS Sensors and Applications
Based on the received responses and measured quantity, the microsensors used in MEMS can be broadly divided into different types, such as physical, chemical, and biological. Table 1 shows a high-level classification of the microsensors used in MEMS-based systems [19] based on the received signal. MEMS sensors can be classified in many ways, such as application area, adaptation rate, popularity, MEMS structure, etc., but this is not in the scope of this paper. The most common form of application and adaptation will be discussed for simplicity.  Most common MEMS inertial sensors consist of accelerometers and gyroscopes. It will be later seen, in Section 4, that the contributions made in MEMS inertial sensors are pretty impressive. We will briefly discuss the working principle of the most important types, such as accelerometers and gyroscopes. This will help understand the problem's complexity and the contribution made by researchers using AI.
In general, accelerometers can be divided into two categories based on the response type, such as (1) AC-response and (2) DC-response. AC-response accelerometers consist of piezoelectric elements for sensing. Thus, they are also known as piezoelectric accelerometers. The piezoelectric element "displaces" a charge when the accelerometer experiences acceleration, resulting in an electrical output proportional to acceleration. DC-response accelerometers can be of two types: piezoresistive (mainly used for low-range devices) and capacitive (high accuracy and sensitivity). They make use of MEMS fabrication technology, which scales to big-volume applications and lowers the cost of production. The advantages of using a MEMS-based accelerometer over a piezoelectric accelerometer are (i) active self-test, (ii) can measure both dynamic and static movement, (iii) generates precise velocity and displacement information, (iv) excellent bias stability and minimal noise, etc.
The most commonly used MEMS accelerometer is the capacitive type, which is the cheapest and smallest. The accelerometer can be a single-axis, where acceleration is measured or a multi-axis, where the orientation of gravity is measured as well. The accelerometer measures acceleration in terms of movement, shock, or vibration. The basic working principle can be explained with a mass suspended on a spring attached to a fixed frame, i.e., a comb capacitor plate. In the presence of an external force, the mass moves, and the distance between the fixed plate and the seismic mass changes. This further changes the capacitance between the set and the movable plate [20,21]. The challenge of this design is to provide DC accuracy over temperature and reduce the temperature drift and bias drift as much as possible. The drift compensation by AI algorithm has become an attractive research topic. From an application perspective, MEMS accelerometers are used in smartphones, airbags in cars, cameras for anti-blur, real time applications, such as the military, etc.
MEMS vibratory gyroscopes measure the angular rate of rotation or displacement by using Coriolis force. Based on the transduction type, a MEMS gyroscope can be of different types, such as silicon tuning fork, quartz tuning fork, vibratory ring, etc. The purpose of a gyroscope is to measure the acceleration of its oscillating mechanical sensing components. The mechanical structure remains in active resonance, and a small displacement as a response is produced due to Coriolis acceleration [22]. Managing this quadrature signal requires a clever gyroscope design so that the small signal is detectable. This is why many electronic compensation methods exist to de-noise the signal. Thus, noise modeling, random drift construction, temperature drift compensation, fault detection, and diagnosis of the gyroscope have become some of the most lucrative fields of interest for data scientists. The main application areas can be found in stability control in the automotive industry, short-range navigation, such as missile navigation, image stabilization in industrial applications, submarines, UAVs, aeronautics navigation, etc.

MEMS Manufacturing Process
The MEMS manufacturing process is an extensive and time-consuming endeavor that requires numerous quality checks before completing the entire process. The design requires simultaneously considering devices from electrical, mechanical, and electronic domains. Not only that, the design should consider the cross-domain effect and analyze it. MEMS fabrication is based on chemical etching and photo-lithography. It consists of bulk micro-machining, surface micro-machining, and high-aspect-ratio micro-machining using techniques, such as LIGA (Lithographie, Galvanoformung, Abformung-Lithography, Electroplating, and Molding) [23]. New micro-manufacturing methods, including micromechanical cutting, micro-electrical discharge machining, micro-electrochemical machining, micro-forming, laser technology, laser-assisted forming, replication techniques, deposition methods, etc., have been emerging to create a hybrid process. Multiple companies exist, such as MEMSCAP Inc., IntelliSense Software Corp, ABAQUS, Inc., Coventor Inc., etc., and research bodies, such as U.C. Berkeley, Carnegie Mellon University, TU Chemnitz, etc., who are continuously working on providing improved solutions for MEMS design [24]. Figure 5 shows a high-level overview of the MEMS design structure flow [25,26]. Given the numerous steps involved, several challenges occur at different levels, such as design issues, fabrication issues, bulk micro-machining issues, dicing issues, packaging issues, technical issues [27,28] etc.
As stated in article [29], manufacturing defects can be categorized into the following three categories.
• Type A defects: These types of defects are evenly random with a stable mean density. There is no repeated occurrence or visible systematic pattern, i.e., the probability of a dice being good or bad is equal. Thus, the root-cause analysis of such defects is not straightforward. Only an accurate and stable process can help reduce these kinds of defects.
• Type B defects: These types of defects are repeatable and of systematic pattern from wafer to wafer. These defects' source can be generated from anomalies in the process or machine, such as mask induced or during the variation while applying films. • Type C defects: The most common type of defect seen in semiconductor manufacturing is Type C defect, a combination of Type A and Type B defects. It is essential to eliminate random defects to recognize and eradicate systemic flaws. There can be multiple causes, such as structural defects coming from the raw material, asymmetrical presence of contaminants, irregular presence of defect generation particles, etc.
All these defects can be visible or invisible. The visible wafer defects can be classified as local, random, center, and scrape based on the visual pattern type. Currently, manual inspections are performed to detect visible surface anomalies, which are prone to erroneous outcomes. The lack of a standardized framework to resolve the aforementioned issues obstructs the manufacturing process. However, this can not be achieved easily as the expected solutions are based on the nature and complexity of the problems. The solution requires in-depth domain knowledge and process knowledge and can be cumbersome. Given the rise in the market demand for cheaper, smaller, and high-performance MEMS sensors, organizations and researchers are constantly looking for new means to improve the MEMS sensor production process to meet the market demand. This is where AI can provide a massive benefit for process optimization. In the following section, we will look into several problem areas of MEMS-based manufacturing where AI-based solutions are beneficial to incorporate.

Research Methodology
The motivation of this study is to find the current trends, challenges, and prospects of AI applications in MEMS-based sensors and their manufacturing process. This study identifies the most common MEMS sensor application domains of AI-based solutions. Special attention was given to the papers that provided an overall AI implementation process in sensor production. The review process was carried out systematically and methodically using Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) [30].

Research Questions
The following research questions were formulated to establish the scope and address the paper's main objective.

Search Strategy
The search strategy helps to identify and analyze the questions raised in Section 3.1. The databases selected for this study are well-established and recognized in the scientific community.
As shown in Table 2, eight database resources were use. To search for effective and precise results, the following conditions were included:  For data collection, search terms corresponding to ("artificial intelligence" AND "MEMS") OR ("machine learning" AND "MEMS") OR ("neural network" AND "MEMS") OR ("artificial intelligence" AND "MEMS sensors") OR ("machine learning" AND "MEMS sensors") OR ("artificial intelligence" AND "sensor manufacturing") OR ("machine learning" AND "sensor manufacturing") OR ("artificial intelligence" AND "MEMS manufacturing") OR ("machine learning" AND "MEMS manufacturing"). Index terms such as "MEMS", "MEMS manufacturing", "artificial intelligence", and "machine learning" were used. Patent applications were also checked to investigate the interest in the field of MEMS and AI together.

Search Result
The literature selection criterion is shown in Figure 6 using the PRISMA flow diagram. Eight databases were searched to gather 427 articles. After thorough checking, 131 duplicate records were found, which were excluded from the study. For the screening process, 296 articles remained. Applying the inclusion and exclusion criterion as mentioned in Section 3.2, 103 articles remained after the screening process for the study. Figure 7 depicts the categorization of AI and MEMS sensors in different fields. The data were extracted from the web of science in the last ten years for the top 25 application domain where most publications were published. The filter criterion was put for "All Fields" with condition ( MEMS) AND (machine learning OR deep learning OR artificial intelligence OR neural network) and date range 2012-01-01 to 2022-10-30, and the language of the article should be in English.
Patent information was retrieved from the World Intellectual Property Organization website [31] for the last ten years with the keywords (MEMS sensors OR inertial sensors) AND ( machine learning OR Artificial Intelligence). This was to check the patent scope and interest trend. Further information is not included in the study, but it provided a powerful technology watch background for our research and validated the motivation of this article.

Artificial Intelligence Application in MEMS System: Current Trends
Due to infrastructure advancement and data availability, disciplines such as data science and data analytics have gained significant attention from an application perspective in different domains, such as the MEMS sensors manufacturing process. Sampaio et al. [32], Cinar et al. [33], Podder et al. [34], Tariq et al. [35], Gupta et al. [36], Li et al. [37,38], and Shen et al. [39] have provided extensive overviews of ML applications for predictive maintenance application in the manufacturing domain of Industry 4.0. This provides a strong motivation for exploring AI benefits in the MEMS domain. AI algorithms can efficiently and effectively analyze massive amounts of data and identify particular trends and patterns humans cannot see, enabling process automation and discarding human intervention. Real data contains noise, complex patterns, multi-modal distributions, variance, etc., which makes the analysis computationally extensive and difficult using traditional methods. AI algorithms are adept at managing multidimensional and multivariate data in dynamic or uncertain environments. It is not predisposed toward particular datasets; thus, learning can, therefore, aid in minimizing bias in corporate choices. Data scarcity and confidentiality can be challenging in certain domains in finding the optimal solution. This issue can be handled by different RNN variants or DL-based synthetic data generation. The benefits of the AI algorithms are discussed, in detail, in Section 6.2.
In this section, we surveyed the current trends of AI applications in different process parts for MEMS-based sensors. MEMS gyroscopes and accelerometers have been scorching research topics due to their diverse application in consumer, automotive, industrial, and medical fields. Thus, more attention is given to improving and optimizing the hardware and software system for these two applications. A brief overview of AI algorithm implementation is discussed in Section 4.1. In Section 4.2, we investigated different application areas of AI in MEMS-based sensors. In Section 4.3, AI application for the overall MEMS manufacturing and design is covered.

AI Implementation Workflow
A high-level overview of AI algorithm implementation workflow is presented in Figure 8, where the overall goal is to build an efficient model using the collected data to achieve a certain objective. Acquiring and getting the data ready for analysis is the first stage in any data science workflow. Typically, data are combined from many sources and come in various formats. Data pre-processing is followed after the data retrieval step. This step is the most resource and time extensive stage.
Data cleaning is crucial to prevent errors from spreading to the data exploration phase, which could lead to incorrect conclusions being drawn from the data. This step includes missing value imputation, format specification, noise removal, etc. The next step, the data exploration stage, finds the complex relationship and hidden patterns in the data. The data preparation and transformation stage allows the data to be transformed according to the model requirement. It includes feature engineering, data labeling, and data splitting. All these aforementioned three stages are iterative. The data are then divided into training and test sets and sometimes into a third set called the validation set. The data used to fit the model are called the training set. A validation set is the subset of data used to assess a model's fit to a training dataset while adjusting model hyperparameters. A test set is a sample of data used to objectively assess how well a final model fits the training dataset. Model preparation and training is the stage in which training data are used to train a model and comprises the hyperparameter adjustment. Before delivering the ML model in production to the end user, the trained model must be validated to ensure it satisfies the originally stated objectives. This is called model evaluation. After this, the model is ready to be deployed in production, followed by a monitoring stage to check the deployed model's performance using real-time and unseen data to make predictions or provide recommendations that are further logged. The articles analyzed in this review follow the aforementioned process. Depending on the objective, sometimes importance is given to specific steps, such as data pre-processing or model preparation, etc.

AI Application in MEMS-Based Sensors
A MEMS IMU unit consists of a multi-axis gyroscope and accelerometer, which are used for positioning and navigation systems when integrated with the global positioning system (GPS). A MEMS IMU system is preferred over traditional solutions due to its higher accuracy, small size, and lower cost. Compared to fiber or laser-based IMUs, MEMS-based IMUs are susceptible to more deterministic and random errors, such as measurement, quantization noise, alignment, bias, etc. Such uncompensated errors accumulate over time and adversely impact the precision and sensitivity of the sensors. There exist mathematical and statistical model-based calibration methods for error compensation. However, random errors (which contain high-frequency and low-frequency components) introduce drift and bias into the Inertial Navigation System (INS). Thus, adequate signal de-noising schemes are required to remove random errors. Allan Variance, Auto-Regressive, Moving Average, and Wavelet De-noising methods exist for error modeling, but this still requires an accurate model to ensure accuracy.
The following papers [40][41][42][43][44][45][46][47][48][49][50][51][52] show that DL-based error compensation models can learn more characteristics of the reference signals, such as identifying the particular accelerations and angular velocities, in comparison to traditional calibration processes, such as the six-position static test and rate test. It is crucial to note that the same deep learning networks may eliminate different error sources from IMUs of any performance grade.
Jiang et al. [40] proposed a Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) for MEMS IMU (MSI3200) de-noising, which outperformed the Auto-Regressive and Moving Average (ARMA) method. This paper was limited to a static dataset and limited layers of LSTM-RNN and thus needed more experimentation. Jiang et al. [41] proposed a heterogeneous deep learning recurrent neural network (RNN) design to suppress the MEMS gyroscope signal noise. Here, two deep-layered RNN algorithms, Gated Recurring Unit (GRU) and Long Short Term Memory (LSTM), were used individually and mixed for the experimentation. The outcome of this analysis showed promising results for significant noise reduction of the MEMS gyroscope using the LSTM-GRU network.
Zhu et al. [42] described a method using the three-axis Neural Architecture Search Recurrent Neural Network (NAS-RNN) for noise modeling and compensation of MEMS Gyroscope for MEMS IMU STIM300 unit. The limitation of this approach was that the experiments were performed only on the STIM300 unit and using only RNN models. Nevertheless, this research paper demonstrates that incorporating DL in MEMS gyroscope can improve accuracy significantly by reducing noise.
Nevertheless, this research paper showcases the strength of incorporating a deep learning module in the MEMS gyroscope that can significantly improve the accuracy by reducing the noise.
Jiang et al. [43] proposed an RNN variant called Simple Recurrent Unit (SRU-RNN) for denosing MEMS IMU MSI3200 gyroscope signals. The authors claimed that SRU-RNN provided significant advantages over the conventional approaches using SVM, LSTM, or LSTM-RNN methods. This research was limited to a single-layer SRU-RNN and fixed parameters.
Thermal calibration of the MEMS gyroscope is required to compensate for error drift, which can affect the accuracy of the gyroscopes. Traditionally, polynomial fitting is performed to overcome this. The authors of [44][45][46][47] have shown that Artificial Neural Networks (ANN) have superior performance compared to the traditional method.
Fei et al. [48] proposed a radial basis function (RBF) neural network-based control scheme to reduce non-linearity during fabrication, such as drift, to improve the robustness of the MEMS gyroscope. However, the proposed method might be prone to over-fitting, which is why Xing et al. [49] proposed a fusion algorithm consisting of least squares support vector machine (LSSVM) [53] and chaotic particle swarm optimization (CPSO). It proved to have better performance than the backpropagation artificial neural network (BP-ANN) for reducing the random drift of the MEMS gyroscope. The authors treated the problem as a chaotic time series issue and performed signal de-noising. Using phase space reconstruction (PSR) along with the C-C method helped with dimension and complexity reduction.
Yang et al. [50] proposed a method that comprised a genetic algorithm (GA) and a very well-known and intensively used ML model, support vector machine (SVM), to solve the temperature drift in MEMS gyroscope.
Interestingly, S. Wang et al. [51] also used GA with an optimized backpropagation neural network (BPNN) for temperature drift compensation. GA helped the BPNN to avoid local minimums by search optimization. The achieved result was 173 times better than the traditional polynomial fitting.
Ma et al. [52] targeted a similar issue of temperature drift by proposing a parallel approach in their article. The presented method was more complete than the earlier approaches discussed. The analysis included the signal value analysis along with temperature compensation. They decomposed the signal using Immune-Based Particle Swarm Optimization (IPSO) for optimal Variational Modal Decomposition (VMD). After signal decomposition, the obtained intrinsic mode functions were further classified into the noise term (this was removed), the mixed term (consisting of both the noise and useful information), and the feature term. The feature term contained the temperature drift, which was further compensated by using Backpropagation (BP)-Adaboost prediction, which was then mixed with the SG-filtered mixed term to obtain the de-noised, temperature-compensated signal. The papers discussed above demonstrate that combining different disciplines, such as GA and PSO, with AI can improve data analysis and understand and improve model performance.
Effective ground vibration monitoring is crucial to avoid geological disasters, such as earthquakes. Kang et al. [54] proposed a CNN-based monitoring scheme on MEMS sensed data. The challenges were related to the collection and cleaning of the data, as the data contained noise, bias, offset errors, and different structures. Nonetheless, the authors reached an overall accuracy of 98.82% with synthetic data and an accuracy of 81.64% on a real dataset.
As mentioned, data scarcity is an issue regarding seismological research using low-cost MEMS sensors. Moreover, the data quality is degraded due to inherent noise in accelerom-eters. There have been many attempts [55][56][57][58] to capture the seismic sequence using AI models. Still, these supervised approaches suffered from the incomplete reconstruction of the seismic waveform, over-fitting, and smaller dataset, unable to capture the diversity of the waves. To overcome these challenges, Wu et al. [59] proposed a model Earthquake Generative Adversarial Network (EQGAN), an unsupervised technique to automatically capture and generate stable seismic waveform using the frameworks of GAN, LSTM, and NN.
MEMS sensors have numerous applications in consumer electronics, such as pedestrian navigation systems, and AI algorithms are intensively used for real-time path tracking, activity recognition, and posture recognition. Models such as tree-based algorithms (decision tree, random forest, extreme gradient boosting), SVM, CNN, NN, and LSTM have found quite popular usages in such cases [60][61][62][63][64].
Gao et al. [65] presented a multi-scale Convolutional Neural Network (CNN) with adaptive learning for fault detection in MEMS inertial sensors in UAVs. The authors claimed that the proposed method could handle the temperature drift of inertial sensors and achieved high fault detection accuracy compared to the strategies presented in papers [66][67][68][69][70][71]. Amini et al. [72] developed an automated defect recognition system based on the Faster R-CNN Inception V2 COCO model by using a plenoptic image of the wafers for surface defect detection. The proposed method can be used for early fault detection at wafer and component levels of MEMS, reducing the design time significantly.
Thus far, all the discussed papers are primarily related to industrial, automotive, or consumer-related use cases, which show promising results for AI applications in MEMSbased sensor applications. AI application for MEMS (BioMEMS) in healthcare is another potentially colossal market [73]. Examples of general applications of MEMS in healthcare can be in a pedometer, hearing aid, body gateway, lab on a chip, blood pressure, etc. [74]. Vashistha et al. [75], and Yadav et al. [76] have discussed AI applications to diagnose several diseases using MEMS sensor-based smart diagnostic devices using Mechanobiology, which uses biosensors. MEMS are used here for finding external stimuli using force spectroscopy. NN can detect environmental factors, such as viruses or bacteria, along with the traditional method. To summarize, in most applications, different RNN and CNN models performed better than the classical statistical models.

AI Applications in the MEMS Manufacturing and Design Process
MEMS reliability and fault mechanism research is still in its early stages. Failure mechanisms differ significantly from those found in traditional microelectronics. Faults or defects arise due to surface interaction energy and intermolecular forces. Fault detection and maintenance are necessary for the MEMS fabrication process for reliability.
Keeping this in mind, Asgary et al. [77] proposed a fusion of a CNN and a Robust Heteroscedastic Probabilistic Neural Network (RHPNN) for fault detection. The aim was to provide a built-in self-test mechanism to implement in the final testing round.
Quality Control is a significant step in generating a superior category of sensors, which is why defect detection is ubiquitous for MEMS manufacturing. Deng et al. [78] proposed a CNN-based defect detection scheme using image processing techniques for pressure sensor chip packaging. The proposed accurate-detection CNN (ADCNN) algorithm was used to detect defects, such as chip damage, chip scratch, wrinkles on the glass surface, the broken bond of the gold and aluminum wires, etc., with a mean average precision of 92.39%. Heringhaus et al. [79] used transfer learning to identify, evaluate, and extract necessary parameters for manufacturing defect detection in a short time.
To prevent losses caused by tool wear or tool damage, Tool Condition Monitoring (TCM) adopts appropriate sensor signal processing techniques to monitor and predict the cutter state. An effective TCM system may boost output and ensure product quality, significantly impacting machining effectiveness. TCM is, therefore, quite significant in the manufacturing sector. Bajaj et al. [80], and Patange et al. [81] demonstrated that incor-porating ML-based approaches, such as tree-based models or the Bayesian optimization approach, can reduce maintenance time.
Using CAD tools, MEMS simulation and modeling are represented by non-linear partial differential equations (PDE). Traditional methods, such as finite-element methods (FEM) or finite-difference methods (FDM), are computationally heavy, creating a bottleneck for many simulations. To tackle this issue, Liang et al. [82] proposed an NN-based method for dynamic simulation and analysis of non-linearity in a MEMS-based system. The proposed dimensionality reduction method consists of a Generalized Hebbian Algorithm (GHA) [83]. GHA is based on the principle component analysis (PCA) of NN and the Galerkin procedure. The authors claimed that this proposed model could replicate the design and simulation method of PDE and can handle a large number of simulations in less time, using less memory space. This can be very helpful to the MEMS system designer for optimizing the design process.
In the technical report [84], author J. Perera provided a comparative analysis for reliability estimation and prediction using NN on the MEMS device design phase using the component attributes. The reliability of MEMS development has been a crucial factor and is worth looking into. The author proposed a framework for the MEMS reliability modeling, a novel addition to the MEMS design. Guo et al. [85] proposed a data-driven deep neural network-based approach to replace the conventional FEA for the MEMS design cycle. The author designed a non-parameterized NN-based model and trained it using geometric patterns to estimate the MEMS structure accurately, indicating the possibility of defect prediction in the microfabrication process.
Due to the increasing complexity of the MEMS manufacturing process, the number of surface defects tends to increase. It is crucial to detect these defects and identify the root cause for yield improvement and overall process optimization. Chien et al. [86] used a faster R-CNN, which they retrained multiple times to detect wafer defects with an accuracy of 98%. Raveendran et al. [87] also used a CNN model to inspect wafer maps visually. Tello et al. [88] proposed a methodical three-step approach to identify single and mixed defects using a deep ML-based method. In the first step, a spatial filter was used to remove random noise. Next, a splitter was used to separate the single and mixed patterns. Finally, a Deep Structured Convolutional Network (DSCN) model was used to find the composite pattern. A shallow Randomized General Regression Network (RGRN) was used to find the single pattern defects with an overall accuracy of 86.17%.
Hoppensteadt and Ishikevich [89] proposed a theoretical framework for a MEMS oscillatory neurocomputer for pattern recognition using auto-correlative associative memory. In this paper, the authors' used MEMS as an analog information unit. They further claimed that this method could help build an information processing unit by eliminating micro-controllers in the process flow.
Liu et al. [90] proposed a PC-based Expert System, 'EASYMEMS', for MEMS design and manufacturing. EASYMEMS covers three main aspects of MEMS-based systems: materials, design, and manufacturing. It contains a knowledge-based engine to provide expert consultation and perform dynamic and static analysis. Although other software packages exist, such as ACS and IntelliSuite, the authors claimed that their proposed system could provide a more comprehensive and user-friendly framework based on the AI expert system.
Guo et al. [91] proposed an ML-based approach in place of the standard FEA to reduce the design time as FEA is computationally heavy. The authors claimed their system is almost 4000 times faster in detecting vibration modes for disk resonators. Although this is an exciting result, the research needs further comparative analysis using other AI approaches.

Review Summary
This section provides an overview of the articles analyzed during this study, keeping some important aspects presented as shown in Table 3. The articles were categorized based on the following attributes:

Discussion
In this study, we systematically reviewed the articles and presented the application of AI in MEMS-based sensors. To conduct the study methodically, we have proposed three research questions in Section 3. We summarized our results and provided a comprehensive overview based on these questions in the following subsections.
6.1. Q1: What Are the Most Researched Areas of AI Implementation for MEMS Sensors? Figure 9 illustrates a time-based heat-map analysis of the MEMS application problem area where the authors implemented AI-based solutions. As seen from the heat map, there are 12 problem areas where the application is more popular. The problem types are subcategories under MEMS sensors and MEMS manufacturing and design, e.g., "Thermal Calibration", and "Predictive Maintenance" is used in MEMS manufacturing. Almost 20% of the articles deal with the problems related to "Fault Detection and Diagnosis". For MEMS design, almost 14% of articles are covered. "BioMEMS", "Human Activity Recognition", and "Surface Defect" cover 10% of the study each. The rest of the problem types cover 36% of the study.
The MEMS sensor types used for data analysis differ in these articles. Among all other sensors, the most used gyroscope and accelerometer data are 50%. At the same time, MEMS manufacturing-related articles were only 25%. Figure 10 shows the various purposes of using AI algorithms. As can be seen, the classification and regression tasks are most intensively used, which explains the high percentage of fault detection, thermal calibration, and production issue-related articles.

Q2: What Are the Advantages of AI-Based Solutions Compared to State-of-the-Art Solutions?
With the advancement of graphics processor units (GPU), data abundance, and less processing time [92,93], AI-based solutions have been integrated with more research areas. It can be seen in Figure 9 that since 2017, there has been a growth in the number of published articles; without any doubt, the most used algorithm type is ANN. Among all the papers used for the review, 78% of the authors used ANN-based solutions. ANN is useful for estimating complex and unknown functions with incomplete a priori knowledge. It has high fault tolerance, auto-correction mechanism, and parallel processing with easy application [94]. Looking deeper into the subcategories of the ANN algorithms, variants of CNN type were used the most (31% of the total articles). Besides neural networkbased models, tree-based models such as RF, AdaBoost, DT, and XGBoost and statistical models such as SVM were also used. Tree-based models are relatively less complex and straightforward and easily explainable. Statistical models, such as SVM, NB, and PCA, are physically realizable and easily visualized. The following part summarizes the benefits obtained by using AI-based solutions.
Advantages of Using AI: • Hidden patterns, such as cross-correlation, in the data that are impossible with traditional methods can be discovered and visualized. • Feature estimation can be enhanced in combination with traditional methods. • Automated monitoring systems can be built for FDD and predictive maintenance without any human in the loop. This reduces manual error and improves process accuracy. • Elimination of prior knowledge dependency. • MEMS manufacturing process improvement by eliminating iterative calibration steps with AI predictions. • In the case of MEMS manufacturing, the sensing element fabrication process is subjected to different kinds of failures. AI methods can accurately perform image classification. it has proven to be useful for fault detection and recognizing substrate defects. • Transfer learning helps apply previously gained knowledge to identify and detect features related to defects before the final test. • AI-based models can replace conventional design tools such as FEA for predicting MEMS structural design steps with higher accuracy in less time. • Overall, time-saving and process optimization can be achieved by incorporating AI. Table 4 provides an in-depth analysis of the data features, data availability, model performance metrics used, and proposed AI algorithm advantages compared to the stateof-the-art solution used in the articles analyzed in Table 3. The Data Features and Attributes column discusses the type, complexity, and limitation of the data used. Whether the data is available publicly is mentioned in the Data Availability column. The type of assessment criterion used to validate the model, or the proposed solution is mentioned in the Performance Metrics column. In the last column, Implemented AI Algorithm Advantage, key benefits of the proposed algorithm presented in the articles are highlighted. Table 4. Detailed analysis of the data and model performance.

Ref. Data Features and Attributes Data Availability Performance Metrics
Implemented AI Algorithm Advantage [32] Raw data were obtained from An Akasa AK-FN059 12 cm Viper cooling fan and an MMA8452Q accelerometer. Different vibration measurements were obtained with 3 weight distributions and 17 rotation speeds at a frequency of 20 ms for 1 min.
Public RMSE ANN-based MLP algorithms could efficiently model complex systems consisting of non-linear data.
The benefits of MLP are easy implementation for large-scale problems, good generalizability, and provides efficient computation.

Performance Metrics
Implemented AI Algorithm Advantage [34] Data used were highly imbalanced. They contained information on yaw rate sensors, inertial sensors, process measurements, temperature measurements, and infrastructure measurements.
Confidential ROC-curve, Precision, Recall, F1 Score XGBoost Classifier could handle data imbalance efficiently. Tree-based feature importance helped with root-cause analysis, and production implementation was easy. [35] Five triple-axis MEMS-based accelerometers (model AX3D with a sensitivity of ±2 g) were used for collecting information on leak and no-leak in different pipe types. A high time synchronized data-collecting system from the manufacturer "Beanair" was utilized. SMOTE was used to balance the data.

Confidential
Accuracy, Precision, Recall, F1 Score, ROC Ensemble-based algorithms provided a good performance on large data, and these are resilient to outliers providing an easy interpretation of results obtained. KNN is less prone to overfitting. [36] Monitored humidity data were collected from Intel indoor sensor data [95] for four different sensors.
Public Accuracy GNB achieved the highest accuracy 90%. The naive bayes-based algorithm is fast, does not require much training data, and is insensitive to irrelative features. [40] Data were collected from a MEMS IMU (MSI3200) manufactured by MT Microsystems Company, Shijiazhuang, China [96]. Raw static data contained three-axis gyroscope information (pitch, roll, yaw) and was noisy.
Public Accuracy LSTM-RNN performance was superior due to its effectiveness for time-series-related problems and better generation ability. [41] Three-axis gyroscope noisy data were collected from a MSI3200 MEMS IMU [97] containing pitch, roll, and yaw information. Training data were limited, and only static data were used.

Public Attitude errors
Mixed deep recurrent neural networks outperformed two-layer long short-term memory recurrent neural networks and two-layer gated recurrent units with the benefits of faster convergence and quicker training procedure. [42] Three-axis gyroscope noisy data were collected from a MEMS IMU STIM300 to detect yaw, raw, and pitch error. The data availability was limited.

Public Attitude errors
The advantages of NAS-RNN include superior sequence data processing, noise suppression, and efficient application-specific neural architecture. [43] The IMU data were composed of three-orthogonal gyroscopes and three-orthogonal accelerometers collected from a MEMS IMU MSI3200 manufactured by MT Microsystems Company. The length of the training data and the de-noising performance were traded off. The training was performed with a fixed learning rate and batch size.

Public
Attitude errors DL has a better learning capacity than SVM or other NN. RNN always has better performance for time-series problems. [44,45,47] MEMS IMU consists of a triaxial accelerometer sensor, a triaxial gyroscope sensor, a triaxial magnetometer, and a temperature sensor. The data points were obtained at different temperature ranges for gyroscopes.

Confidential
Authors' defined performance factor BP-NN provided improved and adaptive polynomial fitting for detecting abrupt bias changes in small temperature change windows.

Ref. Data Features and Attributes Data Availability Performance Metrics
Implemented AI Algorithm Advantage [48] The aim was to have the mass proof follow the intended reference trajectory while estimating and compensating for unknown parameter errors and outside disturbances using the fully tuned RBF network.

Confidential
Authors' defined error tracking To account for the impact of external disturbances and model errors, an adaptive, stable, and fully tuned RBF neural network controller was used as it provides non-linear approximation and adaptive nominal control. Further, it enhanced the MEMS gyroscope's dynamic properties and robustness. [49] Three-axis MEMS IMU was used, and the X-axis gyroscope was analyzed. Wavelet filtering was used to remove the noise. The data type was chaotic time series. The dimension of the data was improved one-dimensional time series into an auxiliary phase space using PSR.

Confidential MAE, RMSE, ARE
Combining LSSVM model with CSPO provided advantages, such as faster computation, parameter optimization, suitable for parallel computing, and chaos mapping. [50] Creating a temperature compensation model that fits the function is the key challenge for MEMS gyroscope temperature correction due to the non-linear characteristics. The three-axis gyroscope data were collected within a temperature range of −30-+70°C, with seven temperature points.

Confidential
Variance, Maximum error SVM provided the following advantages: good generalization ability, easy training, can fit the non-linear temperature changes, and a globally optimal solution. The issue of parameter optimization was solved by using GA. [51] Six-axis MEMS accelerometer, high-precision rotary table, a thermostat, a resonant accelerometer, and the testing circuit were used to build the thermal calibration system. The sensor chip was mounted on a side-brazed ceramic package through the silver conductive epoxy adhesive to collect the data. The model's inertial temperature was 293.15 K, and it was assumed that the structure had no internal tension at this temperature. The materials expanded or contracted as the temperature changed, putting the six DETFs under uneven thermal stress.

Confidential
Maximum percentage error The accuracy of the polynomial fitting method is still lacking when used with MEMS accelerometers and results in a systematic error without considering the high-order non-linearities in the sensor errors. NN can handle these issues efficiently. Combining GA with the BP-NN network helped find the global optima with a faster convergence rate and low error. [52] Temperature signal data were collected from a MEMS gyroscope. IPSO-VMD decomposed the gyro signal and obtained the ideal VMD parameters. Using SE, the sequence complexity was calculated and divided into three categories: noise, mixed, and feature.

Confidential
Sample entropy, Allan variance The fusion algorithm helped with strong learning, better model building, efficient global search ability, and faster convergence speed. [54] The authors used both synthetic [98] and real ground vibration data [99] with two different labels assigned to it as peak acceleration and earthquake magnitude, respectively. For the synthetic data, artificial noise was introduced in a controlled manner.

Mixed
Accuracy CNN models can handle non-linear, erroneous, and nonconvex issues.

Ref. Data Features and Attributes Data Availability Performance Metrics
Implemented AI Algorithm Advantage [55] The earthquake dataset used was highly imbalanced and heavily noisy. It was collected from the National Research Institute of Earth Science and Disaster Prevention (NIED) and USGS (United States Geological Survey). The time-series non-earthquake-related data were captured using a mobile device for several hours. The final dataset contained earthquake, noise, and walk-and-wait data.
Public Accuracy, ROC curve, Precision, Recall, F1 score Imbalanced, the noisy dataset can be efficiently handled by ANN with very low false prediction. [56] Earthquake data containing P and S wave picks were considered. It was collected from Southern California Seismic Network (SCCN). The data were continuous with a 4-second window with noise present. A high pass filter was used to remove the noise.

Public Precision, Recall
DL is efficient for object recognition tasks due to its robust generalization representation. No explicit characteristic recognition over millions of data helps detect objects better and make a reliable earthquake warning system. [57] Time series signals were used to detect events. The first difficulty of the data was that the length of an earthquake occurrence varied greatly; the second was that the generated proposals were temporally correlated; Not all the positive events were annotated correctly, which increased the noise in the data.

Confidential AP
Using CC-RCNN helped find an optimized multi-scale temporal correlation of time series data to detect events of various lengths. The deep neural network has considerably improved object detection in 2D picture data. [59] Data were obtained from the National Research Institute for Earth Science and Disaster Resilience (NIED), the United States Geological Survey (USGS), along with the authors' data. Noise data and human activity data were collected by using low-cost MEMS sensors.
Mixed MSE, MAPE, WD error EQGAN efficiently analyzed complex high-dimensional, time-series data structures to generate high-quality seismic sequences in terms of quality and quantity. [62] Data were obtained from the UCI website with six pedestrian motion mode recognition activities. The testers recorded three axial linear accelerations and three axial angular velocities at a constant rate of 50 Hz using the smartphone's integrated accelerometer and gyroscope. The data contained noise. The Butterworth low-pass filter was used to separate acceleration signal components.
Public Accuracy, Recall, F-measure, Precision The effectiveness of CNN in deep learning is huge due to the utilization of convolutional filter hierarchies, which sequentially extracted feature representations of increasing complexity from raw sensor measurement. The unique internal structure of LSTM models provided a memory with a forget function to efficiently and selectively focus on those sensory data that were important to the recognition process.

Ref. Data Features and Attributes Data Availability Performance Metrics
Implemented AI Algorithm Advantage [64] Dataset contained information on whole-body movement and hand gesture information retrieved. The whole-body movement data were retrieved from 10 modality sensors with 18 classes. Hand gesture data contained 12 classes and were collected from body sensors containing a three-axis accelerometer and a two-axis gyroscope Public Accuracy, Average F-measure, Normalized F-measure The challenge of obtaining useful information for activity identification was difficult as it was mostly handcrafted. It was avoided by utilizing the task-depended feature extraction property of the CNN model. The advantages were: extracted features had a stronger ability to distinguish across different categories of human activity, unified feature extraction, and classification. [65] One-dimensional MEMS inertial data were used and converted into 2D gray images after feature extraction with a sliding window. The sampling was performed non-uniformly with different temperature points for MEMS gyroscopes and accelerometers.

Confidential
Accuracy, Confusion matrix CNN improved the fault diagnosis problem in UAV manufacturing according to the correct sensor temperature. The convergence rate of the proposed algorithm was faster, which helped to train the model with a low amount of data available. [66] Data were obtained from a LDVT sensor and control feedback signals are accessible with a limited number of precise measurements. An additional 2% white noise was introduced in the data. Three parallel blocks with new data fusion integrated with ANN and discrete wavelet transform methods were used to detect three major faults: null bias current, actuator leakage coefficient, and internal leakage. Data fusion is important to increase the decision-making process' accuracy.

Confidential MSE
Discrete wavelet transform can retrieve information from both frequency and location signals, making it an efficient fault detection tool. Low data availability does not impact the performance of ANN for estimating complicated non-linear functions. [67] Seeded fault data were obtained from Case Western Reserve University, which consisted of a two-horsepower (hp) electric motor, a dynamometer, and a torque transducer. The dataset includes vibration acceleration signals for bearings with no flaws and bearings with faults in the inner raceway, outer raceway, and rolling element. The dataset comprises signals recorded for bearings with three fault severity levels at four different shaft loads for each fault condition.

Public Accuracy
To counteract the non-stationary behavior of the signals brought on by various crack sizes, the hybrid feature pool extracted more discriminating information from the raw vibration signals. More discriminating data enabled the next classifier to divide data into appropriate groupings. [68] Sensor data contained information related to high and low-pressure turbine speed, compressor outlet temperature and pressure, low-pressure turbine outlet temperature, and pressure.

Confidential MSE
LS-SVM could implement the structural risk minimization principle with a new learning method with low risk and good generalization ability for unseen samples.

Ref. Data Features and Attributes Data Availability Performance Metrics
Implemented AI Algorithm Advantage [69] Gyroscope and accelerometer data were synthetically generated with two label classes, such as faulty and nominal flight conditions. The training was performed offline, but the prediction was made online.

Confidential
Posterior probabilities SVM offer good generalization without the risk of over-fitting and avoiding global minima. It is useful for high-dimensional, non-linear systems. [72] To address the less availability of training data, the authors employed transfer learning to train their model on top of the already learned Common Objects in Context (COCO) model, which was previously trained and made publicly available. The captured images contained information related to MEMS wafer surface defects.

Confidential
Confusion matrix Faster R-CNN was able to perform detection, classification, and localization altogether [77] A MEMS simulator called EM3DS6.2.14 has been used to simulate faults using RF MEMS and an Opamp. Faults related to stiction, curvature, fatigue and brittle, etch variation, and contamination were generated.

Confidential
Confusion matrix LVQ was used for finding the optimum kernel number automatically with provided faster learning speed to overcome the drawback of the RHPNN algorithm finding the optimum kernel number in the second layer and improving the performance. [78] The defects obtained from the MEMS pressure-sensor chips were of different sizes and scales, and the amount of data was limited, containing an unequal distribution. The defect images were annotated manually. In terms of defects, chip scratch, chip damage, gold-wire bonding, glue-surface wrinkles, and aluminum-wire bonding were considered targets.

Confidential AP, MAP
ADCNN can detect small changes in the MEMS pressure-sensor chip-packaging process from noisy image data by using random-data augmentation and defect classifiers, which are not possible using the traditional RCNN. [79] The data were generated containing non-linear relationships among the damping factor, resonance frequency, Brownian noise, mass, and epitaxial layer thickness. Training samples were generated using Monte Carlo simulations that contained varying epitaxy edge loss, epitaxy thickness, offset, and cavity pressure.

Confidential RMSE
DNN is useful for the detection relationship among non-linear data. Using simulated data for offline training of DNN helped with accurate parameter extraction and was less time-consuming than ML methods. A single time-efficient forward pass was able to identify different system parameters. [82] Snapshots of the microbeam were obtained at fixed intervals with two different step-voltages, from which eigenvectors were obtained.

Confidential
Author defined error GHA did not require to compute the input correlation matrix. The method posed potential advantages when the measured data were huge because it only had to determine a small number of necessary basis functions, which could be learned directly from the input data.

Ref. Data Features and Attributes Data Availability Performance Metrics
Implemented AI Algorithm Advantage [84] MEMS micro-engine data containing attributes, such as humidity, operating frequency, resonant frequency, spring quotient, and force component, were collected. Data were limited and contained both bi-modal and uni-modal distribution.
Confidential MAE, SD, R-squared, MSE, RMSE Trained NN could accurately estimate the reliability by mapping the attributes to a reference value and minimizing the error. As a result, it could optimize the process by sensitivity analysis of the process parameters. [85] Simulated MEMS resonator image data contained several unique resonator patterns.

Confidential
Authors' defined RA DL model could effectively create non-linear combinations of the target structure voxel by voxel faster and automatically without any constraints. [86] Raw visible surface defect images were collected [100] containing eight defect types and did not contain a uniform definition for defects. Only a few wafers contained multiple defects, whereas a single defect type was present in others.

Public
Accuracy, Precision, Recall, F-measure, AUC Transfer-learning was used for efficient parameter estimation for faster training of the CNN model to detect wafer defects on a training subset. [88] The data contained multiple defect patterns and noise. The data type used contained both real and synthetic data, and the splitter segregated single and missed defects.

Confidential
Accuracy, SD, ROC, RMSE, The advantage of the proposed method was that it could generate high-level features using low-level features with mixed patterns, which is impossible using shallow ML models. RGRN was used to detect single defects, and DSCN was used to detect mixed defect patterns. [91] Circular disk resonator two-dimensional images of 100 pixels by 100 pixels were used where the void represented the resonator body and the black-and-white part contained structural information. Four vibrational modes were the topic of interest: two torsional, one flexural, and one in-plane rotational mode.

Confidential
Accuracy, Run-time The Resnet model could learn from the complex physical, structural patterns, which cannot be represented explicitly. The model provided an accurate and much faster analysis than the traditional FEA. AI-based solutions are primarily data-driven, which is why after the Big Data growth in Industry 4.0 at the beginning of 2000, the adaptation of such approaches was more welcomed in different industrial applications. This also poses challenges such as data insufficiency in certain use cases, data labeling, and data security issues. In Table 4, it was shown that 63% of data were confidential, whereas only 37% of the data were available publicly. Data imbalance is another significant issue faced by many researchers, and this was highlighted in the following articles [34,35,54,55,78,86]. It was reported in the article [35] that data collection was challenging and time-consuming, and not all the traditional features, such as spread, level, and frequency centroid were used, which could have further improved the model accuracy. In article [40], it was mentioned that the collected raw data were noisy, and the computing power of the system used was limited. Moreover, the limited layers of the proposed model were not optimum for generalized prediction. As can be seen in the following articles [40][41][42][43]49,50], computational power limitation is a great challenge faced by many. Sometimes, training of the model was performed with a fixed learning rate and batch size, which can be further optimized and tuned to improve the model performance [43]. Moreover, even if the data were available, the amount or quality of data was not satisfactory, as reported in [40][41][42][43]48,49,55,56,62,64,78,88], or sometimes the data collection process was too exhaustive [72,77,85]. This is why synthetic data generation [101] is gaining more and more attention for robust model creation.
There exist other challenges from an adaptation and implementation perspective as well. The following lists the number of limitations posed by AI-based solution incorporation in the MEMS field.
Challenges of AI-Implementation in MEMS manufacturing: • Most AI-based algorithms are data-driven. A larger amount of data provides a bettertrained and more robust model. Insufficient and unreliable data can degrade model performance during implementation. • Data prepossessing is a crucial step to ensure data quality, i.e., the availability of labeled, noise-free data can ensure good training of the AI algorithm for higher accuracy. This can be a time-consuming and challenging task. • More computational power is required compared to the traditional methods. • Some algorithms suffer from over-fitting and bias. • Explainability does not always exist. • Plethora of algorithm availability can be challenging and cumbersome in finding the fitting solution. • Cross-domain knowledge is required. • Infrastructure compatibility poses one of the biggest challenges as sometimes small implementations may need huge changes in the production process, which can be time-consuming.
Currently, there exist many data mining process models, such as CRISP-DM (Cross Industry Standard Process for Data Mining) [102], SEMMA (Sample, Explore, Modify, Model, and Assess) [103], KDD (Knowledge Discovery Databases) [104] etc.; where KDD describes the process and CRISP-DM and SEMMA the implementation. Although these are intensively used in small to large corporations, these methods lack problem clarity, contextualization, multiple reworks, iteration failure, etc. This is why ML-Ops (Machine Learning Operation) is now one of the most suitable and industry-oriented end-to-end solutions for the incorporation of AI [105][106][107]. This field is relatively new and thus needs more research work to be standardized.

Conclusions
This review paper summarizes current evolutionary research approaches for MEMS manufacturing and design optimization from the device to the process. AI applications are extensively used in different areas of MEMS system as it offers several benefits, such as automation of processes, reduction of human error, predictive maintenance, efficient information handling, knowledge reproducibility, etc. AI-enabled approaches outperformed traditional optimization methods. These new methods are capable of multi-objective design optimization in MEMS.
However, AI implementation has certain limitations when applied in industrial settings. Great effort and time must be given at the data pre-processing stage. Data collection can be very demanding due to missing information; real data are highly unstructured and noisy; unclear business requirement specifications make it more challenging to find the right solution. As many algorithms and solutions are available for AI, it becomes strenuous to choose the correct answer, which can further lead to an increase in effort, time, and cost. Training models vary from case to case, which hinders the formulation of a generalized solution. There can be infrastructure limitations that can lead to an increase in model training time. It has been noticed during this literature review that the focus of AI implementation is mostly on fault detection in production.
Although there have been papers where the researchers attempted to improve the MEMS design process using AI, it still lacks sufficient data and a standardized framework for efficiently incorporating these two disciplines at the system-level design. This needs further discussion and research using an interdisciplinary approach. Given the complexity of the process flow, it is certainly not an easy task and requires further analysis from a domain perspective. It is worth mentioning that infrastructure availability, platform compatibility, and data confidentiality pose additional challenges. Nonetheless, it is beyond doubt that integrating AI into MEMS manufacturing will undoubtedly provide significant benefits, and it is undeniably the future.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: