A Deep Reinforcement Learning Approach to Droplet Routing for Erroneous Digital Microfluidic Biochips

Kawakami, Tomohisa; Shiro, Chiharu; Nishikawa, Hiroki; Kong, Xiangbo; Tomiyama, Hiroyuki; Yamashita, Shigeru

doi:10.3390/s23218924

Open AccessArticle

A Deep Reinforcement Learning Approach to Droplet Routing for Erroneous Digital Microfluidic Biochips^†

by

Tomohisa Kawakami

^1,*

,

Chiharu Shiro

¹,

Hiroki Nishikawa

²

,

Xiangbo Kong

³,

Hiroyuki Tomiyama

¹ and

Shigeru Yamashita

⁴

¹

Graduate School of Science and Engineering, Ritsumeikan University, Kusatsu 525-8577, Japan

²

Graduate School of Information Science and Technology, Osaka University, Osaka 565-0871, Japan

³

Department of Intelligent Robotics, Faculty of Engineering, Toyama Prefectural University, Imizu 939-0398, Japan

⁴

College of Information Science and Engineering, Ritsumeikan University, Kusatsu 525-8577, Japan

^*

Author to whom correspondence should be addressed.

^†

This paper is an extended version of our paper published in Kawakami, T.; Shiro, C.; Nishikawa, H.; Kong, X.; Tomiyama, H.; Yamashita, S. A Deep Reinforcement Learning-based Routing Algorithm for Unknown Erroneous Cells in DMFBs. In Proceedings of the IEEE Interregional NEWCAS Conference (NEWCAS), Edinburgh, UK, 26–28 June 2023.

Sensors 2023, 23(21), 8924; https://doi.org/10.3390/s23218924

Submission received: 25 September 2023 / Revised: 27 October 2023 / Accepted: 31 October 2023 / Published: 2 November 2023

(This article belongs to the Section Biosensors)

Download

Browse Figures

Versions Notes

Abstract

:

Digital microfluidic biochips (DMFBs), which are used in various fields like DNA analysis, clinical diagnosis, and PCR testing, have made biochemical experiments more compact, efficient, and user-friendly than the previous methods. However, their reliability is often compromised by their inability to adapt to all kinds of errors. Errors in biochips can be categorized into two types: known errors, and unknown errors. Known errors are detectable before the start of the routing process using sensors or cameras. Unknown errors, in contrast, only become apparent during the routing process and remain undetected by sensors or cameras, which can unexpectedly stop the routing process and diminish the reliability of biochips. This paper introduces a deep reinforcement learning-based routing algorithm, designed to manage not only known errors but also unknown errors. Our experiments demonstrated that our algorithm outperformed the previous ones in terms of the success rate of the routing, in the scenarios including both known errors and unknown errors. Additionally, our algorithm contributed to detecting unknown errors during the routing process, identifying the most efficient routing path with a high probability.

Keywords:

biochips; digital microfluidic biochips; deep reinforcement learning; optimization

1. Introduction

1.1. Digital Microfluidic Biochips (DMFBs)

Digital microfluidic biochips (DMFBs), a subtype of biochips, have transformed the biochemical process industry through their capability to automatically execute operations at a miniature scale. These diminutive, efficient, and user-friendly devices, often referred to as a “lab-on-chip”, mark a significant improvement over the traditional methodologies [1,2]. DMFBs serve a broad spectrum of applications, including DNA analysis, clinical diagnosis, and polymerase chain reaction (PCR) testing [3,4,5]. Figure 1 provides an overview of the wide range of DMF-based biomedical applications in the past decade. It illustrates the progression made in DMF-based immunoassays, molecular diagnosis, blood processing, and microbe detection on automated DMF platforms [6]. Additionally, the COVID-19 pandemic propelled DMFBs into the spotlight, due to their ability to deliver rapid and reliable diagnostic results. For instance, the National Institutes of Health (NIH), a leading U.S. medical research agency, instituted the Rapid Acceleration of Diagnostics (RADx) initiative. RADx’s primary objective was to expedite the development, validation, and commercialization of innovative diagnostic technologies for COVID-19, with biochip technologies playing a pivotal role [7].

From this aspect of research, not only error defections methods and droplet routing algorithms, but also various other kinds of research are developing.For instance, biochip fabrication methodologies are constantly advancing [8], and their integration with emergent technologies such as 5G communication, the Internet-of-medical-things (IoMT), artificial intelligence (AI), and cloud computing is being explored [9]. This convergence is aiding progress towards developing the concept of a hospital-on-chip (HOC).

As various types of biochips are being developed, DMFBs distinguish themselves from their predecessors by allowing the manipulation of minuscule discrete droplets on a device [10,11,12]. They have the capability to generate droplets from a reservoir, split droplets, mix different droplets, and move multiple droplets simultaneously [13]. As illustrated in Figure 2, a droplet and hydrophobic insulation are placed between a ground electrode and a set of controllable electrodes. By varying the voltage across these controllable electrodes, DMFBs can manipulate droplets and perform a variety of movements. The electrowetting-on-dielectric (EWOD) technique, which manipulates the interfacial tension between a conductive fluid and a solid electrode via an applied electric field, is fundamental to the operation of DMFBs [14].

However, despite these advancements, DMFB reliability is still a significant concern. Previous research has identified several potential error sources within DMFBs, classifiable as known or unknown errors [15,16,17]. Known errors, such as cell degradation, droplet residues due to an unexpected surface tension, and obstacles impeding droplet movement, can be identified prior to the routing process, often being detected by sensors or cameras. While previous droplet routing algorithms have addressed these errors [18,19], unknown errors such as electrode breakdown, unexpected electrode shorts with neighboring electrodes, or fluctuations in temperature and heat that are undetectable by human or mechanical observers are more elusive and challenging.

Certain unknown errors only surface during the routing process, and they pose significant challenges, due to their unpredictability and difficulty of detection. Figure 3 illustrates a practical scenario where the conventional methods fall short in identifying and adapting to these unknown errors, which ultimately result in a failed routing. The traditional methods rely heavily on data provided by sensors or cameras prior to the routing, limiting their error detection capabilities to known issues like those depicted in Figure 3a. They consequently generate a routing path similar to what is shown in Figure 3b. However, when an unknown error is present in the predetermined routing path, droplets become stuck in a single state. They continue to apply voltage to the unidentified erroneous cell, as portrayed in Figure 3c. This issue stems from the fact that these methods establish a routing path before the routing begins, utilizing information from sensors or cameras. Regrettably, unknown errors evade this detection system and only become evident during the routing process.

In actual biochip experiments, it is crucial to accurately adapt not only known errors but also to unknown errors in real time, since one of the most important factors in biochemical experiments is reliability. Considering this issue, we propose an algorithm that considers both known and unknown errors, reflecting a more authentic biochip environment. By not only including known errors but also unknown ones in the training phase, our model can efficiently handle all types of errors, as shown in Figure 3d.

Not considering all kinds of errors may lead to system failures or accidents. A case in point was Illumina’s NeoPrep device. Introduced in 2015, this $40 K instrument utilized DMFBs to automate DNA sequencing sample preparation [20]. Despite its initial success in infectious disease testing and newborn disease detection, Illumina had to discontinue sales of NeoPrep in 2017 due to significant reliability issues, which indicates the importance of error handling in biochip experiments.

1.2. Deep Reinforcement Learning (DRL)

Deep reinforcement learning (DRL) is the combination of two methodologies: deep learning (DL) and reinforcement learning (RL). The main purpose of RL is to solve problems, utilizing intelligent agents that perform actions and aim to maximize the total rewards within a given environment [21]. In addition to the RL approach, the development of DL has significantly contributed to solving more complex tasks. For instance, in 2013, a deep Q learning algorithm was proposed as the first DRL algorithm [22]. By adding the two methods of experience replay and epsilon greedy strategies to the DRL algorithm, in 2015, a DRL model outperformed human experts in some Atari games [23], which was a significant milestone in the development of DRL. Furthermore, the same year also witnessed the historical event of AlphaGO, a DRL model, which defeated a professional human Go player with a resounding score of 5-0 [24]. This accomplishment demonstrated the impressive potential of DRL on a global scale, as Go is a game known for its complexity and strategic depth.Today, the capabilities of DRL extend beyond game playing and into various complicated and high-stakes domains. The technology has been demonstrated in StarCraft at the grand master level [25], natural language processing [26], predictions of 3D models of protein structures [27], and so on.

1.3. Related Works

Since 2004, the synthesis process of digital microfluidic biochips (DMFBs) has involved several distinct steps. Initially, biologists developed a bioassay protocol [28]. This protocol was subsequently mapped onto designated electrode areas, also known as fluidic modules, which facilitated the execution of fluidic operations [29].

The transportation of droplets from one module to the next, known as droplet routing, is a critical part of the synthesis process and developed over the past few decades. For instance, Huang et al. introduced a fast routing method and performance-driven approach [30]. They defined an entropy-based routing technique for a better routing method but were unable to efficiently manage delayed delivery times.To address blockages in biochips, Keszocze et al. introduced an exact routing method, which guaranteed optimal solutions to routing paths [31]. Additionally, other strategies have been implemented, such as Pan et al. calculating the Manhattan distance between the source and target to reduce pin count [32]. Even though these innovative approaches brought significant improvements, they came with a trade-off, specifically affecting the resource efficiency of the system. Furthermore, many routing methods have been proposed [33,34,35]. These methods, however, are static and ignore the fact that biochips have various kinds of errors, including known errors and unknown errors [15,16,17].

In order to address the degradation issue, which is a type of known error, an adaptive routing algorithm has been proposed [19]. This algorithm’s primary function is to identify the health conditions of the electrodes, thereby facilitating reliable fluidic operations using a deep reinforcement learning (DRL) algorithm. However, this approach exhibits a limitation in its adaptability to unknown errors. This limitation stems from the fact that the method obtains information about each electrode from a charge-coupled device (CCD) camera [36,37] before initiating the routing process. Consequently, this method falls short in handling the unknown errors that may occur during the routing process or errors that remain undetectable by the CCD camera.

Another notable work in error management involved the proposal of the design of fault-tolerant and dynamically reconfigurable microfluidic biochips [38]. The objective of this approach is to dynamically assign specific modules during the routing process, while taking into account the fault tolerance of these modules. This strategy proves effective in detecting, not just known errors, but also unknown errors, while efficiently utilizing module placement. Despite these advantages, the method exhibits limitations in adapting to the comprehensive range of errors. For example, should an electrode fail during the routing process, a droplet will be immobilized, leading to the failure of the routing process. Additionally, the reassignment of modules to other cells reduces the number of cells available for other droplets to use in parallel.

1.4. Paper Contributions

This paper is an extended version of [39], offering a more in-depth analysis of the algorithm’s performance and a broader range of experiments. The contributions of this paper are outlined below.

This paper presents a new deep reinforcement learning-based routing algorithm for digital microfluidic biochips (DMFBs);
It contributes to the field by addressing the crucial issue of error management in DMFBs, specifically both known and unknown errors. It proposes and tests an algorithm that can effectively handle different types of errors, potentially boosting the reliability and efficiency of biochips;
In addition to proposing a new algorithm, this paper conducted extensive experiments to compare the performance of this algorithm against existing ones. The comprehensive results demonstrated the superior performance of the proposed algorithm in terms of accuracy, optimality of the routing path, and error detection capability.

In this paper, we first introduce a new deep reinforcement learning-based routing algorithm for digital microfluidic biochips (DMFBs) in Section 2. We then provide a detailed description of the proposed framework in Section 2.1, delving into the particulars of the environment in Section 2.2 and details of the agent in Section 2.3. In the subsequent part of the paper, we focus on the verification of our proposition. Section 3 contains a comprehensive account of the experiments conducted. The experimental setup is detailed in Section 3.1, while Section 3.2 elucidates the process of agent training. Subsequent to these, Section 3.3 presents the results of the experiments. We finally draw the paper to a close in Section 4 with the conclusions, providing a summary of the key research findings and their implications for the relevant field.

2. Proposed DRL-Based Routing Algorithm

2.1. Framework Description

The framework we propose revolves around the interactive dynamics between an intelligent agent and a digital microfluidic biochip (DMFB) environment. As illustrated in Figure 4, the primary objective of this framework is to maximize the cumulative reward the agent obtains from the DMFB environment.

2.2. Environment

An integral part of the proposed framework is the creation of a digital microfluidic biochip (DMFB) environment. Initially, the DMFB environment designates the upper-left state as the starting point and the lower-right state as the goal. It also introduces both known and unknown errors at random positions, to consider real DMFB environments that have all kinds of errors. To ensure navigability, the DMFB environment employs the breadth-first search (BFS) algorithm. If the algorithm is unable to find a route from the start state to the goal state within the initially configured DMFB environment, the environment reinitializes itself. This process is repeated until at least one valid path from the starting state to the goal state is established.

Upon successful initialization, the DMFB environment begins its interaction with the agent. It accepts an action input from the agent and, in response, provides the corresponding reward and state updates. Given the nature of DMFB operations, a droplet has four possible movements at any given moment: up, down, right, and left. Therefore, the DMFB environment also offers these four choices of action. The current state provided by the DMFB environment includes the droplet’s status, the goal state, and the presence of any known errors, which are provided from a CCD camera in real experiments [36,37]. Furthermore, if a droplet exhibits irregular movements, the DMFB environment incorporates unknown error information into the state. This ability to add unknown error information enhances the proposed algorithm’s adaptability to handling unexpected errors. The rewards provided by the DMFB environment to the agent are detailed in Table 1. The maximum number of steps, as mentioned in the table, is calculated as 2 × (w + h), where “w” represents the width and “h” denotes the height of the biochip. This framework allows the algorithm to navigate efficiently in a DMFB environment, making it an integral part of the system’s adaptive learning process.

2.3. Agent

In our proposed framework, the agent is represented by a deep neural network (DNN), leveraging a convolutional neural network (CNN) [40] architecture. The fundamental purpose of the learning process is to find the optimal configuration of parameters that best fits the model. This process dynamically interacts with the digital microfluidic biochip (DMFB) environment, determining the agent’s behavior and its learning trajectory.

The CNN agent utilizes a 3D array input, which contains critical information pertaining to the current state of the droplet, the desired goal state, and the presence of any known errors. If we denote the size of the biochip (w × h), the size of the input array is then configured as (w, h, 3). This structure guarantees a comprehensive representation of the DMFB environment, enabling the agent to make informed and effective decisions.

The detailed architecture of the CNN can be seen in Table 2. It follows a layered structure, beginning with convolutional layers, which are primarily tasked with feature extraction. The first layer utilizes 32 filters, while the second and third layers use 64 filters each. All three layers employ the ReLU (rectified linear unit) activation function, introduced to add non-linearity into the model. Following the convolutional layers, two linear layers are deployed. The first possesses 256 nodes and also applies the ReLU activation function. The second linear layer, depending on its function, may have four nodes for the actor network or a single node for the critic network. The Softmax activation function is applied in this layer, enabling it to generate a probability distribution for the agent’s potential actions.

As the agent continually interacts with and adjusts to the DMFB environment, it refines its understanding of the system. This leads to the optimization of its actions, thereby maximizing the cumulative rewards. As a result, the agent’s performance in navigating the DMFB environment improves.

3. Simulation Experiments

3.1. Simulation Setup

The simulation experiments conducted in this study involve a rigorous exploration of the performance and response of varying sizes of biochip. To ensure a thorough understanding of the system behavior, a total of nine different biochip sizes were evaluated. The specific sizes selected for this study included:

10 \times 10

,

10 \times 15

,

10 \times 20

,

15 \times 15

,

15 \times 20

,

15 \times 25

,

20 \times 20

,

20 \times 25

, and

25 \times 25

. Here, the size refers to the number of cells in the biochips.

In order to better understand the system’s robustness against errors and to simulate real-world conditions, error rates were systematically incorporated in the test cases. These error rates ranged from 0% to 10% for each biochip size. Specific combinations of known and unknown error rates were used, such as (0, 5), (5, 5), and (0, 10). Each tuple represents the known error rate first, followed by the unknown error rate. The simulation experiments were carried out on a machine with the following specifications:

GPU: GeForce RTX 3060 LHR
CPU: Core i7-12700F
RAM: 80 GB

3.2. Agent Training

In the training process, we leveraged the proximal policy optimization (PPO) algorithm [41]. The PPO algorithm is an advanced form of policy gradient method that amalgamates principles of the Actor–critic (A2C) method [42]. It is an evolutionary development from the trust region policy optimization (TRPO) algorithm [43] and is favored for its balance between algorithmic complexity and performance outcomes. Our training process is structured into epochs, with each epoch encompassing 10,000 games. Following the conclusion of each epoch, we undertake a comprehensive evaluation of the model against a set of 100 diverse test cases. The model’s performance is considered satisfactory if it can successfully discover a routing path in every single test case.

In Figure 5 and Figure 6, we provide a thorough depiction of the training trajectories, illustrating the accumulated rewards from the environment, the number of steps taken by the agent throughout the training process, and the loss of each network, actor, and value network. These trajectories are shown for two different biochip sizes:

10 \times 10

and

20 \times 20

.

Figure 5 provides a detailed visual breakdown of the trajectories for the

10 \times 10

biochip size, considering an error rate denoted by (5, 5)—representing the rate of known errors and the rate of unknown errors respectively. The agent’s rewards, depicted in Figure 5a, gradually increase, suggesting the agent’s success in refining its policy to maximize rewards from the environment. Concurrently, the number of steps, displayed in Figure 5b, decreases, indicating that the agent learned to find the routing path more efficiently over the training period. The actor loss, exhibited in Figure 5c, corresponds to the loss of the actor network, which is responsible for guiding the agent’s policy. In contrast, the critic loss, displayed in Figure 5d, pertains to the loss of the critic network, which evaluates the agent’s value function. As shown in Figure 5, both the total and critic losses decreased as the training process advanced, while the actor loss hovered around zero. This trend indicated a continuous improvement in the agent’s performance, as it learned to balance exploration and exploitation. Remarkably, in the scenario of a

10 \times 10

biochip size with an error rate denoted by (5, 5)—representing the rate of known errors and the rate of unknown errors, respectively—the agent required only approximately 22 min to satisfactorily accomplish the routing process for all test cases.

In comparison, Figure 6 traces the trajectories for a larger

20 \times 20

biochip size, considering the same error rate denoted by (5, 5). In this case, the agent required 16 h to satisfactorily complete the routing process across all test cases. Despite the difference in training times for the two biochip sizes, our reinforcement learning agent demonstrated promising training in handling both small and larger problem instances.

3.3. Simulation Results

In Table 3, we provide data on the performance of the routing algorithms in the test cases. The table presents several simulated results that compared the performance of the proposed routing method versus an existing method [19] across various chip sizes and error rates. The table includes six columns, each representing a different factor. “Chip size” represents the physical dimensions of the chip being tested. “Error rate” is the frequency of known and unknown errors occurring, represented as a pair of rates (known errors, unknown errors). “Routing success” shows the success rate of the droplet in reaching its target; this is divided into the proposed method and the existing one. “Optimal path rate” is the rate at which the droplet uses the shortest routing path, computed using the breadth-first search (BFS) algorithm. “Unknown error detection” measures how many unknown errors were detected during the routing process. We will now dive deeper into interpreting these results and evaluating their implications.

The experimental results showcase a significant comparison between the proposed routing algorithm and the existing methods across different chip sizes and error rates. The results make it evident that the proposed algorithm outperformed the existing methods in almost all aspects and conditions.

Starting with the chip size of

10 \times 10

, for all error rates, the proposed algorithm showcased a perfect routing success rate of 100%. This was substantially higher than the routing success rates of the existing methods, which varied from 58% to 62%. Such results illustrate the robustness of the proposed algorithm, even in the presence of errors. This trend was consistent across all chip sizes, where the proposed algorithm maintained a 100% success rate, while the existing methods showed a decreased success rate with larger chip sizes and higher error rates. For instance, for the

20 \times 25

chip size with an error rate of (0, 10), the existing method had a significantly lower success rate of 6%, compared to the 100% success rate of the proposed algorithm. Figure 7a provides a visual comparison of the routing success between the existing method and the proposed method across different chip sizes.

Furthermore, the "optimal path rate" showed how efficiently the routing was performed. A higher rate indicated that the routing was carried out using the minimum number of cells, which signified a more efficient and optimal path. For all chip sizes and error rates, the proposed algorithm consistently outperformed the existing method, often achieving a 100% rate. This signified that, not only was the proposed algorithm more successful in routing, but it also did so in the most optimal way. However, it is important to understand why the optimal path rate is not always 100%. The routing process is a dynamic one, and during this process, if an unknown error is detected in a state near to the droplet, the proposed algorithm smartly avoids this error. This detour results in a path that sometimes deviates from the originally calculated optimal path, which, in turn, lowers the optimal path rate. This mechanism allows for a safer and more reliable routing process, even if this means straying from the optimal path. Figure 7b illustrates the optimal path rate for the proposed method across different chip sizes.

Lastly, “unknown error detection” captures the number of unknown errors detected during the routing process. A higher value is desirable, as this signifies a better error detection capability. This is not just about managing current routing tasks. Detecting unknown errors plays a critical role in facilitating smoother routing in the future. As an error is detected, it becomes a known factor that the routing algorithm can account for in subsequent computations. This enables the algorithm to make more informed routing decisions, effectively avoiding previously detected error zones. It is worth noting that, as the chip size and error rates increased, the proposed algorithm tended to detect more unknown errors, ranging from 0.34 to 2.55. This robust error detection capability, therefore, ensures not just the success of the current routing process, but it also significantly enhances the efficiency and reliability of future routing operations. Figure 7c shows the unknown error detection for the proposed method across the different chip sizes.

Overall, the experimental results provide compelling evidence of the superior performance of the proposed algorithm over existing methods. This superiority was observed across various chip sizes and error rates, highlighting the proposed algorithm’s robustness, efficiency, and improved error detection capabilities. These attributes make it a highly promising alternative for routing in biochips, potentially leading to more accurate results and fewer process interruptions due to errors.

4. Conclusions

In conclusion, this research introduced a deep reinforcement learning-based routing algorithm for digital microfluidic biochips (DMFBs). The significance of this work lies in the fact that it not only accounts for known errors but also effectively manages unknown errors, which have been a major reliability issue for biochips.

Our experimental results provide strong evidence of the algorithm’s superior performance in three key areas. First, it demonstrated an impressive success rate in routing, consistently outperforming the previous methods across all tested chip sizes and error rates. Second, the proposed algorithm showed exceptional proficiency in identifying the most efficient routing paths. Even in scenarios where unknown errors were detected during the routing process, the algorithm smartly adapted to the new circumstances, ensuring a successful routing outcome, even if this meant diverging from the initially computed optimal path. Finally, the algorithm’s capability to automatically detect unknown errors during the routing process was a critical asset. This feature did not just enhance the current routing process, it also improved the efficiency and reliability of future routing tasks. By turning unknown errors into known factors, the algorithm evolved to become more informed and adaptive, further strengthening its robustness. In terms of accuracy, the optimality of the routing path, and the number of detected unknown errors during the routing process, the proposed algorithm outperformed the existing ones, thus offering a substantial improvement for DMFB routing.

Given the critical role of DMFBs in various fields, including DNA analysis, clinical diagnosis, and PCR testing, the proposed algorithm’s superior performance and error management capability present a promising advance towards more reliable and efficient biochip operations. As we move forward, we aim to continue refining our algorithm and expanding its application, to further enhance DMFB reliability and efficiency.

Author Contributions

Conceptualization, T.K.; Supervision, C.S., H.N., X.K. and H.T.; Funding acquisition, S.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partly supported by KAKENHI 20H04160 and 20H00590.

Conflicts of Interest

The authors declare no conflict of interest.

References

Azizipour, N.; Avazpour, R.; Rosenzweig, D.H.; Sawan, M.; Ajji, A. Evolution of biochip technology: A review from lab-on-a-chip to organ-on-a-chip. Micromachines 2020, 11, 599. [Google Scholar] [CrossRef]
Su, F.; Chakrabarty, K. High-level synthesis of digital microfluidic biochips. ACM J. Emerg. Technol. Comput. Syst. (JETC) 2008, 3, 1–32. [Google Scholar] [CrossRef]
Sista, R.S.; Ng, R.; Nuffer, M.; Basmajian, M.; Coyne, J.; Elderbroom, J.; Hull, D.; Kay, K.; Krishnamurthy, M.; Roberts, C.; et al. Digital microfluidic platform to maximize diagnostic tests with low sample volumes from newborns and pediatric patients. Diagnostics 2020, 10, 21. [Google Scholar] [CrossRef]
Huang, S.; Connolly, J.; Khlystov, A.; Fair, R.B. Digital microfluidics for the detection of selected inorganic ions in aerosols. Sensors 2020, 20, 1281. [Google Scholar] [CrossRef]
Ganguli, A.; Mostafa, A.; Berger, J.; Aydin, M.Y.; Sun, F.; Ramirez, S.A.S.d.; Valera, E.; Cunningham, B.T.; King, W.P.; Bashir, R. Rapid isothermal amplification and portable detection system for SARS-CoV-2. Proc. Natl. Acad. Sci. USA 2020, 117, 22727–22735. [Google Scholar] [CrossRef]
Yang, C.; Gan, X.; Zeng, Y.; Xu, Z.; Xu, L.; Hu, C.; Ma, H.; Chai, B.; Hu, S.; Chai, Y. Advanced design and applications of digital microfluidics in biomedical fields: An update of recent progress. Biosens. Bioelectron. 2023, 242, 115723. [Google Scholar] [CrossRef]
Schachter, S.C.; Dunlap, D.R.; Lam, W.A.; Manabe, Y.C.; Martin, G.S.; McFall, S.M. Future potential of Rapid Acceleration of Diagnostics (RADx Tech) in molecular diagnostics. Expert Rev. Mol. Diagn. 2021, 21, 251–253. [Google Scholar] [CrossRef]
Dkhar, D.S.; Kumari, R.; Malode, S.J.; Shetti, N.P.; Chandra, P. Integrated lab-on-a-chip devices: Fabrication methodologies, transduction system for sensing purposes. Pharm. Biomed. Anal. 2023, 223, 115120. [Google Scholar] [CrossRef]
Chaudhary, V.; Khanna, V.; Awan, H.T.A.; Singh, K.; Khalid, M.; Mishra, Y.K.; Bhansali, S.; Li, C.Z.; Kaushik, A. Towards hospital-on-chip supported by 2D MXenes-based 5th generation intelligent biosensors. Biosens. Bioelectron. 2023, 220, 114847. [Google Scholar] [CrossRef]
Thorsen, T.; Maerkl, S.J.; Quake, S.R. Microfluidic large-scale integration. Science 2002, 298, 580–584. [Google Scholar] [CrossRef]
Verpoorte, E.; De Rooij, N.F. Microfluidics meets MEMS. Proc. IEEE 2003, 91, 930–953. [Google Scholar] [CrossRef]
Pollack, M.G. Electrowetting-Based Microactuation of Droplets for Digital Microfluidics; The Duke University: Durham, NC, USA, 2001. [Google Scholar]
Cho, S.K.; Moon, H.; Kim, C.J. Creating, transporting, cutting, and merging liquid droplets by electrowetting-based actuation for digital microfluidic circuits. J. Microelectromech. Syst. 2003, 12, 70–80. [Google Scholar]
Pollack, M.G.; Fair, R.B.; Shenderov, A.D. Electrowetting-based actuation of liquid droplets for microfluidic applications. Appl. Phys. Lett. 2000, 77, 1725–1726. [Google Scholar] [CrossRef]
Verheijen, H.; Prins, M. Reversible electrowetting and trapping of charge: Model and experiments. Langmuir 1999, 15, 6616–6620. [Google Scholar] [CrossRef]
Welch, E.R.F.; Lin, Y.Y.; Madison, A.; Fair, R.B. Picoliter DNA sequencing chemistry on an electrowetting-based digital microfluidic platform. Biotechnol. J. 2011, 6, 165–176. [Google Scholar] [CrossRef]
Su, F.; Chakrabarty, K.; Fair, R.B. Microfluidics-based biochips: Technology issues, implementation platforms, and design-automation challenges. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2006, 25, 211–223. [Google Scholar] [CrossRef]
Zhao, Y.; Chakrabarty, K. Cross-contamination avoidance for droplet routing in digital microfluidic biochips. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2012, 31, 817–830. [Google Scholar] [CrossRef]
Liang, T.C.; Zhong, Z. Adaptive droplet routing in digital microfluidic biochips using deep reinforcement learning. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, Online, 13–18 July 2020. [Google Scholar]
Li, J.; Kim, C.-J. Current commercialization status of electrowetting-on-dielectric (EWOD) digital microfluidics. Lab Chip 2020, 20, 1705–1712. [Google Scholar] [CrossRef]
Sutton, R.S. Dyna, an integrated architecture for learning, planning, and reacting. ACM Sigart Bull. 1991, 2, 160–163. [Google Scholar] [CrossRef]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing atari with deep reinforcement learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; Van Den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; et al. Mastering the game of Go with deep neural networks and tree search. Nature 2016, 529, 484–489. [Google Scholar] [CrossRef]
Vinyals, O.; Babuschkin, I.; Czarnecki, W.M.; Mathieu, M.; Dudzik, A.; Chung, J.; Choi, D.H.; Powell, R.; Ewalds, T.; Georgiev, P.; et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 2019, 575, 350–354. [Google Scholar] [CrossRef]
Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training language models to follow instructions with human feedback. Adv. Neural Inf. Process. Syst. 2022, 35, 27730–27744. [Google Scholar]
Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef]
Su, F.; Chakrabarty, K. Architectural-level synthesis of digital microfluidics-based biochips. In Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2004. ICCAD-2004, San Jose, CA, USA, 7–11 November 2004; pp. 223–228. [Google Scholar]
Chakrabarty, K.; Fair, R.B.; Zeng, J. Design tools for digital microfluidic biochips: Toward functional diversification and more than moore. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2010, 29, 1001–1017. [Google Scholar] [CrossRef]
Huang, T.W.; Ho, T.Y. A fast routability-and performance-driven droplet routing algorithm for digital microfluidic biochips. In Proceedings of the 2009 IEEE International Conference on Computer Design, Lake Tahoe, CA, USA, 4–7 October 2009; pp. 445–450. [Google Scholar]
Keszocze, O.; Wille, R.; Drechsler, R. Exact routing for digital microfluidic biochips with temporary blockages. In Proceedings of the 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Jose, CA, USA, 2–6 November 2014; pp. 405–410. [Google Scholar]
Pan, I.; Samanta, T. Weighted optimization of various parameters for droplet routing in digital microfluidic biochips. In Recent Advances in Intelligent Informatics, Proceedings of the Second International Symposium on Intelligent Informatics (ISI’13), Mysore, India, 23–24 August 2013; Springer: Cham, Switzerland, 2013; pp. 131–139. [Google Scholar]
Su, F.; Chakrabarty, K. Yield enhancement of reconfigurable microfluidics-based biochips using interstitial redundancy. ACM J. Emerg. Technol. Comput. Syst. (JETC) 2006, 2, 104–128. [Google Scholar] [CrossRef]
Xu, T.; Chakrabarty, K. Integrated droplet routing in the synthesis of microfluidic biochips. In Proceedings of the DAC07: The 44th Annual Design Automation Conference 2007, San Diego, CA, USA, 4–8 June 2007; pp. 948–953. [Google Scholar]
Zhao, Y.; Chakrabarty, K. Simultaneous optimization of droplet routing and control-pin mapping to electrodes in digital microfluidic biochips. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2012, 31, 242–254. [Google Scholar] [CrossRef]
Luo, Y.; Chakrabarty, K.; Ho, T.Y. Error recovery in cyberphysical digital microfluidic biochips. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2012, 32, 59–72. [Google Scholar] [CrossRef]
Willsey, M.; Stephenson, A.P.; Takahashi, C.; Vaid, P.; Nguyen, B.H.; Piszczek, M.; Betts, C.; Newman, S.; Joshi, S.; Strauss, K.; et al. Puddle: A dynamic, error-correcting, full-stack microfluidics platform. In Proceedings of the ASPLOS’19: Architectural Support for Programming Languages and Operating Systems, Providence, RI, USA, 13–17 April 2019; pp. 183–197. [Google Scholar]
Su, F.; Chakrabarty, K. Design of fault-tolerant and dynamically-reconfigurable microfluidic biochips. In Proceedings of the Design, Automation and Test in Europe, Munich, Germany, 7–11 March 2005; pp. 1202–1207. [Google Scholar]
Kawakami, T.; Shiro, C.; Nishikawa, H.; Kong, X.; Tomiyama, H.; Yamashita, S. A Deep Reinforcement Learning-based Routing Algorithm for Unknown Erroneous Cells in DMFBs. In Proceedings of the 2023 21st IEEE Interregional NEWCAS Conference (NEWCAS), Edinburgh, UK, 26–28 June 2023; pp. 1–5. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar] [CrossRef]
Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal policy optimization algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar]
Mnih, V.; Badia, A.P.; Mirza, M.; Graves, A.; Lillicrap, T.; Harley, T.; Silver, D.; Kavukcuoglu, K. Asynchronous methods for deep reinforcement learning. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; pp. 1928–1937. [Google Scholar]
Schulman, J.; Levine, S.; Abbeel, P.; Jordan, M.; Moritz, P. Trust region policy optimization. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 7–9 July 2015; pp. 1889–1897. [Google Scholar]

Figure 1. Overview of DMF-based biomedical applications [6].

Figure 2. Cross-sectional view of a DMFB [12].

Figure 3. Difference between the previous methods and our method in a situation with unknown errors. (a) An initial DMFB environment. (b) A routing path of the previous methods. (c) A failure case of the previous methods. (d) A success case of our method.

Figure 4. The proposed framework for the agent and DMFB environment interaction. The left image represents the agent, represented by a deep neural network (DNN), while the right image shows the initial state of the DMFB environment.

Figure 5. The training trajectory of the size

10 \times 10

biochip. (a) The trajectory of the reward value. (b) The trajectory of the number of steps. (c) The trajectory of the actor loss. (d) The trajectory of the critic loss.

Figure 5. The training trajectory of the size

10 \times 10

biochip. (a) The trajectory of the reward value. (b) The trajectory of the number of steps. (c) The trajectory of the actor loss. (d) The trajectory of the critic loss.

Figure 6. The training trajectory of the size

20 \times 20

biochip. (a) The trajectory of the reward value. (b) The trajectory of the number of steps. (c) The trajectory of the actor loss. (d) The trajectory of the critic loss.

Figure 6. The training trajectory of the size

20 \times 20

biochip. (a) The trajectory of the reward value. (b) The trajectory of the number of steps. (c) The trajectory of the actor loss. (d) The trajectory of the critic loss.

Figure 7. Comparison of experimental results across different chip sizes. (a) Routing success for the existing method and the proposed method. (b) Optimal path rate for the proposed method. (c) Unknown error detection for the proposed method.

Table 1. Reward function for the DMFB Environment.

State	Reward
Reach the goal state	0
Reach the maximum step number	−1.0
Any other state	−0.1

Table 2. Convolutional Neural Network Structure.

Type	Depth	Activation	Kernel	Padding
Convolution	32	ReLU	3	1
Convolution	64	ReLU	3	1
Convolution	64	ReLU	3	0
Linear	256	ReLU	N/A	N/A
Linear	4 (1) ^a	Softmax	N/A	N/A

^a For the actor network, the depth would be 4 and for the critic network, the depth would be 1.

Table 3. Experimental Results.

Chip Size	Error Rate	Existing Method	Proposed Method
Chip Size	Error Rate	Routing Success	Routing Success	Optimal Path Rate	Unknown Error Detection
$10 \times 10$	(0, 5)	62	100	100	0.34
	(5, 5)	58	100	99	0.42
	(0, 10)	58	100	99	0.74
$10 \times 15$	(0, 5)	74	100	100	0.35
	(5, 5)	69	100	100	0.30
	(0, 10)	32	100	99	0.82
$10 \times 20$	(0, 5)	52	100	100	0.54
	(5, 5)	55	100	93	0.51
	(0, 10)	2	100	94	1.25
$15 \times 15$	(0, 5)	53	100	100	0.49
	(5, 5)	53	100	98	0.60
	(0, 10)	22	100	94	1.27
$15 \times 20$	(0, 5)	62	100	99	0.79
	(5, 5)	42	100	97	0.76
	(0, 10)	16	100	95	1.36
$15 \times 25$	(0, 5)	46	100	100	0.74
	(5, 5)	43	100	93	0.87
	(0, 10)	9	100	95	1.36
$20 \times 20$	(0, 5)	38	100	98	0.98
	(5, 5)	30	100	91	0.91
	(0, 10)	17	100	89	0.89
$20 \times 25$	(0, 5)	31	100	98	1.18
	(5, 5)	34	100	91	1.01
	(0, 10)	6	100	92	2.06
$25 \times 25$	(0, 5)	27	100	93	0.93
	(5, 5)	33	100	90	1.30
	(0, 10)	8	100	84	2.55

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kawakami, T.; Shiro, C.; Nishikawa, H.; Kong, X.; Tomiyama, H.; Yamashita, S. A Deep Reinforcement Learning Approach to Droplet Routing for Erroneous Digital Microfluidic Biochips. Sensors 2023, 23, 8924. https://doi.org/10.3390/s23218924

AMA Style

Kawakami T, Shiro C, Nishikawa H, Kong X, Tomiyama H, Yamashita S. A Deep Reinforcement Learning Approach to Droplet Routing for Erroneous Digital Microfluidic Biochips. Sensors. 2023; 23(21):8924. https://doi.org/10.3390/s23218924

Chicago/Turabian Style

Kawakami, Tomohisa, Chiharu Shiro, Hiroki Nishikawa, Xiangbo Kong, Hiroyuki Tomiyama, and Shigeru Yamashita. 2023. "A Deep Reinforcement Learning Approach to Droplet Routing for Erroneous Digital Microfluidic Biochips" Sensors 23, no. 21: 8924. https://doi.org/10.3390/s23218924

APA Style

Kawakami, T., Shiro, C., Nishikawa, H., Kong, X., Tomiyama, H., & Yamashita, S. (2023). A Deep Reinforcement Learning Approach to Droplet Routing for Erroneous Digital Microfluidic Biochips. Sensors, 23(21), 8924. https://doi.org/10.3390/s23218924

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Deep Reinforcement Learning Approach to Droplet Routing for Erroneous Digital Microfluidic Biochips^†

Abstract