Coupling Bulk Phase Separation of Disordered Proteins to Membrane Domain Formation in Molecular Simulations on a Bespoke Compute Fabric

Phospholipid membranes surround the cell and its internal organelles, and their multicomponent nature allows the formation of domains that are important in cellular signalling, the immune system, and bacterial infection. Cytoplasmic compartments are also created by the phase separation of intrinsically disordered proteins into biomolecular condensates. The ubiquity of lipid membranes and protein condensates raises the question of how three-dimensional droplets might interact with two-dimensional domains, and whether this coupling has physiological or pathological importance. Here, we explore the equilibrium morphologies of a dilute phase of a model disordered protein interacting with an ideal-mixing, two-component lipid membrane using coarse-grained molecular simulations. We find that the proteins can wet the membrane with and without domain formation, and form phase separated droplets bound to membrane domains. Results from much larger simulations performed on a novel non-von-Neumann compute architecture called POETS, which greatly accelerates their execution compared to conventional hardware, confirm the observations. Reducing the wall clock time for such simulations requires new architectures and computational techniques. We demonstrate here an inter-disciplinary approach that uses real-world biophysical questions to drive the development of new computing hardware and simulation algorithms.


Introduction
Soft surfaces are ubiquitous in living cells and are typified by phospholipid membranes that bound the cell and its internal organelles [1]. Membranes allow compositionallydistinct environments for incompatible biochemistry and provide specialized surfaces to mediate material and information transport. Although viewed originally as merely structural boundaries, they are now recognized as active participants in cellular dynamics [2,3]. They undergo morphological transitions between distinct shapes in response to weak external influences [4], and in turn exert forces on membrane-associated particles or materials [5]. Their multicomponent nature allows regulated demixing of a subset of their component lipids and proteins to assemble into localized domains to carry out transient functions [3,6]. Pathological domain formation occurs when bacteria such as E. Coli and Shigelle release nm-sized rigid toxin particles in order to infect the cell [7]. Toxin particles adsorb to the When both interactions are weak, the membrane is well mixed and the IDPs are dispersed in the solvent (a, IDPs not shown). In the absence of an attraction of the IDPs for the minority phase lipids, we expected the membrane to be well mixed, and the IDPs dispersed in the bulk solvent if their self-attraction is low (as in a), or phase separated, if their self-attraction is high (b). Increasing the attraction of the IDPs for the lipid headgroups leads to a morphology that is controlled by their self-attraction: if this is low, they may form a thin wetting layer (c), while a strong self-attraction may lead to a phase separated droplet adhering to a minority phase lipid domain (d). We observed all these results in the simulations, indicating that the equilibrium states of the membrane and IDPs are accessible to DPD simulations. Theoretical calculations have predicted a range of wetting states resulting from the adsorption of polymeric solute molecules on a membrane surface. In particular, it was found that surface phase transitions can occur under conditions in which bulk phase separation is impossible at any solute concentration [27]. Monte Carlo simulations have been used to study the cooperativity of receptor-ligand binding and domain formation in membrane binding [45]. But a molecular-scale picture of the structural organization of membrane-associated BCs is still missing.
Although our system contains only three molecular species (not including the solvent), a large number of parameters must be specified to define its state. Additionally, the simulated system is relatively small (~10-100 nanometres in linear dimension), and we may legitimately ask: What are the prospects of improving its biological fidelity? In considering this question, we encounter the barrier of the rapidly increasing computational cost of simulating a system of increasing size while retaining near-molecular resolution.
Biological membranes evolve on multiple timescales that are hard to span even with coarse-grained simulation techniques [46][47][48][49]. The resources needed to simulate a membrane of linear dimension L on a single compute core scale at least as L 5 , allowing for the spatial increase and time for diffusive processes to propagate, so it is challenging to extend such simulations to near-cell scale problems (~microns). The Folding@home consortium recently harnessed more than one million devices across the world to explore conformational transitions of the SARS-CoV-2 spike protein with an effective performance of 1.01 exaflops [50]. Even this extraordinary power is not sufficient to simulate the interaction of, for example, a complete virus particle with the host cell membrane [49]. This is not a new problem. Conventional parallel programming paradigms using, e.g., the message passing interface (MPI) system, increase the simulated volume by dedicating many compute cores to one simulation [51]. However, they are limited by the requirement that the compute cost per core must exceed the messaging cost, which necessitates a large number of particles to be assigned to each core. This limitation is not removed Although our system contains only three molecular species (not including the solvent), a large number of parameters must be specified to define its state. Additionally, the simulated system is relatively small (~10-100 nanometres in linear dimension), and we may legitimately ask: What are the prospects of improving its biological fidelity? In considering this question, we encounter the barrier of the rapidly increasing computational cost of simulating a system of increasing size while retaining near-molecular resolution.
Biological membranes evolve on multiple timescales that are hard to span even with coarse-grained simulation techniques [46][47][48][49]. The resources needed to simulate a membrane of linear dimension L on a single compute core scale at least as L 5 , allowing for the spatial increase and time for diffusive processes to propagate, so it is challenging to extend such simulations to near-cell scale problems (~microns). The Folding@home consortium recently harnessed more than one million devices across the world to explore conformational transitions of the SARS-CoV-2 spike protein with an effective performance of 1.01 exaflops [50]. Even this extraordinary power is not sufficient to simulate the interaction of, for example, a complete virus particle with the host cell membrane [49]. This is not a new problem. Conventional parallel programming paradigms using, e.g., the message passing interface (MPI) system, increase the simulated volume by dedicating many compute cores to one simulation [51]. However, they are limited by the requirement that the compute cost per core must exceed the messaging cost, which necessitates a large number of particles to be assigned to each core. This limitation is not removed by adding more cores, but it can be eliminated by the elegant scheme we describe below, and which we have used to carry out large-scale simulations to validate our main results.
Since 2016, the Engineering and Physical Sciences Research Council of the UK has funded a project called POETS-Partially-Ordered Event-Triggered Systems (POETS)-that applies the non-von-Neumann computer architecture called event-based computing to large-scale computational problems [52]. POETS is not a general-purpose computer: Each application must be transformed into a graph in order to run on POETS, and reimplemented in the POETS domain specific language. We took the open source DPD code Osprey-DPD [53], converted it to run in POETS, and used it to explore the larger system size presented here. POETS-DPD allowed us to simulate larger systems at a reduced costboth in terms of wall clock time and electricity consumed-compared to conventional CPU/GPU hardware [54]. In POETS-DPD, the simulation volume is divided into a grid of cells similar to that used in conventional MPI-based parallel schemes. Crucially, in contrast to MPI-based simulations, each grid cell contains only a few particles and is assigned to its own compute core, so each grid cell's computing step is very fast. Messaging between cores is executed in hardware and is also fast. Because the limitation of compute cost >> messaging cost has been removed, the system as a whole can be scaled to any volume by simply adding more cores, and it exhibits almost perfect weak scaling.
We showcase in this special issue results obtained with conventional DPD and POETS-DPD, and demonstrate that event-based computing opens up exciting pathways to near cell-scale computational exploration. We use an inter-disciplinary approach that combines the development of new hardware and simulation algorithms on the bespoke computational hardware with the real-world biophysical problem of predicting the influence of protein phase separation on membrane organization. The POETS-DPD hardware and software is still under development, but the initial version used here provides functionally correct answers at speeds which are substantially faster than existing systems, and allows us to validate the dynamical phenomena we observe between small and large systems over long timescales.
The principal contributions of this paper are two-fold:

1.
We present a simulation-based exploration of the morphological outcomes resulting from the 3D phase separation of intrinsically-disordered proteins near a 2D lipid membrane; 2.
We describe the novel POETS compute architecture and the algorithm used in POETS-DPD, and demonstrate the power of this approach by simulating selected systems at a fraction of the wall clock time: a 14 cpu-day conventional DPD simulation is completed by POETS-DPD in 8 h, a speedup of more than a factor of 40.
The remainder of the paper is organized as follows. In Section 2, we briefly describe the dissipative particle dynamics simulation technique and the event-based hardware and POETS-DPD implementation, referring the reader to the literature for more details. Section 3 presents our main results on the equilibrium morphologies observed in the simulations. Section 4 discusses the powerful advantage of POETS-DPD by highlighting much larger systems. Finally, in Section 5, we illuminate the path ahead for using massively parallel POETS-DPD to explore near-cell scale dynamic phenomena.

Dissipative Particle Dynamics Simulations
Dissipative particle dynamics (DPD) is a coarse-grained, explicit-solvent form of molecular dynamics originally designed to simulate the hydrodynamic behavior of complex fluids [38][39][40]. In DPD, atoms and atomic groups are grouped into beads, of mass m, which interact via three non-bonded, short-ranged (vanish beyond a cut-off distance d 0 ), soft, momentum-conserving forces. A conservative force between each bead type pair (characterized by a strength parameter a ij for beads of type i and j) gives beads a chemical identity, such as hydrophobic oil particles or hydrophilic amino acid residues. Additionally, a dissipative force (characterized by the parameter γ ij ) and random force (whose strength σ ij is related to the dissipative force parameter by the fluctuation-dissipation theorem σ 2 ij = 2γ ij k B T) provide a thermostat that keeps the system temperature k B T constant. More details of the force field are given in previous work [55]. Beads are connected into molecules with Hookean springs, and a bending stiffness may be applied via an additional bond angle-dependent potential. Once the forces are specified for all bead and molecule types in a simulation, the Newtonian equations of motion for the beads are integrated using a modified Velocity Verlet scheme. Because the non-bonded forces are soft, the integration scheme is able to use a larger time-step than is possible in atomistic molecular dynamics, and the grouping of several atomic groups into each bead reduces the number of degrees of freedom needing to be integrated. These two factors make DPD several orders of magnitude faster than traditional atomistic and coarse-grained molecular dynamics. DPD has been extensively used to simulate lipid membranes [41,56], domain formation in two-component membranes [57,58], nanoparticles interacting with membranes [59,60], and many other soft matter systems [43]. We adopt here the model of Grafmüller et al. for the lipid molecules [61], and their molecular shape is shown in Figure 2. Briefly, each lipid has a headgroup composed of four hydrophilic beads (H A for the major species, H B for the minor species), and two tails each containing four hydrophobic beads (T A for the major species, T B for the minor species). The tails are connected to the final two adjacent head beads as shown in Figure 2. The IDPs are modeled as linear, semi-flexible polymers (backbone beads of type B) with self-associating endcaps containing four beads (bead type E) [55]. Water was represented by a single bead W. Both backbone and endcap beads in the IDPs are hydrophilic, and it is only the self-attraction of their endcaps that drives their phase separation. The full set of DPD conservative force parameters is given in Table 1. The two parameters a EE and a EM set the attraction between the IDP endcaps and themselves and the minority lipid headgroups, respectively, and are varied systematically in Section 3.1. The parameter a ES sets the attraction of the IDP endcaps and the sticky domains in the linker attached to the minority phase lipid in Section 3.4. The dissipative force parameter γ ij for all beads was 4.5 (in units of mk B T/d 2 0 ). Beads were bonded into molecules using Hookean springs with potential parameters of 128 and 0.5 (in units of k B T/d 2 0 and d 0 respectively). A three-body bending potential of the form k 3 (1 − cos(ϕ − ϕ 0 )) was applied to adjacent triples of backbone beads (B) in the IDPs, and the linker beads L and S when they were present, with parameters k 3 = 5 k B T and ϕ 0 = 0. A similar potential was applied to the lipid tails with parameters k 3 = 15 k B T and ϕ 0 = 0,. Further details of the simulation technique are given in the literature [40,55,61].   Table 1. Note that the head beads and tail beads of the two lipid species are colored differently for clarity but are otherwise the same in the simulations. The number of linker elements can be varied, but is fixed at m = 8 for simplicity. The number of backbone beads in the IDPs is shown as six, but is varied as part of the investigation.
Results in the 40 × 40 × 48 (d0) 3 simulation box are generated using the open-source, single-thread DPD code OSPREY-DPD [53]. Simulations in the larger box 100 × 100 × 48 (d0) 3 are performed using POETS-DPD as described next.   Table 1. Note that the head beads and tail beads of the two lipid species are colored differently for clarity but are otherwise the same in the simulations. The number of linker elements can be varied, but is fixed at m = 8 for simplicity. The number of backbone beads in the IDPs is shown as six, but is varied as part of the investigation. All simulations took place in a box of size 40 × 40 × 48 (d 0 ) 3 unless otherwise stated and periodic boundary conditions were applied in all three dimensions. A single planar membrane was pre-assembled with its normal along the long axis, and the minority phase lipids were randomly distributed in the upper monolayer only to localize IDP/membrane interactions to one side. A given number of IDPs were dispersed randomly throughout the remaining space and the simulation box was filled with solvent beads to the average bead density ρd 3 0 = 3. The DPD length scale is set to d 0 = 1 nm from the size of the lipid headgroups [41]. We set the reduced system temperature to k B T = 1 [40]. Each simulation was run for 3 million time-steps unless otherwise stated using an integration step size of 0.02 τ, where τ = md 2 0 /k B T is the DPD timescale. Because we are interested in equilibrium properties, we did not attempt to fix the simulation timescale more precisely.

Massively Parallel Event-Based Simulations
The larger simulations in this work were performed on a new computer technology called POETS-Partially Ordered Event-triggered Systems [52]. POETS represents computational problems as a graph: vertices in the graph are independent computational threads; and edges between vertices represent channels that transport messages between threads. The abstract, arbitrary computational graph is mapped onto a fixed hardware network of RISC-V processors [62], which provides tens of thousands of hardware threads in the space and power envelope that would usually support only a few hundred conventional x86 threads. POETS is an ongoing research project (https://www.poets-project.org; accessed on 18 December 2021) but preliminary results show that it is highly suited to DPD simulations [54]. We used it both as an efficient way to produce the large-scale results presented here, and to evaluate its potential for production use in mesoscale research and simulation.
POETS is not a general-purpose computer. Each application must be transformed into a graph in order to run on POETS, and reimplemented in the POETS domain specific language. For our case of DPD simulations, we follow the same discretization of space approach used to parallelize DPD for the MPI framework. The simulation volume is spatially subdivided into a cubic array of unit cells for which the beads in each cell are managed by a single computational thread. Each cell becomes a vertex in the graph, and each vertex has edges to its 26 neighboring cells to represent information flows within each simulated time-step. A key point of POETS messaging is that it is very fast and consumes few cycles: sending a message is just one machine instruction, with message latencies ranging from 1-10 micro-seconds, depending on the physical distance between cores. This contrasts with MPI messages that are typically very slow compared to a local function call, requiring hundreds of instructions to prepare, and with latencies ranging from 10-1000 micro-seconds, depending on the message size.
A big algorithmic difference between DPD on POETS and MPI is that the volume of space managed by a cell in the POETS implementation is comparable to the DPD nonbonded cut-off distance (d 0 ), so each cell contains only a few particles. In comparison, an MPI-enabled DPD must use much larger cells containing thousands of beads to ensure that the messaging cost is a small fraction of the total runtime. The difference between these two approaches is shown graphically in Figure 3. Effectively, because messaging in POETS is so cheap, we are able to use very large numbers of light-weight CPUs with each managing only a few beads.
language. For our case of DPD simulations, we follow the same discretization of space approach used to parallelize DPD for the MPI framework. The simulation volume is spatially subdivided into a cubic array of unit cells for which the beads in each cell are managed by a single computational thread. Each cell becomes a vertex in the graph, and each vertex has edges to its 26 neighboring cells to represent information flows within each simulated time-step. A key point of POETS messaging is that it is very fast and consumes few cycles: sending a message is just one machine instruction, with message latencies ranging from 1-10 micro-seconds, depending on the physical distance between cores. This contrasts with MPI messages that are typically very slow compared to a local function call, requiring hundreds of instructions to prepare, and with latencies ranging from 10-1000 micro-seconds, depending on the message size.
A big algorithmic difference between DPD on POETS and MPI is that the volume of space managed by a cell in the POETS implementation is comparable to the DPD nonbonded cut-off distance (d0), so each cell contains only a few particles. In comparison, an MPI-enabled DPD must use much larger cells containing thousands of beads to ensure that the messaging cost is a small fraction of the total runtime. The difference between these two approaches is shown graphically in Figure 3. Effectively, because messaging in POETS is so cheap, we are able to use very large numbers of light-weight CPUs with each managing only a few beads. Once ownership of all the unit cells in the simulation space has been assigned to physical POETS CPUs, the DPD algorithm executes in a highly concurrent fashion, with each thread operating independently and synchronized only by the messages containing beads. We only sketch the main ideas here, in order to give a sense of how it works and refer the read to the literature for more details [54]. POET-DPD executes in in two phases: Sharing: Broadcast each resident bead position and velocity to neighbors; Once ownership of all the unit cells in the simulation space has been assigned to physical POETS CPUs, the DPD algorithm executes in a highly concurrent fashion, with each thread operating independently and synchronized only by the messages containing beads. We only sketch the main ideas here, in order to give a sense of how it works and refer the read to the literature for more details [54]. POET-DPD executes in in two phases:

1.
Force calculation a Sharing: Broadcast each resident bead position and velocity to neighbors; b Integration: For each received bead, calculate non-bonded DPD and Hookean bond interactions with resident cells; c AngleForces: Once head and tail beads for an angle bond are received by middle bead, calculate angle forces and broadcast to neighbors (which must include the owner of the angle bond's head and tail; d AngleAddition: If an angle force is received and this cell contains the related head or tail bead, apply angle-bond forces to that head/tail.

2.
Bead movement a Movement: Apply equations of motion to all beads resident in the cell; b BeadExit: If any bead leaves the cell, broadcast it to neighbors and remove from resident set; c BeadEntrance: If a bead is received from a neighbor and its new position is in the current cell, add it to the resident set.

3.
Go back to step 1 for next step of simulation This is a very high-level overview, but it highlights some of the important computational characteristics: Concurrent: Each thread is running independently, so this approach is concurrent, where lots of threads do different things with loose synchronization, rather than parallel, where lots of threads to the same thing with tight synchronization.
Fine-grained: The simulation is decomposed into one computational thread per unitvolume cell, with each thread potentially only handling 3 or 4 beads, and resulting in thousands or millions of concurrent compute threads.
Event-triggered: Messages trigger computation, rather than computation being used to schedule messages. Within each phase, the sending and receiving of messages can be interleaved, overlapping compute and communication.
Broadcasts: Rather than aiming messages only at the correct receiver, we broadcast messages to all possible receivers. This might seem counter-intuitive, but due to the tight binding between processors and network in POETS this is very efficient.
Single algorithm: All calculations, including Hookean and angle bonds, are integrated into the same algorithm on the same compute thread, and handled concurrently with DPD force calculation.
There is obviously a lot more computational detail in precisely how this is implemented, not least because the experimental goals of the paper were used to drive development of the algorithms and methods. This meant that we had to handle the edge-cases needed in practical simulations, rather than just assuming-away the difficult parts in the pursuit of headline performance figures that do not translate to real-world performance.
For example, the need to handle angle bonds completely changed the way that we handled bonds within the algorithm. Our original algorithmic approach used a faster method with fewer algorithmic steps, but ultimately could only handle Hookean bonds efficiently. Though it was very fast, this was for a problem which is less useful in practice for end-users. By having computer scientists, hardware engineers, and bio-physics endusers working together we tried to avoid that trap, and tackle the more difficult but useful problems head on.

Influence of the IDP-Membrane Affinity on the Equilibrium Morphology
We were interested in how compositionally-distinct domains in a multicomponent membrane may be induced by the nearby bulk phase separation of intrinsically-disordered proteins. As stated, the problem has too many parameters to be efficiently presented, and it is unfeasible to exhaustively explore its high-dimensional parameter space. We therefore studied the simplest system that may exhibit interesting behavior-a two-component membrane and a single type of IDP-and qualitatively enumerate its equilibrium states. The membrane contains a major lipid species and a smaller fraction of a minor lipid species, and we fixed their number fractions to give 1989 and 397 major and minor lipid molecules, respectively, corresponding to a minor fraction of 0.166. This ensures that a domain (if formed) is much larger than the lipid molecules but smaller than the membrane area, thereby reducing the effects of the boundary conditions. The lipid molecules possess identical structures and interactions and are well mixed in the absence of any influence of the IDPs. We initially fixed the backbone length of the IDPs to 6 beads and refer to them by their backbone length in an obvious notation as B 6 . We started with 397 B 6 molecules in the bulk phase. Note that the IDP backbone and endcap beads are hydrophilic, and only their endcap self-attraction drives their phase separation. We varied the IDP self-attraction and their affinity to the minority phase lipid headgroups by changing the conservative DPD interaction parameters a EE , a EM respectively. We defined dimensionless parameters to quantify the IDP endcap-endcap attraction (ε EE ), and the endcap-minority lipid headgroup (ε EM ) attraction by: where a WE is the DPD conservative interaction parameter for the water/IDP-endcap interaction. We explored the range 0 ≤ ε ij ≤ 1, for which 0 corresponds to no net attraction between bead types i, j (compared to their interaction with the solvent beads) and 1 to a very strong attraction. When there was no attraction of the IDPs for themselves nor the membrane, ε EE = ε EM = 0, they remained dispersed in the solvent and the membrane was well mixed (Figure 4, Supplementary Video S1). In the following figures, the left panel shows the whole system while the right panel shows just the membrane for clarity.
where is the DPD conservative interaction parameter for the water/IDP-endcap interaction. We explored the range 0 ≤ ≤ 1, for which 0 corresponds to no net attraction between bead types , (compared to their interaction with the solvent beads) and 1 to a very strong attraction. When there was no attraction of the IDPs for themselves nor the membrane, = = 0, they remained dispersed in the solvent and the membrane was well mixed (Figure 4, Supplementary Video SM1). In the following figures, the left panel shows the whole system while the right panel shows just the membrane for clarity.
Increasing while keeping the zero led to bulk phase separation of the IDPs without influencing the membrane, which remained well mixed ( Figure 5 and Supplementary Video SM2). The value of = 0.68 is not very strong, leading to the droplet being in equilibrium with the dilute phase of IDPs dispersed in the solvent. Increasing ε EE while keeping the ε EM zero led to bulk phase separation of the IDPs without influencing the membrane, which remained well mixed ( Figure 5 and Supplementary Video S2). The value of ε EE = 0.68 is not very strong, leading to the droplet being in equilibrium with the dilute phase of IDPs dispersed in the solvent.
Increasing ε EM = 0.8 while keeping ε EE zero caused the IDPs to wet the membrane, forming a thin layer as their endcaps bind to the lipid headgroups ( Figure 6 and Supplementary Video S3). For the chosen fraction of minority phase lipid, the IDPs were able to saturate the domain leaving some excess IDPs in the solvent. It is interesting to note that the wetting domain formed as the result of the adsorption of the IDPs to the minority lipid headgroups because the IDPs have no tendency to phase separate. The short B 6 IDPs keep the lipids adsorbed to their endcaps in close proximity. When two or more such adsorbed IDPs approach by diffusion, the thermal fluctuations of the IDPs lead to their mixing between the lipids. This caused the IDPs to effectively connect all the minority lipids into a large domain instead of independent small clusters. However, the wetted domain was non-circular, indicating it has a low line tension, and small patches of lipid occasionally detach, taking a few IDPs with them before reattaching. We note that although the minor lipid species comprises only 16% of the total number of lipids, they are all in the upper monolayer and hence can occupy an area up to 32% of the monolayer's area.   Increasing = 0.8 while keeping zero caused the IDPs to wet the membrane, forming a thin layer as their endcaps bind to the lipid headgroups ( Figure 6 and Supplementary Video SM3). For the chosen fraction of minority phase lipid, the IDPs were able to saturate the domain leaving some excess IDPs in the solvent. It is interesting to note that the wetting domain formed as the result of the adsorption of the IDPs to the minority lipid headgroups because the IDPs have no tendency to phase separate. The short B6 IDPs keep the lipids adsorbed to their endcaps in close proximity. When two or more such adsorbed IDPs approach by diffusion, the thermal fluctuations of the IDPs lead to their mixing between the lipids. This caused the IDPs to effectively connect all the minority lipids into a large domain instead of independent small clusters. However, the wetted domain was non-circular, indicating it has a low line tension, and small patches of lipid occasionally detach, taking a few IDPs with them before reattaching. We note that although the minor lipid species comprises only 16% of the total number of lipids, they are all in the upper monolayer and hence can occupy an area up to 32% of the monolayer's area. Finally, increasing both affinities to ε EE = ε EM = 0.8 led to the formation of a bulk droplet of IDP and wetting of a single domain in the membrane. After diffusing for some time, the droplet adsorbed to the domain, but at the selected concentration, it protruded from the membrane, forming a membrane-associated dense phase. (Figure 7 and Supplementary Video S4). The domain fluctuated around a circular shape, indicating the existence of a fairly high line tension. Small numbers of majority phase lipid sometimes diffused through the domain as seen in the movie Supplementary Video S4.
Our results for the equilibrium states of the system are summarized in Figure 8 as a twodimensional morphology diagram with the endcap self-attraction ε EE along the abscissa and the endcap-lipid headgroup ε EM along the ordinate. We reemphasize that there was no attraction between the majority and minority phase lipids, and the observed domains were created only by the influence of the IDPs. At the origin, there was no attraction between the IDPs and the membrane nor between themselves, and the membrane was well mixed with the IDPs dispersed in bulk solvent (invisible for clarity). Increasing the IDP self-attraction ε EE while keeping their attraction to the membrane ε EM low (bottom right) led to their bulk phase separation while the membrane is well mixed. Increasing the IDP attraction for the membrane ε EM while it has no self-attraction (top left) led to wetting of the membrane until the minority phase lipids are covered and excess IDPs remain in the bulk. When both interactions were high (top right), the IDPs drove domain formation in the membrane and adsorbed to it forming a droplet that projects into the solvent.  Our results for the equilibrium states of the system are summarized in Figure 8 as a two-dimensional morphology diagram with the endcap self-attraction along the abscissa and the endcap-lipid headgroup along the ordinate. We reemphasize that there was no attraction between the majority and minority phase lipids, and the observed domains were created only by the influence of the IDPs. At the origin, there was no attraction between the IDPs and the membrane nor between themselves, and the membrane was well mixed with the IDPs dispersed in bulk solvent (invisible for clarity). Increasing the IDP self-attraction while keeping their attraction to the membrane low (bot- Figure 7. When both IDP self-attraction and the IDP-membrane attraction are strong, the proteins undergo phase separation into a dense droplet that adsorbs to the minority phase lipids driving domain formation but still protrudes into bulk solvent.

Reversible Coupling of Phase Separation and Domain Formation
Membrane domains frequently form transiently in response to cellular signals [32,35]. We observed a reversible transition between a phase separated droplet in the bulk solvent and membrane wetting that was induced by turning on and off the attraction of the IDPs to the minority lipids. Figure 9 shows snapshots from a simulation of 494 IDPs and 395/1978 minority and majority phase lipids, respectively. The timing of the snapshots is given as a fraction of the total simulation time of Tmax = 3 × 10 6 time steps. The IDPs were initially neutral towards the minority phase lipids ( = 0), but had a self-attraction ( = 0.68) that caused them to phase separate into a droplet in the bulk (see Supplementary Video SM5). At time 0.2 Tmax, a strong attraction ( = 0.8) of the IDP endcaps to the minority lipids was turned on. As the IDPs adsorbed to the minority lipids and drove them into a single domain, the droplet dissolved because the bulk IDP concentration fell below that at which it is in equilibrium with its dilute phase. The IDP/lipid attraction was removed at time 0.85 Tmax and the IDPs left the membrane and reassembled in the bulk while the membrane rapidly returned to being well mixed. The time required for the membrane to return to a well-mixed state and the IDPs to reassemble in the bulk was much shorter than that for the wetting domain to form.

Reversible Coupling of Phase Separation and Domain Formation
Membrane domains frequently form transiently in response to cellular signals [32,35]. We observed a reversible transition between a phase separated droplet in the bulk solvent and membrane wetting that was induced by turning on and off the attraction of the IDPs to the minority lipids. Figure 9 shows snapshots from a simulation of 494 IDPs and 395/1978 minority and majority phase lipids, respectively. The timing of the snapshots is given as a fraction of the total simulation time of T max = 3 × 10 6 time steps. The IDPs were initially neutral towards the minority phase lipids (ε EM = 0), but had a selfattraction (ε EE = 0.68) that caused them to phase separate into a droplet in the bulk (see Supplementary Video S5). At time 0.2 T max , a strong attraction (ε EM = 0.8) of the IDP endcaps to the minority lipids was turned on. As the IDPs adsorbed to the minority lipids and drove them into a single domain, the droplet dissolved because the bulk IDP concentration fell below that at which it is in equilibrium with its dilute phase. The IDP/lipid attraction was removed at time 0.85 T max and the IDPs left the membrane and reassembled in the bulk while the membrane rapidly returned to being well mixed. The time required for the membrane to return to a well-mixed state and the IDPs to reassemble in the bulk was much shorter than that for the wetting domain to form.

Reversible Coupling of Phase Separation and Domain Formation
Membrane domains frequently form transiently in response to cellular signals [32,35]. We observed a reversible transition between a phase separated droplet in the bulk solvent and membrane wetting that was induced by turning on and off the attraction of the IDPs to the minority lipids. Figure 9 shows snapshots from a simulation of 494 IDPs and 395/1978 minority and majority phase lipids, respectively. The timing of the snapshots is given as a fraction of the total simulation time of Tmax = 3 × 10 6 time steps. The IDPs were initially neutral towards the minority phase lipids ( = 0), but had a self-attraction ( = 0.68) that caused them to phase separate into a droplet in the bulk (see Supplementary Video SM5). At time 0.2 Tmax, a strong attraction ( = 0.8) of the IDP endcaps to the minority lipids was turned on. As the IDPs adsorbed to the minority lipids and drove them into a single domain, the droplet dissolved because the bulk IDP concentration fell below that at which it is in equilibrium with its dilute phase. The IDP/lipid attraction was removed at time 0.85 Tmax and the IDPs left the membrane and reassembled in the bulk while the membrane rapidly returned to being well mixed. The time required for the membrane to return to a well-mixed state and the IDPs to reassemble in the bulk was much shorter than that for the wetting domain to form.

Increasing IDP Length Results in Patchy Domains
We next explored how the backbone length of the IDPs affected the domain morphology. We recall that IDPs with a short backbone B 6 and high affinities (ε EE = ε EM = 0.8) adsorb to the membrane forming a circular domain that shows small shape fluctuations, indicating a significant line tension (see Figure 7 and Supplementary Video S4). The upper panel of Figure 10 (and Supplementary Video S6) shows that the equilibrium state of a system containing 396 IDPs of the type B 8 and 396/1982 minority and majority phase lipids, respectively, also contained a single domain, but it contained small patches of the majority lipid and its non-circular shape indicated a lower line tension. On further increasing the IDP backbone length to B 16 (lower panel of Figure 10 and Supplementary Video S7), we found that the conformationally-fluctuating IDPs bridged multiple, small lipid clusters, but no domain was formed. This simulation contained 296 IDPs and 395/1975 minority and majority phase lipids, respectively, to reduce the crowding effect of the longer IDPs. We hypothesize that the domain formed by short IDPs is broken up by the conformational entropy of the longer IDP backbones. Previous work has shown that B 16 IDPs assemble into a structured network when they phase separate in the bulk solvent with a spatial structure whose length scale is selected by the length of the IDPs [55]. When these associated IDPs adsorb to the minority lipid headgroups, they maintain this structure which keeps the adsorbed minority lipids farther apart and unable to form a single domain.

Increasing IDP Length Results in Patchy Domains
We next explored how the backbone length of the IDPs affected the domain morphology. We recall that IDPs with a short backbone B6 and high affinities ( = = 0.8) adsorb to the membrane forming a circular domain that shows small shape fluctuations, indicating a significant line tension (see Figure 7 and Supplementary Video SM4). The upper panel of Figure 10 (and Supplementary Video SM6) shows that the equilibrium state of a system containing 396 IDPs of the type B8 and 396/1982 minority and majority phase lipids, respectively, also contained a single domain, but it contained small patches of the majority lipid and its non-circular shape indicated a lower line tension. On further increasing the IDP backbone length to B16 (lower panel of Figure 10 and Supplementary Video SM7), we found that the conformationally-fluctuating IDPs bridged multiple, small lipid clusters, but no domain was formed. This simulation contained 296 IDPs and 395/1975 minority and majority phase lipids, respectively, to reduce the crowding effect of the longer IDPs. We hypothesize that the domain formed by short IDPs is broken up by the conformational entropy of the longer IDP backbones. Previous work has shown that B16 IDPs assemble into a structured network when they phase separate in the bulk solvent with a spatial structure whose length scale is selected by the length of the IDPs [55]. When these associated IDPs adsorb to the minority lipid headgroups, they maintain this structure which keeps the adsorbed minority lipids farther apart and unable to form a single domain.

Clustering of Minority Lipids by IDP-Linker Attraction
IDPs are over-represented in cellular signaling and regulatory pathways [63], and are involved in clustering of transmembrane receptors [32]. The cytoplasmic domain of transmembrane receptors plays an important role in their phase separation. One example is the activation of T cell receptors in the immune system, which has been reconstituted in model membranes and leads to formation of a droplet of IDPs associating with the membrane-bound receptors [34]. In addition, the phase separation of IDPs in neurons localizes receptors in the postsynaptic density and vesicles in the pre-synaptic zone [64,65].
To explore the ability of phase-separating IDPs to drive lipid domain formation via binding to a flexible linker, we attached a hydrophilic linker to the minority lipid headgroup and placed several sticky domains along it that are attracted to the IDP endcaps. The addition of the linkers introduces (at least) an additional three parameters: the number and affinity of the sticky domains, and their separation along the linker. For simplicity, we used a linear sequence of four sticky domains (of four S beads each) separated by inert regions (of four L beads each). The inert and sticky linker domains are hydrophilic and have no net attraction. We defined the dimensionless strength of the attraction between the IDP endcaps and the sticky domains (ε ES ) similarly to the endcaps, but used the linker-linker sticky domain conservative parameter as the baseline: ε ES = (a SS − a ES )/a SS . To avoid artifacts of the linkers prematurely clustering due to their initial configuration, we set the endcap-linker interaction to ε ES = 0 (a SS = a ES = 30), and equilibrated the system for 250,000 time steps allowing the minority lipids to diffuse freely across the membrane. Then the endcap self-attraction and endcap-linker attraction were set to a strong value, ε EE = ε ES = 0.83. Figure 11 shows the effects of 9 linker-lipids present in the membrane. As the IDPs phase separated, they bound to the linkers and slowly merged into a large droplet that drew the minority lipid/linkers into a domain ( Figure 11, top panel, and Supplementary Video S8.

Discussion
The increasing number of biomolecular condensates discovered in cells and the ubiquity of membrane-bound organelles suggests that interactions between them are likely to be important for cellular physiology. They also provide an interesting conceptual system for exploring how the two-dimensional nature of a membrane interacts with three-dimensional droplets. An example is provided by the phase separation of multivalent IDPs cou- The above results were generated using the OSPREY-DPD code in a simulation box of size 40 × 40 × 48 (d 0 ) 3 . In Figure 11 (and Supplementary Video S9), we compare the case of LLPS-driven domain formation from Figure 7 with those generated using the POETS-DPD code in the larger simulation box 100 × 100 × 48 (d 0 ) 3 . The upper panel shows the equilibrium state in the small simulation box. We found that as few as 7 linker-lipids can bind a droplet of 395 IDPs whose size is much larger than the clustered linkers. Two linkers flip-flopped from the upper leaflet as a consequence of the initial state and were able to bind sufficient IDPs to form a second domain visible in the lower left (and top left as it is connected across the periodic boundaries). The middle and lower panels of Figure 11 show two views of the equivalent simulation executed with POETS-DPD that contains a membrane that is 100 nm in linear dimension and contains 1260 IDPs and 2520 linker-lipids. The same clustering of the larger number of IDPs to the larger number of linker-lipids results in a similar droplet bound to the membrane (cp. Supplementary Videos S8 and S9).

Discussion
The increasing number of biomolecular condensates discovered in cells and the ubiquity of membrane-bound organelles suggests that interactions between them are likely to be important for cellular physiology. They also provide an interesting conceptual system for exploring how the two-dimensional nature of a membrane interacts with three-dimensional droplets. An example is provided by the phase separation of multivalent IDPs coupled to transmembrane signaling proteins in immune cell signaling [11,35]. We used coarsegrained molecular simulations to explore the morphologies adopted by a dilute phase of IDPs near a two-component lipid membrane. The high computational cost of such simulations drove this collaboration between biophysicists and the developers of novel computer hardware to explore how coarse-grained simulations on much larger scales can be accelerated and yet consume less electricity.
Our biological results are summarized in the morphology diagram in Figure 8, where we explored the relationship between the molecular properties of the constituent molecules and their equilibrium morphological phase behavior. Two non-trivial outcomes were observed. First, if the IDP endcaps were more strongly attracted to the minority phase lipid headgroups than to each other, they wet the membrane. If the length of the IDPs is short, this drove the minority lipids into a single large domain, while longer IDPs adsorbed via multiple, small, discrete clusters. When the IDP self-interaction and their interaction with the minority lipid headgroup (or with sticky linkers attached to the headgroup) were both sufficiently large, bulk LLPS occurred and drove formation of a single membrane domain.
Most of the preceding results were obtained from small systems with around 2400 lipids in the membrane and several hundred IDPs, comprising 230,000 particles in total. Each simulation required 14 cpu-days on a single core of a Ryzen Threadripper 3970X 4.5 GHz desktop machine. It is tedious and unfeasible to carry out hundreds of these simulations to explore exhaustively the parameter space of such complex models. In addition, thẽ 70 simulations we performed consumed an environmentally-unsustainable amount of electricity. We therefore performed some larger simulations in a box size 100 × 100 × 48 (d 0 ) 3 on a novel computer hardware called POETS that drastically reduces the wall clock time of our DPD simulations. The larger system contains 14,600 lipids and 1260 IDPs, comprising 1,400,000 particles in total, but requires only 3 days to run the same number of time-steps as the smaller case.
The research goal of our collaboration of biophysicists and computer engineers is to ensure that the new POETS algorithms and hardware agree with the established DPD codebase (for which many published results exist); that they can handle real-world problems that practitioners want to solve; and that they are highly scalable. Initially, we focused on matching the performance of single-node systems such as shared-memory multi-core CPUs and individual GPUs.
Providing direct comparisons between disparate hardware/software systems is very difficult. For example, it is often unclear how much systems actually cost (buying 1 unit is different from buying 1000), what their power consumption is (should cooling costs be included?), and whether computations are equivalent (is single-precision the same as double-precision?). Table 2 shows indicative metrics for how different systems behave in practice, based on published figures and (where possible) our own experiments. We characterized them in terms of one compute-node, where a node is a multi-core CPU, GPU plus support CPU, or an FPGA card for POETS, with power consumption quoted per node. The performance was measured in total bead-steps per second across all nodes, which is simply the total number of beads multiplied by the total number of simulation time steps, divided by the execution time. CPU figures are based on our own measurements, while GPU figures are courtesy of Seaton [66]. The figures for "POETS Gen1" are the performance attained in this work, which is the measured performance of the hardware performing the large-scale simulations described in this paper. We clearly exceeded the performance of the single-core Osprey-DPD simulator used for the parameter exploration, which is our baseline comparison. It is also slightly faster than a high-performance 32-core machine in our HPC cluster when using multi-core LAMMPS, and about the same performance as a single high-performance GPU. However, we can see that the performance per watt was relatively weak-similar to a single-threaded implementation running on a CPU. This is a consequence of two effects: First, the hardware we used is the first generation, which is based on old FPGAs from 2011; second, we have so far focused on correctness and useability, and have spent little time on performance optimization.
Overall, we achieved the research goals of the collaboration in producing a new hardware system and algorithms that: -Produce reliable DPD results (agree with published DPD implementation) [53]; -Are useful for practitioners; -Are competitive in performance with existing single-node systems.
We are currently assembling the second generation of POETS hardware (POETS Gen2), which is substantially more powerful as it is built on more recent FPGAs. This increases the number of cores by a factor of 10, and the floating-point performance per core by a factor of 4, so we have included in Table 2 estimated figures for the POETS-DPD implementation running on this new hardware. At this point, the performance and efficiency start to become competitive with multi-GPU systems, even without additional tuning and algorithmic optimizations. For example, we have already found that dynamic load-balancing within the POETS hardware dramatically improves simulation rate.
Looking further ahead, we are actively exploring new algorithms and optimizations which substantially improve the performance per FPGA node. In particular, adding custom instructions designed to support DPD calculations will reduce the number of instructions per bead interaction by a factor of 10. We included tentative estimates for the eventual performance we are targeting using the hardware currently being assembled, based on reasonable extrapolations. Clearly more research is needed, and this level of performance is an aspirational goal, but having demonstrated that the basic idea works and is useful we have confidence that this target is worth pursuing.

Conclusions
We live in interesting times: The power of computer systems is increasing year-by-year, bringing biological fidelity at increasing length scales and ever longer time periods (almost) within our grasp. Existing technologies such as multi-core, MPI, and GPUs provide one way of scaling to larger simulations, but it is increasingly important to reduce the financial and environmental costs of the simulations. Exotic architectures with custom hardware are another way, but it is unclear whether they work for real-world problems, or even if they give the correct results.
The principal goal of this paper was to identify the equilibrium morphologies of a model of phase separating proteins near a two-component membrane using coarsegrained simulations. But a secondary goal was to demonstrate how biophysical simulation needs can drive inter-disciplinary research that yields faster and cheaper simulations. By embedding the development of a new hardware simulation platform into the scientific process, we create a feedback loop between the computational modeling pull and the computer technology push that ensures the hardware advances work in practice-not just in theory. The results presented in this paper were generated on distinct compute platforms (Single core: Ryzon ThreadRipper 3970X; 4.5 GHz x86 32-core desktop; POETS machine: 49000 RISC-V threads): Each platform (running different codebases) provides consistent results, which is an important and often overlooked aspect of simulation-based research.
Our final aim was to illuminate the path towards using event-based computing to extend computational biophysics into more realistic arenas of membranes in health and disease. Extensions to our work could follow several interesting routes. The assumption of ideal mixing of the lipid species in the membrane could be relaxed to explore the converse effect of domain formation on the bulk phase separation. The model IDPs could also be more closely modeled on real proteins by adding multiple, punctate binding sites, or a mixture of IDPs could be included. However, increasing the biological fidelity creates more complex models with more parameters, making a systematic exploration of their behaviour challenging on conventional architectures. The POETs computer framework introduced here accelerates DPD simulations sufficiently that the exploration of high-dimensional parameter space models is feasible while simultaneously reducing the energy consumed. As simulations are increasingly used to augment wet-lab experiments, this will become an increasingly important route to sustainable science.