Structural Models for the Dynamic Effects of Loss-of-Function Variants in the Human SIM1 Protein Transcriptional Activation Domain

Single-minded homologue 1 (SIM1) is a transcription factor with numerous different physiological and developmental functions. SIM1 is a member of the class I basic helix-loop-helix-PER-ARNT-SIM (bHLH–PAS) transcription factor family, that includes several other conserved proteins, including the hypoxia-inducible factors, aryl hydrocarbon receptor, neuronal PAS proteins, and the CLOCK circadian regulator. Recent studies of HIF-a-ARNT and CLOCK-BMAL1 protein complexes have revealed the organization of their bHLH, PASA, and PASB domains and provided insight into how these heterodimeric protein complexes form; however, experimental structures for SIM1 have been lacking. Here, we describe the first full-length atomic structural model for human SIM1 with its binding partner ARNT in a heterodimeric complex and analyze several pathogenic variants utilizing state-of-the-art simulations and algorithms. Using local and global positional deviation metrics, deductions to the structural basis for the individual mutants are addressed in terms of the deleterious structural reorganizations that could alter protein function. We propose new experiments to probe these hypotheses and examine an interesting SIM1 dynamic behavior. The conformational dynamics demonstrates conformational changes on local and global regions that represent a mechanism for dysfunction in variants presented. In addition, we used our ab initio hybrid model for further prediction of variant hotspots that can be engineered to test for counter variant (restoration of wild-type function) or basic research probe.


Molecular Dynamics Simulations
Molecular dynamics (MD) was completed on each model for conformational sampling. The primary purpose of MD, in this scenario, is examining any conformational variability that may occur with different dimerization pairs, such as ARNT2. Briefly, each SIM1 system was minimized with relaxed restraints, using either steepest descent (SD) or PRCG, and equilibrated in solvent with physiological salt conditions. More detailed descriptions of our particular MD methodology are discussed in the literature [26][27][28][29]. The protocol for refinement included the following steps: (1) minimization with explicit water molecules and ions, (2) energy minimization of the entire system, and (3) MDS for >10 ns to relax to the forcefield (OPLS3/Amber) [30,31]. Following the refinement protocol, production simulations were completed to collect data, with an additional MD production length of >500 ns. This overall production length included the contribution of various MD runs for adequate sampling of the dynamical phenomena (during MD refinement/setup), and each long (>100 ns) production runs. As these production runs revealed significant structural changes resultant from the point mutations, they were sufficient to accomplish our study goal. We reasonably suspect that even longer simulations would reveal even greater structural rearrangements; however, we already identified sufficient mechanistic details that explicate the detrimental nature of these SIM1 variants with our MDS sampling methods.
OPLS3 (Desmond)/Amber (NAMD2) forcefields were used with the current release of the NAnoscale Molecular Dynamics 2 engine [31,32]. The simulated system, including hydrogens, consisted of~2.0 × 10 5 atoms with solvation using SPC/E water and ions. In all cases, we neutralized with counter-ions, and then created a solvent with 150 mM Na + Cl − to recreate physiological strength. SPC/E water molecules were added around the protein at a depth of 15-18 Å from the edge of the molecule, depending upon the side [33]. Our protocol has been previously described in the literature [29]. Simulations were carried out using the particle mesh Ewald technique with repeating boundary conditions with a 9 Å nonbonded cut-off, using SHAKE with a 2 fs timestep.
Pre-equilibration was started with three stages of minimization with 10,000 steps of SD, PRCG, and relaxing restraints, then followed by 1000 ps of heating under MD, with the atomic positions of nucleic and protein fixed. Then, two cycles of minimization (5000 steps each) and heating (1000 ps) were carried out with soft restraints of 10 and 5 kcal/(mol·Å 2 ) applied to all backbone atoms and metals. Next, 5000 steps of minimization were performed with solute restraints reduced to 1 kcal/(mol·Å 2 ). Following that, 400 ps of MDS were completed using relaxing restraints (1 kcal/(mol·Å 2 )) until all atoms are unrestrained, while the system was slowly heated from 1 to 310 K using velocity rescaling upon reaching the desired 310K during this equilibration phase. Additionally, Isothermal-isobaric ensemble (NPT) equilibration was established using velocity rescaling for >10 ns. Finally, production runs of MD were carried out with constant pressure boundary conditions (relaxation time of 1.0 ps) for over 500 ns. A constant temperature of 310 K was maintained using the Berendsen weak-coupling algorithm with a time constant of 1.0 ps. SHAKE constraints were applied to all hydrogens to eliminate X-H vibrations, which yielded a longer simulation time step (2 fs). Our methods for equilibration and production run protocols are in the literature [27,[34][35][36]. Translational and rotational center-of-mass motions were initially removed. Periodically, simulations were interrupted to have the center-of-mass removed again by a subtraction of velocities to account for the "flying ice-cube" effect [37]. Following the simulation, the individual frames were superposed back to the origin, to remove rotation and translation effects.

SIM1 and ARNT Heterodimer Interface Has Contacts Consistent with Stabilizing Interface
We observed numerous strong contacts between ARNT and the SIM1 heterodimer structure. Additionally, we found many soft contacts that contribute favorably to the interface. For strong contacts, the H-bond 2.2-2.5 Å and 20 • angle is considered strong, while 2.6-3.2 Å is considered a soft contact. The default is set at 2.5 Å for our cutoff, which is technically a strong contact. Section 3.1.1. through Section 3.1.5. discuss a domain-by-domain analysis of the dimer contacts. ARNT structure is shown in van der Waals (VdW) spheres with carbon in dark-gray, oxygen-red, nitrogen-blue, and polar hydrogens-white. Inset is shown without ARNT in same orientation for reference. (C) Same dimeric SIM1: ARNT structure rotated 90° along the X-axis.

SIM1 and ARNT Heterodimer Interface Has Contacts Consistent with Stabilizing Interface
We observed numerous strong contacts between ARNT and the SIM1 heterodimer structure. Additionally, we found many soft contacts that contribute favorably to the interface. For strong contacts, the H-bond 2.2-2.5 Å and 20° angle is considered strong, while 2.6-3.2 Å is considered a soft contact. The default is set at 2.5 Å for our cutoff, which is technically a strong contact. Section 3.1.1. through Section 3.1.5. discuss a domain-by-domain analysis of the dimer contacts.

SIM1 Dynamics Demonstrates Pathogenic Variants Have Both Local and Global Effects
The two most illustrative poses for SIM1 are shown in Figure 4A (namely, side and top view). The placement of the pathogenic variants is either on the far N-terminal side (position 46) or well into the C-terminal end of the SM domain (positions 707, 715, 740) ( Figure 4A-E). A simulation for the full-length wild-type (WT) SIM1 (apo) was completed (Supplementary Movie S1). Simulations were performed on apo SIM1 to assess the impact that the variant-mutation would induce on the conformational dynamics prior to association with ARNT. Variations in the conformational presentation would thereby reduce the likelihood of proper SIM1: ARNT interface. We believe a mechanistic investigation of the variants would be biased by the presence of ARNT pre-dimerized with SIM1 in the simulations, failing to reveal important structural reorganizations influencing the interface. The central region (residues 300-335) is not displayed here, in order to reveal interesting possible interactions between the N-terminal regions and C-terminal regions of SIM1 in its apo form ( Figure 5 and Supplementary Movie S1). Global measurements are given and discussed in Section 3.

SIM1 Dynamics Demonstrates Pathogenic Variants Have Both Local and Global Effects
The two most illustrative poses for SIM1 are shown in Figure 4A (namely, side and top view). The placement of the pathogenic variants is either on the far N-terminal side (position 46) or well into the C-terminal end of the SM domain (positions 707, 715, 740) ( Figure 4A-E). A simulation for the full-length wild-type (WT) SIM1 (apo) was completed (Supplementary Movie S1). Simulations were performed on apo SIM1 to assess the impact that the variant-mutation would induce on the conformational dynamics prior to association with ARNT. Variations in the conformational presentation would thereby reduce the likelihood of proper SIM1: ARNT interface. We believe a mechanistic investigation of the variants would be biased by the presence of ARNT pre-dimerized with SIM1 in the simulations, failing to reveal important structural reorganizations influencing the interface. The central region (residues 300-335) is not displayed here, in order to reveal interesting possible interactions between the N-terminal regions and C-terminal regions of SIM1 in its apo form ( Figure 5 and Supplementary Movie S1). Global measurements are given and discussed in Section 3.2.1. through Section 3.2.4.

D707H Variant
The D707H variant induces reorganization via substituting negatively charged Asp to positively charged His ( Figure 4C). Nearby residues R525, F699, H702, Y705, F706, H707, K708, H709, Y711, and T712 are all disturbed over the 150 ns simulation by the change from the D to H variant ( Figure 4C and Supplementary Movie S5). The Supplementary Movie shows a zoom into the 12 Å region surrounding H707 during the simulation.

G715V Variant
The G715V variant propagates reorganization due to the insertion of a hydrophobic moiety (valine) where there was no side chain ( Figure 4D). Adjacent residues R521, H523, R525, T711, L713, T714, V715, Y716, and H720 are all influenced over the 150 ns simulation by the change from the G to V variant (Figures 4D and 3A, and Supplementary Movie S5). The Supplementary Movie exhibits a close-up of the 12 Å region surrounding V715 during the simulation. This valine could be particularly upsetting to the helix arrangement ( Figure 3A).

D707H Variant
The D707H variant induces reorganization via substituting negatively charged Asp to positively charged His ( Figure 4C). Nearby residues R525, F699, H702, Y705, F706, H707, K708, H709, Y711, and T712 are all disturbed over the 150 ns simulation by the change from the D to H variant ( Figure 4C and Supplementary Movie S5). The Supplementary Movie shows a zoom into the 12 Å region surrounding H707 during the simulation.

G715V Variant
The G715V variant propagates reorganization due to the insertion of a hydrophobic moiety (valine) where there was no side chain ( Figure 4D). Adjacent residues R521, H523, R525, T711, L713, T714, V715, Y716, and H720 are all influenced over the 150 ns simulation by the change from the G to V variant ( Figures 3A and 4D, and Supplementary Movie S5). The Supplementary Movie exhibits a close-up of the 12 Å region surrounding V715 during the simulation. This valine could be particularly upsetting to the helix arrangement ( Figure 3A).

D740H Variant
The D740H variant effectuates reorganization likely due to exchanging negatively charged Asp to positively charged His ( Figure 4E). Proximate residues A474, N511, S512, P514, I682, N729, Y730, L732, H738, and F739 are all impacted over the 150 ns simulation by the change from the D to H variant ( Figure 4E and Supplementary Movie S5). The Supplementary Movie displays the 12 Å region surrounding H740 during the simulation.

Detailed Analyses for Local Deviations in Geometry Lead to Larger Amplitude Changes via Correlated Motions
The global measure of the change in the entire full-length SIM1 (apo) during molecular dynamics simulations is given ( Figure 6). Here, we report that only G715V has a more grossly changed state over the course of the 150 ns simulation. However, we observed interesting apo SIM1 WT motion between N-term and C-term ( Figure 5A,B, and Supplementary Movie S1), which could be an unbound stable form of SIM1. Mutant G715V gives RMSD around 15 Å from the starting conformation, having gone into completely different global orientation between the N-and C-term due to the flexibility from the loosened helix. Mutants T46R and D707H showed less RMSD shift than WT (~5 Å difference), while mutant D740H converges with WT after 100 ns of simulation ( Figure 4). The global RMSD does show that after approximately 100 ns, each of the simulations settled to a general global conformation, as the RMSD only retains small fluctuations around an average. The initial structure for all of these models was essentially the same, aside from in silico point mutations, and each variant settled to a different average RMSD with respect to this initial state: G715V~15 Å, WT~12 Å, D740H~11 Å, T46R~9 Å, and D707H~9 Å. Because the global comparisons do not address individual residues or other reasons for variants' loss-of-activity, we pursued multiple other metrics for analysis.

D740H Variant
The D740H variant effectuates reorganization likely due to exchanging negatively charged Asp to positively charged His ( Figure 4E). Proximate residues A474, N511, S512, P514, I682, N729, Y730, L732, H738, and F739 are all impacted over the 150 ns simulation by the change from the D to H variant ( Figure 4E and Supplementary Movie S5). The Supplementary Movie displays the 12 Å region surrounding H740 during the simulation.

Detailed Analyses for Local Deviations in Geometry Lead to Larger Amplitude Changes via Correlated Motions
The global measure of the change in the entire full-length SIM1 (apo) during molecular dynamics simulations is given ( Figure 6). Here, we report that only G715V has a more grossly changed state over the course of the 150 ns simulation. However, we observed interesting apo SIM1 WT motion between N-term and C-term ( Figure 5A, B, and Supplementary Movie S1), which could be an unbound stable form of SIM1. Mutant G715V gives RMSD around 15 Å from the starting conformation, having gone into completely different global orientation between the N-and C-term due to the flexibility from the loosened helix. Mutants T46R and D707H showed less RMSD shift than WT (~5 Å difference), while mutant D740H converges with WT after 100 ns of simulation ( Figure 4). The global RMSD does show that after approximately 100 ns, each of the simulations settled to a general global conformation, as the RMSD only retains small fluctuations around an average. The initial structure for all of these models was essentially the same, aside from in silico point mutations, and each variant settled to a different average RMSD with respect to this initial state: G715V~15 Å, WT~12 Å, D740H~11 Å, T46R~9 Å, and D707H~9 Å. Because the global comparisons do not address individual residues or other reasons for variants' loss-of-activity, we pursued multiple other metrics for analysis. First, local RMSD within 8 Å of the mutant gives a good indication of how much local geometric rearrangement occurs as a consequence of the individual variant, which were measured with respect to the initial frame from the WT structure for SIM1. Mutant T46R has the smallest RMSD change from the set (~3 Å from initial), while H707, V715, and H740 all have large jumps to >6 Å from their First, local RMSD within 8 Å of the mutant gives a good indication of how much local geometric rearrangement occurs as a consequence of the individual variant, which were measured with respect to the initial frame from the WT structure for SIM1. Mutant T46R has the smallest RMSD change from the set (~3 Å from initial), while H707, V715, and H740 all have large jumps to >6 Å from their initial frames ( Figure 3B). Mutant D740 has the biggest change early in the simulation and then the residues settle around 8 Å from initial, while V715 shows the greatest number of large fluctuations ranging from 2-10 Å for the first 45 ns, which corresponds to the helix loosening and destabilization of the SM domain. The H707 mutant has much the same effect as the H740, but lesser amplitude (~6 Å average).
To further assess the effect of the variants on the rest of the structure, a root-mean-square-fluctuation (RMSF) per residue calculation was completed to determine which residues fluctuated the most over the entire time course of the simulation, i.e., time-averaged fluctuation ( Figure 3C). While flattened values indicate a region of lower mobility, the larger fluctuating residues indicate a more dynamic structure undergoing rapid changes that can contribute to large conformational changes. Ignoring the trailing tail ends (1-20 and 750-766) is generally prudent when considering RMSF, since the termini unsurprisingly have mobility in excess of other regions of the protein.
Mutant H740 had the largest amplitude changes (6-13 Å), which were in residues 65-75, 150, 169, 199, 338-362, 423-431, 451-452, 556, 689, and 735 ( Figure 3C). D740 is located at a partially buried sheet that is tightly neighboring nearby residues. D740 appears optimized in that position, with strong polar contacts to R471, S731, and N729. A histidine in that position makes severe clashes with those and/or other residues, and is also likely repelled by R471. To harbor a histidine, the extensive reordering of nearby regions is evidenced via the various substantial RMSF peaks in proximity to the aforementioned polar contacts. Close behind in amplitude (4-11 Å) was mutant V715, which occurred in residues 43-66, 105-107, 114, 155, 198, 257, 343-368, 406-435, 480, 535-581, 637-638, and 735-742. G715 is located at the partially buried side of a helix, and a valine in that position makes severe clashes with the side chains of Y711 and E719, as well as the backbone and C-beta of H523. The significant structural rearrangements that need to occur to accommodate a valine in that semi-buried portion of helix corroborate the largest global RMSD and significant localized RMSF peaks of that variant. Mutant H707 has only a few peaks (>6 Å) that exceed the WT graph, which occurs at positions 57-76, 144-147, 208, and 545. D707 resides on a helix with nearby residues Q704, K708, R525, and H527, notably mainly basic residues. Interesting, R46 mostly mirrors WT RMSF, but does have a few peaks with different values: position 63-65 (8.84 Å) surpasses WT (~6 Å) and 532-603 is lower than WT by~2 Å for that entire sequence, and similarly, stays flattened from 682-740. T46 is on the solvent-exposed side of a helix, with fewer residues within reach of the sidechain, and exchanging from a polar to a charged residue in a solvent-exposed and unconstrained area explains the minimal dynamical impact, shown via minimal RMSD and RMSF changes. However, T46 is one of the bHLH residues that makes direct contact with ARNT, therefore an exchange to a bulkier residue arginine is likely the etiology of the pathogenicity of this mutant. In general, regions of elevated RMSF in the variants often correspond to portions of domains that make direct contact with ARNT, as discussed in Section 3.1. The majority of these elevated peaks are not in proximity to the mutations, and therefore are impacted differentially from the motions of the WT SIM1 through an allosteric mechanism, ergo most of the deleterious effects of these variants would not be uncovered via static structural predictions, perhaps except T46R.

Particle Size Changes as a Consequence of the Variant Chosen
Another typical analysis is to examine the global structure's spatial arrangement or state of compactness. Using a radius of gyration (RoG) calculation (akin to a hydrodynamic radius), we estimate the average distance from the centroid (particle center of mass) to the edges for all atoms in the structure. The RoG can grow or shrink depending on a variety of factors.
Based on earlier observations of the interaction between the N-and C-term from apo SIM1 (Supplementary Movie S1 and Figure 5), it is not surprising to expect that the RoG would collapse over the course of the simulation when plotted versus time. This is precisely what is seen with RoG for the WT sequence (green line) ( Figure 3D), which seems to start around 41 Å and stabilize at~38 Å. Variants R46, H707, and H740 maintain a larger RoG than WT but H740 does collapse to around 40 Å after 55 ns of simulation. However, both R46 and H707 maintain larger RoG at 42 and 44 Å, respectively. Intriguingly, V715 collapses in to around 34 Å after just 25 ns of simulation, thus forming the most compact of the structures. Implications for these compact versus extended conformations may alter the ease of binding to partner proteins such as ARNT. The R46 mutant shows least local deformation but large global reorganization, which may be a function against its activity.

Stabilization of the Local Region Shifts through Hydrogen Bonding Network Disruptions (Triggering the Correlated Motion Cascade)
Based on the understanding of the local and global changes in structure, examining the shift in the local hydrogen-bonding (H-bond) network versus the WT sequence can be informative for establishing a triggering mechanism that released the conformational change. All H-bonds were measured within an 8 Å cutoff of the residue (and included the entire residue within that cutoff).
Looking at the H-bonds from WT versus R46 reveals an important difference, namely the loss of over 50% of the stabilizing H-bonds (dropped from 14 to 6 H-bonds) ( Figure 3E). This loss of H-bonds could explain how the loosened N-term would maintain a larger RoG but still have smaller peaks on RMSF, since it is more unwound but not as interactive as in the C-term variants (Supplementary Movies S2 and S4). Mutant H707 has slightly increased total average number of hydrogen bonds between 8 and 10, whereas WT in the same region is only 4-7. The V715 mutant has a similar trend to H707 with an average just over 10 and WT at 8. Mutant H740 has over 11 H-bonds on average and WT in the same region during the simulation maintains approximately 8. From this list, we can observe that R46 lost 50% while the other variants gained 20-30% hydrogen bonds during the same time interval.

Apo SIM1 Has Room to Move Forming Intra-Molecular Interactions
The effect of the dampened H-bonds in R46 carried over to the intra-domain interactions (N-term to C-term) ( Figure 5A,B), where WT has domain interaction and R46 stays extended (not shown). SIM1 residues with a possibility of interaction over the course of a very long simulation include (N-term) H119, P145, Y146, H147, S148, V151, and E153, with (C-term) Q686, T687, D690, H691, P692, and R728. Chemical crosslinking could be conducted on these as a means of abrogating ARNT binding for validation. WT has intrinsic motion to move in this way (Supplementary Movie S1), which may help facilitate binding to ARNT or other important molecules (DNA, etc.).
In order to examine the spatial relationship between variants in the C-terminal region of SIM1, we constructed an ab initio model of this region, which has not been structurally characterized to date (Figure 7). The p.G715 residue is in a helix that is facing toward solvent from the protein in the single-minded 1 C-terminal domain, which has a plethora of residues as possible interactions (Supplementary Tables S1-S3). Substitution of glycine for valine at this position leads to local increases in hydrophobicity and is predicted to disrupt helix stability over time. Validation of our hybrid model for the SIM1 through generation of a high-quality crystal structure may be useful in mapping additional variant hotspots and in generating hypotheses regarding the functional consequences of pathogenic variants that fall within the transcription regulatory domain.

Discussion
Several disease-associated variants have been identified in the C-terminal transcription regulatory region around the p.G715V variant identified in this patient, which for this study includes H707D and H740D. We also examined an N-term mutant p.T46R, which has known pathogenicity, but also offers a C-term control group. Lastly, we also studied the WT SIM1 for comparison to all variants.

Mutant T46R
The mutant T46R has pronounced difference in global structure despite minimal differences on the local conformational switching (RMSD/RMSF), but the disrupted H-bond network and highly increased RoG give good indication that the N-term is somewhat de-activated from this variant, since the protein is the most extended of all SIM1 variants. The aforementioned difference is enough that the N-to C-term dynamics are highly dampened and likely, there would be lessened DNA binding capacity or ARNT binding ( Figures 4B and 3B-E, Supplementary Movies S3 and S4). In addition, T46 makes direct contact with ARNT, likely a contributing factor as to why this variant had severe impact on ARNT binding.

Mutant H707D
The mutant H707D has a pronounced switch in the local conformational region (RMSD/RMSF), but the disrupted H-bond network is somewhat increased and the RoG is much increased from WT, which may cause the protein to have too much labile motion to properly bind ARNT, cofactors, or DNA. When comparing RMSF for the variants with the WT sequence, the mutant D707 is second most increased, coming after D740. This variant has the second largest RoG (compared with WT) Figure 7. Mapping of all relevant positions in the SM domain (C-term) and their relative position. The SM domain is colored pink in ribbons with key residues in thick licorice rendering for emphasis (carbons-orange, oxygen-red, nitrogen-blue).

Discussion
Several disease-associated variants have been identified in the C-terminal transcription regulatory region around the p.G715V variant identified in this patient, which for this study includes H707D and H740D. We also examined an N-term mutant p.T46R, which has known pathogenicity, but also offers a C-term control group. Lastly, we also studied the WT SIM1 for comparison to all variants.

Mutant T46R
The mutant T46R has pronounced difference in global structure despite minimal differences on the local conformational switching (RMSD/RMSF), but the disrupted H-bond network and highly increased RoG give good indication that the N-term is somewhat de-activated from this variant, since the protein is the most extended of all SIM1 variants. The aforementioned difference is enough that the N-to C-term dynamics are highly dampened and likely, there would be lessened DNA binding capacity or ARNT binding (Figures 3B-E and 4B, Supplementary Movies S3 and S4). In addition, T46 makes direct contact with ARNT, likely a contributing factor as to why this variant had severe impact on ARNT binding.

Mutant H707D
The mutant H707D has a pronounced switch in the local conformational region (RMSD/RMSF), but the disrupted H-bond network is somewhat increased and the RoG is much increased from WT, which may cause the protein to have too much labile motion to properly bind ARNT, cofactors, or DNA. When comparing RMSF for the variants with the WT sequence, the mutant D707 is second most increased, coming after D740. This variant has the second largest RoG (compared with WT) that may contribute to it being too extended to make decent contacts with partner proteins or ARNT binding ( Figures 1C, 3B-D,G, and 4D, and Supplementary Movies S3 and S5).

Mutant H740D
The mutant H740D has largest switch in the local conformational region (RMSD/RMSF), with some added H-bond network, and the RoG is only marginally increased from WT, but has huge spikes in RMSF for individual residues from SIM1 that might alter how ARNT binds. The motion for the N-to C-term dynamics is similar to WT but the individual residues with fluctuation give SIM1 unique conformations ( Figure 4D) that likely affect how ARNT binds ( Figures 1C and 3B-D,H, and Supplementary Movies S3 and S7).

Mutant G715V
Our structural models indicate that G715V would have a detrimental effect on the protein's function via an altered local structure that perturbs multiple regions of the structure and may perturb the ARNT binding affinity for SIM1. Mutant G715V has minimal changes in the H-bonding of the local region, but large shifts in the local RMSD that come from other interactions discussed above. These new interactions establish a very stable structure that has the lowest RoG from any model in this study. This rather compact state for G715V SIM1 would likely frustrate the ARNT binding and other partner complexes ( Figures 3B,D and 4D, and Supplementary Movies S4 and S6). While the C-terminal region of SIM1 does not mediate interactions with ARNT, ARNT2, or DNA, a variant of this region could potentially disrupt or alter recruitment of regulatory co-factors and hence affect function of the SIM1-ARNT2 heterodimer in target gene regulation. However, the identities of these co-factors remain to be determined. Future studies will examine the spatial relationship of this variant and other neighboring variants in the context of a SIM1/ARNT2 heterodimer, which may reveal novel pathological mechanisms of disease.

Conclusions
In this study, we constructed a full-length model of the SIM1: ARNT dimer. In addition, we performed MDS of apo SIM1 to examine the impact of known pathogenic variants. Of the variants discussed, only T46R makes direct contact with ARNT. Possibly, static structural predictions might explain T46R as pathogenic via alteration of a direct contact, though the disruption of the hydrogen-bonding network and N-to C-term dynamics would be impossible to discern. However, the remaining variants would not be predicted as pathogenic from a simple structural inspection. We propose that the pathogenicity of these variants derives from increasing the conformational flexibility of several regions that make direct contact with ARNT, through a dynamic and allosteric mechanism. The increase in mobility of these binding regions impedes structural pre-configuration, thus explaining the somewhat lesser impact of these variants on ARNT binding, relative to T46R.