Structural Insight into the Binding of TGIF1 to SIN3A PAH2 Domain through a C-Terminal Amphipathic Helix

TGIF1 is a transcriptional repressor playing crucial roles in human development and function and is associated with holoprosencephaly and various cancers. TGIF1-directed transcriptional repression of specific genes depends on the recruitment of corepressor SIN3A. However, to date, the exact region of TGIF1 binding to SIN3A was not clear, and the structural basis for the binding was unknown. Here, we demonstrate that TGIF1 utilizes a C-terminal domain (termed as SIN3A-interacting domain, SID) to bind with SIN3A PAH2. The TGIF1 SID adopts a disordered structure at the apo state but forms an amphipathic helix binding into the hydrophobic cleft of SIN3A PAH2 through the nonpolar side at the holo state. Residues F379, L382 and V383 of TGIF1 buried in the hydrophobic core of the complex are critical for the binding. Moreover, homodimerization of TGIF1 through the SID and key residues of F379, L382 and V383 was evidenced, which suggests a dual role of TGIF1 SID and a correlation between dimerization and SIN3A-PAH2 binding. This study provides a structural insight into the binding of TGIF1 with SIN3A, improves the knowledge of the structure–function relationship of TGIF1 and its homologs and will help in recognizing an undiscovered SIN3A-PAH2 binder and developing a peptide inhibitor for cancer treatment.

Sin3A is a general transcription co-regulatory factor and commonly acts as a scaffold protein assembling the DNA-binding TFs, HDACs and other chromatin-modifying enzymes into a large complex, which facilitates transcription repression through changing the chromatin compaction [36,37]. In recent years, emerging roles of SIN3A in cancer development have been revealed [38]. Sin3A contains five defined domains including three paired amphipathic helix (PAH) domains, an HDAC-interaction domain (HID) and a C-terminal domain. Therein, PAH2 is a major docking site of diverse TFs for mediating SIN3A-targeted regulation of specific gene transcription [39][40][41][42]. Peptides and small molecule inhibitors for SIN3A PAH2 have been designed to block its binding with TFs and inhibit triple-negative breast cancer (TNBC) cell metastasis [43][44][45]. The complex structures of SIN3A PAH2 binding with Sin3A-interaction domains (SIDs) of Mad1, HBP1 and Pf1 have been solved, respectively [40,46,47]. In these structures, SIN3A PAH2 adopts a similar four-helix bundle fold with a hydrophobic cleft, into which the single-helix SID of Mad1, HBP1 or Pf1 can insert. However, as the sequence conservation of the known SIDs is low, it is hard to predict the location of the SID in other TFs, such as TGIF1, through sequence analysis.
In this study, we applied NMR spectroscopy, structure stimulation and biochemical methods to investigate the structural basis of TGIF1 for binding with SIN3A PAH2. The results reveal that the TGIF1 RD2 truncation covering residues M256-A401 was structurally disordered and adopted a monomer in solution. Therein, the region harboring residues F376-E394 contributed to the binding of SIN3A as the SID. Structure stimulation manifested that TGIF1 SID formed an amphipathic helix binding into the hydrophobic cleft of SIN3A PAH2. Furthermore, F379, L382 and V383 were identified as the key residues of TGIF1 SID for SIN3A-PAH2 binding through site-directed mutagenesis combined with yeast two-hybrid (Y2H) assay. Interestingly, although TGIF1 RD2 was not observed to dimerize in solution, Y2H assay indicated that it can dimerize via the SID in a cell, implying a potential dual role of TGIF1 SID and a relationship between dimerization and SIN3A binding of TGIF1.

TGIF1 256-401 Adopts an Intrinsically Disordered Structure
In order to understand the structural basis of TGIF1 for recruiting corepressor SIN3A, a recombinant TGIF1 truncation, TGIF1 256-401 , which covers RD2, was produced from E. coli. TGIF1 256-401 existed in the inclusion body of bacteria lysate, different from the TGIF1 256-375 truncation previously reported to exist in the supernatant of bacteria lysate [32], probably because more than half of the residues from F376 to A401 are hydrophobic and tend to intermolecularly aggregate. Sequence alignment among the TGIF1 homologs in different vertebrates manifested that this region is highly conserved in sequence ( Figure 1B).
The collected circular dichroism (CD) spectra of TGIF1 256-375 and TGIF1 256-401 similarly showed a significant negative peak near 200 nm, indicating an intrinsically disordered structure of both TGIF1 truncations (Figure 2A). NMR methods have unique advantages in studying intrinsically disordered proteins (IDPs), such as that they can provide residuelevel structural information. Thus, 15 N-labeled TGIF1 256-375 and TGIF1 256-401 were, respectively, prepared for the NMR study. The 1 H-15 N HSQC spectra of both TGIF1 256-375 and TGIF1 256-401 showed a narrow distribution of cross-peaks in the 1 H dimension (concentrated between 7.7 and 8.7 ppm), consistent with the disordered structure ( Figure 2B). The signal intensities of the cross-peaks were uniform, and the peak number was consistent with the residue number of each TGIF1 truncation, suggesting that they have a relatively stable conformation in solution. When the two spectra were overlaid, most cross-peaks of TGIF1 256-375 and TGIF1 256-401 were well overlapped except those from the additional residues F376-A401, indicating that these residues do not have significant intramolecular interaction with the residues M256-D375. Taken together, these CD and NMR data revealed an intrinsically disordered structure of TGIF1 256-401 .
signal intensities of the cross-peaks were uniform, and the peak number was consistent with the residue number of each TGIF1 truncation, suggesting that they have a relatively stable conformation in solution. When the two spectra were overlaid, most cross-peaks of TGIF1256-375 and TGIF1256-401 were well overlapped except those from the additional residues F376-A401, indicating that these residues do not have significant intramolecular interaction with the residues M256-D375. Taken together, these CD and NMR data revealed an intrinsically disordered structure of TGIF1256-401.

TGIF1376-401 Plays a Major Role in Binding to SIN3A PAH2
The binding interface of TGIF1 for SIN3A PAH2 was subsequently investigated using NMR titration of 15 N-labeled TGIF1256-375 or TGIF1256-401 with non-labeled SIN3A PAH2. For TGIF1256-375, no significant change in the signal intensity and position of the crosspeaks occurred when the molar ratio of TGIF1256-375 vs. SIN3A PAH2 was increased from 1:0 to 1:2 ( Figure 3A), suggesting that SIN3A PAH2 did not or fairly weakly interact with TGIF1256-375 under the condition. In contrast, when the molar ratio of TGIF1256-401 vs. SIN3A PAH2 was gradually increased from 1:0 to 1:2, although no obvious position migration of the cross-peaks was observed, the signal intensities of many cross-peaks significantly decreased and eventually disappeared ( Figure 3B), which undoubtedly meant a substantial binding of TGIF1256-401 with SIN3A PAH2. Specifically, the disappeared residues were concentrated at the region covering F376 to A401, indicating that the principal binding interface of TGIF1256-401 for SIN3A PAH2 should be located in the C-terminal region harboring residues F376-A401.
At the same time, a Y2H experiment between SIN3A PAH2 and different TGIF1 truncations was performed, in which SIN3A PAH2 was fused with the BD domain of yeast GAL4 and TGIF1 truncations were fused with the GAL4 AD domain. The result shows that the yeast cells expressing BD-SIN3A-PAH2 and AD-TGIF1256-401 could grow normally on SD4 (lack of Ade, His, Leu and Trp components) medium, whereas those expressing BD-SIN3A-PAH2 and AD-TGIF1256-375 could not ( Figure 3C), confirming the dominant role of residues F376-A401 for binding to SIN3A PAH2.  The binding interface of TGIF1 for SIN3A PAH2 was subsequently investigated using NMR titration of 15 N-labeled TGIF1 256-375 or TGIF1 256-401 with non-labeled SIN3A PAH2. For TGIF1 256-375 , no significant change in the signal intensity and position of the crosspeaks occurred when the molar ratio of TGIF1 256-375 vs. SIN3A PAH2 was increased from 1:0 to 1:2 ( Figure 3A), suggesting that SIN3A PAH2 did not or fairly weakly interact with TGIF1 256-375 under the condition. In contrast, when the molar ratio of TGIF1 256-401 vs. SIN3A PAH2 was gradually increased from 1:0 to 1:2, although no obvious position migration of the cross-peaks was observed, the signal intensities of many cross-peaks significantly decreased and eventually disappeared ( Figure 3B), which undoubtedly meant a substantial binding of TGIF1 256-401 with SIN3A PAH2. Specifically, the disappeared residues were concentrated at the region covering F376 to A401, indicating that the principal binding interface of TGIF1 256-401 for SIN3A PAH2 should be located in the C-terminal region harboring residues F376-A401.
At the same time, a Y2H experiment between SIN3A PAH2 and different TGIF1 truncations was performed, in which SIN3A PAH2 was fused with the BD domain of yeast GAL4 and TGIF1 truncations were fused with the GAL4 AD domain. The result shows that the yeast cells expressing BD-SIN3A-PAH2 and AD-TGIF1 256-401 could grow normally on SD4 (lack of Ade, His, Leu and Trp components) medium, whereas those expressing BD-SIN3A-PAH2 and AD-TGIF1 256-375 could not ( Figure 3C), confirming the dominant role of residues F376-A401 for binding to SIN3A PAH2.

TGIF1376-401 Sequence Is Conserved among TGIF1 Homologs and Exhibits Similarity to SIDs of Other TFs
As mentioned earlier, the sequences of residues F376-A401 among the TGIF1 homologs in different vertebrates are highly conserved. Considering that this region of human TGIF1 plays a major role in binding to SIN3A PAH2, the binding of SIN3A should be an evolutionally conserved function of TGIF1 for transcription regulation. Meanwhile, the sequence similarity of TGIF1 F376-A401 with the previously identified SIDs of Mad1, HBP1 and Pf1 was analyzed by sequence alignment, which manifested an eight-residue motif with moderate similarity and suggested that the residues F376-E394 were probably sufficient for SIN3A-PAH2 binding ( Figure 4A). Moreover, the consensus sequence of the eight-residue motifs could be summarized as ϕ-x-x-L-ϕ-x-ϕ-A, wherein "ϕ" represents hydrophobic residue and "x" represents non-conserved residue. Inferred by the sequence similarity, TGIF1 SID may employ a similar binding model for SIN3A PAH2 as the other  15 N-labeled TGIF1 256-401 mixed with non-labeled SIN3A PAH2 at molar ratios of 1:0 (red), 1:0.5 (cyan), 1:1 (orange) and 1:2 (blue). (C) Y2H experiments between SIN3A PAH2 and TGIF1 truncations. Yeast transformants harboring both ADand BD-derived constructs were grown on SD2 (-Trp/-Leu) medium for growth control and SD4 (-Trp/-Leu/-His/-Ade) medium for the interaction test.

TGIF1 376-401 Sequence Is Conserved among TGIF1 Homologs and Exhibits Similarity to SIDs of Other TFs
As mentioned earlier, the sequences of residues F376-A401 among the TGIF1 homologs in different vertebrates are highly conserved. Considering that this region of human TGIF1 plays a major role in binding to SIN3A PAH2, the binding of SIN3A should be an evolutionally conserved function of TGIF1 for transcription regulation. Meanwhile, the sequence similarity of TGIF1 F376-A401 with the previously identified SIDs of Mad1, HBP1 and Pf1 was analyzed by sequence alignment, which manifested an eight-residue motif with moderate similarity and suggested that the residues F376-E394 were probably sufficient for SIN3A-PAH2 binding ( Figure 4A). Moreover, the consensus sequence of the eight-residue motifs could be summarized as φ-x-x-L-φ-x-φ-A, wherein "φ" represents hydrophobic residue and "x" represents non-conserved residue. Inferred by the sequence similarity, TGIF1 SID may employ a similar binding model for SIN3A PAH2 as the other three SIDs. Regrettably, the cross-peaks of TGIF1 SID in complex with Sin3A PAH2 showed broadened signals and were even unable to be detected in the NMR spectrum, preventing the direct structure determination of holo TGIF1 SID and the complex.
three SIDs. Regrettably, the cross-peaks of TGIF1 SID in complex with Sin3A PAH2 showed broadened signals and were even unable to be detected in the NMR spectrum, preventing the direct structure determination of holo TGIF1 SID and the complex. The structure model of TGIF1-SID/SIN3A-PAH2 complex is shown in cartoon representation with secondary structure elements labeled respectively. TGIF1 SID is colored in red (helix) and green (loop), while SIN3A PAH2 is colored in cyan (helix) and pink (loop). The N-and C-termini are labeled with red and cyan fonts for TGIF1 SID and SIN3A PAH2, respectively. (C) The structure model of TGIF1-SID/SIN3A-PAH2 complex is shown with the hydrophobic level displayed from high (red) to low (white). TGIF1 SID is shown in cartoon tube representation with the side chains of the residues interacting with SIN3A PAH2 displayed as sticks and labeled with residue number. SIN3A PAH2 is shown in surface representation with four helices labeled. (D) Helical wheel representation of 18 residues (F376-M393) in TGIF1 SID. Polar residues are shown in cyan circles, while nonpolar residues are in magenta circles. The underlined residues have contact with SIN3A PAH2 in the complex structure model. (E) and (F) are the side-chain interactions between TGIF1 SID (backbone in magenta) and SIN3A PAH2 (backbone in cyan). The side chains of TGIF1-SID residues involved in interaction are shown as sticks with carbon colored in green, and those of Sin3A PAH2 are shown as sticks with carbon colored in orange. Non-interacting parts of Sin3A PAH2 were omitted for clarity. Hydrogen bond and salt bridge are shown with black dashed lines.

Structure Model of TGIF1-SID/SIN3A-PAH2 Complex Reveals the Interaction Pattern
In order to obtain the detailed interaction mechanism, the complex structure model of TGIF1 SID and SIN3A PAH2 was established by multi-step molecular modeling ( Figure  4B). In the complex structure, SIN3A PAH2 adopts a four-helix bundle structure with a deep hydrophobic pocket formed between helices α1 and α2. TGIF1 SID (F376-E394) adopts a single amphipathic α-helix with one side full of hydrophobic residues and the other side mainly containing hydrophilic residues. The helix binds to the hydrophobic pocket of SIN3A PAH2 with the hydrophobic side ( Figure 4C,D). The overall complex structure is similar to the structures of SIN3A PAH2, respectively, in complex with SIDs The structure model of TGIF1-SID/SIN3A-PAH2 complex is shown in cartoon representation with secondary structure elements labeled respectively. TGIF1 SID is colored in red (helix) and green (loop), while SIN3A PAH2 is colored in cyan (helix) and pink (loop). The N-and C-termini are labeled with red and cyan fonts for TGIF1 SID and SIN3A PAH2, respectively. (C) The structure model of TGIF1-SID/SIN3A-PAH2 complex is shown with the hydrophobic level displayed from high (red) to low (white). TGIF1 SID is shown in cartoon tube representation with the side chains of the residues interacting with SIN3A PAH2 displayed as sticks and labeled with residue number. SIN3A PAH2 is shown in surface representation with four helices labeled. (D) Helical wheel representation of 18 residues (F376-M393) in TGIF1 SID. Polar residues are shown in cyan circles, while nonpolar residues are in magenta circles. The underlined residues have contact with SIN3A PAH2 in the complex structure model. (E,F) are the side-chain interactions between TGIF1 SID (backbone in magenta) and SIN3A PAH2 (backbone in cyan). The side chains of TGIF1-SID residues involved in interaction are shown as sticks with carbon colored in green, and those of Sin3A PAH2 are shown as sticks with carbon colored in orange. Non-interacting parts of Sin3A PAH2 were omitted for clarity. Hydrogen bond and salt bridge are shown with black dashed lines.

Structure Model of TGIF1-SID/SIN3A-PAH2 Complex Reveals the Interaction Pattern
In order to obtain the detailed interaction mechanism, the complex structure model of TGIF1 SID and SIN3A PAH2 was established by multi-step molecular modeling ( Figure 4B). In the complex structure, SIN3A PAH2 adopts a four-helix bundle structure with a deep hydrophobic pocket formed between helices α1 and α2. TGIF1 SID (F376-E394) adopts a single amphipathic α-helix with one side full of hydrophobic residues and the other side mainly containing hydrophilic residues. The helix binds to the hydrophobic pocket of SIN3A PAH2 with the hydrophobic side ( Figure 4C,D). The overall complex structure is similar to the structures of SIN3A PAH2, respectively, in complex with SIDs of Mad1, HBP1 and Pf1. Interestingly, CD and NMR data suggested a major disordered structure of TGIF1 SID at the apo state, implying a potential disorder-to-order transition of TGIF1 SID upon SIN3A-PAH2 binding. This is similar to the situations of SIDs of Mad1, HBP1 and Pf1, suggesting a common behavior for PAH2 binders.
The side-chain conformations of TGIF1 SID and SIN3A PAH2 in the complex structure model were well defined, enabling a detailed analysis of interaction at residue level. The residues in the α1 and α2 helices are dominant in the pocket of SIN3A PAH2 for TGIF1-SID binding, while several residues in the α3 and α4 helices are also involved. The pocket floor is defined by the side chains of A307, Y310, L332, Y335, E355, V358, F376, F379 and L380, which conduct hydrophobic interactions with the N-terminus of TGIF1 SID including residues F376, F379, L382 and L383 ( Figure 4E). The pocket edge of SIN3A PAH2 is defined by a set of residues involving F304, I308, V311, K315, Y325, K326, L329, H333, Q336 and Q339 from the α1 and α2 helices ( Figure 4F). F304, I308, V311, K315, Y325, L329 and H333 make hydrophobic interactions with L381, V385, A386, L387, A390 and M393 of TGIF1. Q336 and Q339 of SIN3A make polar interactions with TGIF1 Q380 with a hydrogen bond formed between SIN3A Q336 and TGIF1 Q380. SIN3A K326 shows an electrostatic interaction with TGIF1 E394 with a salt bridge formed between their side chains. Importantly, F379, L382 and L383 of TGIF1 are deeply embedded in the pocket of SIN3A PAH2, such that they appear to be an indispensable part of the hydrophobic core of the complex. Overall, TGIF1 SID recognizes SIN3A PAH2 through an amphipathic helix in the manner of hydrophobic interaction with the nonpolar side. Generally, the interaction pattern in the structure model of the TGIF1-SID/Sin3A-PAH2 complex conferred by the conserved residues in the consensus motif "φ-x-x-L-φ-xφ-A" is similar to those in the complex structures of Sin3A-PAH2/Mad1-SID, Sin3A-PAH2/HBP1-SID and Sin3A-PAH2/Pf1-SID. Nevertheless, the interaction details mediated by the non-conserved residues in TGIF1 SID vary a lot compared to those in the other three complexes. In order to further evaluate the importance of an individual TGIF1 residue at the binding interface for SIN3A PAH2, 13 residues of TGIF1 256-401 with big side chains were mutated into alanine, and the resulting mutants were subsequently tested for SIN3A-PAH2 binding through a Y2H experiment. Among the 13 mutants, F379A, L382A and V383A cannot bind with SIN3A PAH2, as the yeast cells co-expressing these mutants and SIN3A PAH2 could only grow on SD2 medium but not on SD4 medium ( Figure 5). Other mutants did not show obviously disrupted binding of SIN3A PAH2. These data indicate that F379, L382 and V383 are crucial for binding to SIN3A PAH2, consistent with the complex structure model, in which F379, L382 and V383 are an indispensable part of the complex hydrophobic core. Similarly, mutations of L370 and M373 of HBP1 and F210 and L213 of Pf1, which correspond to F379 and L382 of TGIF1 in sequence alignment, also impair the binding with SIN3A PAH2 [40,47], confirming a common mechanism. However, the key role of the residue at the position corresponding to TGIF1 V383 in the consensus motif was evidenced for the first time in this study, calling for more investigations of the residues at this position in SIDs of other SIN3A-PAH2 binders.

TGIF1 SID Mediates Homodimerization of TGIF1 in Cell
A previous study reported that the C-terminal part (residues F237-A401) of TGIF1 mediates its homodimerization in human 293 cells [48]. Thus, multi-angle light scattering (MALS) coupled with size-exclusion chromatography (SEC) was carried out for recombinant TGIF1256-401 to determine whether it existed as a homodimer in solution. The results show a single peak in the chromatography with a calculated average molecular weight (MW) of 16.03 × 10 3 Da (±0.618%), which was close to the theoretical MW (15.63 kDa) of the TGIF1256-401 monomer ( Figure 6A). This indicates that TGIF1256-401 exists as a monomer in solution instead of a homodimer, which may be due to the disordered structure of apostate TGIF1256-401 that is insufficient for homodimer formation. At the same time, a Y2H assay was used to investigate the homodimerization of TGIF1256-401. The result manifested that TGIF1256-401 can homodimerize in a yeast cell, as the cells co-expressing AD-TGIF1256-401 and BD-TGIF1256-401 grew well on SD4 medium ( Figure 6B). Furthermore, the yeast cells co-expressing AD-TGIF1256-375 and BD-TGIF1256-401 could not grow on SD4 medium ( Figure  6B), indicating that the residues F376-A401 essential for binding to SIN3A PAH2 are also crucial for homodimerization of TGIF1 in a cell. The inconsistent results of MALS and Y2H may be due to the different states of TGIF1 in the two experiments.
Given that the region covering residues F376-A401 mediates the homodimerization of TGIF1 in yeast cells, the key residues for homodimerization were subsequently identified by Y2H employing the 13 AD-TGIF1256-401 mutants which were used in the Y2H assay with SIN3A PAH2. Among the 13 mutants, F379A, L382A and V383A could not homodimerize as the yeast cells co-expressing these mutants and BD-TGIF1256-401 could only grow on SD2 medium but not on SD4 medium ( Figure 6C). Other mutants did not show obviously disrupted homodimerization. These data indicate that the residues F379, L382 and V383 are crucial for homodimerization, which are identical to those crucial for binding with SIN3A PAH2. Taken together, these results suggest that TGIF1-SID plays a dual role of mediating the dimerization and the binding with SIN3A.

TGIF1 SID Mediates Homodimerization of TGIF1 in Cell
A previous study reported that the C-terminal part (residues F237-A401) of TGIF1 mediates its homodimerization in human 293 cells [48]. Thus, multi-angle light scattering (MALS) coupled with size-exclusion chromatography (SEC) was carried out for recombinant TGIF1 256-401 to determine whether it existed as a homodimer in solution. The results show a single peak in the chromatography with a calculated average molecular weight (MW) of 16.03 × 10 3 Da (±0.618%), which was close to the theoretical MW (15.63 kDa) of the TGIF1 256-401 monomer ( Figure 6A). This indicates that TGIF1 256-401 exists as a monomer in solution instead of a homodimer, which may be due to the disordered structure of apo-state TGIF1 256-401 that is insufficient for homodimer formation. At the same time, a Y2H assay was used to investigate the homodimerization of TGIF1 256-401 . The result manifested that TGIF1 256-401 can homodimerize in a yeast cell, as the cells co-expressing AD-TGIF1 256-401 and BD-TGIF1 256-401 grew well on SD4 medium ( Figure 6B). Furthermore, the yeast cells co-expressing AD-TGIF1 256-375 and BD-TGIF1 256-401 could not grow on SD4 medium ( Figure 6B), indicating that the residues F376-A401 essential for binding to SIN3A PAH2 are also crucial for homodimerization of TGIF1 in a cell. The inconsistent results of MALS and Y2H may be due to the different states of TGIF1 in the two experiments.
Given that the region covering residues F376-A401 mediates the homodimerization of TGIF1 in yeast cells, the key residues for homodimerization were subsequently identified by Y2H employing the 13 AD-TGIF1 256-401 mutants which were used in the Y2H assay with SIN3A PAH2. Among the 13 mutants, F379A, L382A and V383A could not homodimerize as the yeast cells co-expressing these mutants and BD-TGIF1 256-401 could only grow on SD2 medium but not on SD4 medium ( Figure 6C). Other mutants did not show obviously disrupted homodimerization. These data indicate that the residues F379, L382 and V383 are crucial for homodimerization, which are identical to those crucial for binding with SIN3A PAH2. Taken together, these results suggest that TGIF1-SID plays a dual role of mediating the dimerization and the binding with SIN3A.

Discussion
The homeodomain protein TGIF1 is a transcriptional repressor playing crucial roles in human development and function and is associated with HPE and various cancers. TGIF1-directed transcriptional repression of specific genes depends on the recruitment of corepressor SIN3A and subsequent HDACs to alter the chromatin accessibility, which is mediated by the previously designated RD2 of TGIF1. However, due to the inconsistent results of previous studies [21,22], to date, the exact region of TGIF1 for binding to SIN3A was not clear, and the structural basis for the binding was unknown. Here, we evidence that the residues F376-E394 of TGIF1 (designated as TGIF1 SID) play a pivotal role in binding to SIN3A PAH2 and establish a structural model for the TGIF1-SID/SIN3A-PAH2 complex. Interestingly, we reveal that TGIF1 undergoes homodimerization through the SID in a cell, implying a dual role of the SID.

Disorder-to-Order Transition of TGIF1 SID upon Binding to SIN3A PAH2
Previous NMR studies evidenced that the region covering residues M256-D375 of TGIF1 is disordered at the apo state in solution [31,32]. On the basis of these studies, we further evidence that the residues F376-A401 are also disordered at the apo state in this study. On the other hand, molecular modeling manifested an α-helix structure of TGIF1 F376-E394. Similarly, an α-helix conformation of TGIF1 F376-A401 was predicted in the structure model of TGIF1 by AlphaFold2 recently and was recorded with the ID of AF-Q15583-F1 in the AlphaFold protein structure database [49]. It seems that there is an inconsistency between the experiment and prediction results. However, the predicted αhelix structure of TGIF1 SID is highly possibly due to the fact that its homologous sequences with a solved helical structure are largely at the holo state, such as the SIDs of Mad1, HBP1 and Pf1, which provide the major reference information for the structure predictions by I-TASSER and AlphaFold. In fact, Mad1-SID and HBP1-SID are disordered

Discussion
The homeodomain protein TGIF1 is a transcriptional repressor playing crucial roles in human development and function and is associated with HPE and various cancers. TGIF1-directed transcriptional repression of specific genes depends on the recruitment of corepressor SIN3A and subsequent HDACs to alter the chromatin accessibility, which is mediated by the previously designated RD2 of TGIF1. However, due to the inconsistent results of previous studies [21,22], to date, the exact region of TGIF1 for binding to SIN3A was not clear, and the structural basis for the binding was unknown. Here, we evidence that the residues F376-E394 of TGIF1 (designated as TGIF1 SID) play a pivotal role in binding to SIN3A PAH2 and establish a structural model for the TGIF1-SID/SIN3A-PAH2 complex. Interestingly, we reveal that TGIF1 undergoes homodimerization through the SID in a cell, implying a dual role of the SID.

Disorder-to-Order Transition of TGIF1 SID upon Binding to SIN3A PAH2
Previous NMR studies evidenced that the region covering residues M256-D375 of TGIF1 is disordered at the apo state in solution [31,32]. On the basis of these studies, we further evidence that the residues F376-A401 are also disordered at the apo state in this study. On the other hand, molecular modeling manifested an α-helix structure of TGIF1 F376-E394. Similarly, an α-helix conformation of TGIF1 F376-A401 was predicted in the structure model of TGIF1 by AlphaFold2 recently and was recorded with the ID of AF-Q15583-F1 in the AlphaFold protein structure database [49]. It seems that there is an inconsistency between the experiment and prediction results. However, the predicted α-helix structure of TGIF1 SID is highly possibly due to the fact that its homologous sequences with a solved helical structure are largely at the holo state, such as the SIDs of Mad1, HBP1 and Pf1, which provide the major reference information for the structure predictions by I-TASSER and AlphaFold. In fact, Mad1-SID and HBP1-SID are disordered at the apo state [40]. Thus, the predicted conformation of TGIF1 SID reasonably represents its holo state, which can be docked to SIN3A PAH2 with a high score. Similar to Mad1 SID and HBP1 SID, TGIF1 SID should undergo a disorder-to-order transition between apo and holo states. These results appeal for a cautious consideration when researchers deal with a new structure model predicted by software such as I-TASSER and AlphaFold, as proteins may adopt an experimentally undiscovered conformation, which, currently, remains not included in the reference data pool of these types of software, although they have shown remarkable success in independent assessments of accuracy.

Information from the Interaction Pattern of TGIF1 SID with SIN3A PAH2
Following the determination of TGIF1 F376-A401 as the major binding region of SIN3A PAH2, a complex structure model was obtained through molecular modeling in this study, which identified F376-E394 as the SID of TGIF1 and manifested a moderately conserved binding pattern between TGIF1 SID and SIN3A PAH2. Hydrophobic interactions conducted by the N-terminal F379, L382 and V383 residues of TGIF1 SID with the pocket floor of SIN3A PAH2 provide the pivotal contribution for the binding of the two proteins, as mutation of either of the three residues can abolish the binding. Sequence alignment of TGIF1 SID with SIDs of Mad1, HBP1 and Pf1 indicates that the three residues in SIDs are conserved. Moreover, previous structure studies revealed that the corresponding three residues in Mad1, HBP1 and Pf1 all make critical interactions with SIN3A PAH2, although the role of the residue corresponding to TGIF1 V383 was not subjected to mutation analysis [40,46,47]. On the other hand, F376 at the N-terminal end of TGIF1 SID and those residues interacting with the pocket edge of SIN3A PAH2 may play an auxiliary role to enhance the binding. Corresponding residues in Mad1, HBP1 and Pf1 are variable in amino acid type, indicating a specificality of the interaction pattern for an individual SIN3A binder. In general, the SIN3A-PAH2 binders have a consensus sequence of φ-x-x-L-φ-x-φ-A, and all form an amphipathic helix at the holo state.
TGIF1 and SIN3A are associated with many cancers. A decoy peptide containing the MAD1-SID sequence showed inhibitory activity to TNBC cells [43,45]. Identification of the TGIF1-SID sequence enriches the SID sequence pool that can be referred to when the decoy peptide targeting SIN3A is further developed in the future. Moreover, the structure model of the TGIF1-SID/SIN3A-PAH2 complex provides important knowledge for reasonable modification to increase the binding affinity of the decoy peptide. Thus, this work will help in the development of anti-cancer drugs for the clinical treatment of TGIF1-related disease.

Correlation between Homodimerization and SIN3A Binding of TGIF1
A previous study focusing on the relationship of TGIF1 and HPE showed that the C-terminal part of TGIF1 harboring residues F237-A401, which largely belong to the designated RD2 domain, mediates the formation of a TGIF1 homodimer using coimmunoprecipitation assay in human 293 cells [48]. The homodimerization of TGIF1 was reproduced using Y2H assay for a TGIF1 256-401 truncation in this study, with the identifications of the exact region and key residues for homodimerization. However, recombinant TGIF1 256-401 produced from E. coli existed as a monomer in solution as evidenced by the MALS-SEC assay. This draws a speculation that the homodimerization of TGIF1 in a cell is prompted by some factors not identified currently, such as post-translation modification. On the other hand, the intrinsically disordered structure of TGIF1 256-401 may not be able to provide an effective homodimerization interface. The fact that recombinant TGIF1 256-401 fell into the inclusion body during expression in E. coli, suggested that it conducted irregularly intermolecular interaction leading to insoluble aggregation. Maybe post-translation modification or environment in a eukaryotic cell can remodel the conformation of TGIF1 256-401 to form an interface for homodimerization. Considering that homodimerization and SIN3A binding utilize the same domain and key residues, there should be an antagonism between the two behaviors of TGIF1, while which one is domi-nant must depend on whether it has higher binding affinity. However, it is also very likely that TGIF1 256-401 forms dimers indirectly through binding to some adaptor protein. SIN3A may be a candidate for the adaptor protein, given that the region and key residues for TGIF1 homodimerization in a cell and binding with SIN3A are identical and there is a conserved SIN3A homolog in yeast. Currently, the role of TGIF1 homodimerization is not clear. If TGIF1 homodimerization is achieved indirectly through binding to SIN3A, it will be an additional effect of complex assembly. If TGIF1 homodimerizes directly through the SID or indirectly through binding to an adaptor protein other than SIN3A, there will be a competition between the homodimerization and SIN3A binding, which leads to a negative role in TGIF1/SIN3A-mediated transcription repression. In any event, the role of TGIF1 homodimerization needs further investigation.

Sequence and Function Comparison of Human TGIF Homologs
The human genome has four TGIF genes encoding TGIF1, TGIF2, TGIF2LX and TGIF2LY proteins, respectively. TGIF2 functions redundantly with TGIF1 in the transcription repression of many genes [13,27,50,51], while the functions of TGIF2LX and TGIF2LY remain obscure. Sequence alignment revealed that only one functional domain, which is the homeodomain, is unanimously included by the four TGIF proteins (Figure 7). TGIF1, TGIF2 and TGIF2LX but not TGIF2LY have the SID for binding to SIN3A PAH2, suggesting that the three proteins should utilize a similar mechanism for transcription regulation, that is, recruiting the corepressor SIN3A. Consistently with this, TGIF2 can recruit HDAC1 for transcription repression [52], which is reasonably considered to be dependent on recruiting SIN3A at first, according to current knowledge [36,37]. An Fbxw7-targeted motif can also be found in TGIF2, implying a potentially Fbxw7-mediated turnover of TGIF2 through a ubiquitin-dependent degradation pathway. However, the "PLDLS" motif for recruiting CtBP1/2 and the RD2a of TGIF1 are not contained in TGIF2, TGIF2LX and TGIF2LY, suggesting the unique molecular functions of TGIF1 among the four proteins. it has higher binding affinity. However, it is also very likely that TGIF1256-401 forms dimers indirectly through binding to some adaptor protein. SIN3A may be a candidate for the adaptor protein, given that the region and key residues for TGIF1 homodimerization in a cell and binding with SIN3A are identical and there is a conserved SIN3A homolog in yeast. Currently, the role of TGIF1 homodimerization is not clear. If TGIF1 homodimerization is achieved indirectly through binding to SIN3A, it will be an additional effect of complex assembly. If TGIF1 homodimerizes directly through the SID or indirectly through binding to an adaptor protein other than SIN3A, there will be a competition between the homodimerization and SIN3A binding, which leads to a negative role in TGIF1/SIN3A-mediated transcription repression. In any event, the role of TGIF1 homodimerization needs further investigation.

Sequence and Function Comparison of Human TGIF Homologs
The human genome has four TGIF genes encoding TGIF1, TGIF2, TGIF2LX and TGIF2LY proteins, respectively. TGIF2 functions redundantly with TGIF1 in the transcription repression of many genes [13,27,50,51], while the functions of TGIF2LX and TGIF2LY remain obscure. Sequence alignment revealed that only one functional domain, which is the homeodomain, is unanimously included by the four TGIF proteins (Figure 7). TGIF1, TGIF2 and TGIF2LX but not TGIF2LY have the SID for binding to SIN3A PAH2, suggesting that the three proteins should utilize a similar mechanism for transcription regulation, that is, recruiting the corepressor SIN3A. Consistently with this, TGIF2 can recruit HDAC1 for transcription repression [52], which is reasonably considered to be dependent on recruiting SIN3A at first, according to current knowledge [36,37]. An Fbxw7-targeted motif can also be found in TGIF2, implying a potentially Fbxw7-mediated turnover of TGIF2 through a ubiquitin-dependent degradation pathway. However, the "PLDLS" motif for recruiting CtBP1/2 and the RD2a of TGIF1 are not contained in TGIF2, TGIF2LX and TGIF2LY, suggesting the unique molecular functions of TGIF1 among the four proteins.

Production of Recombinant Proteins
The coding DNA fragments of TGIF1 256-375 , TGIF1 256-401 and SIN3A PAH2 (residues S295-N384) were cloned from human HeLa cell and inserted into a modified pET32 vector, which allows a recombinant expression of individual protein fused with a purification tag of "MHHHHHHSSGLVPRGS". After DNA sequencing, the plasmids containing the coding sequences were respectively transformed into E. coli Rosetta (DE3) cell for inducing protein expression using a similar method as described previously [32].
The purification of TGIF1 256-375 was carried out as previously described [32]. For TGIF1 256-401 purification, following collection by centrifugation, E. coli cells were resuspended in solution A (20 mM Tris-HCl, 500 mM NaCl, 1 mM phenylmethylsulfonyl fluoride (PMSF), pH 7.8) for lysis by sonication. After high-speed centrifugation, the supernatant was discarded, and the pellet was washed two times with solution B (20 mM Tris-HCl, 500 mM NaCl, 2 mM EDTA, pH 7.8), followed by two times of washing with Milli-Q water. Subsequently, the pellet was solved using solution C (20 mM Tris-HCl, 500 mM NaCl, 6 M guanidine hydrochloride, pH 7.8) and clarified by high-speed centrifugation and filtration. The clarified solution was loaded onto an ÄKTAxpress™ chromatography system (GE Healthcare, Boston, MA, US) equipped with a Ni-affinity column (HisTrap IMAC HP™ column, 5 mL). The TGIF1 256-401 protein was eluted with solution D (20 mM Tris-HCl, 500 mM NaCl, 6 M guanidine hydrochloride, 250 mM imidazole, at pH 7.8) and then dialyzed against solution E (20 mM Tris-HCl, 500 mM NaCl, pH 7.8) for several times. The purification tag was removed by thrombin through incubation at 20 • C for 3 h. The resulting TGIF1 256-401 with only two additional residues of "GS" at its N-terminus was further purified through gel filtration chromatography, using an NGC chromatography system (Bio-Rad, Hercules, CA, US) equipped with a HiLoad 26/60 Superdex 75 column (GE Healthcare). The purified protein was concentrated to a final concentration of 0.4 mM for NMR study in solution F (90% H 2 O/10% D 2 O (v/v), 20 mM HEPES, 80 mM NaCl, 2 mM dithiothreitol (DTT), 0.05% NaN 3 , pH 6.4).
For purification of SIN3A PAH2, the harvested E. coli cells were resuspended in solution G (20 mM Tris-HCl, 100 mM NaCl, 1 mM PMSF, pH 8.0) for lysis by sonication. The supernatant was clarified by centrifugation and filtration and then loaded onto the ÄKTAxpress™ system equipped with a Ni-affinity column (HisTrap IMAC HP™ column, 5 mL). The protein was eluted with solution H (20 mM Tris-HCl, 100 mM NaCl, 250 mM imidazole, at pH 8.0) and dialyzed against solution I (20 mM Tris-HCl, 100 mM NaCl, pH 8.0). Subsequently, the purification tag was removed, and the resulting SIN3A PAH2 was further purified through the gel filtration chromatography mentioned above. The purified protein was exchanged into solution F and concentrated just before NMR titration.

Circular Dichroism
CD spectra of TGIF1 256-375 and TGIF1 256-401 from 190 to 260 nm were recorded on a Chirascan™ CD spectrometer (Applied Photophysics, Leatherhead, Surrey, UK) using a 0.2 cm path length quartz cell, with a step size of 1 nm and a bandwidth of 1 nm at 25 • C. Measurements were conducted with 10 µM protein in 10 mM KH 2 PO 4 , pH 6.5. Each sample was scanned three times, and the obtained spectra were averaged and subtracted with the spectrum of buffer solution (recorded as the baseline) to generate the final spectra.

Multi-Angle Light Scattering
Multi-angle static light scattering (MALS) analysis of TGIF1 256-401 was carried out on a DAWN HELEOS II MALS detector (Wyatt Technology Corp., Santa Barbara, CA, USA) coupled with a Superdex TM 75 10/300 GL column (GE Healthcare) at 0.5 mL/min at room temperature in a solution of 20 mM HEPES, 80 mM NaCl, 2 mM DTT, 0.05% NaN 3 , pH 6.4.
The concentration of TGIF1 256-401 was 0.2 mM. The data were analyzed using ASTRA 7.1 software package (Wyatt Technology Corp.). The weight-average molar mass was calculated according to the theoretical UV extinction coefficient (280 nm) of TGIF1 256-401 and using a protein dn/dc value of 0.185 mL/g.

NMR Experiments
NMR experiments were collected on a Bruker Avance III 850 MHz spectrometer equipped with a cryogenic probe at 293 K (25 • C). The TGIF1 256-401 concentration was 0.4 mM. The NMR data were processed using NMRPipe [53] and analyzed using Sparky [54]. NMR titration experiments were performed by mixing 0.1 mM 15 N-labeled TGIF1 256-375 or TGIF1 256-401 with non-labeled SIN3A PAH2 at indicated molar ratios in solution F. After gently shaking for 1 h that allows the binding to reach equilibrium, 1 H-15 N HSQC spectra were collected.

Yeast Two-Hybrid
Yeast two-hybrid assays were performed using the Matchmaker Yeast Transformation System (Clontech, Palo Alto, CA, USA). The coding DNA fragments of TGIF1 256-375 , TGIF1 256-401 and SIN3A-PAH2 were inserted into pGADT7 and pGBKT7 vectors, respectively. Yeast AH109 cells were co-transformed with different pairs of pGADT7 and pGBKT7 constructs as indicated and according to the manual. All yeast transformants were grown on SD2 (-Trp/-Leu) medium for transformation success test and SD4 (-Trp/-Leu/-His/-Ade) medium for prey-bait interaction test.

Molecular Modeling
The structure model of TGIF1 SID was built through de novo modeling using I-TASSER [55] (http://zhanglab.ccmb.med.umich.edu/I-TASSER/ (accessed on 19 November 2021)). The sequence of TGIF1 SID (F376-E394) was entered into I-TASSER as input with recommended setting for structure modeling. The first model of TGIF1 SID generated by I-TASSER and the SIN3A-PAH2 structure from PDB (ID: 2L9S.B) were used to build the complex structure model of the two proteins through molecular simulation, which included multiple steps of docking and optimization. First, initial structures for the complex were calculated using ZDOCK [56]. The model with highest score was selected from the calculated models for further optimization using the molecular docking tool ClusPro [57], which, using multiple steps, optimized the binding of the receptor and the ligand by exhaustively sampling the free energy landscape. Ten structural models that were most-populated clusters were generated, out of which the model with the highest score was refined as the final complex structure model. PyMol 2.5 and its related programs were used to analyze the structure and produce the images.

Conclusions
In conclusion, we demonstrated that TGIF1 utilizes a C-terminal motif (termed SID) ranging from F376 to E394 to bind with SIN3A PAH2. The TGIF1 SID adopts a disordered structure at the apo state, whereas it forms an amphipathic helix upon binding to SIN3A PAH2. In the complex, SIN3A PAH2 adopts a four-helix bundle structure with a deep hy-drophobic cleft, into which TGIF1 SID binds through the nonpolar side of the amphipathic helix. The residues F379, L382 and V383 of TGIF1 SID buried in the hydrophobic core of the complex are critical for the binding, which are conserved residues in SIN3A-PAH2 binders. Although recombinant TGIF1 256-401 exists as a monomer in solution, homodimerization of TGIF1 through the SID can be found in a Y2H assay, which suggests a dual role of TGIF1 SID and a correlation between homodimerization and SIN3A-PAH2 binding of TGIF1. This study provides insight into the binding mechanism of TGIF1 with SIN3A, improves the understanding of the structure-function relationship of TGIF1 and reinforces the knowledge on the sequence and structure characteristics of SIN3A-PAH2 binders. The results can be widely applied to interpret the function of TGIF1 homologs not only from human but also from other vertebrates, recognize the potential SIN3A-PAH2 binders and design a peptide inhibitor blocking SIN3A-TFs interaction for cancer treatment.