Next Article in Journal
From Fat to Brain: Adiponectin as a Mediator of Neuroplasticity in Depression
Previous Article in Journal
Comprehensive Review of Mechanisms and Translational Perspectives on Programmed Cell Death in Vascular Calcification
Previous Article in Special Issue
Activity of Serpins in Context to Hydrophobic Interaction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Definition and Discovery of Tandem SH3-Binding Motifs Interacting with Members of the p47phox-Related Protein Family

1
Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, 1088 Budapest, Hungary
2
Structural Biology Brussels (SBB), Department of Bioengineering Sciences, Vrije Universiteit Brussel (VUB), 1050 Brussels, Belgium
3
VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnologie (VIB), 1050 Brussels, Belgium
4
Department of Bioinformatics, Semmelweis University, Tűzoltó u. 7, 1094 Budapest, Hungary
5
Institute of Molecular Life Sciences, HUN-REN Research Centre for Natural Sciences, 1117 Budapest, Hungary
*
Authors to whom correspondence should be addressed.
Biomolecules 2025, 15(12), 1641; https://doi.org/10.3390/biom15121641 (registering DOI)
Submission received: 15 September 2025 / Revised: 7 November 2025 / Accepted: 18 November 2025 / Published: 22 November 2025
(This article belongs to the Special Issue Protein Biophysics)

Abstract

SH3 domains are widespread protein modules that mostly bind to proline-rich short linear motifs (SLiMs). Most known SH3 domain-motif interactions and canonical or non-canonical recognition specificities are described for individual SH3 domains. Although cooperation and coordinated motif binding between tandem SH3 domains has already been described for members of the p47phox-related protein family, individual cases have never been collected and analyzed collectively, which precluded the definition of the binding preferences and targeted discovery of further instances. Here, we apply an integrative approach that includes data collection, curation, bioinformatics analyses and state-of-the-art structure prediction methods to fill these gaps. A search of the human proteome with the sequence signatures of SH3 tandemization and follow-up structure analyses suggest that SH3 tandemization could be specific for this family. We define the optimal binding preference of tandemly arranged SH3 domains as [PAVIL]PPR[PR][^DE][^DE] and propose potential new instances of this SLiM among the family members and their binding partners. Structure predictions suggest the possibility of a novel, reverse binding mode for certain motif instances. In all, our comprehensive analysis of this unique SH3 binding mode enabled the identification of novel, interesting tandem SH3-binding motif candidates with potential therapeutic relevance.

1. Introduction

SH3 domains are small globular protein modules that are widespread interaction specialists. They represent one of the most abundant protein domain families: there are several hundred SH3 domain-containing proteins in the human proteome, many of them harboring more than one such domain [1]. They function in diverse signaling pathways, most often in membrane receptor-associated signaling pathways/networks. The majority of them are kinases or multi-domain adaptor, scaffold or effector proteins [1]. The most prominent function of SH3 domains is to specifically recognize (mainly proline-rich) peptide motifs in proteins, so-called short linear motifs (SLiMs), but through the years they turned out to be implicated in diverse recognition modes, including domain-domain, domain-RNA and domain-lipid interactions [1,2]. They are central players in the human protein–protein interaction (PPI) network, mediating the assembly of protein complexes through PPIs, and also enable switch-like autoregulatory mechanisms through controllable intrachain domain-motif interactions [3,4,5]. Recently, several SH3-mediated multivalent PPIs were reported to drive liquid–liquid phase separation and thereby contribute to the formation of functionally specialized biomolecular condensates that confer efficiency and fidelity on cellular signaling [6,7,8].
Regarding their recognition specificities, SH3 domains are highly versatile [2,9]. A plethora of low-throughput and high-throughput (mainly phage display-based [10]) studies aimed at identifying the peptide-binding specificities of individual SH3 domains, leading to the description of canonical [11,12,13] and non-canonical [10,14,15,16] binding specificities in both forward and reverse orientations [11,12]. The SH3-peptide complex structures helped the elucidation of some important specificity determinants of SH3 domains, i.e., the residues that comprise the shallow hydrophobic pockets of their binding cleft, which contribute to the binding of peptides that typically adopt a polyproline type II helix conformation [12,13]. Also, SH3 domain-mediated interactions turned out to be highly regulated by phosphorylations affecting the recognized motifs within binding partners or the peptide binding clefts of the SH3 domains themselves [17]. The SH3 domains of the human proteome were comprehensively compiled and classified according to different aspects, and their hitherto described binding specificities were also collected, summarized and analyzed [1,9]. Yet, this comprehensive dataset clearly shows that we are far from fully understanding the function of these multifaceted protein modules, and that the specificities and cellular binding partners of the majority of human SH3 domains remain undiscovered.
Besides falling short of the comprehensive description of binding specificities for all individual SH3 domains, another aspect of SH3 domain functioning that remains clearly understudied is a possible cooperation between different SH3 domains within the same protein or between different proteins. The paucity of such insights mainly stems from technical difficulties. In multi-domain modular proteins, globular domains are usually connected by disordered linker regions. While this arrangement allows for the free movement, rotations and independent functioning of neighboring domains, and therefore it is highly advantageous in cells, the increased length and conformational freedom of modular proteins make their in vitro experimental handling and characterization highly problematic. For this reason, the function and interactions of modular proteins are usually studied by separately investigating the binding properties of the individual protein modules. While this approach can certainly eliminate many technical difficulties and bring valuable insights, it is insufficient for uncovering functions/interactions that require cooperation between different modules of a protein.
The best-described case of cooperation between neighboring, tandemly arranged SH3 domains is that of p47phox (gene name: NCF1). This protein regulates the assembly and activation of the phagocyte NADPH oxidase complex that catalyzes the conversion of oxygen to superoxide and other reactive oxygen species (ROS), and which is composed of membrane-embedded and cytoplasmic components [18]. The cytoplasmic p47phox adaptor protein has an autoinhibited, closed conformation wherein its tandem SH3 domains (tSH3s) associate and form a common, composite binding groove that interacts with an autoinhibitory motif [18,19,20], a mode of binding that was also termed the superSH3 binding mode [21]. The composite ligand binding groove is formed by the conventional ligand binding grooves of the N- and C-SH3 domains, which bind the motif in opposite orientations (termed as plus/minus orientations) [20]. The core motif inserting into the composite binding groove adopts a polyproline type II (PPII) helix conformation, similarly to canonical SH3-binding motifs [19,20]. Several phosphorylations of serine residues in the vicinity of the autoregulatory motif are required to open up this closed conformation [22,23] and enable association of the tSH3s of p47phox to a similar motif within p22phox (gene name: CYBA) [18,19,24], thereby forming a bridge between the membrane-embedded and cytoplasmic components of the NADPH oxidase and activating the complex [18,19].
The p47phox-related organizer superfamily of proteins has 5 members, p47phox (gene: NCF1), p41phox (gene: NOXO1), p40phox (gene: NCF4), Tks4 (gene: SH3PXD2B) and Tks5 (gene: SH3PXD2A), with all of them harboring an N-terminal PX domain and one or more SH3 domains. For simplicity, we will refer to the proteins by their gene names from now on (but intentionally avoid using italic font to make clear that we talk about proteins). SH3PXD2B and SH3PXD2A will be referred to by their alternative gene names: TKS4 and TKS5, and the protein family will be referred to as the NCF1 family. Potential association and motif-binding of tSH3 has only been scarcely studied in other members of the superfamily [25,26,27] and, to our knowledge, it has never been investigated in proteins outside of this superfamily, although there are many proteins harboring multiple consecutive SH3 domains. Studying this binding mode experimentally is highly challenging, as the association of the tSH3s is very weak: even for NCF1, the two domains only associate when linked (ensuring high local concentration), not when placed separately in solution [19]. Also, binding of the peptide motif has a clear stabilization effect on the complex [19], therefore studying a construct with a pair of linked SH3 domains alone might still not be sufficient to detect the association.
In this study, we first comprehensively analyze available literature evidence on the association and motif binding of tSH3 in members of the NCF1 family, to then use the accumulated motif instances, their structural properties (if available), evolutionary conservation patterns and mutation data to define the underlying binding determinants as a novel short linear motif (SLiM). In this step, we follow the strategy that mirrors the annotation procedure for new motif classes in the Eukaryotic Linear Motif (ELM) database [28,29]. The resulting motif definition is then used to propose new motif instances in the family members and their binding partners. Furthermore, we aim to establish the sequence requirements (on the domain side) for the association of tSH3 in members of the NCF1 family using the available complex structures, as well as conservation patterns within the family and through evolution. Using these signatures, we screen the human proteome to see if we can identify other multi-SH3 domain proteins whose tSH3 could potentially associate and form a common binding groove for motif binding.

2. Methods

2.1. Identification of Tandem SH3 Domains

To identify SH3 domain-containing proteins, we searched for human proteins with the PF07653, PF00018, PF14604 PFAM domain identifiers in the InterPro database [30]. Domain boundaries (taken from InterPro annotations) were evaluated and selected as tandems if the linker region connecting two SH3 domains was shorter than 60 amino acids. In the case of the NOXO1 protein (a member of the NCF1 family that is in the focus of this work), one of the two SH3 domains was missing from the InterPro annotations, and therefore it needed to be added based on UniProt annotation (where it was present in accordance with the literature).

2.2. Multiple Sequence Alignments

Orthologous proteins were collected from OMA [31] (using Uniprot AC identifiers) and selected to uniformly cover Vertebrates (Table S1). Sequences were aligned with ClustalΩ (version 1.2.4) [32] and visualized in Jalview (version 2.11.5) [33].

2.3. Search for Structures and Structural Analysis

Possible tandem structures were collected using different methods. The three PFAM IDs belonging to SH3 domains (PF07653, PF00018, PF14604) were searched in 3did [34] to retrieve SH3-containing entries. Each entry was manually assessed to filter those where multiple SH3 domains occur, and a peptide is part of the complex. To validate the results using an independent method, the tandem structures found were also used as an input in FoldSeek [35], and other tandem SH3s were searched in the PDB100 dataset [36]. From the available structures, the domain-peptide amino acid level interactions were predicted using RING [37] with default settings.

2.4. Structure Prediction

We used ColabFold 1.5.5 [38] (based on AlphaFold2 (AF) [39]) to predict the tandem domain-motif interactions defined in Table 1. We opted AF2 over AF3 for the following reasons: (1) several high-throughput analyses concluded that AF2 can reliably predict motif-domain interactions [40,41], however such an investigation is not available for AF3; (2) In ColabFold we can directly manipulate which template structure is used by the prediction; (3) A modification for AF2 can produce actifpTM [42], a score directly developed to assess the quality of motif-domain interactions; (4) We have previously shown that this methodology can capture slight deviations in binding mode [43]. For each interaction, we used PDB:7yxw as a template and added the sequence of the tandem SH3 domain together with the core motif (±4 positions). Among the predicted models, we selected the one with the highest actifpTM (Table S2). We used ChimeraX [44] for visualization. In edge cases (reverse motif orientation, SH3KBP1/CIN85 clustering), we also confirmed the structure prediction using AF3 [45].

2.5. Residue-Wise Contributions to Gibbs Free Energies in the Tandem SH3 Motif

Residue-wise contributions to Gibbs free energies were predicted by MMGBSA [47] on the HawkDock webserver [48,49] with NCF1’s autoinhibitory cis-regulatory tandem SH3 domain-binding peptide complexes (PDB codes: 1NG2 and 1UEC). Since the webserver is only compatible with dimers, we edited the structures in the following way: The second biological assembly was retrieved from RCSB PDB, and the second chain (originally rendered as model) was converted to chain B, then for both chains 13-residue-long peptides (aa. 294–306) were renamed to chains C and D with the deletion of their flanking regions (aa. 278–293 and aa. 307–332). NCF1-CYBA heterodimeric PDB structures were also used but did not need to be modified.

2.6. Conformational Ensemble Prediction

To emulate the structural ensemble of the SH3 tandems with and without the Pro-rich binding motifs, the ColabFold (1.5.5) version [50] of BioEmu 1.1 [51] was used with the following parameter setup: Number of samples: 30. Number of samples to randomly select for clustering: all available samples. The coverage threshold used for FoldSeek [35] clustering was set to 0.7, with a minimum TM-score of 0.6, and 95% sequence identity threshold used for clustering. Shorter constructs only constituting the two SH3 domains were tested using the following region borders (UniProt numbering): for NCF1: aa. 156–285, for NOXO1: aa. 163–296, for SH3KBP1/CIN85: aa. 1–157, for TKS4: aa. 152–280 and for TKS5 isoform 3: aa. 166–297. Longer constructs also comprising the Pro-rich peptide were the following: NCF1: aa. 156–310, NOXO1: aa. 163–343, SH3KBP1/CIN85: aa. 1–157 fused to aa. 334–432, TKS4: aa. 152–358 and TKS5 isoform 3: aa. 166–412. For the comparison of conformers to tandem SH3 reference crystal structure PDB:7YXW, PyMOL’s (ver. 3.1.1., Schrödinger LLC, New York, NY, USA) superimposition tool was used: ‘super sele, topology.pdb, cycles = 0, target_state = i’, where sele corresponds to 7YXW’s PDB region 160–211 and 229–283, and ‘i’ is a given conformer of topology.pdb.

2.7. Motif Prediction

We scanned the canonical human proteome (obtained from UniProt [52], 2025_1 release) for the defined strong binding motif ([PAVIL]PPR[PR][^DE][^DE]). We calculated or added the following properties: (1) protein disorder [53]. (2) We defined motif conservation using the regular expression. We used a set of Chordata species proteomes (Table S3) and searched orthologs for human proteins using BLAST (version 2.12.0+) (the sequence with the highest sequence identity (above 20%) was selected for each species, if it covered at least 60% of the human sequence) and then aligned the selected sequences with ClustalΩ [32]. We searched for the regular expression in the orthologous proteins. Due to the challenging alignment of disordered regions, we allowed 50-residues deviation in the alignment for regular expression hits. (3) Broad-level localization definition of the motif containing protein (i.e., defining intracellular proteins using the same methodology as was used during the development of TOPDOM [54]). (4) To identify if the motif hit is likely to occur simply due to high Pro/Arg content of the sequence, we measured how many times the motif is found after randomly shuffling the protein sequence 100 times. (5) Whether the identified protein is listed as a partner of NCF1, NOXO1, TKS4 or TKS5 in BioGRID [55] and/or IntAct [56]. Additional partners of the TKS4 protein were derived from a recently published dataset [57].

3. Results

3.1. Multiple Lines of Evidence Support the Cooperation of Tandem SH3 Domains in Motif Binding for Members of the p47phox-Related Organizer Superfamily

Among the five members of the p47phox-related organizer superfamily (referred to as the NCF1 family), four have multiple SH3 domains, of which the first two are in proximity in sequence, connected only by a relatively short linker (Figure 1). In the following sections, we discuss the literature evidence supporting that the tSH3 domains (tSH3s) engage in joint motif binding in the four family members and collect all instances of the tSH3-binding motifs hitherto described (Table 1).
NCF1, the eponym of the family, is the major cytoplasmic regulatory subunit of the NADPH oxidase complex [18]. It has an autoinhibited closed conformation where the tSH3s form a composite binding cleft to bind to a C-terminal proline-rich “APPRR” motif within the autoinhibitory region (AIR) [19]. This conformation can be resolved by phosphorylations directly downstream of the motif [22], leading to an active, open conformation, where the tSH3s of NCF1 can bind to a similar “PPPRP” motif within one of the catalytic subunits of the oxidase, namely p22phox (gene: CYBA), which activates the oxidase complex [18,19,24]. The two motifs are very similar, and the available PDB structures of the interactions prove that both SH3 domains of NCF1 are required for motif binding in both cases [19] (see Table 1 rows 1 and 2).
NOXO1 is a homologue of NCF1 with 22% identity that also functions in the activation of superoxide-producing NADPH oxidases. Since NOXO1 lacks the autoinhibitory region (AIR) present in NCF1, it was first considered as a constitutively active organizer of the NADPH oxidase [18,25] through binding to CYBA by its tSH3. The NOXO1-CYBA interaction is similarly disrupted by the chronic granulomatous disease (CGD)-associated P156Q mutation in the CYBA motif 155-PPPRP-159, suggesting a similar binding mechanism as seen for the NCF1-CYBA interaction [18,25]. However, a relatively weak autoinhibitory interaction was uncovered between the tSH3s of NOXO1 and its C-terminal Pro-rich region, namely residues 325-TAPPPTVPTRPS-336 [46,58] that weakens the binding to CYBA. Mutations disrupting the autoinhibitory motif enable CYBA binding with higher affinity [46]. It is important to note that the identified segment harbors the sequence “VPTRP” that is very similar to the motif bound within CYBA and to those bound by the tSH3s of other NCF1 family members (see Table 1). Interestingly, the same Pro-rich region of NOXO1 is known to mediate the interaction with the SH3 domain of NCF2/NOXA1 [25].
TKS4 and TKS5 represent a distinctly different subfamily within the p47phox-related organizer superfamily. They are large, modular adaptor proteins that contribute to the formation of podosomes and invadopodia and thus are implicated in cell motility, including the migration of cancer cells during metastasis [59,60,61]. Interestingly, TKS5 has been demonstrated to fulfil a similar role to NCF1 by binding to CYBA and thereby facilitating ROS production during invadopodia formation [27].
The N-terminal part of TKS4 closely resembles NCF1 and was demonstrated to mediate similar intramolecular interactions, as shown in a study by Merő B. and colleagues [26]. Their in vitro binding assays and SAX results support that TKS4 has an autoinhibited conformation wherein the 3rd SH3 domain binds to the N-terminal PX domain and the tandemly arranged 1st and 2nd SH3 domains are both required to engage with a highly conserved “PPPRR” motif located between residues 346 and 350 [26]. Regions of the protein downstream of the 3rd SH3 domain have not been investigated [26].
The tandem SH3 binding mode has also been suggested for an isoform of TKS5 wherein the linker connecting the first two SH3 domains is shorter than in the canonical isoform (see UniProt: Q5TCZ1-3) [21]. Also, the tandem SH3s of TKS5 were reported to interact with the same region of CYBA as NCF1 and NOXO1 and the binding is similarly compromised by the disease-associated P156Q CYBA mutation. Furthermore, Rufer AC et al. demonstrated some isoform-specific protein–protein interaction for TKS5 by peptide spot membrane assays, wherein four peptides of SOS1 were exclusively bound by the tSH3s of the short TKS5 isoform [21]. Peptide 14 covering human SOS1 residues 11451164 contains “VPPRR”, so we also considered it as a validated instance of the PPRR motif (Table 1). We cannot explain the exclusive binding of peptides 2, 6 and 25 to the tSH3s of the short TKS5 isoform, because those do not contain anything resembling a PPRR motif.

3.2. The Binding Preference of the Tandem SH3 Domains of the NCF1 Family: Definition of the tSH3-Binding Motif

By taking into account the contacts in the available complex structures, constraints imposed by the PPII helix conformation on the motif residues [62], binding strengths, evolutionary conservation signatures, data from mutational analyses, and experimentally verified disease mutations, we here make an attempt to define the crucial binding determinants of tSH3 (Figure 2). We describe the motif as a regular expression commonly used to define short linear motifs (SLiMs) [28,29] and to identify novel motif instances in protein sequences.
Based on the vertebrate alignments of known motif instances (Figure S1) and DSSP secondary structure annotations in the available X-ray structures, we found that five consecutive residues show high conservation and a tendency to form PPII helix conformation, thereby forming the core of the proposed motif (Figure 3). A strong/strict version of the motif core can be defined as [PAVIL]PPR[PR]. In the available X-ray structures depicting the interaction of the NCF1 tSH3s with the autoregulatory motif (PDB IDs:1NG2 and 1UEC) or the motif in the partner CYBA (PDB IDs:1OVC3 and PDB:7YXW), the motif core forms a polyproline type II helix (PPII) structure [20] with minor variations in the starting and finishing positions, as detected by DSSP [63]. At the first position of the core motif, the selection of residues (Pro and hydrophobic residues) was defined based on the vertebrate alignments of the validated motif instances (Figure S1) and the fact that this position is already part of the PPII helix observed in the structures. At the second position, Pro seems to be strictly required based on several observations: (1) there is absolutely no variation in the alignments at this position, (2) the chronic granulomatous disease-associated mutation within CYBA (P156Q) hits this second position and is known to abrogate binding, (3) the Pro at this position (equivalent to P299 within the AIR of NCF1) makes contacts with both domains of the tandem SH3 [19,20]. At the third position, Pro seems to be strongly preferred for strong binding, based on its invariant conservation in the strong motif instances (NCF1 AIR, CYBA, TKS4 autoregulatory motif, SOS1 motif). Also, this Pro (equivalent to P300 within the AIR of NCF1) makes several contacts with a hydrophobic pocket of the first SH3 domain of tSH3s in the available PDB complexes [19,20]. At the fourth position, Arg is invariably conserved among vertebrates for all the motif instances, and it is known to make both hydrophobic and electrostatic contacts with the second SH3 domain [19,20], so it seems to be strictly required for binding. At the fifth position, either Pro or Arg seems to be allowed, since only these two residues can be seen in the motif alignments. The residue at this position contacts the first SH3 domain and it was predicted as the last residue of the PPII helix in one of the available structures of the complex (PDB ID: 7YXW). Finally, some restrictions can be defined for the C-terminal flanking residues of the motif. At the first and second positions immediately following the motif core (6th and 7th positions) negatively charged residues are not allowed because they weaken the interaction. This is evident because phosphorylation or a change to phosphomimetic residues [22] of the serine residues at these positions of the NCF1 autoregulatory motif was demonstrated to weaken the binding [19]. A negatively charged residue at position 6 would induce charge repulsion with a negatively charged residue of the 2nd SH3 domain (Glu241 in NCF1) [19]. We also calculated Molecular Mechanics, Generalized Born Surface Area (MMGBSA) on the crystal structures available in PDB to estimate the relative contribution of the motif residues to the tandem SH3 domain-motif complex formation. The results confirmed that the absolutely conserved positions (Arg in position 4 and Pro in position 2 of our motif definition) have the largest energetic contribution to binding (Figure 2, Table S4). Based on these considerations, a strong tSH3-binding motif can be defined as [PAVIL]PPR[PR][^DE][^DE].
In AlphaFold (AF)-predicted models of tSH3-motif complexes of all the known or proposed motif instances in Table 1 that adhere to this strong motif definition, the motifs are perfectly aligned (see Section 2.4; Figure 3A). We use the predicted complexes for consistency, but they correspond precisely to the respective PDB structures, where available (Figure S2). Notably, on Figure 3A the NCF1 autoinhibitory peptide (Table 1, line 1) is followed by a helix, which was removed from the figure for better visibility. The original figure is available as Figure S3, and all AF2-predicted models are available as Data S1).
Some validated motifs do not adhere to this strong motif definition. The weakly binding NOXO1 C-terminal motif (kD of 50 μM; see Table 1 row 3) contains a Thr at the 3rd motif position, and the vertebrate alignment of this motif suggests that Ala and Val can also be accommodated here (Figure S1). Importantly, these residues are compatible with the residues observed in middle or penultimate positions of PPII helix structures [62]. Based on AF-predicted models, this weaker motif can bind to the tSH3s of NOXO1 in a conformation very similar to that of the strong motifs binding to their respective tandem SH3s (Figure 3B). Therefore, a less strict version of the motif can be defined as [PAVIL]P[PTAV]R[RP][^DE][^DE].
In the validated TKS4 autoregulatory motif (residues 346-350), as well as in the homologous motif of the closely related TKS5, there is a conserved negatively charged residue at the 6th position, the position directly following the core motif (Table 1 rows 5 and 10, Figure S1). Therefore, these motifs do not fit the strong motif definition defined above. A negatively charged residue at this position weakens the interaction based on several pieces of evidence [19]. The interaction of the TKS4 peptide was confirmed by biophysical experiments, suggesting a relatively weak affinity (Kd = 12 μM) [26]. Interestingly, AF2 predicts a reverse binding mode for these two similar motifs in all five models of the output (Figure 3C). Out of curiosity, AF3 was also used on the autoregulatory cases, and the resulting models of the complexes support the forward binding mode for motifs fitting the strong motif definition and the reverse binding mode for the TKS4 and TKS5 autoregulatory motifs that reside ahead of their 3rd SH3 domains (Figure S4). The validity of the predicted reverse binding mode cannot be judged based on the currently available data, since the experiments performed by Merő et al. do not provide information on the directionality of binding [26] and structural evidence is not available. For these two peptides, the reverse binding mode could be enabled by Arg residues occupying the 0 position (the one directly preceding the core motif). In the predicted models of the reverse binding mode, these arginines occupy a similar position as the strictly required Arg at position 4 of the forward-binding strong motifs (Figure 3D). Still, while the latter can form salt bridges with both of the adjacent negatively charged residues within the 2nd SH3 domain of the tandem (alignment positions 267 and 268 in Figure 4 below), Arg0 of the predicted reverse motif is a bit shifted and therefore can only contact the second acidic residue of the two (Figure 3D). TKS4 and TKS5 are closely related, and the region harboring this potential autoregulatory motif is fully conserved in their vertebrate alignments (the conservation pattern is not island-like as usually seen for functional SLiMs [64], and there is no sequence variation at the associated positions). Therefore, a more general reverse tSH3-binding motif cannot be defined based on these two cases.

3.3. Detection of Tandem SH3-Binding Motif Candidates Within the NCF1 Family and Their Binding Partners

We propose three potential novel autoregulatory motif candidates residing in intrinsically disordered regions (IDRs) [65,66] of NCF1 family proteins (see the last three rows of Table 1 and Figure 1), of which two fit the strong motif definition proposed above, and one is a candidate for the reverse binding mode. A conserved, C-terminal, proline-rich motif of NCF1 “365-VPPRP-369” (Table 1, row 8) shows homology to the validated C-terminal autoregulatory motif of NOXO1, moreover it is likely a stronger binder since it has an optimal Pro residue at the third motif position. We are not aware of an alternative intramolecular interaction within NCF1 wherein its tSH3s interact with this conserved Pro-rich region instead of the well-studied AIR motif. However, we propose that such an interaction might exist because the “VPPRP” motif is highly similar to the well-studied AIR and CYBA motifs and contains all the key residues known to be sufficient for binding to the tSH3s of NCF1 [19]. Furthermore, these C-terminal motifs in NOXO1 and NCF1 both contain a Ser at the position directly following the core motif, which suggests that they could also be regulated by phosphorylation (as the well-studied AIR-resident motif). Accordingly, the respective Ser residue can be phosphorylated in both proteins based on PhosPhoSite+ annotations [67].
Additionally, we propose that TKS4 might have a 2nd, C-terminal autoregulatory motif (Table 1, row 9). Notably, in the AlphaFold2 [39,68] model of TKS4, which became available since the publication by Merő et al., the tSH3s bind to a “757-VPPRR-761” motif that differs from the one proposed by the in vitro experiments [26]. This alternative autoregulatory motif fits the strong motif definition and is accordingly predicted to bind in the forward mode (and likely with high affinity), in contrast to the experimentally studied TKS4 autoregulatory motif, which is predicted to bind in reverse mode. Also, the proposed motif is located within the long IDR connecting the 3rd and 4th SH3 domains of TKS4 that was not covered by the in vitro studies.
To our knowledge, an autoregulatory intramolecular interaction has never been proposed for TKS5. Yet, a conserved “400-PPPRR-404” motif, located upstream of the 3rd SH3 domain (see Figure 1 and Table 1, row 10) could likely serve as the autoregulatory module of the short TKS5 isoform (UniProt AC: Q5TCZ1-3), the one compatible with SH3 tandemization. At least one autoregulatory motif was described in all other NCF1 family members, so autoregulation is likely a conserved feature within the family. The functional importance of this motif is further supported by its homology with the validated autoregulatory motif of the closely related TKS4 [26]. Additionally, since the core motif is directly followed by an acidic residue, and preceded by an Arg, this motif is predicted to bind in the reverse binding mode, similarly to its TKS4 homolog (Figure 3C,D).
To identify motif candidates within the known binding partners of the NCF1 family members, the strong motif definition ([PAVIL]PPR[PR][^DE][^DE]) was first searched in the whole canonical human proteome (Table S5) [52] and then the resulting 324 proteins were checked for an overlap with known binding partners of the NCF1 family members cataloged in BioGRID [69] and IntAct [56]. Predicted structural disorder [53] and sequence conservation were calculated to help evaluate the possible functionality of the identified hits [66]. The localizations of the proteins were also considered, since NCF1 family members are all cytoplasmic proteins that can be attached to the plasma membrane by their PX domains under certain circumstances. Besides self-interactions and partners already covered in Table 1, some interesting motif candidates could be identified (Table 2, Figure S5).
NCF2/p67phox is a well-known interaction partner of NCF1 in the assembly of the NADPH oxidase, wherein the C-terminal SH3 domain of NCF2 is known to interact with a C-terminal Pro-rich motif in NCF1 [70], the one that we propose as a potential secondary autoregulatory site within NCF1 (Table 1, row 8). Since NCF2 has a Pro-rich disordered region with “PPPRPKT” between residues 227-233 that fits our strong motif definition, there could be a secondary interaction between the two proteins, where the tSH3s of NCF1 (after being released from autoinhibition) interact not only with the motif within CYBA [18], but also with a motif within the other cytoplasmic oxidase subunit, NCF2. Although an interaction between NCF1 SH3 domain(s) and NCF2 has been suggested [18], to our knowledge the mechanism and the binding site within NCF2 have not been precisely described yet. Based on our motif definition, a tSH3-binding motif maps to residues 227-PPPRPKT-233 of NCF2. The key residues of this motif are almost fully conserved in vertebrates (Figure S5).
The interaction between NCF1 and RelA protein has been demonstrated using a range of interaction detection methods and is proposed to be important in the activation of the NF-κB pathway by IL-1 in endothelial cells [71]. The study demonstrated that the tSH3 of NCF1 are required for the interaction, and it narrowed down the interacting region within RelA to a larger Pro-rich IDR between the Rel homology and transactivation domains (region 301-431 in UniProt AC: Q04206) [71]. Based on our motif definition, the tSH3-binding motif can be precisely located to residues 326-PPPRRIA-332 of RelA. The key residues of the motif are generally conserved among placental mammals, only some Pro to Thr (or Ala) changes can be seen at the 3rd motif position (Figure S5), a change that could still be compatible with binding according to the less strict motif definition.
The disordered cytoplasmic tails of ADAM family metalloproteinases were tested for interactions with a number of individual SH3 domains [72]. The 1st SH3 domain of NCF1 [72] and the 1st and 5th SH3 domains of TKS5 were found to bind to Pro-rich regions within ADAM19 [72,73]. It has a promising tSH3-binding motif between residues 789-PPPRPPP-795 (that is fully conserved among mammals (Figure S5)), so we propose that this potential interaction should be tested using tandemly arranged SH3 domain constructs.
SYN1 and SRSF2 do not seem to be real interaction partners of NCF1. For SYN1 only very weak binding was seen in vitro with the 2nd SH3 domain of NCF1 [74], with no biological relevance suggested for the interaction. The SRSF2 splicing factor is a primarily nuclear protein (in contrast to NCF1 which is cytoplasmic), and the NCF1-SRSF2 interaction was only detected in a single clone during a yeast two-hybrid screen [75]. The NOXO1-ARHGAP8 interaction is also not very well-supported, there is only one high-throughput affinity capture-MS analysis where this interaction was detected [76], but due to the characteristics of the methodology, the interaction could also be indirect.
SH3 domain-binding protein 1 (SH3BP1) is a GTPase-activating protein (GAP) inactivating the RAC1 GTPase at the leading edge of migrating cells [77], thereby facilitating the reorganization of the cytoskeleton at cell protrusions, such as podosomes, invadopodia and others. Its interaction with TKS4 has been detected by TKS4 co-immunoprecipitation experiments in different cell types [57]. The interaction is further supported by the fact that SH3BP1 and TKS4 share many common binding partners, including the adaptor proteins CIN85 and CD2AP [78,79], the Src kinase, and the actin capping protein [57,79]. Therefore, cooperation between the two proteins in actin remodeling within podosomes and invadopodia is highly likely, and the detected tSH3-binding motif suggests a possible binding mechanism. However, the conservation of the potential tSH3-binding motif is not very strong (Figure S5), and the long, C-terminal IDR of SH3BP1 is so rich in prolines that there are plenty of potential SH3-binding sites in it (this is where its name “SH3 domain-binding protein 1” comes from). Thus, it could also be bound by any of the individual SH3 domains of TKS4, the binding preferences of which are poorly characterized [10].
GAB2, PLEKHA7, DENND1A and INPP5E have been suggested as TKS5 partners by a single study, where the interactome of cilia was characterized by high-throughput affinity capture-MS [80]. Since, to our knowledge, no other study has suggested ciliary localization for TKS5, we are not sure about the validity of these interactions. Also, affinity capture-MS tends to detect indirect interactions.
To sum up the evaluation of the detected motif hits within known partners of the four NCF1 family proteins (Table 2 and Table S5): (1) the binding sites of the tSH3s of NCF1 could likely be better defined within two partners, NCF2 and RelA, (2) although only individual SH3 domains of NCF1 (1st SH3) and TKS5 (1st and 5th SH3) were tested and shown to interact with the cytoplasmic tail of ADAM19, it has a promising tSH3-binding motif candidate, and (3) the interaction between TKS4 and SH3BP1 could depend on the detected tSH3-binding motif within SH3KB1, but individual SH3 domains of TKS4 could also mediate it. Some of the remaining hits are rather unlikely, while others are hard to judge based on the currently available information.

3.4. Sequence Signatures of SH3 Tandemization and Motif Binding Within the NCF1 Family

The tSH3s of the four family members are connected by a short, acidic linker, with a relatively well-preserved length within the family (Figure 4) and across evolution (Figure S6). Only TKS5 is known to have isoform-specific variation within the linker; the first two SH3 domains can only tandemize in the isoform with the shorter linker (UniProt AC: Q5TCZ1-3) [21].
Regarding the domain sequences, the GWW signature amino acid triplets in both domains appear crucial for tandemization and motif binding [19,20]. The middle Trp of this triplet is well-conserved among SH3 domains [10] and is a key residue in the binding of canonical as well as non-canonical peptide ligands by forming the shallow hydrophobic pockets designated as P−1 and P0 following the notation by Yu et al. [13]. The tryptophanes at this position of both domains contact the strongly conserved Pro at the 2nd position of the tandem SH3-binding motif defined above. The Gly residues in the GWW signatures of both domains are important for tandemization, as larger residues at this position would cause a steric clash and prevent tandemization [19,20]. Accordingly, mutations to Ser at these positions are implicated in autosomal recessive chronic granulomatous disease (CGD) [81]. The last Trp in the triplets of both domains forms a hydrogen bond with the residue two positions ahead of the GWW triplet of the other SH3 domain (Figure 4); therefore, they also seem strictly required for tandemization. Interestingly, the hydrogen-bonded residues ahead of the GWW triplets are not fully conserved within the family; at both of these positions, NOXO1 contains residues that are not even similar to the ones observed in other family members (Pro versus Glu in the N-SH3 domain, and D versus L in the B-SH3 domain). Groemping et al. proposed that only SH3 domains with the GWW triplet signature could likely form the tandem arrangement [19].
Interestingly, many of the domain positions that contact the tSH3-binding motif show a remarkable sequence variation within the family (Figure 4). Still, some conserved positions can be seen and considered as important motif-binding signatures. Firstly, the C-SH3 domain contains a highly conserved double-negative signature. These acidic residues form salt bridges with the strongly conserved Arg at the 4th position of the tSH3-binding motif. Secondly, the hydrophobic patch at the end of the N-SH3 domain has fully and highly conserved positions that are in contact with the proline residue at the 3rd position of the motif, as well as the residue in the 5th, [RP] position (Figure 4). These will be referred to as C-double negative (C-[DE][DE]) and N-PxxΦL (where Φ is an aromatic residue) signatures from now on. It is important to note that these three signatures seem to be specific to tSH3s, as several of their residue positions show remarkable sequence variation among SH3 domains in general (see SH3 alignment in Table S4 of Teyra J. et al. [10]).
The 3rd position of the core motif was defined as a strongly required Pro in the strong motif definition, but it is the one that became somewhat relaxed in the weak motif definition due to sequence variations seen in the weak-binding NOXO1 C-terminal autoregulatory motif. Interestingly, two of the three domain residue positions that consistently contact this P(3) motif residue in the complex structures show remarkable sequence variation within the NCF1 family (see P(3) contacting several residues in the N-PxxΦL signature in Figure 3). This could imply that certain family members are more suited to accommodate motifs with a variation at the 3rd position, while others are more conservative.
The sequence signatures for SH3 tandemization and composite peptide binding introduced above are well conserved for the NCF1 family members among vertebrates (Figure S6), along with the respective autoregulatory motifs and partner motifs (Figure S1). However, we identified an interesting case of co-evolution between inter-dependent protein interaction modules in the African elephant (Loxodonta africana). Here, the GWW signature triplets of both SH3 domains of NCF1 are disrupted/missing, as well as the autoregulatory (AIR) motif at the C-terminal part of the protein that is responsible for the inactive, closed conformation. Furthermore, the respective motif is also missing from the partner, the NADPH oxidase core complex subunit CYBA, so all interacting modules of the NADPH oxidase regulatory system are affected (Figure 5). These findings, along with the complete lack of the NOXO1 gene, strongly suggest that the regulation of the NADPH oxidase is completely altered in the African elephant.

3.5. Does SH3 Tandemization and Joint Binding of a Single Motif Occur Outside the NCF1 Family?

To explore potential candidates for SH3 domain tandemization and binding of a single motif by a composite binding cleft, we screened all proteins of the human proteome that had at least two SH3 domains (45 proteins; Table S6). Interestingly, besides the NCF1 family members and the two close paralogs of NCF1 (NCF1B and NCF1C; could be pseudogenes), only the adaptor proteins SH3KBP1/CIN85 and CD2AP/CMS had at least two consecutive SH3 domains with the GWW signature. While in CIN85 all three domains have the signature, and the first two are relatively close to each other (separated by a linker of ~30 residues [82]), in CD2AP only the last two SH3s have the signature, but those are relatively far apart (~100 residues).
Interestingly, the first two SH3 domains of CIN85 also exhibit the C-[DE][DE] signature and a slightly modified N-PxxΦL signature, with Val at the last position. Therefore, based on SH3 domain signatures, CIN85 could be an interesting candidate for intrachain SH3 tandemization (Figure S7). It is important to note, however, that the linker connecting the two domains is much longer than those of the NCF1 family members, and has a strong basic character at both termini, in contrast to NCF1 family members, where the linker is acidic. This could be a reason why no contacts have been identified between the SH3 domains within the CIN85 SH3AB construct (containing the first two SH3 domains of CIN85 connected by the linker) when studied by NMR previously [82]. It is also important to note that there are no isoforms of CIN85 and CD2AP, where the SH3 domains are preserved and only the linker is varied (as for TKS5).
Individual SH3 domains of CIN85 have a non-canonical binding preference for Px[PAV]xPR motifs (where x is any residue) [16] that differs from the binding preference of individual SH3 domains in the NCF1 family [82]. Therefore, even if we assume that they can form tandems, the binding preference might not be the same as for NCF1 family members. We applied different computational approaches to investigate whether the CIN85 SH3 domains could form tandems. First, we used both AF2 and AF3 to predict if the linked first two SH3 domains of CIN85 could bind to a peptide within CIN85 that resembled the tSH3-binding motif. CIN85 has a “LPPRR” sequence (residues 417-421 in CIN85; UniProt AC: Q96B97) that though does not fit the strong motif definition due to having a Glu in the C-terminal flanking region (at the 7th position), but might be a weak and/or reverse autoregulatory site. When testing the linked domains (residues 1-160 in CIN85; UniProt AC: Q96B97) with a peptide containing this motif and four flanking residues on both sides, the SH3 domains did not form a tandem arrangement in any of the resulting models (the GWW signatures were too distant to make contacts). Second, we also run AF2 and AF3 with a peptide carrying the proposed autoregulatory motif of the TKS4 protein, because (1) it fits the strong motif definition, (2) the two proteins are binding partners [57,83] and (3) the peptide “QRPVVPPRRPPPP” (residues 753-765 in TKS4; UniProt AC: A1X283) also contains an overlapping PxVxPR motif that fits the binding preference of CIN85 SH3 domains [16] and was recently proved to be the binding site for the close-relative CD2AP [78], implying that it is also the binding site for CIN85. The linked SH3 domains did not form a tandem arrangement in any of the resulting models with this peptide either.
We decided to check if ensembles generated by the novel BioEmu AI-assisted ensemble generation tool [51] for NCF1 family members and CIN85 contain conformers with tandemly arranged SH3 domains and bound tSH3 motifs. To this end, we generated ensembles of 30 conformers using BioEmu [51] on the linked SH3 domains of the NCF1 family members and CIN85 with and without including their autoregulatory peptides into the studied constructs. Without the peptide being part of the constructs, linked SH3 domains failed to sample the typical domain tandem with multiple contacts. However, if the Pro-rich autoregulatory peptide was present C-terminally to the tSH3s, the tandem SH3 protein segments appeared to assemble in a small to large population of the ensemble, depending on the system. In case of NCF1 (aa. 156–310), the interdomain assembly with the peptide appeared most commonly in the correct peptide-bound state. In the TKS4 system (aa. 152–358) and TKS5 system (aa. 166–412), the tandem SH3 was sampled in 19% and 17% of the ensemble population, although plausible peptide binding pose was only detected in the reverse orientation (6% and 17% of conformers, respectively). The NOXO1 construct only sampled the tSH3 in less than 10%, while the autoregulatory peptide was never placed correctly (neither in forward, nor in reverse orientation). While the generated ensembles of NCF1 family members contained varying population with correct tandem arrangements of the SH3 domains (RMSD < 5 Å from reference PDB:7YXW) (Figure S8), the one generated for CIN85 did not. For the CIN85 system, the addition of the peptide had no detectable effect on the success (p-value = 0.35, t-test on corresponding RMSD distributions), despite the fact that this trend of improved tSH3 recovery held true for all other systems (ranging from NOXO1’s p = 0.0138 to NCF1’s p = 3.8 × 10−17, Table S7).
Based on our computational analysis, the N-terminal SH3 domains of CIN85 do not seem to form tandems (which is in agreement with experimental observations). Since, based on sequence signatures and domain distances, CIN85 is the most promising candidate across the proteome, our results imply that intrachain tandemization of SH3 domains probably only occurs within the NCF1 family.
While probably not forming tandems intramolecularly, the N-terminal SH3 domain of CIN85 has already been described to undergo intermolecular clustering with a proline-arginine-rich peptide of CBL-b (PDB: 2BZ8) [84,85]. In these clusters, two different CIN85 proteins are pulled together by motif-binding-mediated clustering of their N-terminal SH3 domains. After identifying this structure and some other similar structures based on Desrochers G et al. [86], we performed a targeted search of the PDB for structures wherein there are at least two SH3 domains binding to the same peptide. Only eight structures fulfilled this criterion, including the five depicting the tSH3s of NCF1 binding to the autoinhibitory or CYBA peptides (listed in Table 1). In PDB ID:2BZ8 [84], introduced above, the two copies of the N-terminal SH3 domain of CIN85 bind to a single CBL-b peptide. In PDB:2D1X [87], two copies of the C-terminal SH3 domain of cortactin bind to a peptide of AMAP1. Finally, in PDB:5SXP [86], two copies of the ARHGEF7/beta-PIX SH3 domain bind to a peptide of the E3 ubiquitin ligase ITCH. Overall, the latter two novel cases also represent interchain SH3 clusterization events, as seen for the CIN85 N-terminal SH3 domain.
When comparing structures depicting intrachain tandemization and interchain clusterization (Figure 6), we found some important differences. In the case of tandemization (structures are only available for NCF1), the two domains are in contact and form a common binding groove (buried surface area between the two domains is 572 Å2 [20]), where the GWW signature triplets and surrounding residues of both domains play a key role [19,20] (Figure 4). In contrast, in the structures depicting interchain SH3 clustering, the two SH3 domains do not form an interface, although they do have the GWW signature, their GWW triplets are much further away, and they bind the peptide independently, i.e., through two independent binding grooves (Figure 6).
While our results suggest that CIN85 does not form intramolecular SH3 domain tandems, and thus those are likely specific to the NCF1 family, they underscore the diversity of possible interactions between SH3 domains. The observed interchain clustering of the CIN85 N-SH3 domain indicates that SH3 domain interactions are context-dependent, though the full scope, associated sequence signatures and mechanisms of these effects remain to be fully understood.

4. Discussion

SH3 domains are small but largely adaptable protein modules that evolved to fulfil a versatility of molecular functions [2]. The ~300 SH3 domains of the human proteome [1] display a fascinating variety of binding preferences [1,9,10], and thereby bring specificity and fidelity into cellular signaling. While most known SH3 domains follow a simple one-to-one peptide binding mode, there are exceptions to this rule. For instance, the IB1 SH3 domain is not known to bind any peptides, the residues usually involved in peptide binding instead form a unique dimerization interface [88]. In our study, we focused on a unique binding mode wherein two tandemly arranged SH3 domains of a single protein chain come together to form a common binding groove that accommodates a single Pro-rich peptide [18,19,20]. This binding mode has only been described for the NCF1 family of proteins, NCF1 [18,19,20,23,24,89,90], NOXO1 [25,46,58], TKS4 [26] and TKS5 [21,27]. While for NCF1 it has already been discovered and characterized in much detail decades ago [18,19,20,23,24,89,90], for the last family member, TKS4, it has only been verified recently [26]. Through the years, more and more evidence and examples accumulated, but—probably due to the big differences in timing and the imbalance of available data for the different family members—those remained scattered in the literature and have never been collected and comprehensively analyzed.
Here, we collected the scattered pieces of evidence, which enabled taking a fresh look at the gathered data, systematic cross-comparison and contextualization of published results across proteins and deriving novel insights. Definition of the tSH3-binding motif as a novel short linear motif (SLiM) based on the gathered instances was a crucial step towards identifying novel motif candidates. We relied on the classical approach for the definition of the tSH3-binding motif [28,29] and detection of potential motif instances in the human proteome. Due to their short length and few specificity-determining residue positions, motif definitions are degenerate and tend to occur in protein sequences by chance [64]. This burdens pattern-based motif searches with high false positive rates [64]. Therefore, in addition to applying the usual restrictions to eliminate false positives, such as considering motifs only within disordered regions [28,66], we also narrowed our hits to those occurring in the known binding partners of NCF1 family proteins, and manually curated the resulting list.
We applied state-of-the-art AlphaFold-based structure prediction approaches for assessing the validity and likely binding modes of both, validated motif instances with no available structures and novel motif instances proposed in our study. We applied the AlphaFold2-derived AlphaFold-Multimer method [91], which demonstrated high accuracy in predicting protein complexes. Although folding and binding are based on similar biophysical principles, and the method was primarily expected to excel in predicting domain interactions, it can also reliably predict disordered regions bound to ordered domains and distinguish true interactions from decoys [92]. Due to their short length and preferential localization in IDRs with poor sequence conservation, predicting the complex structures for SLiM-mediated interactions is particularly challenging. They may fall outside AlphaFold’s training set, and it is not clear to what extent can AlphaFold apply co-evolutionary signals for their prediction. Nevertheless, AlphaFold-Multimer has shown success in both systematically screening known interactions [40] and, in some cases, discovering new motif instances [41]. Furthermore, it has been shown to capture subtle differences in the binding modes of linear motifs [43,93], a feature particularly important for evaluating the versatile peptide binding modes of SH3 domains. A crucial feature for any prediction algorithm is to provide reliability or probability metrics for its predictions. While AlphaFold-Multimer provides predicted TM-scores (ptm), these scores are often biased towards the much larger globular domains, while failing to represent short motifs. To overcome this issue, we also considered actifpTM [42] confidence scores that are specifically adapted to measure the reliability of domain–peptide interactions. Although AlphaFold3 [45] is expected to perform well on motifs, no large-scale analysis has demonstrated this so far, therefore we only used it as a complementary approach.
Out of curiosity, we also experimented with the recently developed BioEmu deep learning-based ensemble generation tool [51] to see if conformers representing the tandemization and motif binding of tSH3s can be observed among NCF1 family members. The rationale for using BioEmu was that it not only generates AlphaFold-like structural models based on a training set of rigid protein structures but is also trained on extensive molecular dynamics (MD) simulations and complementary experimental stability data from small proteins [94]. Interestingly, we observed that BioEmu often responds sensitively to the addition of autoregulatory Pro-rich peptides, which appear to promote tandemization of SH3 domains in NCF1 family proteins. It remains unclear whether tSH3 formation facilitates peptide binding, or whether peptide binding itself induces tandemization of the SH3 domains in the cellular context. This question has not been thoroughly investigated, and targeted kinetic studies will be necessary to elucidate the mechanisms underlying this intramolecular assembly.
Our structure prediction approach was reinforced by the observation that the complexes of tSH3s and motifs fitting our strong motif definition were predicted with almost identical structures as the experimentally determined complexes. At the same time, it highlighted the possibility that certain tSH3-binding motifs, seemingly the ones with at least one negatively charged residue in the two C-terminal flanking positions immediately following the core motif, bind in a reverse orientation. We suggest that this novel binding mode observed for the N-terminal autoregulatory tSH3-binding motif of TKS4 [26] and the homologous motif proposed for TKS5 autoregulation should be experimentally investigated.
When screening the proteome for candidates of the tSH3 binding mode using the previously established sequence signatures of tandemization and joint motif binding, SH3KBP1/CIN85 was identified as the only promising candidate. However, in accordance with the results of a previous structural analysis by NMR [85], our structure prediction and ensemble generation approaches did not support the formation of the tandem arrangement, even in the presence of Pro-rich peptides harboring the tSH3-binding motif, or a PxxxPR motif fitting the binding preference of individual CIN85 SH3 domains. Based on these observations, the tandem SH3-binding mode is likely to be specific for the four members of the NCF1 protein family that have two or more SH3 domains.
We successfully identified promising candidate tSH3-binding motifs in both NCF1 family members (representing potential autoregulatory sites) and in their known binding partners. Since most NCF1 family members are directly implicated in diseases, the proposed novel autoregulatory mechanisms and mapped binding sites within partner proteins could have therapeutic relevance. Mutations preventing the tandemization of SH3 domains within NCF1 [19,20,81] or destroying the tSH3-binding motif in the partner CYBA [18] lead to a lack of functional NADPH oxidase and are implicated in Chronic Granulomatous Disease (CGD), a severe immune deficiency. Our proposed motifs (Table 1 and Table 2) include a potential secondary autoregulatory site within NCF1 (overlapping with the C-terminal Pro-rich binding site of the NCF2 SH3 domain [70]) and a potential tSH3-binding motif within NCF2 that could represent a secondary binding interface between these two crucial subunits of the NADPH oxidase [18,70]. Furthermore, we believe that our motif definition enabled precise mapping of the NCF1 tSH3-binding site within the protein RelA, which mediates the NCF1-RELA interaction underlying the RelA-mediated activation of the NF-κB pathway by IL-1 in endothelial cells [71].
The role of the tSH3s within the larger family members, TKS4 and TKS5 is much less understood [21,26,27,95], although both of them are heavily implicated in diseases, most importantly in the metastatic potential of different cancers, such as melanoma, lung cancer and colon cancer, among others [57,59,60,78,95,96,97]. Having both overlapping and specific functions, the two proteins are key in the formation and dynamic changes in podosomes and invadopodia [61,95]. They are important for the degradation of the extracellular matrix around these cell protrusions (through facilitating the transport of matrix metalloproteases to the cell surface [95] and directly communicating with them in the cytoplasm [72,73]), which helps free up the space required for cell migration. At the same time, they also play a role in the remodeling of the cytoskeleton inside the protrusions, e.g., TKS4 can directly interact with the actin capping protein [57] (and also indirectly through its interaction with CD2AP [78,98]), and therefore likely plays a key role in reorganizing the actin cytoskeleton at podosomes and invadopodia. Due to the crucial roles of TKS4 and TKS5 in the metastatic potential of cancer cells [95], understanding the regulatory mechanisms affecting their availability and/or functions, as well as the molecular mechanisms of their protein–protein interaction, could have direct therapeutic relevance.
We propose several novel candidate tSH3-binding motifs that could be important in the regulation and functioning of the two proteins (Table 1 and Table 2). First, we propose an autoregulatory tSH3-binding motif between the 3rd and 4th SH3 domains of TKS4 that was previously uninvestigated, despite its sequence and the AF2-predicted model of TKS4 indicating it to be a stronger site than the one previously suggested [26]. A further interesting detail is that the recently described CD2AP-interacting SH3-binding motif, PxVxPR [78] completely overlaps with this proposed downstream autoregulatory site of TKS4. This suggests that if the proposed intramolecular interaction exists, it could likely be resolved by the binding of CD2AP. However, the validity and biological relevance of this potential autoregulatory mechanism still awaits experimental validation.
Our structure predictions and ensemble calculations also suggest that the previously suggested autoregulatory tSH3-binding motif falling between the 2nd and 3rd SH3 domains of TKS4 [26] likely binds in reverse orientation (just as the homologous site within TKS5 that could be an autoregulatory motif of the short isoform), probably due to having a negatively charged residue at the first position of the C-terminal flank. While candidate tSH3-binding motifs have also been detected in a partner of TKS4 (SH3BP1 [57]) and in a partner of TKS5 (ADAM19 [72]), these motifs lie in long Pro-rich IDRs that harbour many potential SH3-binding sites, thus the interactions could also be mediated by the individual SH3s of the two proteins. In the case of ADAM19, the 1st and 5th SH3 domains of TKS5 were demonstrated to mediate the interaction [72]. Nonetheless, both the potential autoregulatory and partner binding candidate tSH3-binding motifs are proposed for experimental validation, wherein, ideally, constructs for both, tandem and individual SH3 domains should be used in order to fully elucidate the participating protein modules and the underlying specificity determinants of the interactions.

5. Conclusions

In our study, we revisited a unique binding mode described for the p47phox-related protein family, wherein two tandemly arranged SH3 domains of a single protein chain come together to form a common binding groove that accommodates Pro-rich peptides. When screening the human proteome for other potential SH3 tandemization candidates based on the associated specific sequence signatures, only one promising candidate could be identified. However, our structure prediction and ensemble generation approaches did not support the formation of the tandem arrangement by the SH3 domains of the one candidate, so we conclude that this binding mode is likely specific to the NCF1 protein family. Through the collection and comprehensive analysis of previously described tandem SH3-binding motif instances scattered in the literature, we successfully defined the binding preference of the tSH3s as a novel short linear motif (SLiM), [PAVIL]PPR[PR][^DE][^DE]. This motif definition was then used to discover novel motif instances within NCF1 family members and their interaction partners. The resulting hits were manually curated and their validity and potential binding modes evaluated by state-of-the-art AI-assisted structure prediction and ensemble generation approaches. Our results imply that most tSH3-motif interactions (those lacking structural information) rely on a similar binding mode that has been described for the NCF1 tSH3s binding to its autoregulatory motif or a similar motif within CYBA, with some specific instances binding in reverse mode. We propose this novel binding mode along with the most promising candidate motif instances discovered for experimental validation. Given the involvement of several members of the studied protein family in different diseases, especially the TKS4 and TKS5 proteins being heavily implicated in the metastatic potential of diverse cancer types, the proposed autoregulatory and partner-binding tandem SH3-binding motifs could have direct therapeutic relevance.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/biom15121641/s1, Figure S1: Sequence alignments of the experimentally verified and proposed motifs from Table 1 on vertebrate species. Figure S2: Structure alignment showing the correspondence between experimental and AF2-predicted structures. Figure S3: AlphaFold predicts different binding modes for the tandem SH3-binding motifs. Figure S4: AlphaFold3 models of the tandem SH3 domains with their autoinhibitory peptide(s). Figure S5: Sequence alignments of the proposed motifs within binding partners from Table 2 on vertebrate species. Figure S6: Sequence alignments of the tandem SH3 domains of NCF1 family members. Figure S7: Sequence alignment of the tandem SH3 domains in NCF1 family members and CIN85. Figure S8: Structural superposition statistics of the BioEmu predicted ensembles with the reference state of tandem SH3 domain. Table S1: List of species investigated on the alignment figures. Table S2: ActifpTM scores of AF2-predicted structures from Table 1. Table S3: List of species considered for calculation of sequence conservation for the strong tandem SH3-binding motifs detected in the human proteome. Table S4: Residue-wise contributions to Gibbs free energies predicted by MMGBSA. Table S5: List of strong tandem SH3-binding motif instances in the human proteome. Table S6: List of SH3 domains (with and without GWW signature), and their distance from each other. Table S7: Structural superposition results and corresponding statistics of the BioEmu predicted ensembles with the reference state of tandem SH3 domain. Data S1: AlphaFold2 predicted structures for tandem SH3 domains and motifs in Table 1 are available at https://zenodo.org/records/17522166 (accessed on 17 November 2025).

Author Contributions

Conceptualization, R.P., Z.E.K. and L.D.; Methodology, R.P., Z.E.K., L.D. and T.L.; Formal Analysis, Z.E.K., L.D., T.L. and R.P.; Investigation, Z.E.K., R.P. and L.D.; Data Curation, R.P. and Z.E.K.; Writing—Original Draft Preparation, R.P., L.D., Z.E.K. and T.L.; Writing—Review & Editing, R.P., Z.E.K., L.D. and T.L.; Visualization, Z.E.K., L.D., T.L. and R.P.; Funding Acquisition, R.P., L.D. and Z.E.K. All authors have read and agreed to the published version of the manuscript.

Funding

The project was implemented with the support from the National Research, Development and Innovation Fund of the Ministry of Culture and Innovation, financed under the FK-142285 and PD-146564 funding schemes granted to R.P. and L.D. R.P is a holder of the János Bolyai Research Fellowship of the Hungarian Academy of Sciences (BO/00174/22). T.L. was a postdoctoral innovation mandate holder (HBC.2022.0194) of the Flanders Innovation & Entrepreneurship Agency (VLAIO) between 2022 and 2024. The work was supported by the University Research Scholarship Programme 2024 (EKÖP) and FEBS ST Fellowship 2025 to Z.E.K.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AFAlphaFold
AIRAutoinhibitory region
CGDChronic granulomatous disease
ELMEukaryotic linear motif resource
IDRIntrinsically disordered region
ITCIsothermal titration calorimetry
PDBProtein data bank
PPIProtein–protein interaction
PPIIPolyproline type II
SLiMShort linear motif
tSH3standem SH3 domains

References

  1. Mehrabipour, M.; Jasemi, N.S.K.; Dvorsky, R.; Ahmadian, M.R. A Systematic Compilation of Human SH3 Domains: A Versatile Superfamily in Cellular Signaling. Cells 2023, 12, 2054. [Google Scholar] [CrossRef] [PubMed]
  2. Kaneko, T.; Li, L.; Li, S.S.-C. The SH3 Domain—A Family of Versatile Peptide- and Protein-Recognition Module. Front. Biosci. 2008, 13, 4938–4952. [Google Scholar] [CrossRef]
  3. Kapeller, R.; Prasad, K.V.; Janssen, O.; Hou, W.; Schaffhausen, B.S.; Rudd, C.E.; Cantley, L.C. Identification of Two SH3-Binding Motifs in the Regulatory Subunit of Phosphatidylinositol 3-Kinase. J. Biol. Chem. 1994, 269, 1927–1933. [Google Scholar] [CrossRef]
  4. Merő, B.; Radnai, L.; Gógl, G.; Tőke, O.; Leveles, I.; Koprivanacz, K.; Szeder, B.; Dülk, M.; Kudlik, G.; Vas, V.; et al. Structural Insights into the Tyrosine Phosphorylation-Mediated Inhibition of SH3 Domain-Ligand Interactions. J. Biol. Chem. 2019, 294, 4608–4620. [Google Scholar] [CrossRef]
  5. Rao, Y.; Ma, Q.; Vahedi-Faridi, A.; Sundborger, A.; Pechstein, A.; Puchkov, D.; Luo, L.; Shupliakov, O.; Saenger, W.; Haucke, V. Molecular Basis for SH3 Domain Regulation of F-BAR-Mediated Membrane Deformation. Proc. Natl. Acad. Sci. USA 2010, 107, 8213–8218. [Google Scholar] [CrossRef]
  6. Ghosh, A.; Mazarakos, K.; Zhou, H.-X. Three Archetypical Classes of Macromolecular Regulators of Protein Liquid-Liquid Phase Separation. Proc. Natl. Acad. Sci. USA 2019, 116, 19474–19483. [Google Scholar] [CrossRef] [PubMed]
  7. Tateno, K.; Ando, T.; Tabata, M.; Sugasawa, H.; Hayashi, T.; Yu, S.; Pm, S.; Inomata, K.; Mikawa, T.; Ito, Y.; et al. Different Molecular Recognition by Three Domains of the Full-Length GRB2 to SOS1 Proline-Rich Motifs and EGFR Phosphorylated Sites. Chem. Sci. 2024, 15, 15858–15872. [Google Scholar] [CrossRef]
  8. Amaya, J.; Ryan, V.H.; Fawzi, N.L. The SH3 Domain of Fyn Kinase Interacts with and Induces Liquid-Liquid Phase Separation of the Low-Complexity Domain of hnRNPA2. J. Biol. Chem. 2018, 293, 19522–19531. [Google Scholar] [PubMed]
  9. Kazemein Jasemi, N.S.; Mehrabipour, M.; Magdalena Estirado, E.; Brunsveld, L.; Dvorsky, R.; Ahmadian, M.R. Functional Classification and Interaction Selectivity Landscape of the Human SH3 Domain Superfamily. Cells 2024, 13, 195. [Google Scholar] [CrossRef]
  10. Teyra, J.; Huang, H.; Jain, S.; Guan, X.; Dong, A.; Liu, Y.; Tempel, W.; Min, J.; Tong, Y.; Kim, P.M.; et al. Comprehensive Analysis of the Human SH3 Domain Family Reveals a Wide Variety of Non-Canonical Specificities. Structure 2017, 25, 1598–1610.e3. [Google Scholar] [CrossRef]
  11. Feng, S.; Chen, J.K.; Yu, H.; Simon, J.A.; Schreiber, S.L. Two Binding Orientations for Peptides to the Src SH3 Domain: Development of a General Model for SH3-Ligand Interactions. Science 1994, 266, 1241–1247. [Google Scholar] [CrossRef]
  12. Lim, W.A.; Richards, F.M.; Fox, R.O. Structural Determinants of Peptide-Binding Orientation and of Sequence Specificity in SH3 Domains. Nature 1994, 372, 375–379. [Google Scholar] [CrossRef]
  13. Yu, H.; Chen, J.K.; Feng, S.; Dalgarno, D.C.; Brauer, A.W.; Schreiber, S.L. Structural Basis for the Binding of Proline-Rich Peptides to SH3 Domains. Cell 1994, 76, 933–945. [Google Scholar] [CrossRef]
  14. Aitio, O.; Hellman, M.; Kesti, T.; Kleino, I.; Samuilova, O.; Pääkkönen, K.; Tossavainen, H.; Saksela, K.; Permi, P. Structural Basis of PxxDY Motif Recognition in SH3 Binding. J. Mol. Biol. 2008, 382, 167–178. [Google Scholar] [CrossRef] [PubMed]
  15. Liu, Q.; Berry, D.; Nash, P.; Pawson, T.; McGlade, C.J.; Li, S.S.-C. Structural Basis for Specific Binding of the Gads SH3 Domain to an RxxK Motif-Containing SLP-76 Peptide: A Novel Mode of Peptide Recognition. Mol. Cell 2003, 11, 471–481. [Google Scholar] [CrossRef] [PubMed]
  16. Kurakin, A.V.; Wu, S.; Bredesen, D.E. Atypical Recognition Consensus of CIN85/SETA/Ruk SH3 Domains Revealed by Target-Assisted Iterative Screening. J. Biol. Chem. 2003, 278, 34102–34109. [Google Scholar] [CrossRef]
  17. Dionne, U.; Chartier, F.J.M.; López de Los Santos, Y.; Lavoie, N.; Bernard, D.N.; Banerjee, S.L.; Otis, F.; Jacquet, K.; Tremblay, M.G.; Jain, M.; et al. Direct Phosphorylation of SRC Homology 3 Domains by Tyrosine Kinase Receptors Disassembles Ligand-Induced Signaling Networks. Mol. Cell 2018, 70, 995–1007.e11. [Google Scholar] [CrossRef] [PubMed]
  18. Sumimoto, H.; Kage, Y.; Nunoi, H.; Sasaki, H.; Nose, T.; Fukumaki, Y.; Ohno, M.; Minakami, S.; Takeshige, K. Role of Src Homology 3 Domains in Assembly and Activation of the Phagocyte NADPH Oxidase. Proc. Natl. Acad. Sci. USA 1994, 91, 5345–5349. [Google Scholar] [CrossRef]
  19. Groemping, Y.; Lapouge, K.; Smerdon, S.J.; Rittinger, K. Molecular Basis of Phosphorylation-Induced Activation of the NADPH Oxidase. Cell 2003, 113, 343–355. [Google Scholar] [CrossRef]
  20. Yuzawa, S.; Suzuki, N.N.; Fujioka, Y.; Ogura, K.; Sumimoto, H.; Inagaki, F. A Molecular Mechanism for Autoinhibition of the Tandem SH3 Domains of p47phox, the Regulatory Subunit of the Phagocyte NADPH Oxidase. Genes Cells 2004, 9, 443–456. [Google Scholar] [CrossRef]
  21. Rufer, A.C.; Rumpf, J.; von Holleben, M.; Beer, S.; Rittinger, K.; Groemping, Y. Isoform-Selective Interaction of the Adaptor Protein Tks5/FISH with Sos1 and Dynamins. J. Mol. Biol. 2009, 390, 939–950. [Google Scholar] [CrossRef]
  22. Ago, T.; Nunoi, H.; Ito, T.; Sumimoto, H. Mechanism for Phosphorylation-Induced Activation of the Phagocyte NADPH Oxidase Protein p47(phox). Triple Replacement of Serines 303, 304, and 328 with Aspartates Disrupts the SH3 Domain-Mediated Intramolecular Interaction in p47(phox), Thereby Activating the Oxidase. J. Biol. Chem. 1999, 274, 33644–33653. [Google Scholar] [PubMed]
  23. Huang, J.; Kleinberg, M.E. Activation of the Phagocyte NADPH Oxidase Protein p47(phox). Phosphorylation Controls SH3 Domain-Dependent Binding to p22(phox). J. Biol. Chem. 1999, 274, 19731–19737. [Google Scholar] [CrossRef] [PubMed]
  24. Nobuhisa, I.; Takeya, R.; Ogura, K.; Ueno, N.; Kohda, D.; Inagaki, F.; Sumimoto, H. Activation of the Superoxide-Producing Phagocyte NADPH Oxidase Requires Co-Operation between the Tandem SH3 Domains of p47phox in Recognition of a Polyproline Type II Helix and an Adjacent Alpha-Helix of p22phox. Biochem. J. 2006, 396, 183–192. [Google Scholar] [CrossRef]
  25. Takeya, R.; Ueno, N.; Kami, K.; Taura, M.; Kohjima, M.; Izaki, T.; Nunoi, H.; Sumimoto, H. Novel Human Homologues of p47phox and p67phox Participate in Activation of Superoxide-Producing NADPH Oxidases. J. Biol. Chem. 2003, 278, 25234–25246. [Google Scholar] [CrossRef]
  26. Merő, B.; Koprivanacz, K.; Cserkaszky, A.; Radnai, L.; Vas, V.; Kudlik, G.; Gógl, G.; Sok, P.; Póti, Á.L.; Szeder, B.; et al. Characterization of the Intramolecular Interactions and Regulatory Mechanisms of the Scaffold Protein Tks4. Int. J. Mol. Sci. 2021, 22, 8103. [Google Scholar] [CrossRef]
  27. Diaz, B.; Shani, G.; Pass, I.; Anderson, D.; Quintavalle, M.; Courtneidge, S.A. Tks5-Dependent, Nox-Mediated Generation of Reactive Oxygen Species Is Necessary for Invadopodia Formation. Sci. Signal. 2009, 2, ra53. [Google Scholar] [CrossRef]
  28. Kumar, M.; Michael, S.; Alvarado-Valverde, J.; Zeke, A.; Lazar, T.; Glavina, J.; Nagy-Kanta, E.; Donagh, J.M.; Kalman, Z.E.; Pascarelli, S.; et al. ELM-the Eukaryotic Linear Motif Resource-2024 Update. Nucleic Acids Res. 2024, 52, D442–D455. [Google Scholar] [CrossRef] [PubMed]
  29. Gouw, M.; Alvarado-Valverde, J.; Čalyševa, J.; Diella, F.; Kumar, M.; Michael, S.; Van Roey, K.; Dinkel, H.; Gibson, T.J. How to Annotate and Submit a Short Linear Motif to the Eukaryotic Linear Motif Resource. Methods Mol. Biol. 2020, 2141, 73–102. [Google Scholar]
  30. Blum, M.; Andreeva, A.; Florentino, L.C.; Chuguransky, S.R.; Grego, T.; Hobbs, E.; Pinto, B.L.; Orr, A.; Paysan-Lafosse, T.; Ponamareva, I.; et al. InterPro: The Protein Sequence Classification Resource in 2025. Nucleic Acids Res. 2025, 53, D444–D456. [Google Scholar] [CrossRef]
  31. Altenhoff, A.M.; Warwick Vesztrocy, A.; Bernard, C.; Train, C.-M.; Nicheperovich, A.; Prieto Baños, S.; Julca, I.; Moi, D.; Nevers, Y.; Majidian, S.; et al. OMA Orthology in 2024: Improved Prokaryote Coverage, Ancestral and Extant GO Enrichment, a Revamped Synteny Viewer and More in the OMA Ecosystem. Nucleic Acids Res. 2024, 52, D513–D521. [Google Scholar] [CrossRef]
  32. Sievers, F.; Wilm, A.; Dineen, D.; Gibson, T.J.; Karplus, K.; Li, W.; Lopez, R.; McWilliam, H.; Remmert, M.; Söding, J.; et al. Fast, Scalable Generation of High-Quality Protein Multiple Sequence Alignments Using Clustal Omega. Mol. Syst. Biol. 2011, 7, 539. [Google Scholar] [CrossRef]
  33. Procter, J.B.; Carstairs, G.M.; Soares, B.; Mourão, K.; Ofoegbu, T.C.; Barton, D.; Lui, L.; Menard, A.; Sherstnev, N.; Roldan-Martinez, D.; et al. Alignment of Biological Sequences with Jalview. Methods Mol. Biol. 2021, 2231, 203–224. [Google Scholar]
  34. Mosca, R.; Céol, A.; Stein, A.; Olivella, R.; Aloy, P. 3did: A Catalog of Domain-Based Interactions of Known Three-Dimensional Structure. Nucleic Acids Res. 2014, 42, D374–D379. [Google Scholar] [CrossRef]
  35. van Kempen, M.; Kim, S.S.; Tumescheit, C.; Mirdita, M.; Lee, J.; Gilchrist, C.L.M.; Söding, J.; Steinegger, M. Fast and Accurate Protein Structure Search with Foldseek. Nat. Biotechnol. 2024, 42, 243–246. [Google Scholar] [CrossRef] [PubMed]
  36. Burley, S.K.; Bhikadiya, C.; Bi, C.; Bittrich, S.; Chen, L.; Crichlow, G.V.; Duarte, J.M.; Dutta, S.; Fayazi, M.; Feng, Z.; et al. RCSB Protein Data Bank: Celebrating 50 Years of the PDB with New Tools for Understanding and Visualizing Biological Macromolecules in 3D. Protein Sci. 2022, 31, 187–208. [Google Scholar] [CrossRef]
  37. Del Conte, A.; Camagni, G.F.; Clementel, D.; Minervini, G.; Monzon, A.M.; Ferrari, C.; Piovesan, D.; Tosatto, S.C.E. RING 4.0: Faster Residue Interaction Networks with Novel Interaction Types across over 35,000 Different Chemical Structures. Nucleic Acids Res. 2024, 52, W306–W312. [Google Scholar] [CrossRef] [PubMed]
  38. Kim, G.; Lee, S.; Levy Karin, E.; Kim, H.; Moriwaki, Y.; Ovchinnikov, S.; Steinegger, M.; Mirdita, M. Easy and Accurate Protein Structure Prediction Using ColabFold. Nat. Protoc. 2025, 20, 620–642. [Google Scholar] [CrossRef] [PubMed]
  39. Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly Accurate Protein Structure Prediction with AlphaFold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef]
  40. Bret, H.; Gao, J.; Zea, D.J.; Andreani, J.; Guerois, R. From Interaction Networks to Interfaces, Scanning Intrinsically Disordered Regions Using AlphaFold2. Nat. Commun. 2024, 15, 597. [Google Scholar] [CrossRef]
  41. Lee, C.Y.; Hubrich, D.; Varga, J.K.; Schäfer, C.; Welzel, M.; Schumbera, E.; Djokic, M.; Strom, J.M.; Schönfeld, J.; Geist, J.L.; et al. Systematic Discovery of Protein Interaction Interfaces Using AlphaFold and Experimental Validation. Mol. Syst. Biol. 2024, 20, 75–97. [Google Scholar] [CrossRef]
  42. Varga, J.K.; Ovchinnikov, S.; Schueler-Furman, O. actifpTM: A Refined Confidence Metric of AlphaFold2 Predictions Involving Flexible Regions. Bioinformatics 2025, 41, btaf107. [Google Scholar] [CrossRef]
  43. Zeke, A.; Gibson, T.J.; Dobson, L. Linear Motifs Regulating Protein Secretion, Sorting and Autophagy in Leishmania Parasites Are Diverged with Respect to Their Host Equivalents. PLoS Comput. Biol. 2024, 20, e1011902. [Google Scholar] [CrossRef] [PubMed]
  44. Meng, E.C.; Goddard, T.D.; Pettersen, E.F.; Couch, G.S.; Pearson, Z.J.; Morris, J.H.; Ferrin, T.E. UCSF ChimeraX: Tools for Structure Building and Analysis. Protein Sci. 2023, 32, e4792. [Google Scholar] [CrossRef]
  45. Abramson, J.; Adler, J.; Dunger, J.; Evans, R.; Green, T.; Pritzel, A.; Ronneberger, O.; Willmore, L.; Ballard, A.J.; Bambrick, J.; et al. Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3. Nature 2024, 630, 493–500. [Google Scholar] [CrossRef]
  46. Dutta, S.; Rittinger, K. Regulation of NOXO1 Activity through Reversible Interactions with p22 and NOXA1. PLoS ONE 2010, 5, e10478. [Google Scholar] [CrossRef]
  47. Gohlke, H.; Case, D.A. Converging Free Energy Estimates: MM-PB(GB)SA Studies on the Protein-Protein Complex Ras-Raf. J. Comput. Chem. 2004, 25, 238–250. [Google Scholar] [CrossRef]
  48. Zhang, X.; Jiang, L.; Weng, G.; Shen, C.; Zhang, O.; Liu, M.; Zhang, C.; Gu, S.; Wang, J.; Wang, X.; et al. HawkDock Version 2: An Updated Web Server to Predict and Analyze the Structures of Protein-Protein Complexes. Nucleic Acids Res. 2025, 53, W306–W315. [Google Scholar] [CrossRef]
  49. Sun, H.; Li, Y.; Tian, S.; Xu, L.; Hou, T. Assessing the Performance of MM/PBSA and MM/GBSA Methods. 4. Accuracies of MM/PBSA and MM/GBSA Methodologies Evaluated by Various Simulation Protocols Using PDBbind Data Set. Phys. Chem. Chem. Phys. 2014, 16, 16719–16729. [Google Scholar] [PubMed]
  50. Mirdita, M.; Schütze, K.; Moriwaki, Y.; Heo, L.; Ovchinnikov, S.; Steinegger, M. ColabFold: Making Protein Folding Accessible to All. Nat. Methods 2022, 19, 679–682. [Google Scholar] [CrossRef] [PubMed]
  51. Lewis, S.; Hempel, T.; Jiménez-Luna, J.; Gastegger, M.; Xie, Y.; Foong, A.Y.K.; Satorras, V.G.; Abdin, O.; Veeling, B.S.; Zaporozhets, I.; et al. Scalable Emulation of Protein Equilibrium Ensembles with Generative Deep Learning. Science 2025, 389, eadv9817. [Google Scholar] [CrossRef] [PubMed]
  52. UniProt Consortium UniProt: The Universal Protein Knowledgebase in 2025. Nucleic Acids Res. 2025, 53, D609–D617. [CrossRef] [PubMed]
  53. Erdős, G.; Dosztányi, Z. AIUPred: Combining Energy Estimation with Deep Learning for the Enhanced Prediction of Protein Disorder. Nucleic Acids Res. 2024, 52, W176–W181. [Google Scholar] [CrossRef]
  54. Varga, J.; Dobson, L.; Tusnády, G.E. TOPDOM: Database of Conservatively Located Domains and Motifs in Proteins. Bioinformatics 2016, 32, 2725–2726. [Google Scholar] [CrossRef]
  55. Oughtred, R.; Stark, C.; Breitkreutz, B.-J.; Rust, J.; Boucher, L.; Chang, C.; Kolas, N.; O’Donnell, L.; Leung, G.; McAdam, R.; et al. The BioGRID Interaction Database: 2019 Update. Nucleic Acids Res. 2019, 47, D529–D541. [Google Scholar] [CrossRef]
  56. Del Toro, N.; Shrivastava, A.; Ragueneau, E.; Meldal, B.; Combe, C.; Barrera, E.; Perfetto, L.; How, K.; Ratan, P.; Shirodkar, G.; et al. The IntAct Database: Efficient Access to Fine-Grained Molecular Interaction Data. Nucleic Acids Res. 2022, 50, D648–D653. [Google Scholar] [CrossRef]
  57. László, L.; Kurilla, A.; Tilajka, Á.; Pancsa, R.; Takács, T.; Novák, J.; Buday, L.; Vas, V. Unveiling Epithelial Plasticity Regulation in Lung Cancer: Exploring the Cross-Talk among Tks4 Scaffold Protein Partners. Mol. Biol. Cell 2024, 35, ar111. [Google Scholar] [CrossRef]
  58. Yamamoto, A.; Kami, K.; Takeya, R.; Sumimoto, H. Interaction between the SH3 Domains and C-Terminal Proline-Rich Region in NADPH Oxidase Organizer 1 (Noxo1). Biochem. Biophys. Res. Commun. 2007, 352, 560–565. [Google Scholar] [CrossRef]
  59. Iizuka, S.; Abdullah, C.; Buschman, M.D.; Diaz, B.; Courtneidge, S.A. The Role of Tks Adaptor Proteins in Invadopodia Formation, Growth and Metastasis of Melanoma. Oncotarget 2016, 7, 78473–78486. [Google Scholar] [CrossRef]
  60. Courtneidge, S.A. Cell Migration and Invasion in Human Disease: The Tks Adaptor Proteins. Biochem. Soc. Trans. 2012, 40, 129–132. [Google Scholar] [CrossRef] [PubMed]
  61. Buschman, M.D.; Bromann, P.A.; Cejudo-Martin, P.; Wen, F.; Pass, I.; Courtneidge, S.A. The Novel Adaptor Protein Tks4 (SH3PXD2B) Is Required for Functional Podosome Formation. Mol. Biol. Cell 2009, 20, 1302–1311. [Google Scholar] [CrossRef] [PubMed]
  62. Stapley, B.J.; Creamer, T.P. A Survey of Left-Handed Polyproline II Helices. Protein Sci. 1999, 8, 587–595. [Google Scholar] [CrossRef] [PubMed]
  63. Hekkelman, M.L.; Salmoral, D.Á.; Perrakis, A.; Joosten, R.P. DSSP 4: FAIR Annotation of Protein Secondary Structure. Protein Sci. 2025, 34, e70208. [Google Scholar] [CrossRef]
  64. Davey, N.E.; Van Roey, K.; Weatheritt, R.J.; Toedt, G.; Uyar, B.; Altenberg, B.; Budd, A.; Diella, F.; Dinkel, H.; Gibson, T.J. Attributes of Short Linear Motifs. Mol. Biosyst. 2012, 8, 268–281. [Google Scholar] [CrossRef] [PubMed]
  65. Van Roey, K.; Uyar, B.; Weatheritt, R.J.; Dinkel, H.; Seiler, M.; Budd, A.; Gibson, T.J.; Davey, N.E. Short Linear Motifs: Ubiquitous and Functionally Diverse Protein Interaction Modules Directing Cell Regulation. Chem. Rev. 2014, 114, 6733–6778. [Google Scholar] [CrossRef]
  66. Fuxreiter, M.; Tompa, P.; Simon, I. Local Structural Disorder Imparts Plasticity on Linear Motifs. Bioinformatics 2007, 23, 950–956. [Google Scholar] [CrossRef]
  67. Hornbeck, P.V.; Kornhauser, J.M.; Latham, V.; Murray, B.; Nandhikonda, V.; Nord, A.; Skrzypek, E.; Wheeler, T.; Zhang, B.; Gnad, F. 15 Years of PhosphoSitePlus®: Integrating Post-Translationally Modified Sites, Disease Variants and Isoforms. Nucleic Acids Res. 2019, 47, D433–D441. [Google Scholar] [CrossRef]
  68. Varadi, M.; Bertoni, D.; Magana, P.; Paramval, U.; Pidruchna, I.; Radhakrishnan, M.; Tsenkov, M.; Nair, S.; Mirdita, M.; Yeo, J.; et al. AlphaFold Protein Structure Database in 2024: Providing Structure Coverage for over 214 Million Protein Sequences. Nucleic Acids Res. 2024, 52, D368–D375. [Google Scholar] [CrossRef]
  69. Oughtred, R.; Rust, J.; Chang, C.; Breitkreutz, B.-J.; Stark, C.; Willems, A.; Boucher, L.; Leung, G.; Kolas, N.; Zhang, F.; et al. The BioGRID Database: A Comprehensive Biomedical Resource of Curated Protein, Genetic, and Chemical Interactions. Protein Sci. 2021, 30, 187–200. [Google Scholar] [CrossRef]
  70. Finan, P.; Shimizu, Y.; Gout, I.; Hsuan, J.; Truong, O.; Butcher, C.; Bennett, P.; Waterfield, M.D.; Kellie, S. An SH3 Domain and Proline-Rich Sequence Mediate an Interaction between Two Components of the Phagocyte NADPH Oxidase Complex. J. Biol. Chem. 1994, 269, 13752–13755. [Google Scholar] [CrossRef]
  71. Gu, Y.; Xu, Y.C.; Wu, R.F.; Nwariaku, F.E.; Souza, R.F.; Flores, S.C.; Terada, L.S. p47phox Participates in Activation of RelA in Endothelial Cells. J. Biol. Chem. 2003, 278, 17210–17217. [Google Scholar] [CrossRef]
  72. Kleino, I.; Järviluoma, A.; Hepojoki, J.; Huovila, A.P.; Saksela, K. Preferred SH3 Domain Partners of ADAM Metalloproteases Include Shared and ADAM-Specific SH3 Interactions. PLoS ONE 2015, 10, e0121301. [Google Scholar] [CrossRef] [PubMed]
  73. Abram, C.L.; Seals, D.F.; Pass, I.; Salinsky, D.; Maurer, L.; Roth, T.M.; Courtneidge, S.A. The Adaptor Protein Fish Associates with Members of the ADAMs Family and Localizes to Podosomes of Src-Transformed Cells. J. Biol. Chem. 2003, 278, 16844–16851. [Google Scholar] [CrossRef]
  74. Onofri, F.; Giovedi, S.; Kao, H.T.; Valtorta, F.; Bongiorno Borbone, L.; De Camilli, P.; Greengard, P.; Benfenati, F. Specificity of the Binding of Synapsin I to Src Homology 3 Domains. J. Biol. Chem. 2000, 275, 29857–29867. [Google Scholar] [CrossRef] [PubMed]
  75. Takeshita, F.; Ishii, K.J.; Kobiyama, K.; Kojima, Y.; Coban, C.; Sasaki, S.; Ishii, N.; Klinman, D.M.; Okuda, K.; Akira, S.; et al. TRAF4 Acts as a Silencer in TLR-Mediated Signaling through the Association with TRAF6 and TRIF. Eur. J. Immunol. 2005, 35, 2477–2485. [Google Scholar] [CrossRef]
  76. Cho, N.H.; Cheveralls, K.C.; Brunner, A.-D.; Kim, K.; Michaelis, A.C.; Raghavan, P.; Kobayashi, H.; Savy, L.; Li, J.Y.; Canaj, H.; et al. OpenCell: Endogenous Tagging for the Cartography of Human Cellular Organization. Science 2022, 375, eabi6983. [Google Scholar] [CrossRef]
  77. Parrini, M.C.; Sadou-Dubourgnoux, A.; Aoki, K.; Kunida, K.; Biondini, M.; Hatzoglou, A.; Poullet, P.; Formstecher, E.; Yeaman, C.; Matsuda, M.; et al. SH3BP1, an Exocyst-Associated RhoGAP, Inactivates Rac1 at the Front to Drive Cell Motility. Mol. Cell 2011, 42, 650–661. [Google Scholar] [CrossRef]
  78. Kurilla, A.; László, L.; Takács, T.; Tilajka, Á.; Lukács, L.; Novák, J.; Pancsa, R.; Buday, L.; Vas, V. Studying the Association of TKS4 and CD2AP Scaffold Proteins and Their Implications in the Partial Epithelial-Mesenchymal Transition (EMT) Process. Int. J. Mol. Sci. 2023, 24, 15136. [Google Scholar] [CrossRef] [PubMed]
  79. Elbediwy, A.; Zihni, C.; Terry, S.J.; Clark, P.; Matter, K.; Balda, M.S. Epithelial Junction Formation Requires Confinement of Cdc42 Activity by a Novel SH3BP1 Complex. J. Cell Biol. 2012, 198, 677–693. [Google Scholar] [CrossRef]
  80. Boldt, K.; van Reeuwijk, J.; Lu, Q.; Koutroumpas, K.; Nguyen, T.-M.T.; Texier, Y.; van Beersum, S.E.C.; Horn, N.; Willer, J.R.; Mans, D.A.; et al. An Organelle-Specific Protein Landscape Identifies Novel Diseases and Molecular Mechanisms. Nat. Commun. 2016, 7, 11491. [Google Scholar] [CrossRef]
  81. Noack, D.; Rae, J.; Cross, A.R.; Ellis, B.A.; Newburger, P.E.; Curnutte, J.T.; Heyworth, P.G. Autosomal Recessive Chronic Granulomatous Disease Caused by Defects in NCF-1, the Gene Encoding the Phagocyte p47-Phox: Mutations Not Arising in the NCF-1 Pseudogenes. Blood 2001, 97, 305–311. [Google Scholar] [CrossRef]
  82. Ababou, A.; Pfuhl, M.; Ladbury, J.E. Novel Insights into the Mechanisms of CIN85 SH3 Domains Binding to Cbl Proteins: Solution-Based Investigations and in Vivo Implications. J. Mol. Biol. 2009, 387, 1120–1136. [Google Scholar] [CrossRef] [PubMed]
  83. Havrylov, S.; Rzhepetskyy, Y.; Malinowska, A.; Drobot, L.; Redowicz, M.J. Proteins Recruited by SH3 Domains of Ruk/CIN85 Adaptor Identified by LC-MS/MS. Proteome Sci. 2009, 7, 21. [Google Scholar] [CrossRef] [PubMed]
  84. Jozic, D.; Cárdenes, N.; Deribe, Y.L.; Moncalián, G.; Hoeller, D.; Groemping, Y.; Dikic, I.; Rittinger, K.; Bravo, J. Cbl Promotes Clustering of Endocytic Adaptor Proteins. Nat. Struct. Mol. Biol. 2005, 12, 972–979. [Google Scholar] [CrossRef]
  85. Ceregido, M.A.; Garcia-Pino, A.; Ortega-Roldan, J.L.; Casares, S.; López Mayorga, O.; Bravo, J.; van Nuland, N.A.J.; Azuaga, A.I. Multimeric and Differential Binding of CIN85/CD2AP with Two Atypical Proline-Rich Sequences from CD2 and Cbl-B*. FEBS J. 2013, 280, 3399–3415. [Google Scholar] [CrossRef] [PubMed]
  86. Desrochers, G.; Cappadocia, L.; Lussier-Price, M.; Ton, A.-T.; Ayoubi, R.; Serohijos, A.; Omichinski, J.G.; Angers, A. Molecular Basis of Interactions between SH3 Domain-Containing Proteins and the Proline-Rich Region of the Ubiquitin Ligase Itch. J. Biol. Chem. 2017, 292, 6325–6338. [Google Scholar] [CrossRef]
  87. Hashimoto, S.; Hirose, M.; Hashimoto, A.; Morishige, M.; Yamada, A.; Hosaka, H.; Akagi, K.-I.; Ogawa, E.; Oneyama, C.; Agatsuma, T.; et al. Targeting AMAP1 and Cortactin Binding Bearing an Atypical Src Homology 3/proline Interface for Prevention of Breast Cancer Invasion and Metastasis. Proc. Natl. Acad. Sci. USA 2006, 103, 7036–7041. [Google Scholar] [CrossRef]
  88. Kristensen, O.; Guenat, S.; Dar, I.; Allaman-Pillet, N.; Abderrahmani, A.; Ferdaoussi, M.; Roduit, R.; Maurer, F.; Beckmann, J.S.; Kastrup, J.S.; et al. A Unique Set of SH3-SH3 Interactions Controls IB1 Homodimerization. EMBO J. 2006, 25, 785–797. [Google Scholar] [CrossRef]
  89. Ogura, K.; Nobuhisa, I.; Yuzawa, S.; Takeya, R.; Torikai, S.; Saikawa, K.; Sumimoto, H.; Inagaki, F. NMR Solution Structure of the Tandem Src Homology 3 Domains of p47phox Complexed with a p22phox-Derived Proline-Rich Peptide. J. Biol. Chem. 2006, 281, 3660–3668. [Google Scholar] [CrossRef]
  90. Leto, T.L.; Adams, A.G.; de Mendez, I. Assembly of the Phagocyte NADPH Oxidase: Binding of Src Homology 3 Domains to Proline-Rich Targets. Proc. Natl. Acad. Sci. USA 1994, 91, 10650–10654. [Google Scholar] [CrossRef]
  91. Evans, R.; O’Neill, M.; Pritzel, A.; Antropova, N.; Senior, A.; Green, T.; Žídek, A.; Bates, R.; Blackwell, S.; Yim, J.; et al. Protein Complex Prediction with AlphaFold-Multimer. bioRxiv 2021. [Google Scholar] [CrossRef]
  92. Omidi, A.; Møller, M.H.; Malhis, N.; Bui, J.M.; Gsponer, J. AlphaFold-Multimer Accurately Captures Interactions and Dynamics of Intrinsically Disordered Protein Regions. Proc. Natl. Acad. Sci. USA 2024, 121, e2406407121. [Google Scholar] [CrossRef] [PubMed]
  93. Ibrahim, T.; Khandare, V.; Mirkin, F.G.; Tumtas, Y.; Bubeck, D.; Bozkurt, T.O. AlphaFold2-Multimer Guided High-Accuracy Prediction of Typical and Atypical ATG8-Binding Motifs. PLoS Biol. 2023, 21, e3001962. [Google Scholar] [CrossRef]
  94. Tsuboyama, K.; Dauparas, J.; Chen, J.; Laine, E.; Mohseni Behbahani, Y.; Weinstein, J.J.; Mangan, N.M.; Ovchinnikov, S.; Rocklin, G.J. Mega-Scale Experimental Analysis of Protein Folding Stability in Biology and Design. Nature 2023, 620, 434–444. [Google Scholar] [CrossRef]
  95. Kudlik, G.; Takács, T.; Radnai, L.; Kurilla, A.; Szeder, B.; Koprivanacz, K.; Merő, B.L.; Buday, L.; Vas, V. Advances in Understanding TKS4 and TKS5: Molecular Scaffolds Regulating Cellular Processes from Podosome and Invadopodium Formation to Differentiation and Tissue Homeostasis. Int. J. Mol. Sci. 2020, 21, 8117. [Google Scholar] [CrossRef]
  96. Tilajka, Á.; Kurilla, A.; László, L.; Lovrics, A.; Novák, J.; Takács, T.; Buday, L.; Vas, V. Predictive Value Analysis of the Interaction Network of Tks4 Scaffold Protein in Colon Cancer. Front. Mol. Biosci. 2024, 11, 1414805. [Google Scholar] [CrossRef]
  97. Jacksi, M.; Schad, E.; Buday, L.; Tantos, A. Absence of Scaffold Protein Tks4 Disrupts Several Signaling Pathways in Colon Cancer Cells. Int. J. Mol. Sci. 2023, 24, 1310. [Google Scholar] [CrossRef] [PubMed]
  98. Zhao, J.; Bruck, S.; Cemerski, S.; Zhang, L.; Butler, B.; Dani, A.; Cooper, J.A.; Shaw, A.S. CD2AP Links Cortactin and Capping Protein at the Cell Periphery to Facilitate Formation of Lamellipodia. Mol. Cell. Biol. 2013, 33, 38–47. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Domain maps of the NCF1 family members with more than one SH3 domain. Domain maps of NCF1, NOXO1, TKS4 and TKS5. PX domains are depicted in blue, SH3 domains in green. Tandemized SH3 domains are marked with a light green background. Autoregulatory tandem SH3-binding motifs (tSH3 mot) are marked with orange, among these, the experimentally verified cases are connected to the tandem domains by continuous lines, while the proposed motif instances are connected by dashed lines. The gene names of partner proteins that are experimentally proven to bind to the tSH3s, along with the positions of their tSH3-binding motifs are indicated in purple.
Figure 1. Domain maps of the NCF1 family members with more than one SH3 domain. Domain maps of NCF1, NOXO1, TKS4 and TKS5. PX domains are depicted in blue, SH3 domains in green. Tandemized SH3 domains are marked with a light green background. Autoregulatory tandem SH3-binding motifs (tSH3 mot) are marked with orange, among these, the experimentally verified cases are connected to the tandem domains by continuous lines, while the proposed motif instances are connected by dashed lines. The gene names of partner proteins that are experimentally proven to bind to the tSH3s, along with the positions of their tSH3-binding motifs are indicated in purple.
Biomolecules 15 01641 g001
Figure 2. Considerations driving motif definition. We relied on six different considerations when defining the strong tSH3-binding motif, which are marked with six different colors. In the middle, the regular expression of the strong motif definition is depicted, and the different considerations used to define the (dis)allowed residues in particular positions of the motif are mapped onto the respective positions with stripes of matching color code. In the upper left corner, vertebrate alignment of the tSH3-binding motif of CYBA is shown (the motif is missing from the African elephant, an observation also discussed in a later section). In the middle, the inhibitory effect of a disease mutation in this motif is depicted. On the right, the NCF1 autoregulatory tSH3-binding motif is depicted, wherein phosphorylations of the serine residues following the core motif (in the 6th and 7th positions) inhibit motif binding. Experiments confirmed that introducing phosphomimetic (acidic) residues into these positions also hinders binding, which led to their exclusion in the respective positions of the motif definition. In the bottom right corner, results of MMGBSA calculations are depicted, which estimate the relative contributions of motif residues to the tSH3 domain-motif complex formation.
Figure 2. Considerations driving motif definition. We relied on six different considerations when defining the strong tSH3-binding motif, which are marked with six different colors. In the middle, the regular expression of the strong motif definition is depicted, and the different considerations used to define the (dis)allowed residues in particular positions of the motif are mapped onto the respective positions with stripes of matching color code. In the upper left corner, vertebrate alignment of the tSH3-binding motif of CYBA is shown (the motif is missing from the African elephant, an observation also discussed in a later section). In the middle, the inhibitory effect of a disease mutation in this motif is depicted. On the right, the NCF1 autoregulatory tSH3-binding motif is depicted, wherein phosphorylations of the serine residues following the core motif (in the 6th and 7th positions) inhibit motif binding. Experiments confirmed that introducing phosphomimetic (acidic) residues into these positions also hinders binding, which led to their exclusion in the respective positions of the motif definition. In the bottom right corner, results of MMGBSA calculations are depicted, which estimate the relative contributions of motif residues to the tSH3 domain-motif complex formation.
Biomolecules 15 01641 g002
Figure 3. AlphaFold predicts different binding modes for the tandem SH3-binding motifs. AlphaFold-predicted models of the complexes suggest different binding modes based on motif sequence characteristics. (A) All motifs fitting the strong motif definition bind very similarly (this is also evident based on the experimentally determined structures listed in Table 1). (B) The weak binder NOXO1 C-terminal motif that has a Thr at the 3rd position of its core binds in an almost identical manner as the strong binders, which supports the validity of the weak motif definition. (C) A reverse binding mode is predicted for a pair of homologous motifs in TKS4 and TKS5, where a negatively charged residue follows the core motif, but an Arg occupies position 0, directly preceding the core motif. (D) Comparison of the positioning of Arg4 in the forward binding mode versus Arg0 in the reverse binding mode and their distances to the Glu and Asp residues of the C-double negative signature (defined in a following section) on the domain side.
Figure 3. AlphaFold predicts different binding modes for the tandem SH3-binding motifs. AlphaFold-predicted models of the complexes suggest different binding modes based on motif sequence characteristics. (A) All motifs fitting the strong motif definition bind very similarly (this is also evident based on the experimentally determined structures listed in Table 1). (B) The weak binder NOXO1 C-terminal motif that has a Thr at the 3rd position of its core binds in an almost identical manner as the strong binders, which supports the validity of the weak motif definition. (C) A reverse binding mode is predicted for a pair of homologous motifs in TKS4 and TKS5, where a negatively charged residue follows the core motif, but an Arg occupies position 0, directly preceding the core motif. (D) Comparison of the positioning of Arg4 in the forward binding mode versus Arg0 in the reverse binding mode and their distances to the Glu and Asp residues of the C-double negative signature (defined in a following section) on the domain side.
Biomolecules 15 01641 g003
Figure 4. Sequence signatures of tandemization and motif binding within the tSH3s of the four family members. Positions suggested to be important in tandemization are marked with red boxes (based on [19,20]). Amino acids that participate in binding the partner peptides within the available complex structures (PDB IDs: 1NG2, 1UEC, 1OV3, 1WLP, 7YXW) are marked by black boxes. The residues of the contacting peptides are marked above the contacted alignment positions (bold letter: found in three or more structures, normal: found in at least two structures). The numbering of peptide/motif residues follows the one introduced in the previous section on the definition of the tandem SH3-binding motif, so, for instance, R(4) stands for the conserved Arg at the 4th position of the motif. The domain and linker boundaries, as well as the sequence signatures of tandemization and joint motif binding, are indicated in gray and black below the alignment.
Figure 4. Sequence signatures of tandemization and motif binding within the tSH3s of the four family members. Positions suggested to be important in tandemization are marked with red boxes (based on [19,20]). Amino acids that participate in binding the partner peptides within the available complex structures (PDB IDs: 1NG2, 1UEC, 1OV3, 1WLP, 7YXW) are marked by black boxes. The residues of the contacting peptides are marked above the contacted alignment positions (bold letter: found in three or more structures, normal: found in at least two structures). The numbering of peptide/motif residues follows the one introduced in the previous section on the definition of the tandem SH3-binding motif, so, for instance, R(4) stands for the conserved Arg at the 4th position of the motif. The domain and linker boundaries, as well as the sequence signatures of tandemization and joint motif binding, are indicated in gray and black below the alignment.
Biomolecules 15 01641 g004
Figure 5. Example for co-evolution between the NCF1 tandem SH3 domains and the respective binding motifs within NCF1 and the partner CYBA in the African elephant (Loxodonta africana). Sequence alignments of the NCF1 tandem SH3 domains, the autoregulatory motif and the tSH3-binding motif in CYBA are shown to highlight that all of them have been lost or substantially altered in the African elephant (Loxodonta africana).
Figure 5. Example for co-evolution between the NCF1 tandem SH3 domains and the respective binding motifs within NCF1 and the partner CYBA in the African elephant (Loxodonta africana). Sequence alignments of the NCF1 tandem SH3 domains, the autoregulatory motif and the tSH3-binding motif in CYBA are shown to highlight that all of them have been lost or substantially altered in the African elephant (Loxodonta africana).
Biomolecules 15 01641 g005
Figure 6. Structure alignment of intrachain tandem SH3 domains and interchain SH3 clustering. (A) structure alignment of intrachain and interchain experimental structures (7ywx, 1ov3, 5spx, 2d1x, 2bz8). (B) Rotated view on intrachain tandem (top) and interchain clustering (bottom) SH3 domains. (C) A closer view highlighting the distance of SH3 domains in intrachain tandem (top) and interchain clustering (bottom) SH3 domains. The depicted distances are calculated between the Gly residues of the GWW signature triplets of the two domains.
Figure 6. Structure alignment of intrachain tandem SH3 domains and interchain SH3 clustering. (A) structure alignment of intrachain and interchain experimental structures (7ywx, 1ov3, 5spx, 2d1x, 2bz8). (B) Rotated view on intrachain tandem (top) and interchain clustering (bottom) SH3 domains. (C) A closer view highlighting the distance of SH3 domains in intrachain tandem (top) and interchain clustering (bottom) SH3 domains. The depicted distances are calculated between the Gly residues of the GWW signature triplets of the two domains.
Biomolecules 15 01641 g006
Table 1. Experimentally validated and proposed tandem SH3-binding motifs of the NCF1 family.
Table 1. Experimentally validated and proposed tandem SH3-binding motifs of the NCF1 family.
Partner with SH3s (by Gene Name)UniProt AC; Tandem SH3 PositionsPartner with Motif (by Gene Name)UniProt AC; Peptide PositionsPeptide SequenceAutoregulatory/Partner (A/P)Kd/Not Measured (NM)PDB StructuresReferenceStatus: Validated/Proposed
NCF1P14598 158-283NCF1P14598 296-304RGAPPRRSSA29 μM1NG2, 1UEC[19]Validated
NCF1P14598 158-283CYBAP13498 153-161SNPPPRPPAP0.19 μM1OV3, 1WLP, 7YXW[18,19]Validated
NOXO1Q8NFA2 165-294NOXO1Q8NFA2 329-337PTVPTRPSPA50 µMN/A[46]Validated
NOXO1Q8NFA2 165-294CYBAP13498 153-161SNPPPRPPAP0.15 µMN/A[46]Validated
TKS4A1X283 155-278TKS4A1X283 344-352QRPPPRRDMA12 ± 4 µMN/A[26]Validated
TKS5Q5TCZ1-3 168-297CYBAP13498 153-161SNPPPRPPAPNMN/A[27]Validated
TKS5Q5TCZ1-3 168-297SOS1Q07889 1151-1159PPVPPRRRPPNMN/A[21]Validated
NCF1P14598 158-283NCF1P14598 363-371PAVPPRPSAAN/AN/AN/AProposed
TKS4A1X283 155-278TKS4A1X283 755-763PVVPPRRPPAN/AN/AN/AProposed
TKS5Q5TCZ1-3 168-297TKS5Q5TCZ1-3 398-406TRPPPRRESAN/AN/AN/AProposed
In the peptide sequences residues belonging to the core motifs are underscored.
Table 2. Partners of the NCF1 family proteins which contain at least one strong tandem SH3-binding motif.
Table 2. Partners of the NCF1 family proteins which contain at least one strong tandem SH3-binding motif.
NCF1 family proteinIdentified partners with the motif positions indicated
NCF1NCF2 (227-233), SRSF2 (105-111), ADAM19 (789-795), RELA (326-332), SYN1 (676-682)
NOXO1ARHGAP8 (213-219)
TKS4SH3BP1 (538-544) *
TKS5GAB2 (351-357), PLEKHA7 (919-925), ADAM19 (789-795), RIN3 (385-391), DENND1A (950-956), INPP5E (189-195)
In the table, self interactions (NCF1-NCF1, TKS4-TKS4), interactions with pseudogenes (NCF1-NCF1C) and the interactions covered in Table 1 (NCF1-CYBA, NOXO1-CYBA, TKS5-SOS1) are not listed. The hits that seemed most promising based on literature curation are in bold font. * For TKS4, no novel hits could be identified based on BioGRID and IntAct. The partner lists of in-house TKS4 co-immunoprecipitation experiments performed in five different lung cancer cell lines [57] were also obtained and overlapped with the motif hits for the whole proteome. SH3BP1 could be identified as a partner harboring the motif, which was detected in 2 of the 5 cell lines as a TKS4 partner.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kalman, Z.E.; Lazar, T.; Dobson, L.; Pancsa, R. Definition and Discovery of Tandem SH3-Binding Motifs Interacting with Members of the p47phox-Related Protein Family. Biomolecules 2025, 15, 1641. https://doi.org/10.3390/biom15121641

AMA Style

Kalman ZE, Lazar T, Dobson L, Pancsa R. Definition and Discovery of Tandem SH3-Binding Motifs Interacting with Members of the p47phox-Related Protein Family. Biomolecules. 2025; 15(12):1641. https://doi.org/10.3390/biom15121641

Chicago/Turabian Style

Kalman, Zsofia E., Tamas Lazar, Laszlo Dobson, and Rita Pancsa. 2025. "Definition and Discovery of Tandem SH3-Binding Motifs Interacting with Members of the p47phox-Related Protein Family" Biomolecules 15, no. 12: 1641. https://doi.org/10.3390/biom15121641

APA Style

Kalman, Z. E., Lazar, T., Dobson, L., & Pancsa, R. (2025). Definition and Discovery of Tandem SH3-Binding Motifs Interacting with Members of the p47phox-Related Protein Family. Biomolecules, 15(12), 1641. https://doi.org/10.3390/biom15121641

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop