PInteract: Detecting Aromatic-Involving Motifs in Proteins and Protein-Nucleic Acid Complexes

Li, Dong; Pucci, Fabrizio; Rooman, Marianne

doi:10.3390/biom15081204

Open AccessArticle

PInteract: Detecting Aromatic-Involving Motifs in Proteins and Protein-Nucleic Acid Complexes

by

Dong Li

^1,2

,

Fabrizio Pucci

^1,2 and

Marianne Rooman

^1,2,*

¹

Computational Biology and Bioinformatics, Université Libre de Bruxelles, 1050 Brussels, Belgium

²

Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles, 1050 Brussels, Belgium

^*

Author to whom correspondence should be addressed.

Biomolecules 2025, 15(8), 1204; https://doi.org/10.3390/biom15081204

Submission received: 22 July 2025 / Revised: 15 August 2025 / Accepted: 16 August 2025 / Published: 21 August 2025

(This article belongs to the Special Issue Computational Analysis and Conformational Modeling for Protein Structure and Interaction)

Download

Browse Figures

Versions Notes

Abstract

With the recent development of accurate protein structure prediction tools, virtually all protein sequences now have an experimental or a modeled structure. It has therefore become essential to develop fast algorithms capable of detecting non-covalent interactions not only within proteins but also in protein-protein, protein-DNA, protein-RNA, and protein-ligand complexes. Interactions involving aromatic compounds, particularly their

π

molecular orbitals, hold unique significance among molecular interactions due to the electron delocalization, which is known to play a key role in processes such as protein aggregation. In this paper, we present PInteract, an algorithm that detects

π

-involving interactions in input structures based on geometric criteria, including

π

-

π

, cation-

π

, amino-

π

, His-

π

, and sulfur-

π

interactions. In addition, it is capable of detecting chains and clusters of

π

interactions as well as particular recurrent motifs at protein-DNA and protein-RNA interfaces, called stair motifs, consisting of a particular combination of

π

-

π

stacking, cation/amino/His-

π

and H-bond interactions.

Keywords:

stacking geometry; T-shaped geometry; protein stability; protein affinity; solubility; aggregation

1. Introduction

Among the various types of noncovalent interactions that stabilize the 3-dimensional (3D) structures of proteins and protein-involving complexes, interactions where one of the partners carries an aromatic ring have a status apart [1,2]. Indeed, their London dispersion contribution to the free energy is particularly important, due to the strength of

π

-electron delocalization. In particular, interactions between two aromatic moieties, called

π

-

π

interactions [3], have large dispersion energies especially if the aromatic rings are parallel. When the rings are in T-shaped conformations, the electrostatic contributions are increased at the expense of dispersion contributions. Both T-shaped and stacked conformations are energetically favorable; their relative preference depends on the particular type of extracyclic atoms and on the environment. Here, we considered

π

-

π

interactions in which one of the partners is an aromatic amino acid (Phe, Tyr, Trp) and the other is either another aromatic amino acid or a nucleic acid base (Gua, Ade, Cyt, Thy, Ura).

In addition to

π

-

π

interactions, we considered cation-

π

, amino-

π

, and His-

π

interactions, where the

π

-partner is the ring of an aromatic amino acid or nucleic acid base. In cation-

π

interactions [4,5,6], the other partner is a positively charged amino acid Arg or Lys. In Lys-

π

interactions, the electrostatic contribution is predominant, whereas in Arg-

π

interactions the dispersion energy is also important due to the charge delocalization on the planar guanidinium group of the Arg side chain. In amino-

π

interactions [5,7], sometimes also called amide-

π

, the second partner is the neutral side chain of Asn or Gln, which is the planar and polarizable formamide group; the electrostatic contribution is here small and the dispersion energy is the main stabilizing force. The His-

π

interaction [8] is, in essence, contained in other subtypes according to the protonation state of the His: when neutral, His-

π

can be considered as a

π

-

π

interaction, and when protonated, as a cation-

π

interaction. This makes His-

π

a special and versatile pH-dependent interaction that plays a unique role in function and molecular recognition [9,10]. We chose to treat it as a separate form of

π

interaction.

The last

π

-involving type of interactions we considered is sulfur-

π

interaction [11,12], also called thiol-

π

, which links an aromatic moiety of an amino acid or nucleic acid base with a sulfur atom of Met or Cys side chains. Such interactions have been shown to be stabilizing, though less than other

π

interactions [13], and to be sometimes of functional importance [14].

Different

π

interactions are often found to combine into larger motifs, such as chains or clusters of

π

-

π

interactions [15] and of cation/amino/His-

π

interactions [16]. For example, a positively charged ion can be sandwiched between two aromatic rings, or an aromatic ring can be sandwiched between two positively charged groups. Moreover, recurrent motifs called stair motifs, which mix

π

-

π

interactions and H-bonds with cation-

π

, amino-

π

, or His-

π

interactions, are also commonly found at protein-DNA interfaces [16,17].

There has been great success in studying the importance of aromatic-involving interactions in biological processes [18,19,20,21]. For example, cation-

π

and

π

-

π

interactions have been shown to dictate the phase transition in liquid-liquid phase separation [22]. Furthermore, by mutating some residues in histones into Trp, thus increasing the number of cation-

π

interactions, researchers managed to improve the affinity between histones and their reader proteins, which further enhances downstream genetic regulations [23]. Moreover, a mutation from Phe to Thr in BanLec, a banana protein which is regarded as a potential antiviral agent and works by binding to glycan molecules via

π

-

π

stacking, was demonstrated to reduce the protein’s antiviral competence as the mutation disrupted the capacity of glycan binding of BanLec. Sulfur-

π

interactions, despite a relatively lower appearance frequency, have been pointed out for contribution to protein stability [24]. Finally, His-

π

interactions, being interchangeable among different electrostatic states, has been shown to impact protein activity according to the environmental salt concentration.

Several databases of aromatic interactions in proteins have been publicly released, including CAD [25] and

A^{2}

ID [26], each encompassing different types of aromatic interactions. Despite these advancements, there is still a lack of publicly released computational tools that allow the identification of different types of aromatic interactions. To fill this gap, we developed PInteract, a program written in C, which is able to quickly and accurately localize aromatic-involving interactions in input structures using criteria that combine both distances and angles, which is more chemically reasonable than distance-only tools [27]. With a wide range of supported types of input structures, including protein monomers, multimers, and protein-nucleic acid complexes, we expect PInteract to be an efficient tool in the calculation and analysis of aromatic interactions in biological molecules.

2. Materials and Methods

2.1. Types of Interactions

In the interactions we considered, the first partner is an aromatic moiety of an amino acid (Phe, Tyr, Trp, His) or a nucleic acid (Gua, Ade, Cyt, Thy, Ura). The second partner is an amino acid that is either aromatic (Phe, Tyr, Trp), contains a sulfur atom (Met, Cys), carries a net positive charge (Lys, Arg), possesses a side chain amide group with both a partially positively charged

{NH}_{2}

group and a partially negatively charged C=O group (Asn, Gln), or is His, which is both aromatic and sometimes positively charged. According to the type of partner, the aromatic-involving interactions are referred to as

π

-

π

, sulfur-

π

, cation-

π

, amino-

π

, and His-

π

, respectively.

2.2. Design Principles

PInteract detects

π

interactions using geometric criteria which are in agreement with previously derived energetic principles established through quantum chemical studies; see e.g., [1,2,5,8,28,29]. The interaction energy, defined as

Δ E = E_{A - B} - (E_{A} + E_{B})

, serves as a unified descriptor for diverse

π

interactions, where

E_{A - B}

is the total energy of the interacting partners A and B, and

E_{A}

and

E_{B}

are the energies of the isolated components. A negative

Δ E

indicates a stabilizing interaction. Key factors that determine the geometry and strength of

π

interactions include:

The distance between the two interacting partners. Sometimes, this distance is defined as the distance between the centroids of the two interacting functional groups. In other studies, like here, it is defined as the closest distance between any two atoms of the two functional groups. We chose this distance definition because the considered aromatic moieties carry various substituents which influence how the ring interacts with other partners and breaks the symmetry around the ring’s center. Similarly, the electron delocalization in other functional groups is better captured by selecting the closest atom rather than the functional group’s centroid.
The $α$ angle; it is typically defined as the angle between the vector linking the centroids of both partners and the vector normal to the aromatic ring of the aromatic partner or of one of the aromatic partners in the case of $π$ - $π$ interactions. The complement of $α$ , i.e., $90^{\circ} - α$ , is called elevation angle and is sometimes also used.
The $β$ angle; it only applies to interactions in which the two functional groups are planar, i.e., contain an aromatic moiety (Phe, Tyr, Trp, His), a guanidinium group (Arg) or an formamide group (Asn, Gln). It is defined as the angle between the vectors normal to the planes of the two interacting functional groups and measures the degree of parallellism between the planes.

The energy values of the interacting partners vary significantly according to their interaction geometries. Optimal

Δ E

values are observed for inter-partner distances of 2.5–4.0 Å for cation-

π

and amino-

π

interactions [5,18,25,30,31], about 6.0 Å for sulfur-

π

[11,21], and up to about 5.0 Å for

π

-

π

interactions [2]. In addition to the distance, the angular parameter

α

constraints the interaction geometry and modulates the interaction energies. It allows the specification that the functional group of one partner is positioned above (or below) the aromatic moiety of the other, thereby enabling interaction with its

π

orbital. The

β

angle further influences the interaction energy by defining whether the interacting molecular planes adopt a stacked or T-shaped geometry.

For PInteract to be able to detect stair motifs at protein-DNA and protein-RNA interfaces, which combine a

π

-

π

interaction, an H-bond and a cation-

π

, amino-

π

, or His-

π

interaction between an amino acid and two nucleobases, H-bonds must be identified prior to running PInteract. For this purpose, we used the HBPLUS algorithm [32], which the user must install and execute before using PInteract. Note that all other interaction motifs considered do not require H-bond identification.

2.3. Technical Implementation

2.3.1. Functional Groups

The

π

interactions between functional groups are defined using geometric criteria. The atoms considered to be part of the functional groups are the following:

Aromatic moiety: all atoms that make up the aromatic ring or rings, as well as their centers referred to as mC5 or mC6 depending on whether the ring is 5- or 6-membered;
Positive charge: NH1, NH2, CZ and NE for Arg; and NZ, CE, H1 for Lys; ND1, CE1, NE2 for His if its caonsidered as a cation;
Partial positive charge: NE2, CD and OE1 for Gln; ND2, CG and OD1 for Asn;
Sulfur: SG for Cys; SD for Met.

By convention, partner 1 is taken as the aromatic moiety, or one of the aromatic moieties in the case of

π

-

π

interactions.

2.3.2. Distance Criterion

The first criterion is a distance threshold,

d_{\max}

, which specifies the maximum allowed separation between atoms involved in the interaction. Specifically, we compute the smallest distance d between any atom of the functional group of partner 1 and any atom of the functional group of partner 2, specified in Section 2.3.1. This is illustrated in Figure 1a. For the interaction to be considered valid, this distance must satisfy

d \leq d_{\max}

where:

$d_{\max} = 5.0$ Å for $π$ - $π$ interactions;
$d_{\max} = 6.0$ Å for sulfur- $π$ interactions;
$d_{\max} = 4.5$ Å for cation- $π$ , amino- $π$ , and His- $π$ interactions.

These distance thresholds are provided by default, but they can easily be modified by the user. This is particularly important when using low-resolution or modeled protein 3D structures, in which the positioning of the side chain atoms is not sufficiently accurate.

If an aromatic partner consists of a fused double-ring system, one ring with 5 atoms and the other with 6, as is the case for Trp, Ade and Gua, we determine whether the closest distance involves an atom that is closer to the centroid of the 6-atom ring or of the 5-atom ring. In the former case, we focus on the 6-atom ring for checking whether the other geometrical criteria are satisfied and in the latter case, on the 5-atom ring.

2.3.3. Angle Criterion

To ensure that the interacting partner is positioned within the effective electronic range of the

π

orbital, and thus that the interaction is

π

-mediated, we apply an angular constraint based on a cylindrical model, illustrated in Figure 1a. The cylinder’s base is centered on the geometric center C of the aromatic ring of partner 1 (or of the ring closest to partner 2 in the case of a double aromatic moiety), and its radius is equal to

r_{m a x}

taken to be:

$r_{m a x} = 2 r$ for cation- $π$ , His- $π$ , amino- $π$ and sulfur- $π$ interactions;
$r_{m a x} = 3 r$ for $π$ - $π$ interactions.

where r is the radius of the aromatic ring. The height of the cylinder is such that the closest atom of the functional group of partner 2 (noted A) is included in the cylinder’s second (“upper”) base. If this is impossible, thus if A remains outside the cylinder regardless of its height, the interaction is not considered as a

π

interaction.

The

r_{m a x}

threshold can be translated into an angle criterion. Indeed, for A to be in the cylinder, the angle

α

formed by vector

\vec{C A}

, linking the aromatic moiety’s center to A, and the vector normal to the aromatic ring (taken in the direction of A so that

α \leq 90^{\circ}

) must have a value that is lower than a maximum value, called

α_{m a x}

. This value is reached when A is on the cylinder’s edge, as shown in Figure 1a. Note that the

α_{m a x}

threshold value depends on the distance d between the the 2 partners, unlike the

r_{m a x}

threshold.

In the case of a

π

-

π

interaction, the cylinder is constructed in turn on the aromatic ring of partners 1 and 2, and the configuration yielding the smallest

α

-angle value is retained. This procedure ensures that the two

π

groups are treated symmetrically.

The

r_{m a x}

thresholds are provided by default in the PInteract program but, like for the distance thresholds, they can easily be changed by the user. Since the asymmetrically distributed electron cloud can extend beyond the aromatic ring’s atomic boundaries, using a larger radius ensures inclusion of geometries where the functional group of partner 2 is laterally offset from the ring’s centroid of partner 1 but still engages in energetically significant

π

interactions.

2.3.4. Plane Parallelism Angle

In the case the functional groups of the two partners are planar, as in Arg-

π

, amino-

π

, His-

π

, and

π

-

π

interactions (see Section 2.2), we compute the smallest angle

β

between the vectors that are normal to the planes of the interacting functional groups, as illustrated in Figure 1b for an Arg-Trp interaction. The

β

value is equal to

0^{\circ}

when the two planes are parallel, or equivalently when the two partners are in stacked conformation, and equal to

90^{\circ}

when the planes are perpendicular or in T-shaped conformation.

2.4. Structure Datasets’ Construction

To apply PInteract at proteome scale, we set up several non-redundant and good-resolution structure datasets, which contain:

$D_{monomer}$ : 5590 protein monomers;
$D_{homodimer}$ : 3940 protein homodimers;
$D_{heterodimer}$ : 1037 protein heterodimers;
$D_{TCRpMHC}$ : 51 complexes between a T-cell receptor (TCR) and a peptide-bound major histocompatibility complex (pMHC);
$D_{AbAg}$ : 495 antibody-antigen complexes;
$D_{protNA}$ : 573 protein-DNA or protein-RNA complexes.

The

D_{monomer}

and

D_{heterodimer}

datasets were collected and curated from X-ray structures of protein monomers and heterodimers, respectively, in the Protein DataBank (PDB) [33] using the PISCES server [34]; the

D_{homodimer}

dataset from the HomodimerDB database [35]; the

D_{TCRpMHC}

set from the TCR3d database [36,37]; the

D_{AbAg}

set from the SAbDab database [38,39,40]; and the

D_{protNA}

set from the DNAproDB and RNAproDB databases [41,42].

To ensure high-quality datasets, we applied the following filters: no missing main-chain atomic coordinates; resolution

\leq 2.5

Å; R-factor

\leq 0.25

; and sequence length between 40 and 10,000 residues. Redundancy within each dataset was reduced by applying pairwise sequence identity thresholds. For

D_{monomer}

,

D_{homodimer}

,

D_{heterodimer}

, and

D_{protNA}

(considering only protein chains), the threshold was set to ≤30%. For

D_{AbAg}

and

D_{TCRpMHC}

, we used sequence identity thresholds of ≤70% for antibodies and TCRs (considering complementarity-determining regions only), and ≤90% for antigens and MHCs. Non-redundancy was defined using an “or” rule: two antibody-antigen complexes were considered non-redundant if the antibody sequences shared ≤70% identity or the antigen sequences shared ≤90% identity; the same rule was applied to TCR-pMHC complexes.

To compute the frequency of occurrence of

π

interactions in these datasets, we divided the number of occurrences of a given type by the total number of interactions. More specifically, for

D_{monomer}

, the total number of interactions was estimated as the number of residue pairs in contact, defined as having at least one heavy atom from each residue’s side chain within a maximum distance of 5Å. For

D_{heterodimer}

and

D_{homodimer}

, the frequencies were obtained in the same way, except that only contacts across the protein-protein interface were considered, and for

D_{protNA}

, only contacts across the protein-DNA/RNA interface. In

D_{AbAg}

, where the complexes typically consist of three chains, the considered interactions link the antigen and antibody chains. In

D_{TCRpMHC}

, where the complexes consist of four or five chains, the interactions taken into account are between the TCR chains and the pMHC molecule.

3. Results

3.1. The PInteract Algorithm

We developed the PInteract algorithm to identify

π

-involving interactions in proteins, protein-protein complexes, and interfaces between proteins and DNA, RNA, or nucleic acid base-containing ligands. These interactions include

π

-

π

, sulfur-

π

, cation-

π

, amino-

π

, and His-

π

interactions, which are determined based on the geometric criteria detailed in Section 2.3.

The input to PInteract is the name of a directory that contains structure files in PDB format [33]. The output consists of three files: PInteract.csv, a table of individual

π

interactions; PInteract1.csv, a table of

π

-chains and stair motifs; and PInteract.txt, a human-readable text file containing all the content from PInteract.csv and PInteract1.csv, with comments and explanations prefixed by # to aid understanding. PInteract is implemented in C and is extremely fast, enabling the detection of

π

interactions at the proteome scale. The code can be freely downloaded from our GitHub repository: https://github.com/3BioCompBio/PInteract (accessed on 19 August 2025). All technical details are provided in this repository.

3.2. Examples of $π$ -Involving Interactions

Here, we provide examples of PInteract applications to identify specific geometric arrangements of

π

-involving interactions. The first example of PInteract’s application is illustrated in Figure 2a. A cation-

π

and a

π

-

π

interaction were found at the interface between hen egg lysozyme and the heavy chain of an antibody (PDB ID: 2QDJ). These two interactions are generally particularly important in antibody-antigen recognition [43]. In this specific case, for example, their disruption has been experimentally shown to reduce the antibody-antigen binding affinity by at least two orders of magnitude [44].

PInteract is also able to detect

π

interactions with ligands that contain a nucleic acid base, such as ATP or GTP. An example of such a protein-ligand

π

interaction is shown in Figure 2b. A Met and a Phe residue of a NAD kinase (PDB ID: 1Z0S) make a sulfur-

π

and a

π

-

π

interaction, respectively, with the Ade base included in the ATP molecule.

Successive

π

-involving interactions identified by our algorithm are referred to as

π

-chains in the output files. These chains represent continuous networks of interacting

π

-systems, which may play a structural or functional role in stabilizing molecular assemblies. They can be successive interactions of the same type, e.g., clusters of

π

-

π

interactions or a double cation-

π

interaction where the cation is sandwiched between aromatic rings or where an aromatic moiety is sandwiched between two cations. They can also mix different types of

π

interactions, e.g., a

π

-

π

interaction followed by a cation-

π

interaction. An example is given in Figure 2c, which shows a

π

-chain formed of three

π

-

π

, two cation-

π

and three sulfur-

π

interactions in a bacterial esterase from an Alcaligenes species (PDB ID: 1QLW).

Another type of mixed interactions identified by PInteract is the stair motif [16,17], recurrently observed at protein-DNA and protein-RNA interfaces. These motifs consist of two consecutive stacked nucleic acid bases that form a

π

-

π

interaction, and an amino acid that forms both a cation-

π

, amino-

π

or His-

π

interaction with one of the nucleic acid bases and an H-bond with the next base along the base stack. As mentioned in Section 2.2, for PInteract to be able to detect such motifs, H-bonds must be identified beforehand using the HBPLUS program [32]. An example of four successive stair motifs is shown in Figure 2d, where five stacked nucleobases make H-bonds and His-

π

, cation-

π

and amino-

π

interactions with a His, two Arg and an Asn of a zinc finger protein (PDB ID: 1A1G).

3.3. Large-Scale Analysis of $π$ -Involving Interactions

In order to show the applicability of our tool for large-scale analysis of

π

-involving interactions, we investigated the relative frequency of each type of

π

interaction across the proteins in the six

D

datasets described in Section 2.4: monomeric proteins, homodimers, heterodimers, antibody-antigen complexes, TCR-pMHC complexes, and protein-DNA and RNA complexes. These frequencies were computed as the number of interactions identified by PInteract divided by the total number of interactions in the protein (see Section 2.4 for details). Note that in doing so, we did not normalize for amino acid frequencies. Our objective was to compare the types of interactions that contribute most to protein and protein complex stabilization in different contexts, independently of sequence composition.

As shown in Table 1 and Figure 3, protein-DNA/RNA interfaces contain, on the average, by far the highest numbers of cation-

π

and

π

-

π

interactions, although with very large standard deviations, and relatively few other types of

π

-involving interactions. This is largely due to the frequent occurrence of positively charged amino acids in the vicinity of the negatively charged DNA or RNA sugar-phosphate backbone and their recurrent tendency to form stair motifs [16,17]; aromatic amino acids also establish stabilizing interactions with nucleobases or sugars of the sugar-phosphate backbone of DNA or RNA molecules [45]. The large standard deviation arises from the fact that in some complexes, the interactions involve the sugar–phosphate backbone rather than the nucleobases.

Antibody-antigen interfaces rank second for cation-

π

and

π

-

π

interactions and also contain numerous amino-

π

interactions; the role of these interactions in antibody–antigen binding is well established [43,46]. TCR–pMHC binding interfaces display numerous

π

-

π

interactions—fewer than antibody-antigen interfaces, but more than observed in monomers, heterodimers, and homodimers. The amount of cation-

π

interactions is the same in TCR–pMHC and heterodimer interfaces and higher than in monomers. The proportion of amino-

π

and His-

π

is identical in TCR-pMHC and antibody-antigen interfaces and larger than in heterodimer interfaces and in monomers. In contrast, sulfur-

π

interactions are rare in general, and even rarer in TCR-pMHC interfaces. Notably, the homodimer interfaces contain the lowest number of

π

-involving interactions in general. These different tendencies are statistically significant, as shown in Supplementary Table S1. However, it should be emphasized that these represent average trends. The standard deviations are very large (Table 1), indicating substantial variability between proteins.

To investigate the characteristic distances between the functional groups involved in

π

interactions and evaluate the relevance of our predefined distance thresholds

d_{m a x}

, we analyzed the distribution of distances d (defined in Section 2.3.2) for cation-

π

, amino-

π

, His-

π

, sulfur-

π

, and

π

-

π

interactions detected in dataset

D_{monomer}

. The results are shown in Figure 4. Clearly, as a general trend, the interaction distances consistently exceed 3 Å. The d-distributions exhibit characteristic peaks at different values according to the interaction types: 3.6 Å for cation-

π

, amino-

π

, and His-

π

interactions, and 3.8 Å for sulfur-

π

and

π

-

π

interactions. Interestingly, the peak is less marked for sulfur-

π

interactions, showing that the geometry of this type of interaction is less constrained. Our analysis also confirms that the chosen thresholds for each interaction type are reasonable, as these are located near the tail of the distribution, corresponding to low-density regions. Note again that the different distance thresholds can be modified by the users.

We also analyzed the distribution of

α

angle values, defined in Section 2.3.3, which represent the position of the functional group of partner 2 above (or below) the aromatic ring of partner 1. A value of

α = 0

indicates that this functional group is just above (or below) the aromatic ring’s center. We plotted the frequency of

(1 - cos α)

values, which represent the lateral displacement relative to the ring’s center, for the different interaction types observed in

D_{monomer}

. As shown in Figure 5, all

(1 - cos α)

values have an almost equal frequency up to

α

angles of about 30° (

1 - cos 30^{\circ} \approx 0.13

) for all interaction types except

π

-

π

. This indicates that the formation of

π

interactions requires the interacting partner to be clearly above (or below) the aromatic plane, where the

π

-orbital is situated; however, no clear preference for a particular displacement from the ring’s center is observed, which reflects a certain plasticity of the interaction geometry. For

α

angles larger than about 45° (

1 - cos 45^{\circ} \approx 0.23

), the frequency of all non-

π

-

π

interactions totally vanishes. The fact that the decrease is gradual is related to the fact that the angle threshold value

α_{m a x}

depends on the distance d between interacting partners and is thus not clear-cut.

In contrast,

π

-

π

interactions exhibit a clear preference for small displacement values, with

α

values between

0^{\circ}

and about

18^{\circ}

(

0 \leq 1 - cos 18^{\circ} \leq 0.05

). This means that the closest atom of the aromatic ring of partner 2 is preferably situated above the centroid of the aromatic ring of partner 1. It has however to be noted that these displacements are not always computed between the centroids of the two interacting aromatic rings, but often between an atom of one aromatic ring and the center of the other ring. This can explain the overrepresentation of small

α

angles. Also, larger displacements are observed for this type of

π

interactions, up to (

1 - cos 53^{\circ} \approx 0.40

); this is due to the larger

α_{m a x}

value accepted for this type of interaction (Section 2.3.3).

The same tendencies are observed in the

(1 - cos α)

distributions of antibody-antigen complexes, TCR-pMHC, homodimers, and heterodimers. In contrast, at protein-DNA and protein-RNA interfaces, larger angles are also observed, with displacement values up to (

1 - cos 60^{\circ} = 0.5

) (see Supplementary Figure S1b).

Finally, we computed the distribution of

β

angles, or rather of the

(1 - cos β)

displacement values, where

β

is the angle between the planes of the functional groups of the two interacting partners, when these groups are planar (Section 2.3.4). As shown in Figure 6, this distribution varies markedly depending on the type of

π

interaction. For Arg-involving cation-

π

interactions, smaller

(1 - cos β)

values—corresponding to parallel, stacked conformations—are strongly preferred. A similar but slightly less marked trend is observed for amino-

π

and His-

π

interactions. To an even much lesser extent, this trend is also observed for

π

-

π

interactions. However, these interactions tend to accommodate a wide range of conformations, from stacked to T-shaped, with almost no preference, suggesting a high plasticity in interaction geometry. Note that these trends are similar to those observed in the

(1 - cos β)

distributions of all considered sets of protein-protein interfaces. At protein–DNA and protein–RNA interfaces, however,

π

–

π

interactions display a stronger preference for stacked conformations, which can be explained by their ability to partially intercalate into the nucleobase stack (see Supplementary Figure S2b).

To further evaluate whether these observed preferences of

α

and

β

angles are preserved in computationally predicted structures, we applied PInteract to selected subsets of the AlphaFold Protein Structure Database (AlphaFoldDB) [47], as described in Supplementary Section S3. As shown in Supplementary Figures S3 and S4, the resulting distributions of

α

and

β

angles have a high degree of consistency with those observed in experimentally solved crystal structures, confirming that models predicted by AlphaFold [48] largely retain the geometric characteristics of natural

π

interactions. A minor exception is the attenuated peak of the

β

angle distribution for small

β

values, suggesting a mild underestimation by AlphaFold of the preference for stacked conformations.

3.4. Relationship Between $π$ Interactions and Protein Solubility and Aggregation

Solubility and aggregation are important protein properties that have huge impacts on protein fitness. Molecular mechanisms underlying protein solubility and aggregation have attracted a lot of attention but are still challenging to explain. They are complex properties that arise from the interplay of various forces in both the folded and unfolded or aggregated states. One of the results in this context is that interactions known to decrease solubility and favor aggregation involve delocalized

π

-electrons [49,50,51,52].

Here, we investigated the role of intrachain

π

-involving interactions in protein solubility. For this purpose, we used the dataset of proteins with known structure and solubility values which was set up in [53]. This set contains curated entries from high-throughput experiments on the E. coli proteome [54] and S. cerevisae [55] using a cell-free expression system. Each entry in this dataset is a single-chain protein. We refer to this dataset as

D_{solubility}

.

We first computed the Pearson correlation between solubility values and the frequency of residues capable of forming

π

interactions; the results are shown in Table 2. The correlation is statistically significant and most strongly negative for aromatic residues (

ρ = - 0.17

), followed by Arg (

ρ = - 0.09

). His, Met/Cys, and Asn/Gln do not exhibit any significant solubility preference. In contrast, Lys showed a positive correlation with solubility (

ρ = 0.16

), consistent with the well-known Arg/Lys trend: soluble proteins tend to be depleted in arginine and enriched in lysine [52].

We further assessed whether the observed correlations differ when considering only residues engaged in

π

interactions, as identified by PInteract, compared to those that are not. As reported in Table 2, Arg, Met/Cys and aromatic residues involved in intrachain Arg-

π

, sulfur-

π

, and

π

-

π

interactions, respectively, tend to reduce solubility relative to the same residue types not participating in such interactions. Conversely, Lys exhibits the opposite trend, favoring solubility when it is not engaged in a

π

interaction. Our results thus indicate that interactions involving delocalized electronic

π

-systems appear to play a significant role in solubility mechanisms. This is further supported by the negative correlation of approximately −0.17 observed between protein solubility and the frequency of all detected

π

interactions.

As a further application of PInteract, we investigated

π

-involving interactions in the stabilization of fibril structures, which is still a matter of debate. Although aromatic residues are not strictly required for amyloid formation, they are known to facilitate and promote the fibril assembly process [56]. Specifically, we analyzed the set of fibril structures provided in [57]; we refer to this set as

D_{fibril}

. We observed that the system is dominated by

π

-

π

interactions, which account for as much as 91% of all detected

π

interaction types. We examined the geometric characteristics of these interactions by comparing the distributions of the

α

and

β

angles in both fibrilar and monomeric states. Although no significant difference was observed in the

α

angle distributions, the

β

angle distributions differ markedly between the two sets, as shown in Figure 7. In particular, fibrilar structures exhibit an almost exclusive preference for stacked conformations, likely driven by the geometric symmetry and packing constraints of extended amyloid aggregates, as illustrated for the A

β

40 fibril [58] in Supplementary Figure S5. In contrast, structures of soluble monomeric proteins display a much broader distribution of

β

angles (see Figure 6), reflecting a greater conformational variability.

3.5. PInteract: Fast, Scalable, and User-Friendly

PInteract is extremely easy to install and use—it only requires a standard GCC compiler and can be set up simply by cloning our GitHub repository. It is implemented in C, which makes it exceptionally fast and well-suited for large-scale analyses involving millions of structures. PInteract scales efficiently with the size of the dataset: a few thousands of structures can be processed in less than a minute on a standard laptop, and 100,000 structures require only 6 min (for details, see Supplementary Section S5). This level of efficiency makes PInteract particularly valuable for high-throughput structural screening tasks, as we have also highlighted in the previous sections.

4. Conclusions

With the recent breakthroughs in deep learning-based methods for biomolecular structure prediction, an unprecedented number of structures have been generated. We now have access to over a billion predicted protein structures, covering virtually the entire tree of life. This explosion of structural data calls for computational tools capable of efficiently analyzing these structures and extracting meaningful biological insights.

Here, we introduced PInteract, a fast computational tool designed to identify individual

π

interactions and clusters or chains of

π

interactions in proteins and in protein-protein, protein-RNA, protein-DNA, and protein-ligand complexes. Moreover, PInteract is the only automatic program that identifies stair motifs at protein-DNA and protein-RNA interfaces, which are recurrent motifs combining specific H-bonds and

π

interactions. We illustrated PInteract’s capabilities across several case studies, demonstrating its effectiveness in detecting

π

interactions in proteins as well as in protein complexes. We also showed that PInteract can perform large-scale computations on hundreds of thousands of protein structures within minutes, enabling the analysis of the geometric properties of

π

-involving interactions across large structural databases. Finally, we demonstrated how PInteract can be used to gain meaningful insights into the role of

π

interactions in protein solubility and aggregation.

In summary, we are confident that PInteract, with its ease of use and high computational efficiency, will serve as a valuable resource for the scientific community, enabling the systematic study of

π

-involving interactions and shedding light on the wide spectrum of molecular mechanisms in which these interactions play a pivotal role, from protein folding and stability to biomolecular recognition and aggregation.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/biom15081204/s1.

Author Contributions

Conceptualization, F.P. and M.R.; methodology, M.R.; software, M.R.; validation, D.L., F.P. and M.R.; data curation, D.L.; writing—original draft preparation, D.L.; writing—review and editing, F.P. and M.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the FNRS-Belgian Fund for Scientific Research through a PDR project, and by a scholarship from the China Scholarship Council for D.L.

Data Availability Statement

Data available in a publicly accessible repository: https://github.com/3BioCompBio/PInteract, accessed on 19 August 2025.

Conflicts of Interest

The authors declare no conflicts of interest.

Correction Statement

This article has been republished with a minor correction to Data Availability Statement. This change does not affect the scientific content of the article.

References

Salonen, L.M.; Ellermann, M.; Diederich, F. Aromatic rings in chemical and biological recognition: Energetics and structures. Angew. Chem. Int. Ed. 2011, 50, 4808–4842. [Google Scholar] [CrossRef]
Calinsky, R.; Levy, Y. Aromatic Residues in Proteins: Re-Evaluating the Geometry and Energetics of π–π, Cation-π, and CH-π Interactions. J. Phys. Chem. B 2024, 128, 8687–8700. [Google Scholar] [CrossRef] [PubMed]
Burley, S.; Petsko, G.A. Aromatic-aromatic interaction: A mechanism of protein structure stabilization. Science 1985, 229, 23–28. [Google Scholar] [CrossRef] [PubMed]
Dougherty, D.A. Cation-π interactions in chemistry and biology: A new view of benzene, Phe, Tyr, and Trp. Science 1996, 271, 163–168. [Google Scholar] [CrossRef] [PubMed]
Wintjens, R.; Liévin, J.; Rooman, M.; Buisine, E. Contribution of cation-π interactions to the stability of protein-DNA complexes. J. Mol. Biol. 2000, 302, 393–408. [Google Scholar] [CrossRef]
Dougherty, D.A. The Cation-π Interaction in Chemistry and Biology. Chem. Rev. 2025, 125, 2793–2808. [Google Scholar] [CrossRef]
Imai, Y.N.; Inoue, Y.; Nakanishi, I.; Kitaura, K. Amide–π interactions between formamide and benzene. J. Comput. Chem. 2009, 30, 2267–2276. [Google Scholar] [CrossRef]
Cauët, E.; Rooman, M.; Wintjens, R.; Liévin, J.; Biot, C. Histidine-aromatic interactions in proteins and protein- ligand complexes: Quantum chemical study of X-ray and model structures. J. Chem. Theory Comput. 2005, 1, 472–483. [Google Scholar] [CrossRef]
Shimba, N.; Serber, Z.; Ledwidge, R.; Miller, S.M.; Craik, C.S.; Dötsch, V. Quantitative identification of the protonation state of histidines in vitro and in vivo. Biochemistry 2003, 42, 9227–9234. [Google Scholar] [CrossRef]
Calinsky, R.; Levy, Y. Histidine in Proteins: PH-Dependent Interplay between π–π, Cation–π, and CH–π Interactions. J. Chem. Theory Comput. 2024, 20, 6930–6945. [Google Scholar] [CrossRef]
Reid, K.; Lindley, P.; Thornton, J. Sulphur-aromatic interactions in proteins. FEBS Lett. 1985, 190, 209–213. [Google Scholar] [CrossRef]
Motherwell, W.B.; Moreno, R.B.; Pavlakos, I.; Arendorf, J.R.; Arif, T.; Tizzard, G.J.; Coles, S.J.; Aliev, A.E. Noncovalent interactions of π systems with sulfur: The atomic chameleon of molecular recognition. Angew. Chem. 2018, 130, 1207–1212. [Google Scholar] [CrossRef]
Nunar, L.; Aliev, A.E. Quantification of the Strength of π-Noncovalent Interactions in Molecular Balances using Density Functional Methods. Chemistry-Methods 2023, 3, e202200044. [Google Scholar] [CrossRef]
Daeffler, K.N.M.; Lester, H.A.; Dougherty, D.A. Functionally important aromatic–aromatic and sulfur-π interactions in the D2 dopamine receptor. J. Am. Chem. Soc. 2012, 134, 14890–14896. [Google Scholar] [CrossRef] [PubMed]
Lanzarotti, E.; Biekofsky, R.R.; Estrin, D.A.; Marti, M.A.; Turjanski, A.G. Aromatic–aromatic interactions in proteins: Beyond the dimer. J. Chem. Inf. Model. 2011, 51, 1623–1633. [Google Scholar] [CrossRef] [PubMed]
Rooman, M.; Liévin, J.; Buisine, E.; Wintjens, R. Cation–π/H-bond stair motifs at protein–DNA interfaces. J. Mol. Biol. 2002, 319, 67–76. [Google Scholar] [CrossRef]
Biot, C.; Wintjens, R.; Rooman, M. Stair motifs at protein- DNA interfaces: Nonadditivity of H-bond, stacking, and cation-π interactions. J. Am. Chem. Soc. 2004, 126, 6220–6221. [Google Scholar] [CrossRef]
Kumar, K.; Woo, S.M.; Siu, T.; Cortopassi, W.A.; Duarte, F.; Paton, R.S. Cation–π interactions in protein–ligand binding: Theory and data-mining reveal different roles for lysine and arginine. Chem. Sci. 2018, 9, 2655–2665. [Google Scholar] [CrossRef]
Kumar, N.; Gaur, A.S.; Sastry, G.N. A perspective on the nature of cation-π interactions. J. Chem. Sci. 2021, 133, 97. [Google Scholar] [CrossRef]
Shah, D.; Li, J.; Shaikh, A.R.; Rajagopalan, R. Arginine–aromatic interactions and their effects on arginine-induced solubilization of aromatic solutes and suppression of protein aggregation. Biotechnol. Prog. 2012, 28, 223–231. [Google Scholar] [CrossRef]
Zauhar, R.J.; Colbert, C.L.; Morgan, R.S.; Welsh, W.J. Evidence for a strong sulfur-aromatic interaction derived from crystallographic data. Biopolymers 2000, 53, 233–248. [Google Scholar] [CrossRef]
Krainer, G.; Welsh, T.J.; Joseph, J.A.; Espinosa, J.R.; Wittmann, S.; De Csilléry, E.; Sridhar, A.; Toprakcioglu, Z.; Gudiškytė, G.; Czekalska, M.A.; et al. Reentrant liquid condensate phase of proteins is stabilized by hydrophobic and non-ionic interactions. Nat. Commun. 2021, 12, 1085. [Google Scholar] [CrossRef]
Zhao, H.; Tang, L.; Fang, Y.; Liu, C.; Ding, W.; Zang, S.; Chen, Y.; Xu, W.; Yuan, Y.; Fang, D.; et al. Manipulating Cation–π Interactions of Reader Proteins in Living Cells with Genetic Code Expansion. J. Am. Chem. Soc. 2023, 145, 16406–16416. [Google Scholar] [CrossRef]
Valley, C.C.; Cembran, A.; Perlmutter, J.D.; Lewis, A.K.; Labello, N.P.; Gao, J.; Sachs, J.N. The Methionine-aromatic Motif Plays a Unique Role in Stabilizing Protein Structure. J. Biol. Chem. 2012, 287, 34979–34991. [Google Scholar] [CrossRef]
Kumar, Y.B.; Kumar, N.; John, L.; Mahanta, H.J.; Vaikundamani, S.; Nagamani, S.; Sastry, G.M.; Sastry, G.N. Analyzing the cation-aromatic interactions in proteins: Cation-aromatic database V2.0. Proteins Struct. Funct. Bioinform. 2024, 92, 179–191. [Google Scholar] [CrossRef]
Chourasia, M.; Sastry, G.M.; Sastry, G.N. Aromatic–Aromatic Interactions Database, A2ID: An analysis of aromatic π-networks in proteins. Int. J. Biol. Macromol. 2011, 48, 540–552. [Google Scholar] [CrossRef] [PubMed]
Tina, K.G.; Bhadra, R.; Srinivasan, N. PIC: Protein Interactions Calculator. Nucleic Acids Res. 2007, 35, W473–W476. [Google Scholar] [CrossRef] [PubMed]
Biot, C.; Buisine, E.; Kwasigroch, J.M.; Wintjens, R.; Rooman, M. Probing the energetic and structural role of amino acid/nucleobase cation-π interactions in protein-ligand complexes. J. Biol. Chem. 2002, 277, 40816–40822. [Google Scholar] [CrossRef] [PubMed]
Biot, C.; Buisine, E.; Rooman, M. Free-energy calculations of protein- ligand cation-π and amino- π interactions: From vacuum to proteinlike environments. J. Am. Chem. Soc. 2003, 125, 13988–13994. [Google Scholar] [CrossRef] [PubMed]
Minoux, H.; Chipot, C. Cation-π Interactions in Proteins: Can Simple Models Provide an Accurate Description? J. Am. Chem. Soc. 1999, 121, 10366–10372. [Google Scholar] [CrossRef]
Gallivan, J.P.; Dougherty, D.A. A Computational Study of Cation-π Interactions vs Salt Bridges in Aqueous Media: Implications for Protein Engineering. J. Am. Chem. Soc. 2000, 122, 870–874. [Google Scholar] [CrossRef]
McDonald, I.K.; Thornton, J.M. Satisfying hydrogen bonding potential in proteins. J. Mol. Biol. 1994, 238, 777–793. [Google Scholar] [CrossRef]
Berman, H.M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.N.; Weissig, H.; Shindyalov, I.N.; Bourne, P.E. The protein data bank. Nucleic Acids Res. 2000, 28, 235–242. [Google Scholar] [CrossRef]
Wang, G.; Dunbrack, R.L., Jr. PISCES: A protein sequence culling server. Bioinformatics 2003, 19, 1589–1591. [Google Scholar] [CrossRef]
HomodimerDB. Available online: https://seq2fun.dcmb.med.umich.edu/HomodimerDB (accessed on 9 August 2025).
Gowthaman, R.; Pierce, B.G. TCR3d: The T cell receptor structural repertoire database. Bioinformatics 2019, 35, 5323–5325. [Google Scholar] [CrossRef] [PubMed]
Pierce, B.G.; Weng, Z. A flexible docking approach for prediction of T cell receptor–peptide–MHC complexes. Protein Sci. 2013, 22, 35–46. [Google Scholar] [CrossRef]
Dunbar, J.; Krawczyk, K.; Leem, J.; Baker, T.; Fuchs, A.; Georges, G.; Shi, J.; Deane, C.M. SAbDab: The structural antibody database. Nucleic Acids Res. 2014, 42, D1140–D1146. [Google Scholar] [CrossRef] [PubMed]
Schneider, C.; Raybould, M.I.; Deane, C.M. SAbDab in the age of biotherapeutics: Updates including SAbDab-nano, the nanobody structure tracker. Nucleic Acids Res. 2022, 50, D1368–D1372. [Google Scholar] [CrossRef]
Raybould, M.I.; Marks, C.; Lewis, A.P.; Shi, J.; Bujotzek, A.; Taddese, B.; Deane, C.M. Thera-SAbDab: The therapeutic structural antibody database. Nucleic Acids Res. 2020, 48, D383–D388. [Google Scholar] [CrossRef] [PubMed]
Mitra, R.; Cohen, A.S.; Sagendorf, J.M.; Berman, H.M.; Rohs, R. DNAproDB: An updated database for the automated and interactive analysis of protein–DNA complexes. Nucleic Acids Res. 2025, 53, D396–D402. [Google Scholar] [CrossRef]
Mitra, R.; Cohen, A.S.; Tang, W.Y.; Hosseini, H.; Hong, Y.; Berman, H.M.; Rohs, R. RNAproDB: A Webserver and Interactive Database for Analyzing Protein–RNA Interactions. J. Mol. Biol. 2025, 437, 169012. [Google Scholar] [CrossRef]
Dalkas, G.A.; Teheux, F.; Kwasigroch, J.M.; Rooman, M. Cation–π, amino–π, π–π, and H-bond interactions stabilize antigen–antibody interfaces. Proteins Struct. Funct. Bioinform. 2014, 82, 1734–1746. [Google Scholar] [CrossRef]
Akiba, H.; Tsumoto, K. Thermodynamics of antibody–antigen interaction revealed by mutation analysis of antibody variable regions. J. Biochem. 2015, 158, 1–13. [Google Scholar] [CrossRef] [PubMed]
Wilson, K.A.; Kellie, J.L.; Wetmore, S.D. DNA–protein π-interactions in nature: Abundance, structure, composition and strength of contacts between aromatic amino acids and DNA nucleobases or deoxyribose sugar. Nucleic Acids Res. 2014, 42, 6726–6741. [Google Scholar] [CrossRef]
Wu, D.; Sun, J.; Xu, T.; Wang, S.; Li, G.; Li, Y.; Cao, Z. Stacking and energetic contribution of aromatic islands at the binding interface of antibody proteins. Immunome Res. 2010, 6, S1. [Google Scholar] [CrossRef]
Varadi, M.; Bertoni, D.; Magana, P.; Paramval, U.; Pidruchna, I.; Radhakrishnan, M.; Tsenkov, M.; Nair, S.; Mirdita, M.; Yeo, J.; et al. AlphaFold Protein Structure Database in 2024: Providing structure coverage for over 214 million protein sequences. Nucleic Acids Res. 2024, 52, D368–D375. [Google Scholar] [CrossRef] [PubMed]
Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef] [PubMed]
Hou, Q.; Bourgeas, R.; Pucci, F.; Rooman, M. Computational analysis of the amino acid interactions that promote or decrease protein solubility. Sci. Rep. 2018, 8, 14661. [Google Scholar] [CrossRef]
Vernon, R.M.; Chong, P.A.; Tsang, B.; Kim, T.H.; Bah, A.; Farber, P.; Lin, H.; Forman-Kay, J.D. Pi-Pi contacts are an overlooked protein feature relevant to phase separation. eLife 2018, 7, e31486. [Google Scholar] [CrossRef]
Gazit, E. A possible role for π-stacking in the self-assembly of amyloid fibrils. FASEB J. 2002, 16, 77–83. [Google Scholar] [CrossRef]
Warwicker, J.; Charonis, S.; Curtis, R.A. Lysine and arginine content of proteins: Computational analysis suggests a new tool for solubility design. Mol. Pharm. 2014, 11, 294–303. [Google Scholar] [CrossRef]
Hou, Q.; Kwasigroch, J.M.; Rooman, M.; Pucci, F. SOLart: A structure-based method to predict protein solubility and aggregation. Bioinformatics 2020, 36, 1445–1452. [Google Scholar] [CrossRef] [PubMed]
Niwa, T.; Ying, B.W.; Saito, K.; Jin, W.; Takada, S.; Ueda, T.; Taguchi, H. Bimodal protein solubility distribution revealed by an aggregation analysis of the entire ensemble of Escherichia coli proteins. Proc. Natl. Acad. Sci. USA 2009, 106, 4201–4206. [Google Scholar] [CrossRef] [PubMed]
Uemura, E.; Niwa, T.; Minami, S.; Takemoto, K.; Fukuchi, S.; Machida, K.; Imataka, H.; Ueda, T.; Ota, M.; Taguchi, H. Large-scale aggregation analysis of eukaryotic proteins reveals an involvement of intrinsically disordered regions in protein folding. Sci. Rep. 2018, 8, 678. [Google Scholar] [CrossRef] [PubMed]
Stanković, I.M.; Niu, S.; Hall, M.B.; Zarić, S.D. Role of aromatic amino acids in amyloid self-assembly. Int. J. Biol. Macromol. 2020, 156, 949–959. [Google Scholar] [CrossRef]
van der Kant, R.; Louros, N.; Schymkowitz, J.; Rousseau, F. Thermodynamic analysis of amyloid fibril structures reveals a common framework for stability in amyloid polymorphs. Structure 2022, 30, 1178–1189. [Google Scholar] [CrossRef]
Kollmer, M.; Close, W.; Funk, L.; Rasmussen, J.; Bsoul, A.; Schierhorn, A.; Schmidt, M.; Sigurdson, C.J.; Jucker, M.; Fändrich, M. Cryo-EM structure and polymorphism of Aβ amyloid fibrils purified from Alzheimer’s brain tissue. Nature Comm. 2019, 10, 4760. [Google Scholar] [CrossRef]

Figure 1. Geometric definition of

π

-involving interactions. (a) The aromatic moiety of partner 1 (here Trp) is in red; the centroid of its 6-atom ring is labeled C. The closest atoms between the two partners are atom B of the 6-atom aromatic ring and atom A of the functional group of partner 2, represented by a blue ball; their distance is equal to d. A must be inside a cylinder, whose lower base is a circle of radius

r_{m a x}

centered on C and its upper base passes through A and is centered on

C^{'}

. The angle between vector

\vec{C A}

and vector

\vec{C C^{'}}

which is normal to the cylinder’s bases, is called

α

. For A being inside the cylinder,

α

must be

\leq α_{m a x}

, where

α_{m a x}

is the angle between the normal vector

\vec{C C^{'}}

and vector

\vec{C A^{'}}

, where

A^{'}

is on the upper base’s edge and on the diameter passing through A and

C^{'}

. (b) Partner 1 (Trp) is in red and partner 2 (Arg), in blue. The planes formed by the 6-atom aromatic ring of Trp and by Arg’s guanidinium group are shown in red and blue, respectively, as well as the vectors normal to these planes. The angle between these vectors, called

β

, taken between

0^{\circ}

and

90^{\circ}

, is shown on a gray background and measures the deviation from the parallelism between the two planes.

Figure 1. Geometric definition of

π

-involving interactions. (a) The aromatic moiety of partner 1 (here Trp) is in red; the centroid of its 6-atom ring is labeled C. The closest atoms between the two partners are atom B of the 6-atom aromatic ring and atom A of the functional group of partner 2, represented by a blue ball; their distance is equal to d. A must be inside a cylinder, whose lower base is a circle of radius

r_{m a x}

centered on C and its upper base passes through A and is centered on

C^{'}

. The angle between vector

\vec{C A}

and vector

\vec{C C^{'}}

which is normal to the cylinder’s bases, is called

α

. For A being inside the cylinder,

α

must be

\leq α_{m a x}

, where

α_{m a x}

is the angle between the normal vector

\vec{C C^{'}}

and vector

\vec{C A^{'}}

, where

A^{'}

is on the upper base’s edge and on the diameter passing through A and

C^{'}

. (b) Partner 1 (Trp) is in red and partner 2 (Arg), in blue. The planes formed by the 6-atom aromatic ring of Trp and by Arg’s guanidinium group are shown in red and blue, respectively, as well as the vectors normal to these planes. The angle between these vectors, called

β

, taken between

0^{\circ}

and

90^{\circ}

, is shown on a gray background and measures the deviation from the parallelism between the two planes.

Figure 2. Illustration of

π

-involving interactions. (a) Example of

π

interactions at a protein-protein interface. Cation-

π

interaction between Lys Y97 and Tyr H33 and

π

-

π

interaction between Tyr H53 and Trp Y62 at the interface between a hen egg lysozyme and its cognate antibody (PDB ID: 2QDJ). Chain Y is the antigen depicted in pink; chain H is the antibody’s heavy chain and is shown in green; the light chain is shown in cyan. (b) Example of protein-ligand

π

interactions. ATP molecule C739 (in green spheres with its Ade nucleobase in green sticks), Phe B182 (in blue sticks) and Met B127 (in orange sticks) in an NAD kinase from Archaeoglobus fulgidus (PDB ID: 1Z0S). (c) Example of

π

-chains. Residues Trp 78, Trp 88, Phe 92, Arg 146, His 205, Trp 262, Arg 265, His 298, Met 301 in chain A of a proteobacterium’s carboxylesterase (PDB ID: 1QLW). Aromatic side chains are in green, sulfur-containing side chains in red, histidines in yellow, and positively charged side chains in blue. Backbone atoms are in gray. (d) Example of stair motifs at the protein-DNA interface. Nucleic acid bases Thy B5, Gua B6, Gua B7, Gua B8, Cyt B9 and residues His A149, Arg A146, Arg A124 and Asn A 121 in a zinc finger-DNA complex in Mus musculus (PDB ID: 1A1G). Nucleic acid bases are depicted as blue sticks and amino acid side chains, as red sticks. In the lower left corner, the H-bonds are depicted as green lines and the cation-

π

interaction as an orange line.

Figure 2. Illustration of

π

-involving interactions. (a) Example of

π

interactions at a protein-protein interface. Cation-

π

interaction between Lys Y97 and Tyr H33 and

π

-

π

interaction between Tyr H53 and Trp Y62 at the interface between a hen egg lysozyme and its cognate antibody (PDB ID: 2QDJ). Chain Y is the antigen depicted in pink; chain H is the antibody’s heavy chain and is shown in green; the light chain is shown in cyan. (b) Example of protein-ligand

π

interactions. ATP molecule C739 (in green spheres with its Ade nucleobase in green sticks), Phe B182 (in blue sticks) and Met B127 (in orange sticks) in an NAD kinase from Archaeoglobus fulgidus (PDB ID: 1Z0S). (c) Example of

π

-chains. Residues Trp 78, Trp 88, Phe 92, Arg 146, His 205, Trp 262, Arg 265, His 298, Met 301 in chain A of a proteobacterium’s carboxylesterase (PDB ID: 1QLW). Aromatic side chains are in green, sulfur-containing side chains in red, histidines in yellow, and positively charged side chains in blue. Backbone atoms are in gray. (d) Example of stair motifs at the protein-DNA interface. Nucleic acid bases Thy B5, Gua B6, Gua B7, Gua B8, Cyt B9 and residues His A149, Arg A146, Arg A124 and Asn A 121 in a zinc finger-DNA complex in Mus musculus (PDB ID: 1A1G). Nucleic acid bases are depicted as blue sticks and amino acid side chains, as red sticks. In the lower left corner, the H-bonds are depicted as green lines and the cation-

π

interaction as an orange line.

Figure 3. Comparison between the average relative frequency of each type of

π

interactions in the datasets

D_{monomer}

,

D_{homodimer}

,

D_{heterodimer}

,

D_{TCRpMHC}

,

D_{AbAg}

and

D_{protNA}

. The relative frequencies were computed as the number of interactions identified by PInteract divided by the total number of interactions in the protein, as defined in Section 2.4. For homo- and heterodimers, only the interactions occurring at the dimer interface were counted; for antibody-antigen complexes, only interactions that link one of the two antibody chains with the antigen; for TCR-pMHC, only interactions between one of the two TCR chains and the pMHC molecule; and for protein-DNA/RNA, only interactions linking the protein with DNA or RNA.

Figure 3. Comparison between the average relative frequency of each type of

π

interactions in the datasets

D_{monomer}

,

D_{homodimer}

,

D_{heterodimer}

,

D_{TCRpMHC}

,

D_{AbAg}

and

D_{protNA}

. The relative frequencies were computed as the number of interactions identified by PInteract divided by the total number of interactions in the protein, as defined in Section 2.4. For homo- and heterodimers, only the interactions occurring at the dimer interface were counted; for antibody-antigen complexes, only interactions that link one of the two antibody chains with the antigen; for TCR-pMHC, only interactions between one of the two TCR chains and the pMHC molecule; and for protein-DNA/RNA, only interactions linking the protein with DNA or RNA.

Figure 4. Distribution of the distances d between functional groups of

π

-involving interactions, defined in Section 2.3.2, for all considered types of

π

interactions detected in the

D_{monomer}

dataset.

Figure 4. Distribution of the distances d between functional groups of

π

-involving interactions, defined in Section 2.3.2, for all considered types of

π

interactions detected in the

D_{monomer}

dataset.

Figure 5. Distribution of the values of the lateral displacement with respect to the aromatic rings’ center,

(1 - cos α)

, in which the

α

angle measures the location of the functional group of interacting partner 2 above (or below) the plane of the aromatic ring of partner 1 (see Section 2.3.3), for different types of

π

interactions detected in the

D_{monomer}

dataset. The smaller the value of

(1 - cos α)

, the more directly the functional group of partner 2 is positioned above (or below) the center of the aromatic ring. See Supplementary Figure S1 for the corresponding

α

angle distributions.

Figure 5. Distribution of the values of the lateral displacement with respect to the aromatic rings’ center,

(1 - cos α)

, in which the

α

angle measures the location of the functional group of interacting partner 2 above (or below) the plane of the aromatic ring of partner 1 (see Section 2.3.3), for different types of

π

interactions detected in the

D_{monomer}

dataset. The smaller the value of

(1 - cos α)

, the more directly the functional group of partner 2 is positioned above (or below) the center of the aromatic ring. See Supplementary Figure S1 for the corresponding

α

angle distributions.

Figure 6. Distribution of

1 - cos β

values, in which the

β

angle measures the degree of parallelism between the interacting planar functional groups (see Section 2.3.4), for different types of

π

interactions detected in the

D_{monomer}

dataset. Parallel or stacked conformations correspond to

β = 0

° and

1 - cos β = 0

; perpendicular or T-shaped conformations, to

β = 90

° and

1 - cos β = 1

. See Supplementary Figure S2 for the corresponding

β

angle distributions.

Figure 6. Distribution of

1 - cos β

values, in which the

β

angle measures the degree of parallelism between the interacting planar functional groups (see Section 2.3.4), for different types of

π

interactions detected in the

D_{monomer}

dataset. Parallel or stacked conformations correspond to

β = 0

° and

1 - cos β = 0

; perpendicular or T-shaped conformations, to

β = 90

° and

1 - cos β = 1

. See Supplementary Figure S2 for the corresponding

β

angle distributions.

Figure 7. Distributions of

(1 - cos α)

and

(1 - cos β)

values of

π

-

π

interactions in the

D_{fibril}

set.

Figure 7. Distributions of

(1 - cos α)

and

(1 - cos β)

values of

π

-

π

interactions in the

D_{fibril}

set.

Table 1. Average relative frequencies of

π

interactions in the set of monomers

D_{monomer}

and in the sets of protein-protein complexes

D_{homodimer}

,

D_{heterodimer}

,

D_{TCRpMHC}

,

D_{AbAg}

, and

D_{protNA}

, computed as the number of interactions identified by PInteract divided by the total number of interactions in the protein (see Section 2.4). The standard deviation is given in parentheses. The associated p-values are given in Supplementary Table S1.

Table 1. Average relative frequencies of

π

interactions in the set of monomers

D_{monomer}

and in the sets of protein-protein complexes

D_{homodimer}

,

D_{heterodimer}

,

D_{TCRpMHC}

,

D_{AbAg}

, and

D_{protNA}

, computed as the number of interactions identified by PInteract divided by the total number of interactions in the protein (see Section 2.4). The standard deviation is given in parentheses. The associated p-values are given in Supplementary Table S1.

Dataset	Cation- $π$	Amino- $π$	His- $π$	Sulfur- $π$	$π$ - $π$
$D_{monomer}$	0.50 (0.46)	0.30 (0.33)	0.14 (0.22)	0.28 (0.37)	1.77 (1.13)
$D_{homodimer}$	0.33 (1.14)	0.17 (0.81)	0.09 (0.51)	0.10 (0.56)	1.13 (2.00)
$D_{heterodimer}$	0.79 (1.80)	0.37 (1.36)	0.14 (0.73)	0.30 (0.89)	1.37 (2.38)
$D_{TCRpMHC}$	0.70 (1.22)	0.86 (1.73)	0.35 (0.84)	0.09 (0.44)	2.12 (2.97)
$D_{AbAg}$	1.72 (3.21)	0.93 (1.93)	0.34 (1.15)	0.20 (0.78)	2.59 (3.89)
$D_{protNA}$	2.02 (12.67)	0.41 (1.08)	0.23 (0.89)	0.47 (4.29)	3.73 (10.41)

Table 2. Pearson correlation coefficients

ρ

between protein solubility and (1) the residue frequency F; (2) the frequency

F_{π}

of residues involved in a

π

interaction; (3) the frequency

F_{n o n - π}

of residues not involved in a

π

interaction, computed from the

D_{solubility}

dataset. A cross (×) indicates that the result is not statistically significant (p-value > 0.01).

Table 2. Pearson correlation coefficients

ρ

between protein solubility and (1) the residue frequency F; (2) the frequency

F_{π}

of residues involved in a

π

interaction; (3) the frequency

F_{n o n - π}

of residues not involved in a

π

interaction, computed from the

D_{solubility}

dataset. A cross (×) indicates that the result is not statistically significant (p-value > 0.01).

	Arg	Lys	Phe/Tyr/Trp	Asn/Gln	His	Met/Cys
F	−0.09	0.14	−0.17	− ${0.09}^{\times}$	− ${0.04}^{\times}$	${0.00}^{\times}$
$F_{π}$	−0.14	− ${0.04}^{\times}$	−0.23	− ${0.05}^{\times}$	− ${0.04}^{\times}$	−0.15
$F_{n o n - π}$	− ${0.07}^{\times}$	0.15	${0.00}^{\times}$	− ${0.09}^{\times}$	− ${0.02}^{\times}$	${0.04}^{\times}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, D.; Pucci, F.; Rooman, M. PInteract: Detecting Aromatic-Involving Motifs in Proteins and Protein-Nucleic Acid Complexes. Biomolecules 2025, 15, 1204. https://doi.org/10.3390/biom15081204

AMA Style

Li D, Pucci F, Rooman M. PInteract: Detecting Aromatic-Involving Motifs in Proteins and Protein-Nucleic Acid Complexes. Biomolecules. 2025; 15(8):1204. https://doi.org/10.3390/biom15081204

Chicago/Turabian Style

Li, Dong, Fabrizio Pucci, and Marianne Rooman. 2025. "PInteract: Detecting Aromatic-Involving Motifs in Proteins and Protein-Nucleic Acid Complexes" Biomolecules 15, no. 8: 1204. https://doi.org/10.3390/biom15081204

APA Style

Li, D., Pucci, F., & Rooman, M. (2025). PInteract: Detecting Aromatic-Involving Motifs in Proteins and Protein-Nucleic Acid Complexes. Biomolecules, 15(8), 1204. https://doi.org/10.3390/biom15081204

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

PInteract: Detecting Aromatic-Involving Motifs in Proteins and Protein-Nucleic Acid Complexes

Abstract

1. Introduction

2. Materials and Methods

2.1. Types of Interactions

2.2. Design Principles

2.3. Technical Implementation

2.3.1. Functional Groups

2.3.2. Distance Criterion

2.3.3. Angle Criterion

2.3.4. Plane Parallelism Angle

2.4. Structure Datasets’ Construction

3. Results

3.1. The PInteract Algorithm

3.2. Examples of $π$ -Involving Interactions

3.3. Large-Scale Analysis of $π$ -Involving Interactions

3.4. Relationship Between $π$ Interactions and Protein Solubility and Aggregation

3.5. PInteract: Fast, Scalable, and User-Friendly

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Correction Statement

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

PInteract: Detecting Aromatic-Involving Motifs in Proteins and Protein-Nucleic Acid Complexes

Abstract

1. Introduction

2. Materials and Methods

2.1. Types of Interactions

2.2. Design Principles

2.3. Technical Implementation

2.3.1. Functional Groups

2.3.2. Distance Criterion

2.3.3. Angle Criterion

2.3.4. Plane Parallelism Angle

2.4. Structure Datasets’ Construction

3. Results

3.1. The PInteract Algorithm

3.2. Examples of π -Involving Interactions

3.3. Large-Scale Analysis of π -Involving Interactions

3.4. Relationship Between π Interactions and Protein Solubility and Aggregation

3.5. PInteract: Fast, Scalable, and User-Friendly

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Correction Statement

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.2. Examples of $π$ -Involving Interactions

3.3. Large-Scale Analysis of $π$ -Involving Interactions

3.4. Relationship Between $π$ Interactions and Protein Solubility and Aggregation