1. Introduction
The enzyme 4-hydroxyphenylpyruvate dioxygenase (4-HPPD) is an Fe (II)-dependent, non-heme oxygenase that catalyzes the conversion of 4-hydroxyphenylpyruvate (4-HPP) to a homogenisate (HGA), the second step in the pathway for the catabolism of tyrosine [
1]. The enzyme 4-HPPD is encoded by the HPD gene located on chromosome 12, represented by 14 exons and 4 transcripts (splice variants), 207 orthologues, and 1 paralogue associated with 6 phenotypes [
1,
2]. Almost all organisms express 4-HPPD in the cell cytoplasm (with the exception of some gram-positive bacteria), showing 393 residues with a 40–50 kDa subunit mass organized in a tetrameric (typically in bacteria) or dimeric form (human) [
3]. Two domains characterize it; the vicinal oxygen chelate (VOC 1) domain (18–149 residues) and the VOC 2 domain (180–338 residues) (residue number based on the 4-HPPD with UniProt code P32754). The VOC domain is actually an enzyme family involved in catalyzing a wide variety of chemistries derived from a shared fundamental mechanism, which is the bidentate coordination to a divalent metal center (Fe (II)) by a substrate/intermediate/transition state with vicinal oxygen atoms [
4]. The active site is located within a barrel-like β-sheet that accommodates a metal-binding site with an iron (II) ion as a cofactor. The iron ion is coordinated by a residue triad highly conserved, that is, H183, H266, and E349 [
5].
The primary structure of the enzyme can be divided into a more variable N-terminus and a more conserved C-terminus, which folds into discrete domains. The active site is formed exclusively from residues of the C-terminus region; the function of the N-terminal half of the protein is not known [
5,
6].
The 4-HPPD C-terminal acts as a gate regulating the access to the active site and isolating the bound substrate during catalysis [
7,
8]. The coverage of the active site by a C-terminal extension is a recurring feature observed in numerous 2-oxoglutarate-dependent oxygenases [
5]. The human 4-HPPD enzyme possesses a C-terminal tail that performs the regulatory function of gating, in contrast to the bacterial 4-HPPD enzymes found in organisms such as
Pseudomonas fluorescens [
9] or
Streptomyces avermitilis [
10], which show a C-terminal α-helix which regulates the enzyme gating mechanism [
9,
10,
11].
E349 assumes a crucial role in catalysis, as it is proposed to be indispensable, both for the Fe(II)/substrate complex and for the activation of dioxygen upon substrate binding, representing the initial step in the 4-HPPD oxygenase activity. H266 plays a key role in the regulation of the geometry of the reactive oxygen intermediate (facilitating the coupling of decarboxylation and hydroxylation reactions). While H183 plays a critical role in metal ion coordination (determining the correct orientation of the reactive oxygen intermediate in the oxidative reaction), it represents an important step in protecting the protein from damage caused by alternative oxidative reactions.
Structural, computational, and spectroscopic studies have shown that the binding of the substrate (HPP) triggers the activation of the metal center, leading to its susceptibility to binding the dioxygen [
12,
13,
14,
15].
The 4-HPPD catalysis process is allowed in the absence of water molecules inside the active site, and such a chemical condition is favored by F347, F371, P339, R378, and Q375 residues, which, following a rearrangement of the side chains, cause the coverage of the gate. Then, the water molecules are displaced, leading to the start of the substrate catalysis process. The C-terminal region plays a crucial role in the arrangement of residues surrounding the active site, regulating the gate opening/closing, and consequently, the start/end of the enzyme catalysis [
6].
In the 4-HPPD 3D structure, the C-terminal N380, L381, T382, and N383 residues, forming an interaction network with K247, K248, E254, Y258, and Q375 (close to the active site), lead the C-terminal α-helix in the correct orientation for the gate regulation. Furthermore, the last 5 residues of the C-terminal (V389, V390, P391, G392, and M393) would seem to be key in the gate regulation. Unfortunately, no full-length 4-HPPD 3D structure is available to study the biological, functional, and structural role of the enzyme C-terminal tail.
However, in vitro mutations of C-terminal residues highlighted important molecular insights about human 4-HPPD Q375 and R378 mutants (two residues belonging to the C-terminal tail) inducing a lack of the enzyme activity, showing the crucial role of this region for the enzyme biological function [
6]. In-depth, the Q375N mutant causes the presence of the solvent in the active site, leading to the loss of enzyme function, while the mutation R378K causes the lack of the R378/E254 key structural interaction, showing its role in the regulation of the enzyme catalysis process [
6].
Currently, the absence of the full-length 4-HPPD 3D structure does not allow the use of experimental methods to define the enzyme molecular mechanism, making the computational study the only useful approach to investigate the enzyme gating process.
To obtain biological, functional, and structural insights into human 4-HPPD, we applied an integrative and comprehensive bioinformatics pipeline to dissect and explore the dynamic features of the wild-type and experimental mutants to define the molecular mechanism responsible for enzyme gating regulation.
In this study, for the first time, novel key residues involved in enzyme gating were identified, providing fine regulation of the enzyme in/activation bringing to light important molecular fingers. Our results could open new frontiers in the target/drug discovery field, not only to further investigate human 4-HPPD-related diseases but also to propose new strategies to finely regulate the enzyme’s biological activity.
2. Methods
The 3D structure (PDB ID: 5EC3) and FASTA sequence (UniProtKB Entry: P32754) of the human 4-HPPD apo state were retrieved from the RCSB Protein Data Bank [
16] and UniProt [
17], respectively. The mutant 3D structures (Q375N, R378K, Y221A, and K39A) were obtained by the DUET tool [
18]. To avoid errors during the molecular dynamic (MD) simulations, missing side chains and steric clashes in PDB files were adjusted through molecular modeling using PyMOD3.0 (Department of Biochemical Sciences, Sapienza University of Rome, Rome, Italy) [
19] with MODELLER 10.5, and validated using PROCHECK [
20]. CHARMM-GUI platforms [
21] were used to prepare all systems for MDs using the charmm36-mar2019 force field to assign all molecular parameters, while GROMACS 2019.3 (University of Groningen Royal Institute of Technology Uppsala University, Uppsala, Sweden) [
22] with CUDA support was used to perform the MD run. In brief, the structures were immersed in a cubic box filled with TIP3P water molecules and counter-ions to balance the net charge of the system. Simulations were run, applying periodic boundary conditions. The energy of the system was minimized, as suggested in previous work [
23]. Thus, 5000 steps of minimization were set with the steepest descent algorithm to converge to a minimum energy with forces less than 100 kJ/mol/nm. All the cMD simulations were performed, integrating each time-step of 2 fs. A V-rescale thermostat maintained the temperature at 300 K, and a Noose–Hoover barostat maintained the system pressure at 1 atm with a low dumping of 1 ps
−1. The LINCS algorithm constrained the bond lengths involving hydrogen atoms. The MD run was of 1µs for each system, and each MD was carried out 3 times (thus 15 MDs, for a total time of 15 µs). Next, a simulated annealing process to define the structural stability upon in silico mutagenesis was performed, setting six annealing times (from 0 ns to 5 ns) and six annealing temperatures (from 300 K to 550 K). All MD analyses were explored with GROMACS 2019.3 packages [
22]. PCA analyses were performed for the entire trajectory of each system, considering the three triplicates, using the BIO3D implemented in Rstudio 4.1.2 (Posit, Vienna, Austria), while the free energy landscape was evaluated using the GROMACS 2019.3 package. Secondary structural analyses were carried out with Timeline, a VMD plugin.
GRACE [
24] generated the MD graphs, and PyMOL 3.0 was used as a molecular graphic interface to produce the biological system pictures. To confirm the potential key residues, we applied an evolution approach using BLASTp 2.15.0 tool (National Library of Medicine, MD, USA) [
25] to perform two different multiple sequence alignments (MSA) between the human 4-HPPD primary structure and (i) the “Non-redundant protein sequences (nr)” database, excluding the “
mammalia” class, and (ii) the “Non-redundant protein sequences (nr)” database considering, only the “
mammalia” class. The MSA was performed by using the BLOSUM 62 matrix, word size of 6, and max target sequences of 5000; all other parameters were used by default. The MSA results were analyzed and displayed through the Skylign 1.8.2 tool (HHMI Janelia Farm Research Campus, VA, USA) [
26]. All data were computed using a workstation with CPU AMD Ryzen Threadripper PRO 5965WX 24 Core (4.5 GHz, 140 MB CACHE), DDR4 Kingston 3200 MHz 128 GB RAM, Nvidia GeForce RTX4090 24 GB graphic card, SSD NVME M.2 of 2 TB + 2x HDD S-ATA3 of 14 TB as storage partition with OS Ubuntu 14.04.
4. Discussion
Human 4-HPPD catalyzes the conversion of 4-hydroxyphenyl pyruvic acid to homogentisic acid, one of the steps in tyrosine catabolism.
The catalysis process occurs within the active site and several types of reactions are catalyzed by these oxygenases, including hydroxylations, desaturations, and oxidative ring closures, and the reaction environment may have pharmaceutical and medical implications. The conversion of substrate is very complex, involving oxidative decarboxylation, side-chain migration, and aromatic hydroxylation in a single catalytic cycle.
A barrel-like β-sheet buries the human 4-HPPD active site, which is finely regulated by a C-terminal tail able to cover the active site, assuming a gate function regulating the access of the substrate to the active site and isolating the bound substrate during catalysis.
Human 4-HPPD alterations have been observed in severe and rare diseases like tyrosinemia type III and hawkinsinuria, for which no cure is available. Hence, knowing the 4-HPPD gating molecular mechanism represents a crucial step in regulating and/or using 4-HPPD as a therapeutical target.
In vitro experiments showed the key role of the C-terminal tail for enzyme activity through truncation experiments.
Here, applying an integrative bioinformatics/evolution study, wild-type and mutant 4-HPPD were deeply dissected and explored, bringing to light for the first time an enzyme-gating molecular regulation mechanism, providing the human 4-HPPD full-length 3D structure, and revealing Y221 and K39 as two novel pivotal amino acidic residues that steer and stabilize the C-terminal tail conformational change defining the enzyme in/activation.
RMSD analyses confirmed the structural stability of systems considered in this study, and the reliability of the MD protocols. Furthermore, considering the protein backbone of RMSD without the C-terminal tail, a significantly lower and stable RMSD profile was obtained, and such evidence would show the super-flexible behavior of the C-tail within the protein structure.
Based on the experimental data, which showed the role of the C-terminal tail in the gating mechanism, we identified potential residues involved in this process by considering significant interactions between the C-terminal tail residues and any other residue present on the protein surface. Interaction network analyses performed for wild-type and Q375N and R378K mutant 4-HPPD provided interesting results.
Wild-type 4-HPPD exhibited the formation of hydrogen bonds for Y221-R378 and K39-M393 residues with significant occupancy, indirectly showing a close distance between the residues; differently, no hydrogen bond with Y221 and K39 was detected for all mutants, suggesting their potential involvement in the gating regulation.
PCA analyses showed, with high statistical accuracy, two different conformational states only for the wild-type 4-HPPD, representing the open and closed states following the conformational change of the C-terminal tail, and to further support our results, the FEL of each system was explored. Wild-type 4-HPPD showed two energy wells, characterized by a larger, more “rugged” and complex free-energy surface than all mutants. Such evidence would confirm that the wild-type enzyme had two conformational sub-states (open and closed), richer than conformational diversity, and more complex dynamics behavior than all other mutants. The RMSD distribution further validates our results, clearly providing two different conformational states of the wild-type enzyme throughout the entire MD simulation. Despite the three replicates, the wild-type enzyme showed a similar result for all the analyses performed; a slight difference was noted for the PCA and FEL. Such a difference could be explained by the super-flexible behavior of the C-terminal tail. Since it is made up of 19 residues, its behavior can slightly change both in the time of formation of interactions and in the movement to reach the N-terminal region; however, the wild-type enzyme, in the three triplicates, is able to always reach an open and closed state following the conformational change of the C-terminal loop. Instead, differently from the wild-type enzyme, all mutants never showed a significant PCA result or two stable states, based on the FEL, strongly supporting our analyses.
Conformational analyses carried out for the wild-type enzyme showed the Q375-R378 hydrogen bond formation. Such an interaction would lead to a first stabilization of the C-terminal tail, allowing the formation of a second hydrogen bond between R378 and Y221. Here, Y221 steers the C-terminal tail toward the N-terminal region (open/inactive state). Finally, the K39-M393 hydrogen bond would stabilize the C-terminal tail in a conformation able to wholly cover the enzyme active site (close/active state).
Our results would suggest the hypothesis that the absence of the initial interaction between Q375 and R378 may influence the C-terminal tail stabilization, avoiding hydrogen bond formation with Y221 and C-tail movement toward the N-terminal region.
The conformational analyses of Q375N and R378K mutant 4-HPPD confirmed our hypothesis; in fact, no hydrogen bond with Y221 was detected, causing a high flexibility and random conformations of the C-terminal tail, avoiding its correct shaping toward the N-terminal region, leading the enzyme to be in an open/inactive state for the whole MD run.
Such evidence would suggest the effective potential role of the Y221 and K39 residues in the enzyme gating regulation; therefore, Y221A and K39A mutants were obtained in silico to evaluate their dynamics behaviors.
The Y221A mutant analyses showed, firstly, the Q375-R378 hydrogen bond formation, and then the movement of the C-terminal tail toward A221, where no hydrogen bond was observed, leading to the conformation of the C-terminal tail in a shape like that of the Q375N and R378K mutants, confirming the potential role of Y221 in enzyme gating.
The K39A mutant provided significant evidence to confirm our study, where, firstly, the hydrogen bond between R378 and Q375 was established and subsequently the Y221-R378 hydrogen bond occurred; thus, the C-terminal tail was oriented toward the N-terminal region. Here, the presence of A39 did not allow the hydrogen bond formation with M393, leading to a destabilization of the C-terminal tail, causing a “semi-open/semi-active” conformation. Our result would be heavily supported by experimental evidence that 4-HPPD is involved in hawkinsinuria disease as a result of an alanine-to-threonine mutation at position 33 in the human enzyme [
28], causing an incorrect lock of the active site, which causes the entrance of water, leading to the formation of catalytic aberrations. Interestingly, A33 is located on the same α-helix of K39, and the mutation of A33T could determine a potential structural distortion of the α-helix or a potential polar interaction with M393 during its movement toward K39, not allowing the correct shape of the C-terminal tail, causing the “semi-open/semi-active” conformation of the enzyme.
The evolution approach (MSA analyses) showed that Y221 and K39 were always conserved during the protein evolution in the “mammalia” class, while in all other organisms (excluding the “mammalia” class), Y221 was poorly conserved with a substitution in phenylalanine.
On first consideration, this evidence would disprove our result; actually, the C-terminal tail (involved in the gating mechanism) has been acquired only in the “mammalia” class; interestingly, Y221 in all other organisms was present as phenylalanine, representing a conservative mutation, suggesting a potential functional/structural role; thus, it is highly conceivable that there is a co-evolutionary process between Y221-R378 and K39-M393 to establish the hydrogen bond pattern to perform the role of fine regulators of the enzyme gating mechanism.
Annealing molecular dynamics and secondary structural analyses showed the structural stability of Y221A and K39A upon mutation, providing a similar RMSD trend compared to wild-type 4-HPPD, indicating no structural destabilization due to the mutation proposed by us.