Comparative Analysis of Bacteriophytochrome Agp2 and Its Engineered Photoactivatable NIR Fluorescent Proteins PAiRFP1 and PAiRFP2

Two photoactivatable near infrared fluorescent proteins (NIR FPs) named “PAiRFP1” and “PAiRFP2” are formed by directed molecular evolution from Agp2, a bathy bacteriophytochrome of Agrobacterium tumefaciens C58. There are 15 and 24 amino acid substitutions in the structure of PAiRFP1 and PAiRFP2, respectively. A comprehensive molecular exploration of these bacteriophytochrome photoreceptors (BphPs) are required to understand the structure dynamics. In this study, the NIR fluorescence emission spectra for PAiRFP1 were recorded upon repeated excitation and the fluorescence intensity of PAiRFP1 tends to increase as the irradiation time was prolonged. We also predicted that mutations Q168L, V244F, and A480V in Agp2 will enhance the molecular stability and flexibility. During molecular dynamics (MD) simulations, the average root mean square deviations of Agp2, PAiRFP1, and PAiRFP2 were found to be 0.40, 0.49, and 0.48 nm, respectively. The structure of PAiRFP1 and PAiRFP2 were more deviated than Agp2 from its native conformation and the hydrophobic regions that were buried in PAiRFP1 and PAiRFP2 core exposed to solvent molecules. The eigenvalues and the trace of covariance matrix were found to be high for PAiRFP1 (597.90 nm2) and PAiRFP2 (726.74 nm2) when compared with Agp2 (535.79 nm2). It was also found that PAiRFP1 has more sharp Gibbs free energy global minima than Agp2 and PAiRFP2. This comparative analysis will help to gain deeper understanding on the structural changes during the evolution of photoactivatable NIR FPs. Further work can be carried out by combining PCR-based directed mutagenesis and spectroscopic methods to provide strategies for the rational designing of these PAiRFPs.


Introduction
Bacteriophytochrome photoreceptors (BphPs) are red and far-red light sensing proteins predominantly found in certain nonphotosynthetic and photosynthetic bacteria [1,2]. Several BphPs absorb and emit light in near infrared (NIR) region that provide an opportunity for deep tissue in vivo imaging. Mammalian tissues show maximum transparency called as NIR tissues transparency window (650-900 nm). The use of NIR light for in vivo imaging significantly reduces the absorption of light by water, melanin, and hemoglobin (Hb) [3,4]. Previous studies revealed that BphPs exhibit strong absorbance and weak fluorescence in NIR region [5,6]. BphPs have gained much consideration to use

Protein Expression, Purification, and In Vitro Assembly with BV
The Escherichia coli MC1061 cells were used as a host cell for the expression of PAiRFP1 protein as described in ref [18] with some modifications. The proteins were purified with nickel-affinity His-trap chelating column equipped with an AKTA purifier system (GE Healthcare). The resulting apoproteins were assembled at~3× molar excess of BV for overnight at 4 • C to allow holoprotein formation [25]. NAP-10 desalting column (GE Healthcare) was used to remove the unbound BV. The proteins were stored in a 50 mM Tris-HCl buffer at pH 7.8 for further use.

Spectroscopic Measurements
The NIR fluorescence emission spectra were detected upon repeated excitation at 680 nm for 10 min, using the F4600 spectrofluorometer (Hitachi). The protein samples were kept in dark and measured before and after repetitive irradiation using a 680 nm incident beam for 10 min. The fluorescence intensity and time-dependent dynamic curve for PAiRFP1 was plotted at 1-10 min immediately after repeated excitation. The UV-VIS NIR absorption spectra were recorded using U2910 spectrophotometer (Hitachi) at room temperature.

3D Structure Modelling
The three-dimensional (3D) structure of these BphPs was generated by homology modelling method using Modeller 9.10 [26,27]. The structural homologue search was performed in the Protein Data Bank (PDB) using basic local alignment search tool [28]. The missing residues in the crystal structures of Agp2 and PAiRFP2 were also modelled in order to obtain proper structure of Agp2, PAiRFP1, and PAiRFP2. The sequence of PAiRFP1 and PAiRFP2 were aligned with their templates Agp2 (PDB ID: 6G1Y) and PAiRFP2 (PDB ID: 6G1Z). The fold recognition methods were used to optimize the sequence-structure alignment. The most reliable models were evaluated on the basis of RMSD, TM score, and DOPE profile. The selected models were further refined using SCWRL 4.0 [29], and the GROMOS 43B1 force field implemented in Swiss-PdbViewer was used for energy minimization of the predicted 3D structures [30].

Mutation Analysis
Several computational servers such as DynaMut, DeepDDG, and Discovery Studio were used to predict the impact of mutations on the molecular flexibility and stability of proteins [31,32]. DynaMut is a powerful computational tool, which can be used to analyze the effect of single-point mutation on the stability and dynamics of protein resulting from vibrational entropy changes. It operates through Normal Mode Analysis (NMA) by implementing two different approaches Bio3D and ENCoM, which provide rapid, simplified, and powerful insightful exploration of protein dynamics [33][34][35][36], whereas DeepDDG server predicts the stability change of protein point mutations using neural networks [37]. These predictors were successfully applied to all the mutated sites in PAiRFP1 and PAiRFP2 to compute the impact of single-point mutations on the basis of protein folding free energy (∆∆G) and vibrational entropy changes (∆∆S) [31,[38][39][40][41].

Docking Studies
The literature review suggested that BV in BphPs tends to be in cis and trans form in Pr and Pfr states, respectively [42][43][44]. In order to understand the phenomena of Pr and Pfr conformations, the molecular docking of BV in cis and trans forms were performed with Agp2, PAiRFP1, and PAiRFP2 by Schrodinger molecular modelling software using CovDock that performs a series of automated steps based on a simple setup from the Maestro graphical interface [45,46]. The best docked poses of BV in both states were selected based on binding energy and proper orientations [25]. We assumed that the docked complex Agp2-BVcis, PAiRFP1-BVcis, and PAiRFP2-BVcis are in Pr states, whereas Agp2-BVtrans, PAiRFP1-BVtrans, and PAiRFP2-BVtrans are in Pfr states [18].

MD Simulations
Many marvelous biological functions in proteins and their profound dynamic mechanisms can be revealed by studying their internal motions using MD simulation [47][48][49][50][51]. The MD simulations were performed on apoproteins of Agp2, PAiRFP1, and PAiRFP2 at 300 K at the molecular mechanics level implemented in the GROMACS 2018.2 [52] using the GROMOS96 43a1 force field. The Agp2, PAiRFP1, and PAiRFP2 molecules were soaked in a cubic box of water molecules with a dimension of 10 Å, i.e., setting the box edge 10 Å from the molecule periphery using the gmx editconf module for creating boundary conditions and gmx solvate module for solvation. The simple point charge (spc216) water model was used to solvate the protein. The charges on Agp2, PAiRFP1, and PAiRFP2 molecules were neutralized by addition of Na + and Cl − ions using gmx genion module to maintain neutrality, preserving a physiological concentration (0.15 M). The detailed methodology used to analyze the trajectories has been described in our previous communications [51,[53][54][55]. All graphical presentations of the 3D models were prepared using PyMOL and VMD (Visual Molecular Dynamics) [56].

Essential Dynamics
The principal component analysis (PCA) or essential dynamics (ED) reflects the overall expansion of a protein during MD simulations. The sum of the eigenvalues is a measure of the overall mobility in the system to relate the elasticity of a protein under different environments. PCA or ED were calculated for atomic motions in Agp2, PAiRFP1, and PAiRFP2 molecules. The PCA is based on the diagonalization of the covariance matrix C, with the elements explained as follows: where r i represents the cartesian coordinate of the i th Cα atom, N is the number of Cα atoms, and <r i > indicates the time average over all configurations achieved during MD simulation. The MD projections of trajectories onto the key essential dynamics relates to the largest eigenvector, and the major fluctuations of the correlated atomic motions can be visualized [57].

Gibbs Free Energy Landscape
The structural features and conformational profiles of a protein can be obtained by Gibbs free energy landscape using conformational sampling methods [55]. The structural information and conformations profiles attained by MD simulations are used for further analysis. In order to obtain 2D and 3D depiction, the Gibbs free energy landscapes were projected onto the first principal component (PC1) and second principal component (PC2) with the highest eigenvalues calculated from PCA analysis for Agp2, PAiRFP1, and PAiRFP2. The free energy landscapes are defined by following equation as: where kB is the Boltzmann constant, T is the temperature, and P (PC1, PC2) is the normalized joint probability distribution.

Stability of NIR Fluorescence Emission
It is known that BV exhibited strong NIR absorption overlapping with its NIR fluorescence. The protein sample of PAiRFP1 was diluted so as to avoid the self-absorption from Pr state and inner filter effect from Pfr state. The protein samples were diluted to OD700 of 0.05, and the NIR fluorescence emission spectra were recorded at room temperature upon repeated excitation at 680 nm for 10 min. The single-exponential growth function was used to fit the time-dependent curve of the fluorescence intensity for PAiRFP1 at 1-10 min after repeated excitation. The results showed that the NIR fluorescence intensity tended to increase for PAiRFP1 as the irradiation time was prolonged ( Figure 1A,B). This phenomenon could be explained because Pfr state photoconverted into Pr state upon excitation at 680 nm was observed from the absorption spectra ( Figure 1C). Since only Pr state can emit NIR fluorescence, it was evident that NIR fluorescence intensity will be enhanced. Different from other BphP-based NIR FPs, PAiRFP1 and PAiRFP2 have a PHY domain, which is essential for Pfr→Pr photoconversion [13,17,19,20].

Gibbs Free Energy Landscape
The structural features and conformational profiles of a protein can be obtained by Gibbs free energy landscape using conformational sampling methods [55]. The structural information and conformations profiles attained by MD simulations are used for further analysis. In order to obtain 2D and 3D depiction, the Gibbs free energy landscapes were projected onto the first principal component (PC1) and second principal component (PC2) with the highest eigenvalues calculated from PCA analysis for Agp2, PAiRFP1, and PAiRFP2. The free energy landscapes are defined by following equation as: where kB is the Boltzmann constant, T is the temperature, and P(PC1, PC2) is the normalized joint probability distribution.

Stability of NIR Fluorescence Emission
It is known that BV exhibited strong NIR absorption overlapping with its NIR fluorescence. The protein sample of PAiRFP1 was diluted so as to avoid the self-absorption from Pr state and inner filter effect from Pfr state. The protein samples were diluted to OD700 of 0.05, and the NIR fluorescence emission spectra were recorded at room temperature upon repeated excitation at 680 nm for 10 min. The single-exponential growth function was used to fit the time-dependent curve of the fluorescence intensity for PAiRFP1 at 1-10 min after repeated excitation. The results showed that the NIR fluorescence intensity tended to increase for PAiRFP1 as the irradiation time was prolonged ( Figure 1A,B). This phenomenon could be explained because Pfr state photoconverted into Pr state upon excitation at 680 nm was observed from the absorption spectra ( Figure 1C). Since only Pr state can emit NIR fluorescence, it was evident that NIR fluorescence intensity will be enhanced. Different from other BphP-based NIR FPs, PAiRFP1 and PAiRFP2 have a PHY domain, which is essential for Pfr→Pr photoconversion [13,17,19,20].

Structure Analysis
The sequence of PAiRFP1 and PAiRFP2 showed 96% and 94% similarity with wild-type Agp2. The obtained structures of the PAiRFP1 and PAiRFP2 exhibited low violations of restraints, and hence it is assumed to be more precise. The final alignment of 3D models of PAiRFP1 and PAiRFP2 with Agp2 displayed a RMSD values of 0.13 and 0.19 Å, respectively. The alignment of PAiRFP1 with

Hydrogen Bonds and van der Waals Interactions
The Discovery Studio was used to analyze the different kinds of noncovalent interactions at the mutated sites in the predicted models of PAiRFP1 and PAiRFP2 and compared them with Agp2. The H-bonding and van der Waals forces pattern in the predicted protein structures of PAiRFP1 and PAiRFP2 were analyzed by Discovery Studio at the mutated sites, and compared them with Agp2 (Tables 1-3). In the PAS domain, mutation at residue K69 and R83 on Agp2 will not affect hydrogen bonding and van der Waals interactions in PAiRFP2. While mutation in G120 of Agp2 will result in loss and gain of van der Waals atomic interactions in PAiRFP2 with residues S121 and S286, respectively. In case of PAiRFP1, the mutation of M163, Q168, and Y280 sites from Agp2 to 163L, 168L, and 280C led to decrease in hydrogen bonding. While, the mutation of other sites from Agp2 did not affect hydrogen bonding. The overall van der Waals forces increased when Q168, Y280, E294, H303, and A480 sites from Agp2 were mutated to 168L, 280C, 294V, 303R, and V480, whereas the mutation of R220 and H498 sites from Agp2 to 220P and 498Y led to decrease in van der Waals forces. In summary, mutations in residues E294, H303, and A480 might improve the stability of Agp2, while residues R220 and H498 have no positive effects on the structural stability of Agp2. Our previous communications also suggested that reverse mutations V480A in PAiRFP1 will decrease the extinction coefficient and relative fluorescence quantum yield [18]. In case of PAiRFP2, the mutation of G120, S243, and E494 sites from Agp2 to 120D, 243N, and 494G led to increase in hydrogen bonding such as D120-S121, N243-Y23-F244, and G494-H498, whereas the mutation of M163, Y280, A487 sites from Agp2 to 163L, 280C, and 487T led to decrease in hydrogen bonding. Additionally, it is found that the overall van der Waals forces increase when Q168, V244, Y280, H333, M351, G409, and A487 sites from Agp2 were mutated to 168E, 244F, 280C, 333R, 351I, 409D, and 487T, whereas the mutation of M163, R220, D349, L419, T469 sites from Agp2 to 163L, 220P, 349R, 419I, and 469S led to decrease in van der Waals forces. Table 1. Analysis of the mutational sites of PAiRFP1 and PAiRFP2 and their comparison with wild-type Agp2 in "period clock protein" (Per), "aromatic hydrocarbon receptor nuclear translocator" (ARNT), and "single-minded protein" (Sim) (PAS) domain.

Stability and Flexibility Analysis
The DynaMut and DeepDDG servers were used to predict the impact of mutations on the molecular flexibility and stability of PAiRFP1 and PAiRFP2. These predictors were successfully applied to all the mutated sites in PAiRFP1 and PAiRFP2 to compute the impact of single-point mutation on the basis of protein folding free energy (∆∆G) and vibrational entropy changes (∆∆S).

Interaction of BV in Cis and Trans Forms
The BV in cis and trans forms were docked into the active site of Agp2, PAiRFP1, and PAiRFP2. The top poses of BV in both forms were selected. BV was covalently bound to Cys13 of Agp2, PAiRFP1, and PAiRFP2 as it is present in nature. The superimposition of the docked complexes suggested that BV is bound to Agp2, PAiRFP1, and PAiRFP2 in the same position and proper orientations. All the docked complexes such as Agp2-BVcis, Agp2-BVtrans, PAiRFP1-BVcis, PAiRFP1-BVtrans, PAiRFP2-BVcis, and PAiRFP2-BVtrans were analyzed (Figure 2A-F). These cis and trans orientations of BV are considered as Pr and Pfr states of each BphP, respectively. It clearly indicates that the residual interactions change upon different orientations of BV in BphPs ( Table 6). The overall van der Waals interactions were increased in case of Agp2-BVcis and PAiRFP1-BVcis forms as compared to their trans orientations of BV. The electrostatic interactions were increased in case of PAiRFP2-BVcis form as compared to PAiRFP2-BVtrans. The van der Waals interactions were also increased in case of PAiRFP2-BVtrans as compared to PAiRFP2-BVcis.
The hydrophobic and hydrophilic SASA for Agp2 were found to be 112.04 and 132.54 nm 2 , respectively. The hydrophobic and hydrophilic SASA for PAiRFP1 were found to be 120.47 and 128.91 nm 2 , respectively. The hydrophobic and hydrophilic SASA for PAiRFP2 were found to be 117. 45 and 130.89 nm 2 , respectively. The hydrophobic SASA was increased in case of PAiRFP1 and PAiRFP2 as compared to Agp2. The hydrophilic SASA was decreased in case of PAiRFP1 and PAiRFP2 as compared to Agp2. The hydrophobic SASA was found to increase in case of PAiRFP1 than PAiRFP2. There was a slight decrease in hydrophilic SASA in case of PAiRFP1 than PAiRFP2. The graph clearly shows that the hydrophobic regions that were buried in the PAiRFP1 core were exposed to solvent than PAiRFP2.  The hydrophobic and hydrophilic SASA for Agp2 were found to be 112.04 and 132.54 nm 2 , respectively. The hydrophobic and hydrophilic SASA for PAiRFP1 were found to be 120.47 and 128.91 nm 2 , respectively. The hydrophobic and hydrophilic SASA for PAiRFP2 were found to be 117.45 and 130.89 nm 2 , respectively. The hydrophobic SASA was increased in case of PAiRFP1 and PAiRFP2 as compared to Agp2. The hydrophilic SASA was decreased in case of PAiRFP1 and PAiRFP2 as compared to Agp2. The hydrophobic SASA was found to increase in case of PAiRFP1 than PAiRFP2. There was a slight decrease in hydrophilic SASA in case of PAiRFP1 than PAiRFP2. The graph clearly shows that the hydrophobic regions that were buried in the PAiRFP1 core were exposed to solvent than PAiRFP2.  3.6. Structural Dynamics

Structural Deviations and Compactness
To explore the structural dynamics of Agp2, PAiRFP1, and PAiRFP2, the root mean square deviation (RMSD), root mean square fluctuations, and the radius of gyration (Rg) were analyzed [22]. The average RMSD values of Agp2, PAiRFP1, and PAiRFP2 were found to be 0.40, 0.49, and 0.48 nm, respectively ( Figure 3A). Agp2 showed least RMSD fluctuation as compared to PAiRFP1 and PAiRFP2, whereas PAiRFP1 and PAiRFP2 showed similar patterns in the RMSD plot. Further, the root mean square fluctuation (RMSF) of the Agp2, PAiRFP1, and PAiRFP2 were plotted ( Figure 3B,C). deviation (RMSD), root mean square fluctuations, and the radius of gyration (Rg) were analyzed [22]. The average RMSD values of Agp2, PAiRFP1, and PAiRFP2 were found to be 0.40, 0.49, and 0.48 nm, respectively ( Figure 3A). Agp2 showed least RMSD fluctuation as compared to PAiRFP1 and PAiRFP2, whereas PAiRFP1 and PAiRFP2 showed similar patterns in the RMSD plot. Further, the root mean square fluctuation (RMSF) of the Agp2, PAiRFP1, and PAiRFP2 were plotted ( Figure 3B, C). The average RMSF of the Agp2, PAiRFP1, and PAiRFP2 were found to be 0.15, 0.16, and 0.18 nm, respectively. The rise in residual fluctuations are due to mutations in the structure of PAiRFP1 and PAiRFP2. The Rg is linked to the tertiary structural volume of Agp2, PAiRFP1, and PAiRFP2. A protein, which has higher Rg is assumed to have less tight packing. The average Rg values for Agp2, PAiRFP1, and PAiRFP2 were found to be 2.70, 2.60, and 2.77 nm, respectively. It is found that PAiRFP1 has more tight packing than Agp2 and PAiRFP2 in its tertiary structure ( Figure 3D).

Solvent Accessible Surface Area
The solvent accessible surface area (SASA) is explained as the surface area of a protein which forms networks with its solvent molecules [23]. The average SASA values with respect to backbone for Agp2, PAiRFP1, and PAiRFP2 were found to be 248.60, 252.88, and 250.42 nm 2 , respectively ( Figure 4A). The average SASA values with respect to protein for Agp2, PAiRFP1, and PAiRFP2 were found to be 215.09, 217.67, and 223.87 nm 2 , respectively ( Figure 4B). The SASA plot suggested that PAiRFP1 and PAiRFP2 have more SASA than Agp2 due to point mutations. This can be presumed as the internal residues of PAiRFP1 and PAiRFP2 were exposed to solvent due to these point mutations. Solvation plays crucial roles in monitoring the stability of protein structure. The solvation The average RMSF of the Agp2, PAiRFP1, and PAiRFP2 were found to be 0.15, 0.16, and 0.18 nm, respectively. The rise in residual fluctuations are due to mutations in the structure of PAiRFP1 and PAiRFP2. The Rg is linked to the tertiary structural volume of Agp2, PAiRFP1, and PAiRFP2. A protein, which has higher Rg is assumed to have less tight packing. The average Rg values for Agp2, PAiRFP1, and PAiRFP2 were found to be 2.70, 2.60, and 2.77 nm, respectively. It is found that PAiRFP1 has more tight packing than Agp2 and PAiRFP2 in its tertiary structure ( Figure 3D).

Solvent Accessible Surface Area
The solvent accessible surface area (SASA) is explained as the surface area of a protein which forms networks with its solvent molecules [23]. The average SASA values with respect to backbone for Agp2, PAiRFP1, and PAiRFP2 were found to be 248.60, 252.88, and 250.42 nm 2 , respectively ( Figure 4A). The average SASA values with respect to protein for Agp2, PAiRFP1, and PAiRFP2 were found to be 215.09, 217.67, and 223.87 nm 2 , respectively ( Figure 4B). The SASA plot suggested that PAiRFP1 and PAiRFP2 have more SASA than Agp2 due to point mutations. This can be presumed as the internal residues of PAiRFP1 and PAiRFP2 were exposed to solvent due to these point mutations. Solvation plays crucial roles in monitoring the stability of protein structure. The solvation effect is measured by the solvation free energy and reflects the atomic-level interactions between the protein and solvent. The free energy of solvation for Agp2, PAiRFP1, and PAiRFP2 were found to be 307.15, 313.21, and 321.49 kJ/mol/nm 2 , respectively ( Figure 4C). The mutations in Agp2 result in high solvation energy. Additionally, the SASA plots were further resolved into hydrophobic and hydrophilic regions ( Figure 4G-I). effect is measured by the solvation free energy and reflects the atomic-level interactions between the protein and solvent. The free energy of solvation for Agp2, PAiRFP1, and PAiRFP2 were found to be 307.15, 313.21, and 321.49 kJ/mol/nm 2 , respectively ( Figure 4C). The mutations in Agp2 result in high solvation energy. Additionally, the SASA plots were further resolved into hydrophobic and hydrophilic regions ( Figure 4G-I).

Hydrogen Bonds and Secondary Structure Analysis
The free energy difference (ΔG) between the folded and unfolded states of proteins is affected by changes in atomic interactions. Changes in the interaction among residues within a protein and its surroundings affect the entropy of the system, thus affecting flexibility/rigidity of the structure. Proteins are stabilized by covalent disulfide bonds and the noncovalent hydrophobic, electrostatic, van der Waals, and hydrogen bonds. The hydrogen bond is a significant factor in stabilizing protein conformations. To check the hydrogen bond formations between main and side chains of Agp2, PAiRFP1, and PAiRFP2, the hydrogen bonds paired within 0.35 nm were estimated during the 100 ns MD simulations. The average number of hydrogen bonds between main and side chains of Agp2, PAiRFP1, and PAiRFP2 were found to be 192, 188, and 196, respectively ( Figure 5A). The average number of hydrogen bonds decrease in case of PAiRFP1 and increase in case of PAiRFP2. This may be due to the increase in SASA of PAiRFP1.

Hydrogen Bonds and Secondary Structure Analysis
The free energy difference (∆G) between the folded and unfolded states of proteins is affected by changes in atomic interactions. Changes in the interaction among residues within a protein and its surroundings affect the entropy of the system, thus affecting flexibility/rigidity of the structure. Proteins are stabilized by covalent disulfide bonds and the noncovalent hydrophobic, electrostatic, van der Waals, and hydrogen bonds. The hydrogen bond is a significant factor in stabilizing protein conformations. To check the hydrogen bond formations between main and side chains of Agp2, PAiRFP1, and PAiRFP2, the hydrogen bonds paired within 0.35 nm were estimated during the 100 ns MD simulations. The average number of hydrogen bonds between main and side chains of Agp2, PAiRFP1, and PAiRFP2 were found to be 192, 188, and 196, respectively ( Figure 5A). The average number of hydrogen bonds decrease in case of PAiRFP1 and increase in case of PAiRFP2. This may be due to the increase in SASA of PAiRFP1.
The purpose of the secondary structure is to spot the structural features of Agp2, PAiRFP1, and PAiRFP2. The secondary structure assignments in Agp2, PAiRFP1, and PAiRFP2 such as α-helix, β-sheet, and turn were split into individual residues at each time step. The average number of residues participated in secondary structure formation of Agp2, PAiRFP1, and PAiRFP2 were compared (Table 7). It was found that the overall average residues participated in structure formation in case of Agp2, PAiRFP1, and PAiRFP2 were the same. In case of PAiRFP1, the α-helix slightly decreases, whereas β-sheets remain the same when compared with Agp2. In case of PAiRFP2, β-sheets slightly decrease, whereas α-helix remains the same ( Figure 5B-D). Further, the volume and density of Agp2, PAiRFP1, and PAiRFP2 were calculated. It was found that the volume of Agp2, PAiRFP1, and PAiRFP2 were 91.94, 92.85, and 93.09 nm 3 , respectively. The average density of Agp2, PAiRFP1, and PAiRFP2 were found to be 1018. 35,1013.08, and 1009.72 g/L, respectively ( Figure 6A,B). It is clear that the volume of PAiRFP1 and PAiRFP2 are slightly higher than Agp2, which is in line with SASA data. Similarly, the density of PAiRFP1 and PAiRFP2 are less than Agp2. The purpose of the secondary structure is to spot the structural features of Agp2, PAiRFP1, and PAiRFP2. The secondary structure assignments in Agp2, PAiRFP1, and PAiRFP2 such as α-helix, βsheet, and turn were split into individual residues at each time step. The average number of residues participated in secondary structure formation of Agp2, PAiRFP1, and PAiRFP2 were compared (Table 7). It was found that the overall average residues participated in structure formation in case of Agp2, PAiRFP1, and PAiRFP2 were the same. In case of PAiRFP1, the α-helix slightly decreases, whereas β-sheets remain the same when compared with Agp2. In case of PAiRFP2, β-sheets slightly decrease, whereas α-helix remains the same ( Figure 5B-D). Further, the volume and density of Agp2, PAiRFP1, and PAiRFP2 were calculated. It was found that the volume of Agp2, PAiRFP1, and PAiRFP2 were 91.94, 92.85, and 93.09 nm 3 , respectively. The average density of Agp2, PAiRFP1, and PAiRFP2 were found to be 1018.35, 1013.08, and 1009.72 g/L, respectively ( Figure 6A,B). It is clear that the volume of PAiRFP1 and PAiRFP2 are slightly higher than Agp2, which is in line with SASA data. Similarly, the density of PAiRFP1 and PAiRFP2 are less than Agp2. Table 7. Percentage of residues in Agp2, PAiRFP1, and PAiRFP2 participated in average structure formation during 100 ns molecular dynamics (MD) simulations. Agp2  65  21  21  1  13  9  34  1  PAiRFP1  65  20  21  2  13  10  32  1  PAiRFP2  65  21  20  2  13  9 34 1 * Structure = α-helix + β-sheet + β-bridge + turn.

Principal Component Analysis
The PCA or ED show an overall expansion of Agp2, PAiRFP1, and PAiRFP2 molecules during MD simulation. The PCA for Agp2, PAiRFP1, and PAiRFP2 were calculated using gmx covar module with respect to the backbone. The ED identifies the substantial average atomic motions of Agp2, PAiRFP1, and PAiRFP2. The sum of the eigenvalues is a measure of the overall mobility in the system, and it relates with the elasticity of a protein. The eigenvalues and the trace of the covariance matrix were found to be 535.79, 597.90, and 726.74 nm 2 for Agp2, PAiRFP1, and PAiRFP2, respectively. The eigenvalues and the trace of covariance matrix were found to be high for PAiRFP1 and PAiRFP2 when compared with Agp2. The higher values of eigenvalues and the trace of covariance matrix suggest that the average random fluctuations are more in case of PAiRFP2 than PAiRFP1. Greater eigenvalues indicate more expansion of PAiRFP2 than PAiRFP1. Figure 7A-F shows detailed multidimensional covariance matrix for each atom pair covariance. The 2D projections of trajectories on eigenvectors displayed diverse projections of Agp2, PAiRFP1, and PAiRFP2 molecules. There was a major difference of projections of trajectories in case of PAiRFP2. The variations in the position of atoms are very different in case of PAiRFP2 than PAiRFP1 and Agp2. This might be due to different conformations of PAiRFP2 during MD simulations. The root mean square atomic fluctuations of Agp2, PAiRFP1, and PAiRFP2 were also monitored during the PCA calculations. The random atomic fluctuations of PAiRFP2 were more than PAiRFP1 and Agp2. The eigenvector components were further resolved into x, y, and z directions.

Principal Component Analysis
The PCA or ED show an overall expansion of Agp2, PAiRFP1, and PAiRFP2 molecules during MD simulation. The PCA for Agp2, PAiRFP1, and PAiRFP2 were calculated using gmx covar module with respect to the backbone. The ED identifies the substantial average atomic motions of Agp2, PAiRFP1, and PAiRFP2. The sum of the eigenvalues is a measure of the overall mobility in the system, and it relates with the elasticity of a protein. The eigenvalues and the trace of the covariance matrix were found to be 535.79, 597.90, and 726.74 nm 2 for Agp2, PAiRFP1, and PAiRFP2, respectively. The eigenvalues and the trace of covariance matrix were found to be high for PAiRFP1 and PAiRFP2 when compared with Agp2. The higher values of eigenvalues and the trace of covariance matrix suggest that the average random fluctuations are more in case of PAiRFP2 than PAiRFP1. Greater eigenvalues indicate more expansion of PAiRFP2 than PAiRFP1. Figure 7A-F shows detailed multidimensional covariance matrix for each atom pair covariance. The 2D projections of trajectories on eigenvectors displayed diverse projections of Agp2, PAiRFP1, and PAiRFP2 molecules. There was a major difference of projections of trajectories in case of PAiRFP2. The variations in the position of atoms are very different in case of PAiRFP2 than PAiRFP1 and Agp2. This might be due to different conformations of PAiRFP2 during MD simulations. The root mean square atomic fluctuations of Agp2, PAiRFP1, and PAiRFP2 were also monitored during the PCA calculations. The random atomic fluctuations of PAiRFP2 were more than PAiRFP1 and Agp2. The eigenvector components were further resolved into x, y, and z directions.

Gibbs Free Energy Landscape
The gmx covar, gmx anaeig, and gmx sham modules were utilized to calculate the Gibbs free energy landscape by their own initial (PC1) and next (PC2) eigenvectors projections. The color-coded energy landscape displayed varied forms for Agp2, PAiRFP1, and PAiRFP2 ( Figure 8A-C). The covariance matrix for each atom pair covariance shows different patterns in each case. The corresponding free energy contour map with deep blue shade represents lower energy state. The main free energy well in the global free energy minimum region of Agp2, PAiRFP1, and PAiRFP2 is different. A comparison

Gibbs Free Energy Landscape
The gmx covar, gmx anaeig, and gmx sham modules were utilized to calculate the Gibbs free energy landscape by their own initial (PC1) and next (PC2) eigenvectors projections. The color-coded energy landscape displayed varied forms for Agp2, PAiRFP1, and PAiRFP2 ( Figure 8A-C). The covariance matrix for each atom pair covariance shows different patterns in each case. The corresponding free energy contour map with deep blue shade represents lower energy state. The main free energy well in the global free energy minimum region of Agp2, PAiRFP1, and PAiRFP2 is different. A comparison between the full views of the Gibbs free energy values of Agp2, PAiRFP1, and PAiRFP2 suggested that these BphPs have different patterns of global minima. PAiRFP1 has a sharper stable global minimum than Agp2 and PAiRFP2. A single deep blue color well is generally treated as stable. The single sharp stable global minima might be the reason for improved photophysical properties of PAiRFP1.

Discussion
BphPs contain heme breakdown product, linear tetrapyrrole biliverdin (BV) IXα, as a chromophore. BV is the most abundant and ubiquitous product in cells of different eukaryotic organisms. Most of the BphPs have been successfully used as a template for the development of NIR FPs [5,6]. Low light scattering and reduced autofluorescence of NIR light in biological tissues provide higher SNR for deep tissue imaging as compared to visible light [3,4]. The development of such NIR FPs enable scientists to thoroughly monitor the different biological processes like cellular pH, metabolite concentrations and homeostasis, protein-protein interactions, expression, localization, dynamics, and solubility of proteins [58][59][60][61][62].
In this study, the NIR fluorescence emission spectra for PAiRFP1 were detected upon repeated excitation at 680 nm for 10 min, and it was found that the fluorescence intensity of PAiRFP1 enhanced as the irradiation time was prolonged. The Pfr state photoconverted into Pr state upon excitation at 680 nm observed from the absorption spectra ( Figure 1C). Since only Pr state can emit NIR fluorescence, it was evident that NIR fluorescence intensity will be enhanced. Upon absorption of light, BphPs are reversibly photoconverted between two spectrally distinct and relatively stable forms termed as Pr and Pfr states. The irradiation of Pr and Pfr forms with red and far-red light lead to cis (Z or Zusammen) and trans (E or Entgegen) isomerization of C15 = C16 double bond between ring C and D within the bilin chromophore system. This causes a cascade of conformational changes within the protein structure [63]. All the nitrogen atoms of BV become protonated during the stable forms of Pr and Pfr states and briefly deprotonated at ring B or C during Pr→Pfr photoconversion [64]. The canonical BphPs gain stable Pr state in dark and produce Pfr state upon illumination with red light. The Pfr state then reverts back into relaxed Pr state either by thermal dark reversion or by irradiation with far-red light [65]. Several engineered canonical BphPs are unable to convert to Pfr state and display only Pr state. The bathy BphPs adopt stable Pfr state in dark and yield Pr state upon irradiation with far-red light. The Pr state then converts back into relaxed Pfr state either by thermal dark reversion or by irradiation with red light [20]. In engineered bathy BphPs, like PAiRFP1 and PAiRFP2, the Pr and Pfr states coexist in the resting state.
Usually, these two states cannot coexist in dark because they exhibit different thermodynamic stability. Activation barrier occurs (Ea) between Pr and Pfr on the ground state. However, once BphP is activated to the excited state, Ea will become smaller, therefore resulting in light-induced isomerization from Pr to Pfr on the excited state. That is called "photoconversion." After Pfr is formed,

Discussion
BphPs contain heme breakdown product, linear tetrapyrrole biliverdin (BV) IXα, as a chromophore. BV is the most abundant and ubiquitous product in cells of different eukaryotic organisms. Most of the BphPs have been successfully used as a template for the development of NIR FPs [5,6]. Low light scattering and reduced autofluorescence of NIR light in biological tissues provide higher SNR for deep tissue imaging as compared to visible light [3,4]. The development of such NIR FPs enable scientists to thoroughly monitor the different biological processes like cellular pH, metabolite concentrations and homeostasis, protein-protein interactions, expression, localization, dynamics, and solubility of proteins [58][59][60][61][62].
In this study, the NIR fluorescence emission spectra for PAiRFP1 were detected upon repeated excitation at 680 nm for 10 min, and it was found that the fluorescence intensity of PAiRFP1 enhanced as the irradiation time was prolonged. The Pfr state photoconverted into Pr state upon excitation at 680 nm observed from the absorption spectra ( Figure 1C). Since only Pr state can emit NIR fluorescence, it was evident that NIR fluorescence intensity will be enhanced. Upon absorption of light, BphPs are reversibly photoconverted between two spectrally distinct and relatively stable forms termed as Pr and Pfr states. The irradiation of Pr and Pfr forms with red and far-red light lead to cis (Z or Zusammen) and trans (E or Entgegen) isomerization of C15 = C16 double bond between ring C and D within the bilin chromophore system. This causes a cascade of conformational changes within the protein structure [63]. All the nitrogen atoms of BV become protonated during the stable forms of Pr and Pfr states and briefly deprotonated at ring B or C during Pr→Pfr photoconversion [64]. The canonical BphPs gain stable Pr state in dark and produce Pfr state upon illumination with red light. The Pfr state then reverts back into relaxed Pr state either by thermal dark reversion or by irradiation with far-red light [65]. Several engineered canonical BphPs are unable to convert to Pfr state and display only Pr state. The bathy BphPs adopt stable Pfr state in dark and yield Pr state upon irradiation with far-red light. The Pr state then converts back into relaxed Pfr state either by thermal dark reversion or by irradiation with red light [20]. In engineered bathy BphPs, like PAiRFP1 and PAiRFP2, the Pr and Pfr states coexist in the resting state.
Usually, these two states cannot coexist in dark because they exhibit different thermodynamic stability. Activation barrier occurs (Ea) between Pr and Pfr on the ground state. However, once BphP is activated to the excited state, Ea will become smaller, therefore resulting in light-induced isomerization from Pr to Pfr on the excited state. That is called "photoconversion". After Pfr is formed, it can revert back to Pr via light-dependent or light-independent way that is called dark recovery. The fact that dark recovery can happen infers that the activation energy Ea is not too high. The contrary situation was observed for Bathy BphPs. The energy of both Ea and Ea changes with BphP species and with the microenvironment surrounding BV. Dark recovery is related to dynamics or kinetics, where the rate is dependent on the activation energy between Pr and the intermediates. Therefore, we cannot exclude the possibility that Pr and Pfr states can coexist in resting state in PAiRFPs. Dark recovery also depends on temperature in canonical or pH-in bathy BphPs [66,67]. We have also studied the assembly dynamics of BV with apoprotein of PAiRFP1. As shown in Figure 9A,B, Pr state was formed quickly and then converted to Pfr state. PAiRFPs. Dark recovery also depends on temperature in canonical or pH-in bathy BphPs [66,67]. We have also studied the assembly dynamics of BV with apoprotein of PAiRFP1. As shown in Figure  9A,B, Pr state was formed quickly and then converted to Pfr state. Almost all BphPs experience two reversible photocycles such as Pr→Pfr and Pfr→Pr. Low temperature X-ray crystallographic data, Fourier-transform infrared spectroscopy (FTIR), and Raman spectroscopy results showed that Pfr can be phototransformed into Pr mainly via three intermediates including Lumi-F, Meta-Fa, and Meta-Fb. In addition, Pr experiences Lumi-R, Meta-R, and Meta-Rc to reach Pfr state. Piatkevich et al. used 77K and 245K absorption spectroscopic methods to investigate the two photocycles in both PAiRFP1 and PAiRFP2. It was found that Pfr could switch to Pr smoothly. Nevertheless, the Pr→Pfr photoconversion was blocked at the Meta-Ra intermediate in PAiRFP1. Since Meta-Ra→Meta-Rc was the rate-determining step for the formation of Pfr, the Pr→Pfr photoconversion failed in PAiRFP1. Instead, Meta-Ra returned back to the initial Pr state. This explained why Pr state of PAiRFP1 could still emit NIR fluorescence even when the PHY domain was present. In spite of that, it was unclear about the key factors that blocked Meta-Ra→Meta-Rc→Pfr conversion in PAiRFP1. Some experimental methods such as femtosecond transient absorption spectroscopy, femtosecond Raman spectroscopy, and nanosecond flash photolysis are required to gain deeper understanding on the time scale of fs, ps, and ns to unravel this mechanism. Additionally, some theoretical calculations will also provide useful information regarding the structure and stability of both Meta-Ra and Meta-Rc.
Our present study suggested that after several rounds of mutations by Piatkevich et al., the mutants Q168L, V244F, and A480V will enhance the molecular stability and flexibility of Agp2. A series of each round of library of different mutants that are constructed by random mutagenesis are explained in Figure 10. We also found that structures of PAiRFP1 and PAiRFP2 are highly deviated than Agp2 from its native conformation, but PAiRFP1 exhibited more tight packing as compared to Agp2 and PAiRFP2 during 100 ns MD simulations. The hydrogen bonds between main chain and side chains of PAiRFP1 were less than Agp2 and PAiRFP2, which led to the exposure of the hydrophobic region of PAiRFP1 core to more solvent molecules. Furthermore, the eigenvalues and Almost all BphPs experience two reversible photocycles such as Pr→Pfr and Pfr→Pr. Low temperature X-ray crystallographic data, Fourier-transform infrared spectroscopy (FTIR), and Raman spectroscopy results showed that Pfr can be phototransformed into Pr mainly via three intermediates including Lumi-F, Meta-Fa, and Meta-Fb. In addition, Pr experiences Lumi-R, Meta-R, and Meta-Rc to reach Pfr state. Piatkevich et al. used 77K and 245K absorption spectroscopic methods to investigate the two photocycles in both PAiRFP1 and PAiRFP2. It was found that Pfr could switch to Pr smoothly. Nevertheless, the Pr→Pfr photoconversion was blocked at the Meta-Ra intermediate in PAiRFP1. Since Meta-Ra→Meta-Rc was the rate-determining step for the formation of Pfr, the Pr→Pfr photoconversion failed in PAiRFP1. Instead, Meta-Ra returned back to the initial Pr state. This explained why Pr state of PAiRFP1 could still emit NIR fluorescence even when the PHY domain was present. In spite of that, it was unclear about the key factors that blocked Meta-Ra→Meta-Rc→Pfr conversion in PAiRFP1. Some experimental methods such as femtosecond transient absorption spectroscopy, femtosecond Raman spectroscopy, and nanosecond flash photolysis are required to gain deeper understanding on the time scale of fs, ps, and ns to unravel this mechanism. Additionally, some theoretical calculations will also provide useful information regarding the structure and stability of both Meta-Ra and Meta-Rc.
Our present study suggested that after several rounds of mutations by Piatkevich et al., the mutants Q168L, V244F, and A480V will enhance the molecular stability and flexibility of Agp2. A series of each round of library of different mutants that are constructed by random mutagenesis are explained in Figure 10. We also found that structures of PAiRFP1 and PAiRFP2 are highly deviated than Agp2 from its native conformation, but PAiRFP1 exhibited more tight packing as compared to Agp2 and PAiRFP2 during 100 ns MD simulations. The hydrogen bonds between main chain and side chains of PAiRFP1 were less than Agp2 and PAiRFP2, which led to the exposure of the hydrophobic region of PAiRFP1 core to more solvent molecules. Furthermore, the eigenvalues and trace of covariance matrix were found higher for PAiRFP1 and PAiRFP2 as compared to Agp2. It means that different point mutations increase the random atomic fluctuations of these PA NIR FPs during their directed molecular evolution from Agp2. The detailed computational analysis provided useful understanding structure dynamics of Agp2, PAiRFP1, and PAiRFP2. We found that several mutations are stabilizing, destabilizing, favorable, and unfavorable. The predicted mutations such as Q168L, V244F, and A480V in Agp2 will enhance the molecular stability and flexibility of protein. It is possible to further design a better PA NIR FPs from Agp2 than PAiRFP1 and PAiRFP2. We carried out structural analysis and found that the average SASA values with respect to backbone for Agp2, PAiRFP1, and PAiRFP2 were found to be 248.60, 252.88, and 250.42 nm 2 , respectively. PAiRFP1 and PAiRFP2 had higher SASA than Agp2. Similar phenomenon was observed for the free energy of solvation. The hydrogen bond and secondary structural analysis revealed that the volume of PAiRFP1 and PAiRFP2 are slightly larger than Agp2. Principal component analysis showed that both PAiRFP1 and PAiRFP2 exhibit higher eigenvalues and the trace of covariance matrix. It suggested that there are more random atomic fluctuations in PAiRFP1 and PAiRFP2. All of these aspects might be the reason why PAiRFP1 and PAiRFP2 have slower dark recovery rate than Agp2. The dark recovery is related to dynamics or kinetics, where the rate is dependent on the activation energy between Pr and the intermediates. Further study is needed to unravel this phenomenon.
Although we have performed detailed comparative analysis of Agp2, PAiRFP1, and PAiRFP2, there are many possible research limitations. The experimental validations to support this computational prediction are further required. The results of computational analysis can have several variations when it is executed by different software using different force fields. Additionally, the theoretical prediction may have several variations when it comes to experimental level such as stabilizing and destabilizing effects of amino acids in the structures of Agp2, PAiRFP1, and PAiRFP2. The difference may arise due to different positions of amino acids in the structure of protein. We carried out structural analysis and found that the average SASA values with respect to backbone for Agp2, PAiRFP1, and PAiRFP2 were found to be 248.60, 252.88, and 250.42 nm 2 , respectively. PAiRFP1 and PAiRFP2 had higher SASA than Agp2. Similar phenomenon was observed for the free energy of solvation. The hydrogen bond and secondary structural analysis revealed that the volume of PAiRFP1 and PAiRFP2 are slightly larger than Agp2. Principal component analysis showed that both PAiRFP1 and PAiRFP2 exhibit higher eigenvalues and the trace of covariance matrix. It suggested that there are more random atomic fluctuations in PAiRFP1 and PAiRFP2. All of these aspects might be the reason why PAiRFP1 and PAiRFP2 have slower dark recovery rate than Agp2. The dark recovery is related to dynamics or kinetics, where the rate is dependent on the activation energy between Pr and the intermediates. Further study is needed to unravel this phenomenon.

Conclusions
Although we have performed detailed comparative analysis of Agp2, PAiRFP1, and PAiRFP2, there are many possible research limitations. The experimental validations to support this computational prediction are further required. The results of computational analysis can have several variations when it is executed by different software using different force fields. Additionally, the theoretical prediction may have several variations when it comes to experimental level such as stabilizing and destabilizing effects of amino acids in the structures of Agp2, PAiRFP1, and PAiRFP2. The difference may arise due to different positions of amino acids in the structure of protein.

Conclusions
In this study, a comprehensive molecular exploration and comparative studies of Agp2, PAiRFP1, and PAiRFP2 are carried out. The PAiRFP1 was expressed and purified, and the fluorescence intensity and time-dependent dynamic curve for PAiRFP1 was obtained. It was found that both PAiRFP1 and PAiRFP2 perform photoactivation behavior because Pr and Pfr states are coexisting in resting state, and Pfr state can be photoconverted into Pr state, which emits fluorescence upon absorption of far-red light. The computational findings suggested that the structure of PAiRFP1 and PAiRFP2 are more deviated and the hydrophobic cores that were buried in PAiRFP1 and PAiRFP2 were exposed to solvent molecules during the MD simulations. We found that some mutations at residues Q168, V244, and A480 of Agp2 are more useful as it enhances the molecular stability and flexibility. This study is helpful to gain understanding of structural changes during the evolution of these photoactivatable of NIR FPs.