New Findings on LMO7 Transcripts, Proteins and Regulatory Regions in Human and Vertebrate Model Organisms and the Intracellular Distribution in Skeletal Muscle Cells

LMO7 is a multifunctional PDZ–LIM protein that can interact with different molecular partners and is found in several intracellular locations. The aim of this work was to shed light on LMO7 evolution, alternative transcripts, protein structure and gene regulation through multiple in silico analyses. We also explored the intracellular distribution of the LMO7 protein in chicken and zebrafish embryonic skeletal muscle cells by means of confocal fluorescence microscopy. Our results revealed a single LMO7 gene in mammals, sauropsids, Xenopus and in the holostean fish spotted gar while two lmo7 genes (lmo7a and lmo7b) were identified in teleost fishes. In addition, several different transcripts were predicted for LMO7 in human and in major vertebrate model organisms (mouse, chicken, Xenopus and zebrafish). Bioinformatics tools revealed several structural features of the LMO7 protein including intrinsically disordered regions. We found the LMO7 protein in multiple intracellular compartments in chicken and zebrafish skeletal muscle cells, such as membrane adhesion sites and the perinuclear region. Curiously, the LMO7 protein was detected within the nuclei of muscle cells in chicken but not in zebrafish. Our data showed that a conserved regulatory element may be related to muscle-specific LMO7 expression. Our findings uncover new and important information about LMO7 and open new challenges to understanding how the diverse regulation, structure and distribution of this protein are integrated into highly complex vertebrate cellular milieux, such as skeletal muscle cells.


Introduction
LIM domain only protein 7 (LMO7) is a large protein (1683 residues) that orchestrates many protein-protein interactions. It contains a LIM domain (a unique cysteine-rich zinc-binding domain), a calponin homology (CH) domain, a PDZ domain and an F-box (FBX) domain [1]. Combining distinct functional domains in one protein, the PDZ-LIM proteins have been related to wide-ranging and multicompartmental cell functions during development and homeostasis. For instance, these proteins are known to mediate signaling between the nucleus and the cytoplasm, sequester nuclear factors to the cytoplasm and interact with actin microfilaments [2]. In addition, the PDZ-LIM proteins facilitate the assembly of protein complexes, regulate gene transcription, control mitosis progression, influence spindle assembly checkpoint besides being involved in the ciliary function [3][4][5][6][7].

Identification of LMO7 Orthologous Genes in Different Vertebrates
To identify the orthologs of the human LMO7 gene in species representative of different vertebrate groups, we started by performing a systematic search in the Ensembl and Gene NCBI databases. A single LMO7 gene was identified in the genome of placental and marsupial mammals (mouse and opossum), sauropsids (chicken, zebra finch, Chinese softshell turtle and anole lizards), amphibian (Xenopus tropicalis) and in the spotted gar, a holostean fish. In contrast, in teleost fishes (zebrafish and catfish), two lmo7 genes (lmo7a and lmo7b) were identified (Table S1).
To understand the evolutionary relationship between the LMO7 sequences identified, we performed a phylogenetic analysis based on the predicted amino acid sequences. The Branchiostoma floridae (cephalochordate) Lmo7 protein was used as an out-group. The UniProt or NCBI IDs of the proteins used to build the LMO7 phylogenetic tree are presented in Table S1. Our results revealed that the LMO7 sequences form two subgroups. One encompasses the tetrapod LMO7 and Lmo7a of holostean and teleost fishes while the other includes the lmo7b teleost protein (Figure 1). The common ancestry of the different LMO7 orthologs is further evidenced by the presence of syntenic gene sets around the LMO7 genes, as displayed in Table S2. fishes while the other includes the lmo7b teleost protein (Figure 1). The common ancestry of the different LMO7 orthologs is further evidenced by the presence of syntenic gene sets around the LMO7 genes, as displayed in Table S2. Our data on placental mammals, chicken and Xenopus are in accordance with a previous study but differ regarding zebrafish, given that only a single lmo7 gene was described previously in this species [13]. The presence of two lmo7 genes in the genome of teleost fishes is expected given that an additional round of whole genome duplication occurred at the base of the teleost fish lineage [18,19]. Although additional analyses are required to determine which teleost species bear both lmo7 genes or whether there are further unidentified paralogs in specific teleost fishes, our results demonstrate for the first time that lmo7 underwent duplication in teleosts, generating lmo7a and lmo7b. Our data on placental mammals, chicken and Xenopus are in accordance with a previous study but differ regarding zebrafish, given that only a single lmo7 gene was described previously in this species [13]. The presence of two lmo7 genes in the genome of teleost fishes is expected given that an additional round of whole genome duplication occurred at the base of the teleost fish lineage [18,19]. Although additional analyses are required to determine which teleost species bear both lmo7 genes or whether there are further unidentified paralogs in specific teleost fishes, our results demonstrate for the first time that lmo7 underwent duplication in teleosts, generating lmo7a and lmo7b.

Characterization of the LMO7 Transcripts in Human and Vertebrate Model Organisms
Alternative splicing-derived transcripts of LMO7 have been described for human [20] and other placental mammals [2]. However, data on LMO7 variants generated either by alternative splicing or distinct starting sites for transcription or termination are still scarce. Therefore, we analyzed the information about the LMO7 transcripts available in the Ensembl genome browser for humans and the major vertebrate model organisms (mouse, chicken, Xenopus and zebrafish). Taken together, our findings indicate that studies of expression at the mRNA and protein levels are required to validate the bioinformatic predictions about the LMO7 variant transcripts in human and in the main vertebrate model organisms. In addition, as multiple non-coding transcripts were predicted to be generated from the vertebrate LMO7 genes, functional studies to elucidate the role of such transcripts are required. Given that non-coding transcripts can regulate gene expression in different ways [22] and some can Taken together, our findings indicate that studies of expression at the mRNA and protein levels are required to validate the bioinformatic predictions about the LMO7 variant transcripts in human and in the main vertebrate model organisms. In addition, as multiple non-coding transcripts were predicted to be generated from the vertebrate LMO7 genes, functional studies to elucidate the role of such transcripts are required. Given that non-coding transcripts can regulate gene expression in different ways [22] and some can generate active micropeptides [23], non-coding LMO7 transcripts can potentially play important roles that have yet to be described.

Dissecting the Structural Features of the LMO7 Protein
To deepen our comparative analysis, we evaluated the structural features of the LMO7 proteins encoded by the transcripts shown in Figure 2. The human LMO7 (hLMO7; UniProt ID Q8WWI1) contains three well-folded domains, as predicted from the primary structure (The UniProt Consortium, 2021 [24]). Moreover, there is a domain of unknown function (DUF4757) found in two regions: residues 294-382 and residues 650-787, as reported by the PFAM database [25]. There is one three-dimensional structural model for the PDZ domain of LMO7 (PDB 2eaq, residues 1037-1126) ( Figure 3A). This X-ray crystallographic structure shows five antiparallel β-strands resembling a β-barrel (also known as the β-finger) next to an α-helical segment. The folding shared by different types of the PDZ domains accounts for their binding to redundant targets. The PDZ domains are widely present in life, from bacteria to mammals, and recognize short linear motifs (SLiMs) in various proteins, contributing to assembly of multicomponent complexes (reviewed in [26]). Close to the N-terminus, there is a calponin homology (CH) domain involved in the interaction with cytoskeletal proteins and signaling. The C-terminal LIM domain is composed of two zinc fingers that function in cytoskeleton reorganization. Collectively, all three domains of LMO7 are involved in protein-protein interactions. However, the predicted domains only cover 16% of the hLMO7 primary structure. Moreover, LMO7 shows nuclear localization, acting on transcriptional activation and differentiation, processes related to proteins containing long regions of intrinsic disorder [27], and none of its globular domains is related to direct nucleic acid binding. Then, we analyzed whether hLMO7 contains intrinsically disordered regions (IDRs), which are highly flexible regions that lack a fixed secondary and/or tertiary structure [28]. Instead, IDRs are best described as an ensemble of conformations, challenging the structure-function relationship since the plasticity of unordered regions enables control of several regulatory pathways [29]. Analysis by means of the charge-hydropathy (CH) plot [30] shows that the chemical composition of the LMO primary structure is similar to that of proteins containing long IDRs ( Figure 3C). Prediction of disordered regions by means of a set of seven different algorithms revealed two long C-terminal disordered regions interspaced with the PDZ and LIM domains, respectively, and a short N-terminal disordered region ( Figure 4). In addition to PONDR, IUPred, PrDOs and their derivatives, the D2P2 database [31] verifies IDRs by means of the combination with the Spritz algorithms [32] that have been trained on NMR and X-ray crystallography data as well as biophysical characterizations gathered in the Disprot database (available at disprot.org). The results from D2P2 show consensual regions of disorder (75% accordance among the predictors) as follows: residues 321-344 (where the nuclear localization signal is found), 753-929, 938-1042 and 1236-1603. Interestingly, the enrichment in intrinsic disorder is conserved along evolution ( Figures 3C and 4). The LMO7 sequences from mouse (Mus musculus), chicken (Gallus gallus) and zebrafish (Danio rerio) shows an enriched disorder content, with the longest disordered segment appearing in the C-terminus (comprising around 400 residues: 1200-1600 for human and mouse, 1100-1500 for chicken and 900-1300 for zebrafish Lmo7b) ( Figure 4). To investigate whether the hLMO7 segments could fold into coiled coils, we used ncoils [33] and observed three segments with high propensity: residues 721-776, 1227-1277 and 1347-1384. Coiled coil structures become folded into α-helical segments upon intermolecular interactions and can form supercoils containing up to seven helices wrapped around each other. Since hLMO7 is involved in many protein in-teractions, coiled coils could orchestrate the assemblage of large protein clusters. The high content of intrinsic disorder in hLMO7 (PONDR-VLXT total score: 52.5%) together with the prediction of coiled coils, which also function as oligomerization domains, prompted us to verify whether hLMO7 presents potential regions that could drive liquid-liquid phase separation (LLPS). Two bioinformatic tools, catGRANULE [34] and PScore [35], predicted a short C-terminal region comprising about 50 residues, with a potential to drive homotypic LLPS ( Figure 5). The catGRANULE algorithm predicts condensation based on the following characteristics: primary sequence composition, intrinsic disorder and nucleic acid-binding propensities [34]. According to catGRANULE, hLMO7 showed a strong total score for LLPS (score: 1.024; a score above zero indicates LLPS) ( Figure 5). In contrast, the overall PScore, which was developed on the high frequency of planar sp 2 pi-pi interactions from LLPS "driver" proteins, was only 3.66, below the minimum threshold of 4, which indicates LLPS [35] ( Figure 5). Nonetheless, multiple heterotypic interactions with its molecular partners promoted by its three domains and the IDRs could drive the assemblage of condensates. Consequently, it is essential to assess whether hLMO7 can form biomolecular condensates in vivo. its three domains and the IDRs could drive the assemblage of condensates. Consequently, it is essential to assess whether hLMO7 can form biomolecular condensates in vivo.     Regarding other motifs encoded by the hLMO7 sequence, analysis by Phobius [36] predicts a transmembrane motif (TM; residues 32-50) at the extreme N-terminus ( Figure 3A). This may contribute to the insertion in the plasma membrane, and perhaps also to insertion in the nuclear membrane, as LMO7 is found in the nuclear membrane during the differentiation of mouse myoblasts [17]. Interestingly, several LIM domain-containing proteins undergo nucleocytoplasmic shuttling under specific conditions. Indeed, LMO7 is a DNA-binding protein that enhances transcription of key genes related to myogenic differentiation [17]. Accordingly, Holaska et al. [8] suggest two nuclear export signals (NES; residues 118-127 and 650-659) in the N-terminus and a C-terminal nuclear localization signal (NLS; residues 1189-11,996) based on bioinformatic prediction tools. However, to the best of our knowledge, the evaluation tools used in their work were not stated. Hence, we also searched for NLS/NES annotated in the database of nuclear transport sequences based on the experimentally proven sequences [37] and found only one motif that may function as a nuclear localization sequence (NLS) in the N-terminus (328-LRKKP-333) for the human protein ( Figure 3B). It was reported that LMO7 N-terminus contains a predicted transactivation domain (TAD). Additionally, residues 888-1320 of recombinant hLMO7 seem to be involved in DNA binding, as evidenced by the electrophoresis mobility shift assay (EMSA) [17]. Using the 9aaTAD prediction tool [38], we found that the N-terminus contains three motifs and the C-terminus contains one motif (residues 931-939) with 92% identity to nine-amino-acid transactivation domains (9aaTAD). Interestingly, this propensity for a C-terminal TAD could explain the requirement of residues 888-1320 from hLMO7 to bind myogenic promoters in vitro. In addition, NLSs are found in LMO7 from all the species studied, apart from the b isoform from zebrafish. A perfect match for 9aaTADs was observed for LMO7 from zebrafish (isoform a), chicken and mouse ( Figure 3B). Based on a search through the PhosphoSitePlus database [39], we report the post-translational modifications (PTMs) that occur in hLMO7 ( Figure 3C). Most phosphorylated residues lie within the IDRs of hLMO7. We speculate that phosphorylation/dephosphorylation of serine 318, serine 322 and tyrosine 323 controls nuclear shuttling.
To identify differences in domain organization between the five protein-coding splicing variants of hLMO7 reported in UniProt (ID Q8WWI1), we performed primary structure alignment by Clustal Omega (Figures 6 and 7A). The alignment showed that isoform 5 does not possess the predicted TM motif, the CH domain or a short-disordered region predicted by seven disorder algorithms, from residues 182 to 285 that precede the NLS (Figures 6 and 7A). Proteins containing a CH domain have been implicated in actin and tubulin binding and are believed to connect cytoskeleton to signaling pathways [40]. Thus, isoform 5 probably lacks the ability to interact with several proteins to transduce signaling via the CH domain.  Omega [41]. The predicted domains, motifs and intrinsically disordered regions (IDR) reported in the Database of Disordered Protein Predictions (D2P2) [31] are marked by rectangles and the sequences indicated above. Isoform 5 lacks the putative transmembrane α-helix (TM, predicted by Phobius at https://phobius.sbc.su.se/) and the calponin homology domain (CH). Isoform 3 lacks residues 356-690. The LIM domain is missing in isoform 4. Isoforms 2 and 5 do not have the key zinc ion coordination motifs (dark blue rectangles). Cysteine and histidine residues that might also be involved in Zn(II) binding are marked in purple and blue, respectively. NLS, nuclear localization signal predicted by NLSdb (https://rostlab.org/services/nlsdb/). Cysteine and histidine residues that might also be involved in Zn(II) binding are marked in purple and blue, respectively. NLS, nuclear localization signal predicted by NLSdb (https://rostlab.org/services/nlsdb/). Isoform 3 lacks residues 359 to 690 predicted to contain two main antiparallel α-helices and four short α-helices together with long disordered regions ( Figure 6 and Figure  7B, highlighted in green). Because the function of the region that is missing in isoform 3 is unknown, it is not possible to predict the functional outcome. Interestingly, all five hLMO7 isoforms share the main long disordered regions, the PDZ domain and the predicted NLS (Figure 6 and Figure 7A). Analysis using the I-TASSER structural alignment program (TM-align) [42] showed that the closest structural analog to the hLMO7′s LIM domain is the LIM domain from thyroid receptor interacting protein 6 (TRIP6; PDB 1X61). Isoform 3 lacks residues 359 to 690 predicted to contain two main antiparallel αhelices and four short α-helices together with long disordered regions (Figures 6 and 7B, highlighted in green). Because the function of the region that is missing in isoform 3 is unknown, it is not possible to predict the functional outcome. Interestingly, all five hLMO7 isoforms share the main long disordered regions, the PDZ domain and the predicted NLS (Figures 6 and 7A). Analysis using the I-TASSER structural alignment program (TMalign) [42] showed that the closest structural analog to the hLMO7 s LIM domain is the LIM domain from thyroid receptor interacting protein 6 (TRIP6; PDB 1X61). Superimposition of the 3D structural models from the LIM domain of hLMO7 (obtained by AlphaFold) and TRIP6 (solved by solution nuclear magnetic resonance spectroscopy) showed a rootmean-square deviation (RMSD) of 2.255 Å, indicating a similar fold despite a low level (27%) of sequence identity, as assessed by Clustal Omega. The LIM domain is absent in isoform 4 and is truncated in isoforms 2 and 5 ( Figure 6), presumably impacting the protein-protein interactions which the LIM domain orchestrates. Specifically, isoforms 2 and 5 do not contain two key Zn(II) binding sites ( Figure 7C, highlighted in blue). Since proper folding requires zinc ion coordination, the absence of binding residues (Figure 6, dark blue rectangles) would abolish LIM-related functions such as recruitment of diverse proteins, including DNA-binding proteins [43].
Additionally, we used AlphaFold and analysis of coevolutionary couplings to obtain structural insights on hLMO7. AlphaFold is a cutting-edge machine-learning approach that predicts protein three-dimensional structural models for the proteome of human and twenty other organisms. Despite this deep-learning method reporting a very low confidence level for the structural definition of IDRs, a significant correlation to disorder predictors has been demonstrated [44]. The predicted Local Distance Difference Test (pLDDT, range from 0 to 100) measures the accuracy of the AlphaFold model to a corresponding structure resolved by structural techniques. Consistent with the analysis by the seven disorder algorithms, hLMO7 is also predicted by AlphaFold to contain multiple domains linked by long disordered regions (Figure 8).  (Figure 6), presumably impacting the protein-protein interactions which the LIM domain orchestrates. Specifically, isoforms 2 and 5 do not contain two key Zn(II) bindin sites ( Figure 7C, highlighted in blue). Since proper folding requires zinc ion coordination the absence of binding residues (Figure 6, dark blue rectangles) would abolish LIM-related functions such as recruitment of diverse proteins, including DNA-binding proteins [43]. Additionally, we used AlphaFold and analysis of coevolutionary couplings to obtai structural insights on hLMO7. AlphaFold is a cutting-edge machine-learning approac that predicts protein three-dimensional structural models for the proteome of human and twenty other organisms. Despite this deep-learning method reporting a very low confidence level for the structural definition of IDRs, a significant correlation to disorde predictors has been demonstrated [44]. The predicted Local Distance Difference Tes (pLDDT, range from 0 to 100) measures the accuracy of the AlphaFold model to corresponding structure resolved by structural techniques. Consistent with the analysi by the seven disorder algorithms, hLMO7 is also predicted by AlphaFold to contai multiple domains linked by long disordered regions ( Figure 8). Three-dimensional structure model prediction of hLMO7 indicates the presence of well-folded domains and many intrinsic disorder regions with long-range evolutionary couplings. The evolutionary couplings of overlapping regions of LMO7 were calculated by the EVcoupling server (A). The quality scores (Q) for the identification of evolutionary couplings are indicated for each segment and range from 0 (worst) to 10 (best). The pink-shaded squares highlight wellfolded domains previously described (residues 61-215, 1037-1125 and 1606-1675) as well as the one identified here (residues 634-701). The 3D structure calculated by the AlphaFold server is indicated in (B). Each globular domain highlighted by EVcoupling is indicated in the 3D model. The full-length protein 3D model is colored by the fold prediction confidence score for each amino acid residue (bottom right legend).
The analysis of evolutionary couplings among the amino acid residues of hLMO7 b the EVcoupling server indicated several significant couplings consistent with globula The analysis of evolutionary couplings among the amino acid residues of hLMO7 by the EVcoupling server indicated several significant couplings consistent with globular domains ( Figure 8A). Indeed, these match well with the fold predictions by the AlphaFold server and identify the calponin homology, PDZ and LIM domains ( Figure 8B). However, it is worth noting that there are many significant evolutionary couplings within the sequence of LMO7 that are in regions with a low score for AlphaFold prediction confidence (less than 70). These regions are predicted to be mostly intrinsically disordered (Figure 4), and we speculate that these intrinsically disordered regions with significant long-range evolutionary couplings might undergo disorder-to-order transitions, depending on the solution composition of specific cellular contexts. It is also interesting that an unknown domain reported by PFAM (DUF4757) exhibits an abundance of evolutionary couplings among residues 634-701 ( Figure 8). This domain is also not well-defined in the 3D model of hLMO7 and has intermediate values of the disorder score (mean disorder around 0.5).

Identification of a Putative Regulator of LMO7 Transcription in the Myogenic Context
Given that LMO7 has been described as having a role during skeletal muscle differentiation of different vertebrate species [8,10], we wondered whether the LMO7 ortholog genes would be regulated by an evolutionarily conserved element (e.g., a promoter or an enhancer). Therefore, we searched for evolutionarily conserved regions (ECRs) in the LMO7 loci of human, mouse, chicken, Xenopus tropicalis and zebrafish (lmo7a loci) using the ECR genome browser.
Our comparative analysis revealed an ECR present in the LMO7 loci of all species analyzed except for zebrafish. In humans, this conserved element (hLMO7 ECR) is found upstream of the first exon of the Lmo7-207 transcript (ENST00000465261.6; see Table S1 for details). Several potential transcription factor binding sites (TFBS) for proteins involved in skeletal myogenesis, such as PAX3, MYOD, MYOGENIN and MEF2, were identified in the human ECR ( Figure 9). In addition, this ECR contains LEF1/TCF4 potential binding sites, indicating that this element may be responsive to WNT ligands, which are known to play multiple and pivotal roles in skeletal muscle development [45]. The MultiTF search for conserved TFBSs among human and other species showed that MEF2 and LEF1/TCF4 sites are evolutionarily conserved.  Figure 8A). Indeed, these match well with the fold predictions by the A server and identify the calponin homology, PDZ and LIM domains ( Figure 8B). it is worth noting that there are many significant evolutionary couplings w sequence of LMO7 that are in regions with a low score for AlphaFold confidence (less than 70). These regions are predicted to be mostly intrinsically d (Figure 4), and we speculate that these intrinsically disordered regions with long-range evolutionary couplings might undergo disorder-to-order t depending on the solution composition of specific cellular contexts. It is also that an unknown domain reported by PFAM (DUF4757) exhibits an abu evolutionary couplings among residues 634-701 ( Figure 8). This domain is also defined in the 3D model of hLMO7 and has intermediate values of the diso (mean disorder around 0.5).

Identification of a Putative Regulator of LMO7 Transcription in the Myog Context
Given that LMO7 has been described as having a role during skelet differentiation of different vertebrate species [8,10], we wondered whether ortholog genes would be regulated by an evolutionarily conserved eleme promoter or an enhancer). Therefore, we searched for evolutionarily conserv (ECRs) in the LMO7 loci of human, mouse, chicken, Xenopus tropicalis and (lmo7a loci) using the ECR genome browser.
Our comparative analysis revealed an ECR present in the LMO7 loci of analyzed except for zebrafish. In humans, this conserved element (hLMO7 ECR upstream of the first exon of the Lmo7-207 transcript (ENST00000465261.6; se for details). Several potential transcription factor binding sites (TFBS) fo involved in skeletal myogenesis, such as PAX3, MYOD, MYOGENIN and M identified in the human ECR ( Figure 9). In addition, this ECR contains L potential binding sites, indicating that this element may be responsive to WN which are known to play multiple and pivotal roles in skeletal muscle develop The MultiTF search for conserved TFBSs among human and other species sh MEF2 and LEF1/TCF4 sites are evolutionarily conserved.  To further evaluate the regulatory potential of the human hLMO7 ECR, its sequence was used to perform a BLAT search against the human genome (assembly GRCh38/hg38) in the UCSC genome browser. Our analysis revealed that the hLMO7 ECR partially overlaps with a predicted LMO7 regulatory element of the GeneHancer promoter/enhancer catalog (ID GH13J075758), a database of human regulatory elements and their inferred target genes [46]. Of interest, the GH13J075758 element was also categorized as a super enhancer related to myogenic differentiation in dbSUPER (ID SE_37705), a database of super enhancers in the mouse and human genomes [47].
Overall, our findings indicated that a conserved regulatory element is found in the LMO7 loci of humans and other vertebrates, that may be involved in regulating gene transcription in the context of skeletal myogenesis. Functional studies are required to confirm the role of the ECR identified here as an LMO7 regulatory element.

Intracellular Distribution of the LMO7 Protein in Chicken and Zebrafish Embryonic Muscle Cells
Since the PDZ-LIM proteins have been associated with multicompartmental cell functions during development, we decided to explore the cellular distribution of the LMO7 protein during vertebrate skeletal muscle development. To achieve this objective, we analyzed the intracellular localization of LMO7 in two widely used vertebrate animal models for the study of myogenesis. Zebrafish embryos and primary cultures of chicken embryonic muscle cells were labeled for LMO7 and analyzed with a confocal laser microscope. First, we analyzed the localization of Lmo7 in zebrafish somites. Zebrafish embryos are particularly appropriate for studies on skeletal muscle cell development because of their transparency and external development, which allow access to embryos for easy and detailed visualization. Furthermore, in a single zebrafish embryo, it is possible to analyze different differentiation stages of somite progenitor muscle cells. Zebrafish somites form one after another from tissue at the tail end of the embryo so that somites near the tail of the fish are younger and somites near the head are older. Our results showed Lmo7 near the septa and the notochord, and in the cytoplasm of myogenic cells in the somites from the trunk region of prim 25 zebrafish embryos [48] (Figure 10A-C). In somites from the caudal region of prim 25 zebrafish embryos, Lmo7 was found in the cytoplasm and was particularly concentrated in the perinuclear region of myogenic cells ( Figure 10D-F). No labeling of Lmo7 was detected within the nuclei of progenitor skeletal muscle cells in zebrafish embryos (Figure 10). The presence of Lmo7 near the septa of zebrafish somites might be related to its role in skeletal muscle cell adhesion to the extracellular matrix (ECM), whereas Lmo7 perinuclear localization might be related to its role in intracellular signaling. Recently, the perinuclear region of eukaryotic cells has been described as a space that concentrates signaling proteins distributed in a 3D network of cytoskeletal filaments and organelles [49]. ure 10D-F). No labeling of Lmo7 was detected within the nuclei of progenitor skeletal muscle cells in zebrafish embryos ( Figure 10). The presence of Lmo7 near the septa of zebrafish somites might be related to its role in skeletal muscle cell adhesion to the extracellular matrix (ECM), whereas Lmo7 perinuclear localization might be related to its role in intracellular signaling. Recently, the perinuclear region of eukaryotic cells has been described as a space that concentrates signaling proteins distributed in a 3D network of cytoskeletal filaments and organelles [49].  Next, we analyzed the distribution of LMO7 in chicken myogenic cell cultures under a fluorescence confocal microscope and found LMO7 within the nuclei of mononucleated myoblasts and in the cytoplasm (perinuclear region) of multinucleated myotubes ( Figure 11A-P). These results agree with a previous report showing that upon myotube formation in the mouse C2C12 cell line, LMO7 shuttled from the nucleus to the cytoplasm [17]. Furthermore, our data are in accordance with the predicted function of the PDZ-LIM proteins regarding their ability to mediate signals between the nucleus and the cytoplasm [5] since LMO7 was found in the nucleus and in the perinuclear cloud of chicken muscle cells. The presence of LMO7 within the nuclei of chicken myoblasts reinforces the idea that LMO7 has a role in the regulation of gene expression in skeletal muscle cells [8,17]. Since beta-catenin is a major transcriptional regulator in muscle cells [50], we decided to investigate a possible crosstalk between the Wnt/betacatenin pathway and the LMO7 signaling pathways during chicken myogenesis. To test that, we treated chicken myogenic cells with two activators of the Wnt/beta-catenin pathway, BIO and Wnt3a, and analyzed possible alterations in the intracellular distribution of LMO7. We selected BIO and Wnt3a for these experiments since both molecules have been shown to be robust activators of the canonical Wnt/beta-catenin signaling pathway. BIO is a potent and selective pharmacological inhibitor of glycogen synthase kinase-3β (GSK3-β), and it is well-established that inhibition of GSK3-β allows the nuclear translocation of beta-catenin and the subsequent beta-catenin-dependent regulation of its target genes [51]. Different from BIO, Wnt3a is a member of the canonical Wnt glycoproteins that can bind to its receptor Frizzled (Fz) and coreceptor lipoprotein receptor-related protein (LRP5/6) at the plasma membrane of the target cells and activate the canonical Wnt/beta-catenin signaling pathway. Since these two activators of the Wnt/beta-catenin pathway (BIO and Wnt3a) differ in their mechanisms of action, testing their effects in chicken muscle cells could provide more robust information on the possible interplay between LMO7 and the Wnt/beta-catenin signaling pathways. Interestingly, both BIO and Wnt3a induced an increase in the nuclear labeling of LMO7, which was concentrated in specific compartments within the nuclei of muscle cells (Figure 11, insets M-P). Curiously, LMO7 labeling in fibroblasts was lower than in muscle cells in all the experimental conditions ( Figure 11A-P), pointing to a muscle-specific role of LMO7. Quantification of the LMO7 labeling showed an increase in the presence of LMO7 within the nuclei of all the cell phenotypes (myoblasts, myotubes and fibroblasts) after treatment with BIO and Wnt3a ( Figure 11Q). We also quantified the amount of LMO7-positive nuclear aggregates in all the experimental conditions and found a significant increase in these aggregates in myoblasts, myotubes and fibroblasts after the activation of the Wnt/beta-catenin pathway ( Figure 11R). Since chromatin is not randomly distributed within the interphase nuclei of eukaryotic cells [52], our data suggest that LMO7 could be concentrated at specific nuclear territories with active transcription. Further studies are necessary to identify the specific nuclear domain where LMO7 localizes after the activation of the Wnt signaling pathway in skeletal muscle cells.
Importantly, the LMO7 protein was detected within the nuclei of chicken myoblasts and in the cytoplasm of zebrafish somites (Figures 10 and 11). No labeling of LMO7 was detected within the nuclei of muscle cells in zebrafish embryos ( Figure 10). The differences in the intracellular distribution of the LMO7 protein between zebrafish and chicken muscle cells may have different explanations: (i) we analyzed chicken muscle cells grown in vitro as compared with zebrafish embryos grown in vivo. It is possible that the mechanical stress caused by in vitro conditions where the chicken muscle cells were cultivated induced the nuclear translocation of LMO7 and the subsequent activation of target genes. LMO7 has been reported to be associated with focal adhesions (cell-ECM adhesions) in cells by the interaction with p130Cas, a key signaling component of focal adhesions, and that this association allows muscle cells to withstand mechanical stress [9]. Importantly, focal adhesions are rarely seen in vivo, such as in zebrafish embryos, and therefore LMO7 may have a specific role in stress-related responses of muscle cells grown in vitro; and/or (ii) as described above, our results revealed that the Lmo7 teleost genes/proteins are separated into lmo7a/Lmo7a and lmo7b/Lmo7b. We cannot exclude the possibility that the antibody against Lmo7 that we used in the immunofluorescence experiments with zebrafish embryos was able to detect only one zebrafish Lmo7 protein isoform and that the other undetected isoform could have a different intracellular localization (including muscle cell nuclei in zebrafish somites). More experiments are needed to explore this hypothesis.

A bibliometric Glimpse of All the Published Data on LMO7
Finally, we performed an exploratory analysis of the data retrieved from the PubMed (https://pubmed.ncbi.nlm.nih.gov/) database. Using descriptors "LMO7" OR "Lmo7" OR "LMO-7" OR "Lmo-7" OR "lmo7" OR "lmo-7" OR "Lim domain only protein 7", the search returned 76 articles as of 10 October 2021 in a period that spanned the years 1998 to 2021. First, we analyzed the number of LMO7 articles published per year and observed that the number of LMO7 publications is increasing over the years, particularly after 2017, which highlights the growing relevance of LMO7 studies ( Figure 12). Then, we analyzed the frequency of words from titles and abstracts of the articles using the VOSviewer software [53]. Figure 13 depicts a term map of co-occurrence relations between the scientific terms found in the title and abstract of the 76 LMO7 articles. VOSviewer has its own clustering technique [54], which is based on the citation relations between the clusters. The most frequent words are represented by colored nodes. Five different colors (red, blue, green, yellow and purple) represent five clusters of different scientific contexts for the 76 LMO7 articles ( Figure 13). Red represents the central node, where most of the interactions occur, and which is related to studies of LMO7 in cardiac and skeletal muscles (myocardium, Emery-Dreifuss muscular dystrophy). Blue is the second most important node and is associated with the regulation of gene expression (exons, alternative splicing) by LMO7. Green represents studies of the role of LMO7 in cancer cells and organs (oncogene proteins, lung cancer). Yellow is related to the role of LMO7 in cell adhesion (adherens junctions, cadherin). Finally, purple shows the work developed in human genetics of LMO7 (genome-wide association study and single nucleotide polymorphism). The sizes of the nodes denote the citations; that is, the larger the size of the node, the greater the number of citations. The larger nodes were "lim domain proteins", "humans", "animals", "mice" and "tumor cell line", representing the biological model used in LMO7 studies. Importantly, the large size of the nodes "transcription factors", "signal transduction" and "cell adhesion" highlights LMO7 versatile functions in the regulation of gene expression, signaling and adhesion processes. The nodes "microfilament proteins" and "cadherins" point to the participation of LMO7 in cadherin/actin-based intercellular adhesion. The collection of these bibliometric analyses shows that LMO7 studies are concentrated in transcriptional regulation, signaling and adhesion in muscles and cancer. Curiously, no node related to intrinsically disordered proteins was detected, reinforcing the novelty of our data.
signaling and adhesion processes. The nodes "microfilament proteins" and "cadherins" point to the participation of LMO7 in cadherin/actin-based intercellular adhesion. The collection of these bibliometric analyses shows that LMO7 studies are concentrated in transcriptional regulation, signaling and adhesion in muscles and cancer. Curiously, no node related to intrinsically disordered proteins was detected, reinforcing the novelty of our data.  the novelty of our data.  . Bibliometric analysis of the co-occurrence relations between scientific terms found in the LMO7 articles. A term map of co-occurrence relations between the scientific terms was created from the bibliometric data retrieved from the titles and abstracts of the published LMO7 articles.

Identification of the LMO7 Orthologs Genes
Orthologs of the human LMO7 gene were identified by textual searches in the Gene NCBI database (https://www.ncbi.nlm.nih.gov/gene/) and Ensembl genome browser (https://www.ensembl.org/index.html). In addition to human, the species evaluated were mouse (Mus musculus), opossum (Monodelphis domestica), chicken (Gallus gallus), zebra finch (Taeniopygia guttata), anole lizard (Anolis carolinensis), Western clawed frog (Xenopus tropicalis), zebrafish (Danio rerio), channel catfish (Ictalurus punctatus) and spotted gar (Lepisosteus oculatus). The spotted gar was included in our analyses as a representative of holostean fishes, a group that separated from teleost fishes before the additional whole genomic duplication that occurred in this lineage [18]. The genes around LMO7 were also annotated as chromosomal regions displaying conserved synteny are believed to share common ancestry.

Phylogenetic Relationships among the LMO7 Proteins
Multiple alignments of the predicted LMO7 protein amino acid sequences were conducted using ClustalW of Molecular Evolutionary Genetics Analysis (MEGA) software version 11 [55]. Phylogenetic trees were built using the neighbor-joining method of the same software. The robustness of the groupings was assessed using 1000 bootstrap resampling.

LMO7 Transcripts in Human and Vertebrate Model Organisms
Information about LMO7 transcripts annotated in human and major vertebrate model organisms (mouse, chicken, Xenopus tropicalis and zebrafish) were obtained in the Ensembl genome browser (https://www.ensembl.org/index.html). Transcript flags, which help to identify transcripts with higher quality, were used for selecting the main LMO7 transcripts for human and mouse. Given that transcript flags are not available for other species analyzed, the longest transcript was chosen for comparisons. Exon-intron DNA sequences were downloaded in the Ensembl genome browser and used to create transcript graphics with online resource Exon-Intron Graphic Maker (http://wormweb.org/exonintron).

Structural Features of LMO7 by Bioinformatics Tools
The intrinsically disordered regions of LMO7 were analyzed by means of the D2P2 database [31] and seven disorder predictors, PONDR-FIT [56], PONDR-VLXT [57], IUPredlong (long regions of intrinsic disorder) and IUPred-short (short regions of intrinsic disorder) [58], PONDR-VSL2 and PONDR-VL3 [59] and PrDOS [60]. The average disorder profile was obtained by calculating the mean of disorder reports from the seven computational tools. A score > 0.5 refers to amino acid residues in disordered regions whereas scores from 0.2 to 0.5 indicate residues in flexible segments. CatGRANULE [34] and PSCore algorithms predicted liquid-liquid phase separation. Coiled coil regions were predicted by ncoils [33]. The Phobius server was used to identify a transmembrane α-helical motif [36]. Nuclear export sequences (NES) and nuclear localization signals (NLS) were analyzed in [37]. Nine-amino-acid transactivation domains (TADs) were analyzed by means of the 9aaTAD prediction tool (available at https://www.med.muni.cz/9aaTAD/index.php) using the moderately stringent pattern and the motifs selected in Figure 3B showed at least 92% match [38]. The 3D structure model prediction was obtained by means of Al-phaFold [40]. Coevolution of amino acid residues was analyzed by EVcouplings (available at https://evcouplings.org/) using the default parameters [61].

Identification of the LMO7 Putative Regulatory Elements
Comparative genomics is a useful tool to identify gene regulatory elements (e.g., promoters or enhancers) [62]. Therefore, we searched for evolutionarily conserved regions (ECRs) among the LMO7 loci of human, mouse, opossum, chicken, Xenopus and zebrafish using the ECR browser (http://ecrbrowser.dcode.org). Human LMO7 was chosen as base for comparisons with the orthologous sequences. The default parameters of this browser were used in the analyses (ECR = minimum length of 100 bp and 70% sequence identity). The Mulan tool (http://mulan.dcode.org/) was used to access the MultiTF algorithm used to identify conserved transcription factor binding sites [63]. The sequence of the human ECR identified was downloaded and used for a BLAT search in the UCSC genome browser (https://genome.ucsc.edu/) and an analysis using the Regulation tracks.

Bibliometric Analysis
For the bibliometric evaluation, we performed exploratory analyses of the data retrieved from the articles present in the PubMed database (https://pubmed.ncbi.nlm.nih. gov/). The query was performed on 10 October 2021 by using the following descriptors: . We found a total of 76 articles in a period that spanned from 1998 to 2021. We used the freely available software VOSviewer (https://www.vosviewer.com/) to analyze the frequency of words that appear in the titles and abstracts of the 76 articles. A term map of co-occurrence relations between scientific terms was created [53] using the following parameters: all keywords, full counting, minimum number of occurrences of a keyword as 3 (of the 649 keywords, 65 met the threshold), maximum length of circles of 100 and maximum size of lines of 1000. chloride for 15 min and 3% BSA-PBS/T for 30 min and incubated with the primary antibody against Lmo7 (diluted 1:100 in a blocking solution) for 1 h at 37 • C. Then, the embryos and the cells were washed for 30 min with PBS/T and incubated for 2 h at 37 • C with Alexa Fluor-conjugated secondary antibody (diluted 1:200 in a blocking solution). The nuclei were labeled with 0.1 µg/mL of DAPI in 0.9% NaCl. The embryos and the cells were mounted on #1.5 24 × 60 mm glass coverslips (with spacers in the case of embryos) using Prolong Gold (Molecular Probes). The cells and the embryos were examined under a Leica TCS SPE fluorescence confocal laser scanning microscope (Leica, Wetzlar, Germany). The control experiments with only the secondary antibodies showed only faint background staining (data not shown). Quantification of the LMO7 fluorescence labeling were performed using the Fiji software [65] and figure panels were produced with the Adobe Photoshop software (Adobe Systems Inc., San José, CA, USA), where some of the original fluorescence grayscale images were pseudo-colored and superimposed.

Statistical Analysis
Statistical analysis was carried out using the GraphPad Prism software version 8. The results of three independent experiments are expressed as the means ± standard deviation. Statistical analysis of the data related to the quantification of nucleus versus cytoplasm localization of LMO7 and LMO7-positive nuclear aggregates was performed with two-way ANOVA followed by Tukey's post hoc test. p < 0.05 was considered statistically significant.  Institutional Review Board Statement: The use of chicken embryos and zebrafish embryos was approved by the Ethics Committee for Animal Care and Use in Scientific Research of the Federal University of Rio de Janeiro and received the following approval numbers: 069/19 for chicken embryos and 039/20 for zebrafish embryos.

Informed Consent Statement: Not applicable.
Data Availability Statement: The authors confirm that the data supporting the findings of this study are available within the article.

Conflicts of Interest:
The authors declare that they have no competing interests.