A Duplication–Divergence Hypergraph Model for Protein Complex Data
Abstract
1. Introduction
2. The Duplication-Divergence Hypergraph Model
- V is a finite set of nodes, which in our context represent individual proteins;
- is a set of hyperedges, where each hyperedge is a non-empty subset of V, corresponding to a protein complex.
- Protein Duplication: A protein is selected uniformly at random. A new protein is created as a duplicate of v. This new protein initially inherits the complex membership of its parent:
- For every hyperedge that includes v, a new hyperedge is generated by replacing v with its duplicate :
- Let be the set of duplicated hyperedges. The updated node set is , and the hyperedge set (prior to divergence) becomes the following:
- Hyperedge Divergence: To model functional divergence over time, we introduce stochastic loss of interactions:
- Each hyperedge is independently removed from the hypergraph with probability . That is,
- Hyperedges in the original set remain unchanged during this step. This reflects the biological observation that duplicated proteins may not retain all their functions indefinitely. Mechanisms such as subfunctionalisation and neofunctionalisation can lead to the partitioning or reconfiguration of protein complex participation [36].
The final updated hypergraph at time is as follows:
3. Network Robustness and Attack Strategies
3.1. Attack Strategies
- Node selection: A protein is selected uniformly at random from the current node set. This selection emulates a perturbation event such as a targeted genetic knockout or deleterious mutation that impairs the functionality of a specific protein.
- Complex disruption: Each hyperedge such that is independently removed from the hypergraph with probability . That is, the complex is disrupted with probability r if it contains the targeted protein:
3.2. Quantifying Network Efficiency via Clique Projection
- Each hyperedge is represented as a node in the projected graph.
- Edges between nodes and are drawn if the corresponding hyperedges intersect, i.e., . In particular, the line graph has self-loops at all of its nodes.
- The weight of each edge is defined as follows:
- Increased intersection reduces weight when the size of the union of the two hyperedges is fixed: Complexes that share more proteins indicate some redundancy, corresponding to smaller weights and a perhaps less efficient structure.
- Increased union increases weight when the size of the intersection of the two hyperedges is fixed: Hyperedges with large cardinalities have the potential to introduce larger weights. Biologically, this weighting assumes that interactions involving larger complexes have greater potential to influence network efficiency. This behaviour is motivated by [45], showing that a large protein complex is more likely to be able to provide the correct functional groups for catalysis, and by [46], showing that large protein complex interfaces have evolved to promote cotranslational assembly.
- Compositional diversity affects linkage: The use of both union and intersection terms ensures that distant and weakly overlapping complexes contribute more weight.
4. Evolution of the Number of Hyperedges in the DDH Model
- 1.
- If , then there exists an such that ;
- 2.
- If , then ;
- 3.
- If , then .
- 1.
- For each j, converges almost surely as in Theorem 2 with .
- 2.
- For the global processwe have the following:where each is the almost sure limit of . In particular, all components with vanish in this scaling limit.
5. Simulation
5.1. Dataset
5.2. Empirical Validation
5.3. Parameter Tuning
6. Discussion
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Koyutürk, M.; Kim, Y.; Topkara, U.; Subramaniam, S.; Szpankowski, W.; Grama, A. Pairwise alignment of protein interaction networks. J. Comput. Biol. 2006, 13, 182–199. [Google Scholar] [CrossRef] [PubMed]
- He, J.; Ford, H.C.; Carroll, J.; Douglas, C.; Gonzales, E.; Ding, S.; Fearnley, I.M.; Walker, J.E. Assembly of the membrane domain of ATP synthase in human mitochondria. Proc. Natl. Acad. Sci. USA 2018, 115, 2988–2993. [Google Scholar] [CrossRef] [PubMed]
- Spirin, V.; Mirny, L.A. Protein complexes and functional modules in molecular networks. Proc. Natl. Acad. Sci. USA 2003, 100, 12123–12128. [Google Scholar] [CrossRef] [PubMed]
- Juszkiewicz, S.; Hegde, R.S. Quality Control of Orphaned Proteins. Mol. Cell 2018, 71, 443–457. [Google Scholar] [CrossRef]
- Pla-Prats, C.; Thomä, N.H. Quality control of protein complex assembly by the ubiquitin–proteasome system. Trends Cell Biol. 2022, 32, 696–706. [Google Scholar] [CrossRef]
- Barabási, A.L.; Oltvai, Z.N. Network biology: Understanding the cell’s functional organization. Nat. Rev. Genet. 2004, 5, 101–113. [Google Scholar] [CrossRef]
- Pereira-Leal, J.B.; Levy, E.D.; Teichmann, S.A. The origins and evolution of functional modules: Lessons from protein complexes. Philos. Trans. R. Soc. B Biol. Sci. 2006, 361, 507–517. [Google Scholar] [CrossRef]
- Taylor, M.B.; Ehrenreich, I.M. Higher-order genetic interactions and their contribution to complex traits. Trends Genet. 2016, 31, 34–40. [Google Scholar] [CrossRef]
- Vidal, M.; Cusick, M.E.; Barabási, A.L. Interactome Networks and Human Disease. Cell 2011, 144, 986–998. [Google Scholar] [CrossRef]
- Marsh, J.A.; Teichmann, S.A. Structure, dynamics, assembly, and evolution of protein complexes. Annu. Rev. Biochem. 2015, 84, 551–575. [Google Scholar] [CrossRef]
- Welch, G.R. On the Role of Organized Multienzyme Systems in Cellular Metabolism: A General Synthesis. In Progress in Biophysics and Molecular Biology; Eleviser: Amsterdam, The Netherlands, 1978; pp. 103–191. [Google Scholar]
- Wang, Z.; Wang, P.; Li, Y.; Peng, H.; Zhu, Y.; Mohandas, N.; Liu, J. Interplay between cofactors and transcription factors in hematopoiesis and hematological malignancies. Signal Transduct. Target. Ther. 2021, 6, 24. [Google Scholar] [CrossRef]
- Benson, A.R.; Abebe, R.; Schaub, M.T.; Jadbabaie, A.; Kleinberg, J. Simplicial closure and higher-order link prediction. Proc. Natl. Acad. Sci. USA 2018, 115, E11221–E11230. [Google Scholar] [CrossRef]
- Iacopini, I.; Petri, G.; Barrat, A.; Latora, V. Simplicial models of social contagion. Nat. Commun. 2019, 10, 2485. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.; Lucas, M.; Battiston, F. Higher-order interactions shape collective dynamics differently in hypergraphs and simplicial complexes. Nat. Commun. 2023, 14, 1605. [Google Scholar] [CrossRef]
- Landry, N.W.; Young, J.G.; Eikmeier, N. The simpliciality of higher-order networks. EPJ Data Sci. 2024, 13, 17. [Google Scholar] [CrossRef]
- Battiston, F.; Cencetti, G.; Iacopini, I.; Latora, V.; Lucas, M.; Patania, A.; Young, J.G.; Petri, G. Networks beyond pairwise interactions: Structure and dynamics. Phys. Rep. 2020, 874, 1–92. [Google Scholar] [CrossRef]
- Berge, C. Hypergraphs: Combinatorics of Finite Sets; Elsevier: Amsterdam, The Netherlands, 1984; Volume 45. [Google Scholar]
- Benson, A.R.; Gleich, D.F.; Leskovec, J. Higher-order organization of complex networks. Science 2016, 353, 163–166. [Google Scholar] [CrossRef] [PubMed]
- Di Gaetano, L.; Battiston, F.; Starnini, M. Percolation and topological properties of temporal higher-order networks. Phys. Rev. Lett. 2024, 132, 037401. [Google Scholar] [CrossRef]
- Tudisco, F.; Higham, D.J. Node and edge nonlinear eigenvector centrality for hypergraphs. Commun. Phys. 2021, 4, 201. [Google Scholar] [CrossRef]
- Lotito, Q.F.; Musciotto, F.; Montresor, A.; Battiston, F. Higher-order motif analysis in hypergraphs. Commun. Phys. 2022, 5, 79. [Google Scholar] [CrossRef]
- Contisciani, M.; Battiston, F.; De Bacco, C. Inference of hyperedges and overlapping communities in hypergraphs. Nat. Commun. 2022, 13, 7229. [Google Scholar] [CrossRef]
- Battiston, F.; Amico, E.; Barrat, A.; Bianconi, G.; Ferraz de Arruda, G.; Franceschiello, B.; Iacopini, I.; Kéfi, S.; Latora, V.; Moreno, Y.; et al. The physics of higher-order interactions in complex systems. Nat. Phys. 2021, 17, 1093–1098. [Google Scholar] [CrossRef]
- Torres, L.; Blevins, A.S.; Bassett, D.S.; Eliassi-Rad, T. The Why, How, and When of Representations for Complex Systems. SIAM Rev. 2021, 63, 435–485. [Google Scholar] [CrossRef]
- Klimm, F.; Deane, C.M.; Reinert, G. Hypergraphs for predicting essential genes using multiprotein complex data. J. Complex Netw. 2021, 9, cnaa028. [Google Scholar] [CrossRef]
- Murgas, K.A.; Saucan, E.; Sandhu, R. Hypergraph geometry reflects higher-order dynamics in protein interaction networks. Sci. Rep. 2022, 12, 20879. [Google Scholar] [CrossRef] [PubMed]
- Klamt, S.; Haus, U.U.; Theis, F. Hypergraphs and Cellular Networks. PLoS Comput. Biol. 2009, 5, e1000385. [Google Scholar] [CrossRef]
- Franzese, N.; Groce, A.; Murali, T.M.; Ritz, A. Hypergraph-based connectivity measures for signaling pathway topologies. PLoS Comput. Biol. 2019, 15, e1007384. [Google Scholar] [CrossRef]
- Estrada, E.; Rodríguez-Velázquez, J.A. Subgraph centrality and clustering in complex hyper-networks. Phys. A Stat. Mech. Appl. 2006, 364, 581–594. [Google Scholar] [CrossRef]
- Feng, S.; Heath, E.; Jefferson, B.; Joslyn, C.; Kvinge, H.; Mitchell, H.D.; Praggastis, B.; Eisfeld, A.J.; Sims, A.C.; Thackray, L.B.; et al. Hypergraph models of biological networks to identify genes critical to pathogenic viral response. BMC Bioinform. 2021, 22, 287. [Google Scholar] [CrossRef]
- Bianconi, G.; Dorogovtsev, S.N. Theory of percolation on hypergraphs. Phys. Rev. E 2024, 109, 014306. [Google Scholar] [CrossRef]
- Chung, F.; Lu, L.; Dewey, G.; Galas, D. Duplication Models for Biological Networks. J. Comput. Biol. 2003, 10, 677–687. [Google Scholar] [CrossRef]
- Ispolatov, I.; Krapivsky, P.L.; Yuryev, A. Duplication-divergence model of protein interaction network. Phys. Rev. E Stat. Nonlin. Soft. Matter. Phys. 2005, 71, 061911. [Google Scholar] [CrossRef]
- Zhang, R.; Reinert, G. Simulating Weak Attacks in a New Duplication–Divergence Model with Node Loss. Entropy 2024, 26, 813. [Google Scholar] [CrossRef]
- Birchler, J.A.; Yang, H. The multiple fates of gene duplications: Deletion, hypofunctionalization, subfunctionalization, neofunctionalization, dosage balance constraints, and neutral variation. Plant Cell 2022, 34, 2466–2474. [Google Scholar] [CrossRef] [PubMed]
- Ágoston, V.; Csermely, P.; Sándor, P. Multiple weak hits confuse complex systems: A transcriptional regulatory network as an example. Phys. Rev. 2005, 71, 051909. [Google Scholar] [CrossRef] [PubMed]
- Sun, H.; Bianconi, G. Higher-order percolation processes on multiplex hypergraphs. Phys. Rev. E 2021, 104, 034306. [Google Scholar] [CrossRef]
- Peng, H.; Qian, C.; Zhao, D.; Zhong, M.; Ling, X.; Wang, W. Disintegrate hypergraph networks by attacking hyperedge. J. King Saud-Univ.-Comput. Inf. Sci. 2022, 34, 4679–4685. [Google Scholar] [CrossRef]
- Feng, R.; Ke, Q.; She, L.; Kong, X.; Liu, C.; Zhan, X.X. Hypergraph dismantling with spectral clustering. Commun. Nonlinear Sci. Numer. Simul. 2025, 105, 108975. [Google Scholar] [CrossRef]
- Genetti, S.; Ribaga, E.; Cunegatti, E.; Lotito, Q.F.; Iacca, G. Influence maximization in hypergraphs using multi-objective evolutionary algorithms. In Proceedings of the International Conference on Parallel Problem Solving from Nature, Hagenberg, Austria, 14–18 September 2024. [Google Scholar]
- Peng, P.; Fan, T.; Lü, L. Network higher-order structure dismantling. Entropy 2024, 26, 248. [Google Scholar] [CrossRef]
- Latora, V.; Marchiori, M. Efficient behavior of small-world networks. Physical review letters. Phys. Rev. Lett. 2001, 87, 198701. [Google Scholar] [CrossRef] [PubMed]
- Vasilyeva, E.; Romance, M.; Samoylenko, I.; Kovalenko, K.; Musatov, D.; Raigorodskii, A.M.; Boccaletti, S. Distances in Higher-Order Networks and the Metric Structure of Hypergraphs. Entropy 2023, 25, 923. [Google Scholar] [CrossRef]
- Bergendahl, L.T.; Marsh, J.A. Functional determinants of protein assembly into homomeric complexes. Sci. Rep. 2017, 7, 4932. [Google Scholar] [CrossRef]
- Badonyi, M.; Marsh, J.A. Large protein complex interfaces have evolved to promote cotranslational assembly. eLife 2022, 11, e79602. [Google Scholar] [CrossRef]
- Balu, S.; Huget, S.; Medina Reyes, J.; Ragueneau, E.; Panneerselvam, K.; Fischer, S.; Claussen, E.; Kourtis, S.; Combe, C.; Meldal, B.; et al. Complex portal 2025: Predicted human complexes and enhanced visualisation tools for the comparison of orthologous and paralogous complexes. Nucleic Acids Res. 2025, 53, D644–D650. [Google Scholar] [CrossRef]
- Mancastroppa, M.; Iacopini, I.; Petri, G.; Barrat, A. Hyper-cores promote localization and efficient seeding in higher-order processes. Nat. Commun. 2023, 14, 6223. [Google Scholar] [CrossRef] [PubMed]
- Galimberti, E.; Barrat, A.; Bonchi, F.; Cattuto, C.; Gullo, F. Mining (maximal) span-cores from temporal networks. In Proceedings of the 27th ACM international Conference on Information and Knowledge Management, Torino, Italy, 22–26 October 2018; pp. 107–116. [Google Scholar]
- Lotito, Q.F.; Montresor, A. Efficient algorithms to mine maximal span-trusses from temporal graphs. arXiv 2020, arXiv:2009.01928. [Google Scholar] [CrossRef]
- Lo, T.; Reinert, G.; Zhang, R. Isolated vertices in two duplication-divergence models with edge deletion. arXiv 2025, arXiv:2501.11077. [Google Scholar] [CrossRef]
- De Silva, E.; Stumpf, M.P. Complex networks and simple models in biology. J. R. Soc. Interface 2005, 2, 419–430. [Google Scholar] [CrossRef]





Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, R.; Reinert, G. A Duplication–Divergence Hypergraph Model for Protein Complex Data. Complexities 2025, 1, 7. https://doi.org/10.3390/complexities1010007
Zhang R, Reinert G. A Duplication–Divergence Hypergraph Model for Protein Complex Data. Complexities. 2025; 1(1):7. https://doi.org/10.3390/complexities1010007
Chicago/Turabian StyleZhang, Ruihua, and Gesine Reinert. 2025. "A Duplication–Divergence Hypergraph Model for Protein Complex Data" Complexities 1, no. 1: 7. https://doi.org/10.3390/complexities1010007
APA StyleZhang, R., & Reinert, G. (2025). A Duplication–Divergence Hypergraph Model for Protein Complex Data. Complexities, 1(1), 7. https://doi.org/10.3390/complexities1010007

