Weighted Homology of Bi-Structures over Certain Discrete Valuation Rings

: An RNA bi-structure is a pair of RNA secondary structures that are considered as arc-diagrams. We present a novel weighted homology theory for RNA bi-structures, which was obtained through the intersections of loops. The weighted homology of the intersection complex X features a new boundary operator and is formulated over a discrete valuation ring, R . We establish basic properties of the weighted complex and show how to deform it in order to eliminate any 3-simplices. We connect the simplicial homology, H i ( X ) , and weighted homology, H i , R ( X ) , in two ways: ﬁrst, via chain maps, and second, via the relative homology. We compute H 0, R ( X ) by means of a recursive contraction procedure on a weighted spanning tree and H 1, R ( X ) via an inﬂation map, by which the simplicial homology of the 1-skeleton allows us to determine the weighted homology H 1, R ( X ) . The homology module H 2, R ( X ) is naturally obtained from H 2 ( X ) via chain maps. Furthermore, we show that all weighted homology modules H i , R ( X ) are trivial for i > 2. The invariant factors of our structure theorems, as well as the weighted Whitehead moves facilitating the removal of ﬁlled tetrahedra, are given a combinatorial interpretation. The weighted homology of bi-structures augments the simplicial counterpart by introducing novel torsion submodules and preserving the free submodules that appear in the simplicial homology.


Introduction
This paper is concerned with the weighted homology of the nerve complex of RNA bi-structures [1]. The weighted nerve complex of such bi-structures provides a framework for studying the computational complexity of algorithms for identifying sequences that are thermodynamically stable with respect to a given bi-structure. Such sequences hold the key for evolutionary optimization [2].
The weighted homology introduced here augments the simplicial homology by making the weights of the simplices an integral part of the theory. In our situation, these weights encode the cardinality of intersections of loops in a bi-structure. Along the lines of [3], the notion of a weighted complex is introduced. Differently from [3], the weights of simplices in this paper are taken from a discrete valuation ring that contains a copy of the integers as units and in which the weight of a simplex divides the weight of its faces.

Background
RNA sequences are linear, single-stranded sequences composed of the bases A,U,G, and C. In contrast to double-stranded DNA, RNA sequences fold into structures (see Figure 1).A particularly important class of RNA structures is that of RNA secondary structures [4]. Secondary structures can be represented as planar, non-crossing arc diagrams [5,6]. By means of its folded structure, RNA facilitates a plethora of important functional roles [7]. The boundary components of an RNA secondary structure, when viewed as a fatgraph [8], are called loops, and they determine the minimum free energy of the structure.  20). Right: its diagram representation on the set of vertices [19] with two additional vertices, 0 and 20, forming the rainbow r = (0, 20), the loop s = [3,4] ∪ [8,12] ∪ [16,17] (shaded), and the corresponding intervals (underlined). The arc x = (3,17) is the maximal arc of s, α s = x.
In [10], a bijection between linear trees and secondary structures was presented. In [13], the authors enumerated k-non-crossing RNA structures by employing a bijection between k-non-crossing partial matchings and walks in the interior of the Weyl-chamber C 0 based on a certain bijection between oscillating tableaux and matchings [19][20][21]. The geometrical realization of a certain complex of secondary structures was studied in [22].
A key determinant of a sequence-structure pair is its thermodynamic stability, which is measured via its free energy. Given a sequence σ, RNA structure prediction algorithms identify structures by assuming minimal or low free energy with respect to σ [23]. Given any RNA sequence-structure pair (a, S), the free energy η(a, S) is computed by adding all loop energy contributions [24,25], i.e., η(a, S) = ∑ s∈S η(a, s). Here, η(a, s) depends on the loop type of s (hairpin, interior, multi-branch, etc.) and the particular nucleotides of the sequence a that appear in the loop s. Details on loop decompositions, partition functions, and recursive constructions for the folded structure-derived via polynomial-time dynamic programming (DP)-can be found in [9,23,[26][27][28]. In [12], the notion of bi-secondary structures was introduced. Bi-secondary structures are central to identifying sequences that can realize two mutually exclusive minimum free energy (mfe) conformations. Such sequences naturally appear in the context of evolutionary transitions [29] and of RNA riboswitches, i.e., sequences that exhibit two distinct, stable configurations [30].

Motivation
This paper was motivated by the problem of computing thermodynamically stable sequences for two distinguished secondary structures, S and T [2]. The problem can be formalized as the computation of a sequence a that minimizes η(a, S, T), where The sub-problems of the underlying DP routine are associated with sets of loops and recursively constructed by adding one loop at a time, where subsequently added nucleotides affect the energy calculation if they appear in multiple loops simultaneously; see Figure 2. This leads to the consideration of all such intersections as a simplicial complex, X, whose homology provides insights into the above optimization problem.
In [1,31], we computed the simplicial homology groups of X and proved that only the second homology group carries information. This group was identified to be free, and its generators were identified as the crossing components within the diagram representation. All known RNA riboswitches exhibit one such crossing component, while random pairs of structures contain multiple crossing components.
The simplicial homology of X in [1] only allows us to express the existence of loop intersections. However, the crucial determinant for the algorithmic complexity of the DP routine in [2] is the size of such intersections; see Figure 2. The recursion is over the set of loops, successively examining one loop at a time. Each such examination affects certain vertices that the currently examined loop shares with the unprocessed loops. Top: a bi-structure decomposed into a distinguished sub-structure and its complement. The vertices contained in both (white) control the complexity of the dynamic programming (DP) routine in [2]. Bottom: the recursive step of the DP routine. Note that the recursion changes the set of white vertices by removing some and adding others.
It is thus natural to employ a homology theory that accounts for the intersection size. The notion of weighted homology as originally formulated [3], assigns a certain weight to each simplex, imposing specific divisibility conditions on these weights.
Our framework represents a variation on weighted homology as follows: We consider homology with coefficients in a discrete valuation ring, which, in fact, makes any divisibility constraints obsolete and allows us to deal with the cardinality of intersections in a natural way. The boundary operator introduced here is new, and so is our approach to connecting the weighted and simplicial homologies. More specifically, the construction of an inflation map relating H 1 (X 1 ) and H 1,R (X 1 ) is a new technique. The computation of the weighted homology groups, H 0,R , H 1,R (X), and H 2,R (X), is also novel. Specifically, the idea of obtaining a combinatorial interpretation for the invariant factors in the structure theorems for H 0,R and H 1,R (X) via particular X 1 -spanning trees is original. The chain maps between C n (X) and C n,R (X), which relate the H i (X) and H i,R (X) introduced here, are new and nontrivial. In particular, the injection inj : is a result of the fact that, in the expressions for the generators of H 2,R (X), any triangle has a corresponding intersection size of one.

Organization
The paper is organized as follows: In Section 2, the simplicial loop complex of a bistructure is introduced and the splicing of nucleotides is recalled [1]. In Section 3, the weighted complex of a bi-structure and a modified splicing procedure are discussed. In Section 4, the weighted homology and simplicial collapses [3,32] are introduced. In Section 5, a relative point of view is adopted by relating the simplicial and weighted homologies. Specifically, we introduce the inflation map, which maps classes of unweighted 1-cycles into corresponding weighted 1-cycles. In Section 6, we construct certain spanning trees that will be instrumental for proving our main results in Section 7. Here, we compute the weighted homology modules of bi-structures and provide combinatorial interpretations.
The torsion module emerging in the zeroth homology stems from a recursive contraction scheme on the distinguished weighted spanning trees of the complex. The torsion of the first homology module can be understood via particular weighted edges that reference the aforementioned spanning tree. We conclude the paper with the computation of the second homology module, with all others being trivial. We integrate our results and outline future work in Section 8.

The Simplicial Loop Complex of a Bi-Structure
An RNA diagram S over the set [n] := {1, · · · , n} is a vertex-labeled graph-whose vertices represent nucleotides-drawn on the horizontal axis and labeled with elements of the set [n]. An arc α = (i, j), i < j, is an ordered pair of vertices, which represents the base pairing between the i-th and j-th nucleotides. Each S-vertex can be paired with at most one other vertex, and the arc that connects them is drawn in the upper half-plane. An RNA diagram over the set [n] is augmented with two additional vertices associated with positions 0 and n + 1, together with the arc (0, n + 1), which is called the rainbow, representing the closing base pair of the external loop of the RNA structure; see [23]. The set of vertices {i, i + 1, . . . , j − 1, j} is called an interval, and is denoted by [i, j]. The set [0, n + 1] := {0, 1, · · · , n, n + 1} is referred to as the backbone of the diagram. Two arcs, (i, j) and (p, q) with i < p, are called crossing if and only if i < p < j < q. S is called a secondary structure if it does not contain any crossing arcs. A loop s in a secondary structure S is represented as the disjoint union of intervals are arcs, and any other interval-vertices are unpaired. We denote by α s the unique, maximal arc (a 1 , b k ) of the loop s, and we have a correspondence between loops and their maximal arcs; see Figure 1.
S-arcs and loops can be endowed with a partial order: (k, l) ≺ S (i, j) ⇐⇒ i < k < l < j. Abusing the notation, for the two loops s 1 , s 2 ∈ S, we write s 1 ≺ S s 2 if α s 1 ≺ S α s 2 . Accordingly, (a) any unpaired vertex is contained in exactly one loop, (b) any non-rainbow arc appears in exactly two loops, and (c) the Hasse diagram of (S, ≺ S ) is a rooted tree Tr(S) with the rainbow arc as the root.
Given two secondary structures S and T, we refer to R = (S, T) as a bi-secondary structure (bi-structure). Abusing the notation, we let R = S∪T be the loop set of R. We represent a bi-structure R = (S, T) with the S-arcs in the upper and the T-arcs in the lower half-plane along the same horizontal backbone.
Let A = {A 0 , A 1 , . . . , A m } be a collection of finite sets. We call B = {A i 0 , . . . , Let R = (S, T) be a bi-structure with the loop complex X =˙ ∞ d=0 K d (R). Then, a simplicial order, ≺ R , is given by We specify a d-simplex, σ ∈ X, by the ordered d being pure edges, and any other edge being mixed), and By construction, any 2-simplex ∆ ∈ X contains exactly one pure and two mixed edges. Furthermore, X cannot contain simplices of a dimension greater than or equal to four, as that would imply that three or more loops in the same secondary structure intersect non-trivially. Moreover, for any σ = [s 0 , s 1 , t 0 , In [31], we investigated the effect of splicing a nucleotide p into two adjacent nucleotides q 1 , q 2 such that the two arcs incident to p were resolved into two non-crossing arcs. In the case ω(σ) = 2, splitting does not change X, and if ω(σ) = 1, splitting induces a specific alteration: the removal of σ from X, as well as a distinguished free edge, τ σ , and exactly two free triangles, ∆ σ 1 , ∆ σ 2 , glued at τ σ . We refer to (τ σ , ∆ σ 1 , ∆ σ 2 ) as a σ-butterfly; see Figure 4. Lemma 1. Let R = (S, T) be a bi-structure with complex X; then, there exists a simple bi-structure R = (S , T ) with a complex X < X, where X is derived from X by successively removing any 3-simplex, σ, together with an associated σ-butterfly.
Thus, successively splicing all nucleotides in P induces a bi-structure (S , T ), together with an associated complex X < X, which is obtained by removing all σ-butterflies and their corresponding σs from X.
Removing a σ-butterfly and its corresponding σ is equivalent to X X -a simplicial collapse consisting of two elementary collapses [32,33]. The first removes σ and ∆ σ 1 , and the second removes τ σ and ∆ σ 2 .
Proposition 1. Let X be the loop complex of a bi-structure R = (S, T), and let X < X be an X-sub complex obtained by removing all 3-simplices, σ, from X, together with corresponding σ-butterflies. Then, X is the complex of a bi-structure R = (S , T ) that satisfies (X ) 2 = X for any k ≥ 0, H k (X ) ∼ = H k (X).
Note that after splicing, we have simplified X to a complex of a bi-structure, X , thus reducing its maximum simplex dimension and maintaining its homology. In the following, we generalize this to weighted complexes.

µ-Splicings and the Weighted Complex of a Bi-Structure
We begin by modifying the splicing introduced in Section 2 as follows: Instead of replacing a nucleotide p ∈ P by the two nucleotides q 1 < q 2 , we substitute it by three nucleotides, q 1 < a < q 2 , such that the arcs in R that share p as an endpoint now have endpoints q 1 and q 2 , which are non-crossing, and a is unpaired. We call this a µ-splicing of p.
Note that µ-splicing and splicing all p ∈ P have exactly the same effect on X-namely, eventually removing any 3-simplex σ and a corresponding σ-butterfly. In terms of bistructures, from (S, T), µ-splicing P produces the bi-structure (S , T ) augmented by a set of distinguished, unpaired nucleotides A = {a 1 , . . . , a n }, which only appear in 0-and 1-simplices. By construction, the complex X of (S , T ) does not contain any 3-simplices. Furthermore, after µ-splicing P, any distinguished nucleotide produced is flanked by two distinct endpoints of non-crossing arcs. Thus, for any simplex σ ∈ X , we have h(σ) ≥ 0, where h(σ) denotes the difference between the number of non-distinguished and distinguished nucleotides contained in σ. Note that h is a natural generalization of ω in the context of complexes of bi-structures. Accordingly, we define a weighted complex associated with a bi-structure as follows.
First, we recall that a ring R is called a discrete valuation ring if it is a Principal Ideal Domain (PID) with exactly one non-zero maximal ideal. Any irreducible element is a generator of this ideal and is called a uniformizer for R, and furthermore, any two such elements differ only up to multiplication with a unit. Definition 1. Let X be a complex of a bi-structure (S, T) such that for any σ ∈ X, we have h(σ) ≥ 0. Let R ⊃ Z be a discrete valuation ring with uniformizer π. We define the weight function v : X → R, v(σ) = π h(σ) , and call (X, v) the weighted complex of (S, T). Lemma 2. Let (S, T) be a bi-structure with no distinguished vertices and complex X. Let (S , T ) be derived from (S, T) via µ-splicing P, and we denote its complex by X . Then, there exists an embedding where (X ) is a X-subcomplex obtained by removing any 3-simplices and associated butterflies. Moreover, is an embedding of weighted complexes: Proof. Successively µ-splicing P generates the series of complexes X |P| = X > · · · > X 0 = X . By construction, given X i and a 3-simplex σ, the corresponding µ-splicing at p replaces p with the triple q i 1 < a i < q i 2 . Since a i is unpaired, a i is contained in exactly two loops, λ i−1 , µ i−1 , the only two loops that change size in terms of number of nucleotides when passing to X i−1 . Since the distinguished nucleotide contributes a −1, we note that . Any other X i -simplex-differently from σ and its σ-butterfly, which are removed-is not affected by the splicing, and consequently retains its weight in X i−1 , whence the lemma.

Weighted Homology
Consider ∂ v n : C n,R (X) → C n−1,R (X), where C n,R (X) denotes the free R-module generated by all n-simplices contained in X. Let ∂ v n be given by . This is, in view of h(σ) ≤ h(σ i ), a non-negative power of the uniformizer π and, hence, still an element of the ring R. As a result, the boundary operator ∂ v n is well defined. Since and we denote the relative homology groups by H n,R (X, Y). These are connected via the long exact sequence Lemma 2 guarantees that, for any weighted complex (X, v) of a bi-structure (S, T), there exists a bi-structure (S , T ) with a weighted complex (X , v) such that (X , v) < (X, v) and where X 2 = X . Thus, the relative homology groups H k,R (X, X ) and their associated long sequence are well defined. However, a weighted complex does not have to be induced by a bi-structure. The previous definitions still hold as long as σ ⊆ σ =⇒ v(σ)|v(σ ) for an arbitrary (X, v)-pair. Weighted complexes represent a general combinatorial-algebraic framework that enhances the simplicial homology for a wide range of applications. The next proposition extracts the algebraic core of µ-splicings; it represents a variation of [32], and it also appears in some form in [3]. Proposition 2. Let (X, v) be a weighted complex and let τ, σ ∈ X such that: (a) τ is σ-free; (b) v(τ) = v(σ). Let Y be the X-subcomplex obtained by removing any τ ⊂ τ ⊂ σ (X Y). Then, Proof. Claim 1. It suffices to prove the Lemma for a σ-free τ with the property dim(τ) + 1 = dim(σ). We may assume that σ = [v 0 , . . . , v k ] and τ = [v 0 ]. The lattice of sub-simplices of σ is isomorphic to a binary k-cube, where cover relations are induced by face inclusions.
We prove by induction on k that a k-cube can be recursively decomposed into a sequence of pairs Σ k = (σ i , τ i ) 1≤i≤2 k−1 , where σ i is a maximal simplex with a free face τ i . The induction basis For the induction step, note that a k-cube decomposes into two disjoint copies of a (k − 1)-cube based on if the last coordinate is zero or one. By the induction hypothesis, for a (k − 1)-cube, there exists a sequence, Σ k−1 = (σ i , τ i ) 1≤i≤2 k−2 , with the desired properties; appending one or zero as the last coordinate yields two mappings from the k − 1 cube into the k cube. By construction, these form a family of disjoint-type embeddings, i.e., their images are injective copies of the k − 1-cube that are vertex disjoint in the k-cube. Accordingly, we obtain Σ 1 As the two sub-cubes are disjoint, we can immediately construct the desired sequence for the k-cube, Σ k = Σ 1 k−1 Σ 0 k−1 ; the claim follows.
Note that only C k,R (X, Y) and , so H n,R (X, Y) = 0 for any n = k, k − 1, and Claim 2 follows.
The proposition is implied in view of Claim 2 and the following long exact sequence: , for any σ ∈ X 3 and its corresponding σ-butterfly, Proposition 2 implies the following. Corollary 1. Let (X, v) be the complex of the bi-structure (S, T), and let (X , v) be the complex of (S , T ) that is obtained by completely µ-splicing (S, T). Then,

The Inflation Map
In view of Corollary 1, we can assume that X is the complex of a bi-structure with the property X 2 = X, where X k denotes the simplicial complex induced by all k-simplices in X [34]. Clearly, the natural embedding X 1 → X is an embedding of weighted complexes, and we have the exact sequence 0 / / C n,R (X 1 ) producing the long exact sequence We now adopt a relative point of view by relating the simplicial homology H k (X) to the weighted homology H k,R (X). To this end, we consider the following two segments of the long homology sequences where the surjectivity of∂ 2 is given by H 1 (X) = 0 [1]. While these sequences belong to different categories, the first two vertical maps are natural injections and relate simplicial and weighted homology modules.
Indeed, the first vertical injection is induced by chain maps of the form θ n : C n (X) − → C n,R (X), θ(∑ i n i σ i ) = ∑ i n i v(σ i )σ i , which make the following diagram commutative: The second vertical injection is given by the fact that, since Z ⊂ R, C 2 (X) is an R-submodule of C 2,R (X) and, by construction, H 2 (X, X 1 ) = C 2 (X) and H 2,R (X, X 1 ) = C 2,R (X).
This motivates us to ask if there exists a Z-linear map that makes the above diagram commutative. It turns out that such a map exists and will prove useful for understanding H 1,R (X).

Lemma 3. The mapping
infl : where c =∂ 2 (∑ i ∆ i ), is well defined and Z-linear; the following diagram is commutative: Proof. By construction, it suffices to show that infl is well defined. Observe that, in view of sequence (1),∂ 2 is surjective. Hence, any c ∈ H 1 (X 1 ) can be expressed as c = ∑ i∂2 (∆ i ), where ∑ i ∆ i is unique, modulo elements of Ker(∂ 2 ). Secondly, in [31], we showed that where ∇ β = ∑ j ∆ j ; ∆ j is contained in the (S, T)-crossing component β and v(∆ j ) = 1.
We next employ infl : H 1 (X 1 ) → H 1,R (X 1 ) to obtain information about H 1,R (X 1 ). X 1 is connected, and for a spanning tree T X 1 , the free generators of H 1 (X 1 ) are in bijection with κ's mixed X 1 cycle closing edges for T X 1 . A generator c κ ∈ H 1 (X 1 ) is a sum of edges u j ∈ T X 1 and the closing edge κ, with each such edge appearing exactly once. We fix T X 1 as follows: Consider the trees T S , T T of the sub-complexes for S and T, respectively, and let T X 1 be the tree obtained by connecting the T S , T T -roots with the mixed edge ω, corresponding to the two rainbow loops.

Proof.
With T X 1 as above, consider a generator c κ = ∑ j u j ∈ H 1 (X 1 ) where, for 1 < j < n, u j are pure Sor T-edges and u 1 = ω while u n = κ. Since∂ 2 is surjective, we can also write This involves additional mutually canceling edges, k, all of which are mixed and appear as faces of pairs of 2-simplices, (∆ k , ∆ k ). Without loss of generality (w.l.o.g.), we can traverse the cycle c κ beginning at ω via the pure S-edges until we arrive at κ, after which we return back to ω via the pure T-edges.
, and we accordingly categorize the k edges into k α -and k β -edges, respectively. As a result, in where u j ∈∂ 2 (∆ j ). Note that only κ and ω are mixed c κ -edges, and so {u j } ∩ {k β } = ∅. Any set M of c κ -cycles induces a set of distinct mixed faces ω, κ 1 , . . . , κ h . In the two sub-trees of S and T formed by the respective pure edges of the c κ -paths, these mixed faces connect an S-leaf with a T-minimal vertex or a T-leaf with an S-minimal vertex. By construction, these edges are distinct from any of the k β produced by any M-cycle, and the coefficients of any of these c κ i 1 ≤ i ≤ h in a linear combination ∑ c κ ∈M r κ · infl(c κ ) = 0 are zero. Iterating this argument, we arrive at Accordingly, {infl(c κ ) | κ} freely generate {infl(c κ ) | κ} R , and the following diagram is commutative:

The Rank of H 1,R (X 1 )
Next, we compute the rank of H 1,R (X 1 ) by constructing a basis. Note that an element in H 1 (X 1 ) is a sum of cycles, and it is natural to ask what the analogue of a cycle in H 1,R (X 1 ) is.
Let c = ∑ i e i ∈ H 1 (X 1 ) be minimal in terms of the number of edges it contains. A straightforward computation shows that there exist r i ∈ R with ∂ v 1 (∑ i r i · e i ) = 0, where the multi-set (r i ) i is unique up to a scalar factor. This shows that, except for a scalar multiple, minimal H 1 (X 1 ) cycles determine the distinguished H 1,R (X 1 ) cycles, and we shall call a cycle c = ∑ i r i e i ∈ H 1,R (X 1 ) elementary if gcd({r i }) = 1 and ∑ i e i ∈ H 1 (X 1 ) is a minimal cycle. An elementary cycle is precisely the H 1,R (X 1 )-analogue of a minimal H 1 (X 1 )-cycle.
In view of X 1 being connected, a basis of H 1 (X 1 ) can be comprised of all minimal cycles derived from a fixed X 1 -spanning tree T: Any X 1 non-tree edge added to this tree results in a unique minimal cycle in the basis that is labeled by the edge. One key observation here is that {∂ 1 (e)|e ∈ T} generates ∂ 1 (C 1 (X 1 )) for an arbitrary spanning tree T of X 1 .
To derive a basis of H 1,R (X 1 ) in an analogous fashion, an arbitrary choice of an X 1spanning tree is no longer sufficient because of the weights involved. However, in the following, we construct a distinguished tree T # X 1 that is particularly well suited to the understanding of the embedding of H 1 (X 1 ) into H 1,R (X 1 ).

Lemma 5.
There exists an X 1 -spanning tree, T # X 1 , such that the following sequence is exact: To this end, we fix an arbitrary X 1 -spanning tree T 0 and let M = X 1 \ T 0 . We examine M-edges one by one, yielding a chain of processed edges, ∅ = M 0 ⊂, . . . , ⊂ M k−1 ⊂ M k , and a sequence of trees, T 0 , . . . , T k−1 , T k . We claim that these have the property ∂ v 1 ( T k R ) = ∂ v 1 ( M k ∪ T 0 R ). The above holds trivially for k = 0, and we assume that M k−1 and T k−1 are constructed . Let e k ∈ M \ M k−1 and set M k = M k−1 {e k }. Then, there exist r e ∈ R such that ∑ e∈{e k }∪T k−1 r e ∂ v 1 (e) = 0. Letting r = gcd{r e } and r e = rr e leads to ∑ e∈{e k }∪T k−1 r e ∂ v 1 (e) = 0, where, by construction, at least one edge has a coefficient of one. In case the r e k = 1, set T k = T k−1 . Otherwise, remove a T k−1 -edge with coefficient one, and add e k to T k−1 , obtaining T k . Note then that By induction, when the processes terminates, it yields the desired T # X 1 .
T # X 1 induces a new set of closing edges κ # and the associated minimal H 1 (X 1 )-cycles c κ # . We shall show that the corresponding elementary cycles constitute a basis of H 1,R (X 1 ).

Proposition 3. There exists an isomorphism
Furthermore, H 1,R (X 1 ) is freely generated by the set of elementary cycles c κ # and has rank e − v + 1, where e and v are the numbers of 1-and 0-simplices in X 1 .
Proof. H 1,R (X 1 ) is free, as it is a sub-module of the finitely generated free module over the PID R and C 1,R (X 1 ). By construction, to each c κ # ∈ H 1 (X 1 ), there corresponds a unique elementary H 1,R (X 1 )-cycle c v κ # , whose κ # coefficient is one. Furthermore, c v κ # is non-torsion in H 1,R (X 1 ); if r · c v κ # = 0, R being an integral domain implies r = 0.
Claim. Any c ∈ H 1,R (X 1 ) has the representation c = ∑ κ r κ # c v κ # with a unique r κ # ∈ R. We proceed by induction on the number of T # X 1 -closing edges in c. To establish the induction basis, let r κ # κ # be the unique term that contains κ # in c. Using the fact that κ # appears with a coefficient of one in c v κ # , we derive c = r κ # c v κ # . For the induction step, distinguishing r κ # 1 κ # 1 as a closing edge summand in c, we have From the claim and the fact that the c v κ # are non-torsion, we conclude that hence the proposition.

The Modules of the Weighted Homology
By Lemma 5, we have is a module over the PID R, we arrive at where n 1 ≥ . . . n v−1 ≥ 0, and M π = v−1 j=1 R/(π n j ) is the π-torsion module of H 0,R (X). We shall proceed by computing M π . Note that the T # X 1 -tree is a spanning tree of X 1 , and {∂ v 1 (e)|e ∈ T # X 1 } generates Im(∂ v 1 ). Thus, we can compute H 0,R (X) via T # X 1 . To this end, we process T # X 1 as follows: We select a T # X 1 -vertex u i and an incident edge e i = (u j , u i ) for which log π v(u i ) v(e i ) is minimal. We contract e i onto u j by removing u i and replacing any edges of the form (u k , u i ) with (u k , u j ) while maintaining their original weights v(u k , u j ) := v(u k , u i ). Recursively contracting all (v − 1) edges in T # X 1 produces a sequence of trees T # X 1 = T 0 , . . . , T v−1 , where T v−1 = u v is a single vertex at which point the process terminates. Relabeling the T # X 1 -vertices, we may assume that u i is removed at the ith step of this process.

Theorem 1. There exists an isomorphism
v(e i ) , as described above.
Proof. Let E m = {e i = (u a i , u i ) | 0 ≤ i ≤ m} denote the set of recursively contracted edges up to and including the mth step. Set e i = (u i , u a i ), where the formal fractions are well defined by the minimality of the denominator exponent. Note We proceed by induction on i, where Lemma 5 implies the induction basis i = 0. Consider the contraction at step i: , and Claim 1 follows by induction.
where u i is removed from T i−1 at the ith contraction step. We rewrite the z in terms of the x i . This is done recursively, starting with u 1 , as follows: Let C p (u q ) be the coefficient of u q at step p. Then, v(u i ) u a i , and we obtain the updated u a i -coefficient Processing all u i , we arrive at z = ∑ v−1 =1 r i x i + r v x v , and Claim 2 follows.
, and we make the following ansatz: First, ϕ is a well-defined homomorphism: , and consequently, there exist unique coefficients, .
The linear independence of x i , in turn, implies that (r i − r i ) = s i n i for any i, from which (r i − r i )x i = s i ∂ v 1 (e i ), and thus, r i x i = r i x i . ϕ is, by construction, surjective and does not have a nontrivial kernel from which Claim 3 follows.
By construction, Rx v ∼ = R, and as for the direct summands Rx k , for 1 ≤ k ≤ v − 1, we consider the mapping h : Rx k → R/Rn k , h(rx k ) = r mod n k . Suppose that (r k − r k )x k = s k ∂ v 1 (e k ); then, ∂ v 1 (e k ) = n k x k implies that (r k − r k ) = s k n k , so h is well defined. Clearly h is surjective, and it remains to check the injectivity. h(rx k ) = 0 implies that r = sn k and sn k x k = s∂ v 1 (e k ) implies that rx k = 0, hence the theorem. We proceed by analyzing H 1,R (X). We begin with the expression for infl(c κ # ) in Lemma 4: where ∆ j is the 2-simplex containing the edge u j , and k β are mixed edges that appear when transitioning between 2-simplices with different weights that share a k β face.
The "troublemakers" here are the off-diagonal terms of the embedding, k β , which stem from passing from trivial to nontrivial crossing components and vice versa. We shall eliminate such transitions by appropriately deforming X without affecting its homology. We then have to check that the deformation gives rise to an inflation map that is just a restriction of the original inflation map for X. Let ∆ ∈ X be a 2-simplex corresponding to a non-crossing arc, i.e., ∆ is maximal in X. W.l.o.g., we can assume that ∆ = [s 1 , s 2 , t] with s 1 , s 2 ∈ S, t ∈ T. Then, τ = [s 1 , s 2 ] is ∆-free, and by construction, v(τ) = v(∆) = 2. Removing τ and ∆ from X produces a sub-complex X with X X . By Proposition 2, we may remove all 2-simplices contained in X corresponding to non-crossing arcs together with their pure faces without changing the homology. Accordingly, the resultingX is an X-sub-complex such that H i,R (X) ∼ = H i,R (X). Note thatX is, in general, not a complex induced by a bi-structure. Lemma 6. There exists anX 1 -spanning tree, Clearly, H 2 (X 1 ,X 1 ) = 0 and H 2,R (X 1 ,X 1 ) = 0, and by the long homology sequences, I * : H 1 (X 1 ) → H 1 (X 1 ) and I * : H 1,R (X 1 ) → H 1,R (X 1 ) are embeddings. Lemma 6 shows that T #X 1 is also a T # X 1 -tree, from which H 1 (X 1 ) is obtained from H 1 (X 1 ) by just adding all elementary cycles induced by the pure edges that are not contained inX. Consequently, removing these edges from X 1 is equivalent to restricting the inflation map infl : H 1 (X 1 ) → H 1,R (X 1 ) to the sub-complexX 1 , and Claim 1 follows.
Claim 2. For infl| H 1 (X 1 ) : where ∆ j is the 2-simplex containing the edge u j , and k β are mixed edges that appear when In the simplicial homology, only the zeroth and second homology groups are nontrivial [1]. Since the complex of a bi-structure is, by construction, connected, the simplicial homology carries key information exclusively via H 2 (X).
The weighted homology not only conserves the information provided by the simplicial homology, but it also supplies additional information. Specifically, connectivity is still picked up by the free submodule of H 0,R (X); however, additional torsion now emerges. This torsion is given a concrete combinatorial interpretation via the compression algorithm on the spanning tree T # X 1 . In fact, the compression procedure naturally generates the free submodule of rank one at its final step.
In the case of H 2,R (X), we consider H 2 (X) and observe that the nontrivial chain maps θ n : C n (X) − → C n,R (X), θ(∑ i n i σ i ) = ∑ i n i v(σ i )σ i , produce an injection inj : H 2 (X) → H 2,R (X), in which all coefficients are one, since any 2-simplex contained in the expression of an H 2 (X)-generator (i.e., appearing in a crossing component) has a weight of one. As a result, passing to the ring R preserves the information present in H 2 (X).
In the computation of H 1,R (X), we employed the simplicial homology in a different way. Here, we consider the relative homology and transcend the information from the 1-skeleton, X 1 . While H 1 (X) is trivial [1], H 1 (X 1 ) carries information, which we can utilize by extracting from the embedded exact sequence of Lemma 4. After resolving some technicalities, this allows us to compute H 1,R (X). Note that passing to the deformationX greatly simplifies the inflation map, and we observe that it acts onX as θ 1 would on the 1-skeleton. The structure theorems for the weighted homology enable the classification of bistructures by their R-modules. Such classifications can provide insights into the algorithmic complexity of problems involving bi-structures. For instance, the problem of computing thermodynamically stable sequences for a bi-structure can be efficiently solved in polynomial time when the second homology R-module H 2,R (X) is trivial [2].
As for future work, we shall extend the homological analysis to RNA-RNA interaction structures [35]. An interaction structure can be represented as a diagram with two backbones drawn horizontally on top of each other such that both intra-molecular and inter-molecular bonds are non-crossing; see Figure 5.
The inter-molecular arcs naturally induce an equivalence relation φ on the vertices of the two backbones. The intersection of two loops is accordingly defined to be the set of φ-equivalence classes of vertices that they share. A bi-secondary structure can then be viewed as a particular type of interaction structure-namely, an interaction structure with two identical backbones, where interaction arcs connect any pair (i, i) for 0 ≤ i ≤ n + 1. In contrast to a bi-structure, an interaction structure can exhibit nontrivial first-simplicial homology and requires revisiting the notion of a crossing component for interaction structures. The analysis involves many more topologies-in particular, surfaces and Mayer-Vietoris sequences [34].
The framework of the weighted homology is by no means restricted to bi-or interaction structures. It also gives rise to the consideration of the dissimilarity complex of a finite set of genomic sequences, which we briefly discuss in the following. A multiple sequence alignment W of genomic sequences is defined to be a finite set of words of equal length m over the finite alphabet A = {a 1 , · · · , a s }. For any k ∈ {1, · · · , m}, let f k : W − → A be the map that returns the symbol at position k of a word, namely, f k (w) = a w k for w = a w 1 . . . a w m ∈ W.
A d + 1 subset σ ⊆ W forms a d-simplex if there exists at least one position j such that |{ f j (w)|w ∈ σ}| = d + 1. In other words, all d + 1 sequences in σ contain mutually different nucleotide types at the site j. For a fixed σ, the number of different positions j for which the above holds is called the dissimilarity of σ. This leads to the aforementioned dissimilarity complex for genomic sequences; see Figure 6. This complex is, by definition, low dimensional; its highest dimension is s − 1. For d + 1 = 2, the 1-skeleton of the dissimilarity complex captures the well-known Hamming distance-the number of positions in which two genetic sequences differ. It may be worth pointing out that this framework integrates well-known generalizations of distances, as these appear naturally as weights in the dissimilarity complex. Triangle inequalities generalize to tetrahedron inequalities, etc., and this could enhance our understanding of genetic sequences. This is because basic constructs-e.g., trees, such as the tree of life-that reflect the ancestral relations between sequences depend on the notion of Hamming distance. Our preliminary investigations suggest that the homology of the dissimilarity complex captures evolutionary events. For example, the free rank of the first homology module gives bounds of the possible recombinations within the sequence set.