Next Article in Journal
Metaheuristic to Optimize Computational Convergence in Convection-Diffusion and Driven-Cavity Problems
Previous Article in Journal
Halpern-Subgradient Extragradient Method for Solving Equilibrium and Common Fixed Point Problems in Reflexive Banach Spaces
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Weighted Homology of Bi-Structures over Certain Discrete Valuation Rings

by
Andrei Bura
1,†,‡,§,
Qijun He
1,†,‡,§ and
Christian Reidys
1,2,*,†,‡,§
1
Biocomplexity Institute and Initiative, University of Virginia, Charlottesville, VA 22904-4298, USA
2
Mathematics Department, University of Virginia, Charlottesville, VA 22904-4137, USA
*
Author to whom correspondence should be addressed.
Current address: Physical Address: Biocomplexity Institute, Town Center Four, 994 Research Park Boulevard, Charlottesville, VA 22904-4298, USA.
Mailing address: Biocomplexity Institute, P.O. Box 400298, Charlottesville, VA 22904-4298, USA.
§
These authors contributed equally to this work.
Mathematics 2021, 9(7), 744; https://doi.org/10.3390/math9070744
Submission received: 23 February 2021 / Revised: 27 March 2021 / Accepted: 29 March 2021 / Published: 31 March 2021
(This article belongs to the Section Algebra, Geometry and Topology)

Abstract

:
An RNA bi-structure is a pair of RNA secondary structures that are considered as arc-diagrams. We present a novel weighted homology theory for RNA bi-structures, which was obtained through the intersections of loops. The weighted homology of the intersection complex X features a new boundary operator and is formulated over a discrete valuation ring, R. We establish basic properties of the weighted complex and show how to deform it in order to eliminate any 3-simplices. We connect the simplicial homology, H i ( X ) , and weighted homology, H i , R ( X ) , in two ways: first, via chain maps, and second, via the relative homology. We compute H 0 , R ( X ) by means of a recursive contraction procedure on a weighted spanning tree and H 1 , R ( X ) via an inflation map, by which the simplicial homology of the 1-skeleton allows us to determine the weighted homology H 1 , R ( X ) . The homology module H 2 , R ( X ) is naturally obtained from H 2 ( X ) via chain maps. Furthermore, we show that all weighted homology modules H i , R ( X ) are trivial for i > 2 . The invariant factors of our structure theorems, as well as the weighted Whitehead moves facilitating the removal of filled tetrahedra, are given a combinatorial interpretation. The weighted homology of bi-structures augments the simplicial counterpart by introducing novel torsion submodules and preserving the free submodules that appear in the simplicial homology.

1. Introduction

This paper is concerned with the weighted homology of the nerve complex of RNA bi-structures [1]. The weighted nerve complex of such bi-structures provides a framework for studying the computational complexity of algorithms for identifying sequences that are thermodynamically stable with respect to a given bi-structure. Such sequences hold the key for evolutionary optimization [2].
The weighted homology introduced here augments the simplicial homology by making the weights of the simplices an integral part of the theory. In our situation, these weights encode the cardinality of intersections of loops in a bi-structure. Along the lines of [3], the notion of a weighted complex is introduced. Differently from [3], the weights of simplices in this paper are taken from a discrete valuation ring that contains a copy of the integers as units and in which the weight of a simplex divides the weight of its faces.

1.1. Background

RNA sequences are linear, single-stranded sequences composed of the bases A,U,G, and C. In contrast to double-stranded DNA, RNA sequences fold into structures (see Figure 1). A particularly important class of RNA structures is that of RNA secondary structures [4]. Secondary structures can be represented as planar, non-crossing arc diagrams [5,6]. By means of its folded structure, RNA facilitates a plethora of important functional roles [7]. The boundary components of an RNA secondary structure, when viewed as a fatgraph [8], are called loops, and they determine the minimum free energy of the structure.
Secondary structures have been studied from the perspectives of enumerative combinatorics [9,10,11,12], algebraic combinatorics [13], matrix models [14,15], and topology [16,17,18].
In [10], a bijection between linear trees and secondary structures was presented. In [13], the authors enumerated k-non-crossing RNA structures by employing a bijection between k-non-crossing partial matchings and walks in the interior of the Weyl-chamber C 0 based on a certain bijection between oscillating tableaux and matchings [19,20,21]. The geometrical realization of a certain complex of secondary structures was studied in [22].
A key determinant of a sequence–structure pair is its thermodynamic stability, which is measured via its free energy. Given a sequence σ , RNA structure prediction algorithms identify structures by assuming minimal or low free energy with respect to σ [23]. Given any RNA sequence–structure pair ( a , S ) , the free energy η ( a , S ) is computed by adding all loop energy contributions [24,25], i.e., η ( a , S ) = s S η ( a , s ) . Here, η ( a , s ) depends on the loop type of s (hairpin, interior, multi-branch, etc.) and the particular nucleotides of the sequence a that appear in the loop s. Details on loop decompositions, partition functions, and recursive constructions for the folded structure—derived via polynomial-time dynamic programming (DP)—can be found in [9,23,26,27,28]. In [12], the notion of bi-secondary structures was introduced. Bi-secondary structures are central to identifying sequences that can realize two mutually exclusive minimum free energy (mfe) conformations. Such sequences naturally appear in the context of evolutionary transitions [29] and of RNA riboswitches, i.e., sequences that exhibit two distinct, stable configurations [30].

1.2. Motivation

This paper was motivated by the problem of computing thermodynamically stable sequences for two distinguished secondary structures, S and T [2]. The problem can be formalized as the computation of a sequence a that minimizes η ( a , S , T ) , where
η ( a , S , T ) = s S η ( a , s ) + t T η ( a , t ) .
The sub-problems of the underlying DP routine are associated with sets of loops and recursively constructed by adding one loop at a time, where subsequently added nucleotides affect the energy calculation if they appear in multiple loops simultaneously; see Figure 2. This leads to the consideration of all such intersections as a simplicial complex, X, whose homology provides insights into the above optimization problem.
In [1,31], we computed the simplicial homology groups of X and proved that only the second homology group carries information. This group was identified to be free, and its generators were identified as the crossing components within the diagram representation. All known RNA riboswitches exhibit one such crossing component, while random pairs of structures contain multiple crossing components.
The simplicial homology of X in [1] only allows us to express the existence of loop intersections. However, the crucial determinant for the algorithmic complexity of the DP routine in [2] is the size of such intersections; see Figure 2. The recursion is over the set of loops, successively examining one loop at a time. Each such examination affects certain vertices that the currently examined loop shares with the unprocessed loops.
It is thus natural to employ a homology theory that accounts for the intersection size. The notion of weighted homology as originally formulated [3], assigns a certain weight to each simplex, imposing specific divisibility conditions on these weights.
Our framework represents a variation on weighted homology as follows: We consider homology with coefficients in a discrete valuation ring, which, in fact, makes any divisibility constraints obsolete and allows us to deal with the cardinality of intersections in a natural way. The boundary operator introduced here is new, and so is our approach to connecting the weighted and simplicial homologies. More specifically, the construction of an inflation map relating H 1 ( X 1 ) and H 1 , R ( X 1 ) is a new technique. The computation of the weighted homology groups, H 0 , R , H 1 , R ( X ) , and H 2 , R ( X ) , is also novel. Specifically, the idea of obtaining a combinatorial interpretation for the invariant factors in the structure theorems for H 0 , R and H 1 , R ( X ) via particular X 1 -spanning trees is original. The chain maps between C n ( X ) and C n , R ( X ) , which relate the H i ( X ) and H i , R ( X ) introduced here, are new and nontrivial. In particular, the injection inj : H 2 ( X ) H 2 , R ( X ) that induces the isomorphism H 2 , R ( X ) R Z H 2 ( X ) is a result of the fact that, in the expressions for the generators of H 2 , R ( X ) , any triangle has a corresponding intersection size of one.

1.3. Organization

The paper is organized as follows: In Section 2, the simplicial loop complex of a bi-structure is introduced and the splicing of nucleotides is recalled [1]. In Section 3, the weighted complex of a bi-structure and a modified splicing procedure are discussed. In Section 4, the weighted homology and simplicial collapses [3,32] are introduced. In Section 5, a relative point of view is adopted by relating the simplicial and weighted homologies. Specifically, we introduce the inflation map, which maps classes of unweighted 1-cycles into corresponding weighted 1-cycles. In Section 6, we construct certain spanning trees that will be instrumental for proving our main results in Section 7. Here, we compute the weighted homology modules of bi-structures and provide combinatorial interpretations. The torsion module emerging in the zeroth homology stems from a recursive contraction scheme on the distinguished weighted spanning trees of the complex. The torsion of the first homology module can be understood via particular weighted edges that reference the aforementioned spanning tree. We conclude the paper with the computation of the second homology module, with all others being trivial. We integrate our results and outline future work in Section 8.

2. The Simplicial Loop Complex of a Bi-Structure

An RNA diagram S over the set [ n ] : = { 1 , , n } is a vertex-labeled graph—whose vertices represent nucleotides—drawn on the horizontal axis and labeled with elements of the set [ n ] . An arc α = ( i , j ) , i < j , is an ordered pair of vertices, which represents the base pairing between the i-th and j-th nucleotides. Each S-vertex can be paired with at most one other vertex, and the arc that connects them is drawn in the upper half-plane. An RNA diagram over the set [ n ] is augmented with two additional vertices associated with positions 0 and n + 1 , together with the arc ( 0 , n + 1 ) , which is called the rainbow, representing the closing base pair of the external loop of the RNA structure; see [23]. The set of vertices { i , i + 1 , , j 1 , j } is called an interval, and is denoted by [ i , j ] . The set [ 0 , n + 1 ] : = { 0 , 1 , , n , n + 1 } is referred to as the backbone of the diagram. Two arcs, ( i , j ) and ( p , q ) with i < p , are called crossing if and only if i < p < j < q . S is called a secondary structure if it does not contain any crossing arcs. A loop s in a secondary structure S is represented as the disjoint union of intervals s = ˙ i = 1 k [ a i , b i ] , such that ( a 1 , b k ) and ( b i , a i + 1 ) , for 1 i k 1 , are arcs, and any other interval-vertices are unpaired. We denote by α s the unique, maximal arc ( a 1 , b k ) of the loop s, and we have a correspondence between loops and their maximal arcs; see Figure 1.
S-arcs and loops can be endowed with a partial order: ( k , l ) S ( i , j ) i < k < l < j . Abusing the notation, for the two loops s 1 , s 2 S , we write s 1 S s 2 if α s 1 S α s 2 . Accordingly, (a) any unpaired vertex is contained in exactly one loop, (b) any non-rainbow arc appears in exactly two loops, and (c) the Hasse diagram of ( S , S ) is a rooted tree Tr ( S ) with the rainbow arc as the root.
Given two secondary structures S and T, we refer to R = ( S , T ) as a bi-secondary structure (bi-structure). Abusing the notation, we let R = S ˙ T be the loop set of R. We represent a bi-structure R = ( S , T ) with the S-arcs in the upper and the T-arcs in the lower half-plane along the same horizontal backbone.
Let A = { A 0 , A 1 , , A m } be a collection of finite sets. We call B = { A i 0 , , A i d } A a d-simplex of A if k = 0 d A i k . We set Ω ( B ) = k = 0 d A i k and let ω ( B ) = | Ω ( B ) | 0 . Let K d ( A ) be the set of all d-simplices of A. Then, the complex (nerve) of A is K ( A ) = d = 0 K d ( A ) 2 A . A d -simplex B K ( A ) is called a d -face of B if d d and B B . Let B be a d -face of a maximal simplex B, where d < d . Then, B is B-free if no other maximal simplex of K ( A ) contains B as a face. By construction, K ( A ) is a simplicial complex. For a secondary structure S, K ( S ) Tr ( S ) is a tree.
Let R = ( S , T ) be a bi-structure with the loop complex X = ˙ d = 0 K d ( R ) . Then, a simplicial order, R , is given by
r 1 R r 2 r 1 , r 2 T and r 1 T r 2 r 1 , r 2 S and r 1 S r 2 r 1 S , r 2 T .
We specify a d-simplex, σ X , by the ordered d + 1 -tuple σ = [ r i 0 , r i 1 , . . . , r i d ] , where r i 0 R r i 1 R R r i d .
A 1-simplex τ = [ r i 0 , r i 1 ] X is called pure if r i 0 and r i 1 are loops in the same secondary structure and mixed otherwise; see Figure 3.
By construction, any 2-simplex Δ X contains exactly one pure and two mixed edges. Furthermore, X cannot contain simplices of a dimension greater than or equal to four, as that would imply that three or more loops in the same secondary structure intersect non-trivially. Moreover, for any σ = [ s 0 , s 1 , t 0 , t 1 ] K 3 ( X ) , 1 ω ( σ ) 2 and Ω ( σ ) P = { p [ n ] deg ( p ) = 4 } .
In [31], we investigated the effect of splicing a nucleotide p into two adjacent nucleotides q 1 , q 2 such that the two arcs incident to p were resolved into two non-crossing arcs. In the case ω ( σ ) = 2 , splitting does not change X, and if ω ( σ ) = 1 , splitting induces a specific alteration: the removal of σ from X, as well as a distinguished free edge, τ σ , and exactly two free triangles, Δ 1 σ , Δ 2 σ , glued at τ σ . We refer to ( τ σ , Δ 1 σ , Δ 2 σ ) as a σ -butterfly; see Figure 4.
Lemma 1.
Let R = ( S , T ) be a bi-structure with complex X; then, there exists a simple bi-structure R = ( S , T ) with a complex X < X , where X is derived from X by successively removing any 3-simplex, σ, together with an associated σ-butterfly.
Thus, successively splicing all nucleotides in P induces a bi-structure ( S , T ) , together with an associated complex X < X , which is obtained by removing all σ -butterflies and their corresponding σ s from X.
Removing a σ -butterfly and its corresponding σ is equivalent to X X —a simplicial collapse consisting of two elementary collapses [32,33]. The first removes σ and Δ 1 σ , and the second removes τ σ and Δ 2 σ .
Proposition 1.
Let X be the loop complex of a bi-structure R = ( S , T ) , and let X < X be an X-sub complex obtained by removing all 3-simplices, σ, from X, together with corresponding σ-butterflies. Then, X is the complex of a bi-structure R = ( S , T ) that satisfies ( X ) 2 = X for any k 0 , H k ( X ) H k ( X ) .
Note that after splicing, we have simplified X to a complex of a bi-structure, X , thus reducing its maximum simplex dimension and maintaining its homology. In the following, we generalize this to weighted complexes.

3. μ -Splicings and the Weighted Complex of a Bi-Structure

We begin by modifying the splicing introduced in Section 2 as follows: Instead of replacing a nucleotide p P by the two nucleotides q 1 < q 2 , we substitute it by three nucleotides, q 1 < a < q 2 , such that the arcs in R that share p as an endpoint now have endpoints q 1 and q 2 , which are non-crossing, and a is unpaired. We call this a μ -splicing of p.
Note that μ -splicing and splicing all p P have exactly the same effect on X—namely, eventually removing any 3-simplex σ and a corresponding σ -butterfly. In terms of bi-structures, from ( S , T ) , μ -splicing P produces the bi-structure ( S , T ) augmented by a set of distinguished, unpaired nucleotides A = { a 1 , , a n } , which only appear in 0- and 1-simplices. By construction, the complex X of ( S , T ) does not contain any 3-simplices. Furthermore, after μ -splicing P, any distinguished nucleotide produced is flanked by two distinct endpoints of non-crossing arcs. Thus, for any simplex σ X , we have h ( σ ) 0 , where h ( σ ) denotes the difference between the number of non-distinguished and distinguished nucleotides contained in σ . Note that h is a natural generalization of ω in the context of complexes of bi-structures. Accordingly, we define a weighted complex associated with a bi-structure as follows.
First, we recall that a ring R is called a discrete valuation ring if it is a Principal Ideal Domain (PID) with exactly one non-zero maximal ideal. Any irreducible element is a generator of this ideal and is called a uniformizer for R, and furthermore, any two such elements differ only up to multiplication with a unit.
Definition 1.
Let X be a complex of a bi-structure ( S , T ) such that for any σ X , we have h ( σ ) 0 . Let R Z be a discrete valuation ring with uniformizer π. We define the weight function v : X R , v ( σ ) = π h ( σ ) , and call ( X , v ) the weighted complex of ( S , T ) .
Lemma 2.
Let ( S , T ) be a bi-structure with no distinguished vertices and complex X. Let ( S , T ) be derived from ( S , T ) via μ-splicing P, and we denote its complex by X . Then, there exists an embedding
ϵ : ( X , v ) ( X , v ) ,
where ϵ ( X ) is a X-subcomplex obtained by removing any 3-simplices and associated butterflies. Moreover, ϵ is an embedding of weighted complexes: σ = ϵ ( σ ) v ( σ ) = v ( σ ) .
Proof. 
Successively μ -splicing P generates the series of complexes X | P | = X > > X 0 = X . By construction, given X i and a 3-simplex σ , the corresponding μ -splicing at p replaces p with the triple q 1 i < a i < q 2 i . Since a i is unpaired, a i is contained in exactly two loops, λ i 1 , μ i 1 , the only two loops that change size in terms of number of nucleotides when passing to X i 1 . Since the distinguished nucleotide contributes a 1 , we note that v ( λ i 1 ) = v ( λ i ) and v ( μ i 1 ) = v ( μ i ) ; furthermore, v ( [ λ i , μ i ] ) = v ( [ λ i 1 , μ i 1 ] ) . Any other X i -simplex—differently from σ and its σ -butterfly, which are removed—is not affected by the splicing, and consequently retains its weight in X i 1 , whence the lemma. □

4. Weighted Homology

Consider n v : C n , R ( X ) C n 1 , R ( X ) , where C n , R ( X ) denotes the free R-module generated by all n-simplices contained in X. Let n v be given by
n v ( σ ) = i = 0 n v ( σ ^ i ) v ( σ ) · ( 1 ) i σ ^ i ,
where the symbol v ( σ ^ i ) v ( σ ) = π h ( σ ^ i ) π h ( σ ) is defined to be π h ( σ ^ i ) h ( σ ) . This is, in view of h ( σ ) h ( σ ^ i ) , a non-negative power of the uniformizer π and, hence, still an element of the ring R. As a result, the boundary operator n v is well defined. Since v ( σ ^ i , j ) v ( σ ^ i ) · v ( σ ^ i ) v ( σ ) = v ( σ ^ j , i ) v ( σ ^ j ) · v ( σ ^ j ) v ( σ ) , we obtain n 1 v ( n v ( σ ) ) = 0 , i.e., n v is a boundary map.
Let H n , R ( X ) denote the n v -homology groups. For ( Y , v ) ( X , v ) , we have
0 C n , R ( Y ) I C n , R ( X ) J C n , R ( X , Y ) 0 , v v ¯ v 0 C n 1 , R ( Y ) I C n 1 , R ( X ) J C n 1 , R ( X , Y ) 0
and we denote the relative homology groups by H n , R ( X , Y ) . These are connected via the long exact sequence
H n , R ( X , Y ) ^ j v H n 1 ( Y ) I * H n 1 ( X ) J * H n 1 ( X , Y ) ^ n 1 v .
Lemma 2 guarantees that, for any weighted complex ( X , v ) of a bi-structure ( S , T ) , there exists a bi-structure ( S , T ) with a weighted complex ( X , v ) such that ( X , v ) < ( X , v ) and where X 2 = X . Thus, the relative homology groups H k , R ( X , X ) and their associated long sequence are well defined.
However, a weighted complex does not have to be induced by a bi-structure. The previous definitions still hold as long as σ σ v ( σ ) | v ( σ ) for an arbitrary ( X , v ) -pair. Weighted complexes represent a general combinatorial-algebraic framework that enhances the simplicial homology for a wide range of applications. The next proposition extracts the algebraic core of μ -splicings; it represents a variation of [32], and it also appears in some form in [3].
Proposition 2.
Let ( X , v ) be a weighted complex and let τ , σ X such that: (a) τ is σ-free; (b) v ( τ ) = v ( σ ) . Let Y be the X-subcomplex obtained by removing any τ τ σ ( X Y ). Then,
n 0 ; H n , R ( X ) H n , R ( Y ) .
Proof. 
Claim 1. It suffices to prove the Lemma for a σ-free τ with the property dim ( τ ) + 1 = dim ( σ ) .
We may assume that σ = [ v 0 , , v k ] and τ = [ v 0 ] . The lattice of sub-simplices of σ is isomorphic to a binary k-cube, where cover relations are induced by face inclusions.
We prove by induction on k that a k-cube can be recursively decomposed into a sequence of pairs Σ k = ( σ i , τ i ) 1 i 2 k 1 , where σ i is a maximal simplex with a free face τ i . The induction basis Σ 1 = ( [ 1 ] , [ 0 ] ) is immediate.
For the induction step, note that a k-cube decomposes into two disjoint copies of a ( k 1 ) -cube based on if the last coordinate is zero or one. By the induction hypothesis, for a ( k 1 ) -cube, there exists a sequence, Σ k 1 = ( σ i , τ i ) 1 i 2 k 2 , with the desired properties; appending one or zero as the last coordinate yields two mappings from the k 1 cube into the k cube. By construction, these form a family of disjoint-type embeddings, i.e., their images are injective copies of the k 1 -cube that are vertex disjoint in the k-cube. Accordingly, we obtain Σ k 1 1 = ( σ i 1 , τ i 1 ) i and Σ k 1 0 = ( σ i 0 , τ i 0 ) i . As the two sub-cubes are disjoint, we can immediately construct the desired sequence for the k-cube, Σ k = Σ k 1 1 Σ k 1 0 ; the claim follows.
Claim 2. Let X Y , where σ X maximal, τ is σ-free, and dim ( τ ) + 1 = dim ( σ ) = k . Then, for any n 0 , H n , R ( X , Y ) = 0 .
Note that only C k , R ( X , Y ) and C k 1 , R ( X , Y ) are non-trivial. C k + 1 , R ( X , Y ) = 0 implies Im ( ^ k + 1 v ) = 0 ; hence, H k , R ( X , Y ) = Ker ( ^ k v ) . Note that C k , R ( X , Y ) = σ R , so ^ k v ( r · σ ) = 0 implies r = 0 ; thus, H k , R ( X , Y ) = 0 . Similarly, C k 1 , R ( X , Y ) = τ R , so Ker ( ^ k 1 v ) C k 1 , R ( X , X ) is also generated by τ . Now, ^ k v ( σ ) = v ( τ ) / v ( σ ) · τ = τ , and so H k 1 , R ( X , Y ) = 0 . Finally, for n k , k 1 , Ker ( ^ n v ) C n , R ( X , Y ) = 0 , so H n , R ( X , Y ) = 0 for any n k , k 1 , and Claim 2 follows.
The proposition is implied in view of Claim 2 and the following long exact sequence:
H n + 1 , R ( X , Y ) n v H n , R ( Y ) I * H n , R ( X ) J * H n , R ( X , Y ) .
 □
In view of X X and v ( τ σ ) = v ( Δ 1 σ ) = v ( Δ 2 σ ) = v ( σ ) , for any σ X 3 and its corresponding σ -butterfly, Proposition 2 implies the following.
Corollary 1.
Let ( X , v ) be the complex of the bi-structure ( S , T ) , and let ( X , v ) be the complex of ( S , T ) that is obtained by completely μ-splicing ( S , T ) . Then,
k 0 , H k , R ( X ) H k , R ( X ) .

5. The Inflation Map

In view of Corollary 1, we can assume that X is the complex of a bi-structure with the property X 2 = X , where X k denotes the simplicial complex induced by all k-simplices in X [34]. Clearly, the natural embedding X 1 X is an embedding of weighted complexes, and we have the exact sequence
0 C n , R ( X 1 ) I C n , R ( X ) J C n , R ( X , X 1 ) 0 , v v ¯ v 0 C n 1 , R ( X 1 ) I C n 1 , R ( X ) J C n 1 , R ( X , X 1 ) 0
producing the long exact sequence
0 I * H 2 , R ( X ) J * H 2 , R ( X , X 1 ) ^ 2 v H 1 , R ( X 1 ) I * H 1 , R ( X ) 0 .
We now adopt a relative point of view by relating the simplicial homology H k ( X ) to the weighted homology H k , R ( X ) . To this end, we consider the following two segments of the long homology sequences
0 H 2 , R ( X ) J * H 2 , R ( X , X 1 ) ^ 2 v H 1 , R ( X 1 ) I * H 1 , R ( X ) 0 , inj inj ? 0 H 2 ( X ) J * H 2 ( X , X 1 ) ^ 2 H 1 ( X 1 ) I * 0
where the surjectivity of ^ 2 is given by H 1 ( X ) = 0 [1].
While these sequences belong to different categories, the first two vertical maps are natural injections and relate simplicial and weighted homology modules.
Indeed, the first vertical injection is induced by chain maps of the form θ n : C n ( X ) C n , R ( X ) , θ ( i n i σ i ) = i n i v ( σ i ) σ i , which make the following diagram commutative:
C n ( X ) n C n 1 ( X ) θ n θ n 1 C n 1 ( X ) n v C n 1 , R ( X )
The second vertical injection is given by the fact that, since Z R , C 2 ( X ) is an R-submodule of C 2 , R ( X ) and, by construction, H 2 ( X , X 1 ) = C 2 ( X ) and H 2 , R ( X , X 1 ) = C 2 , R ( X ) .
This motivates us to ask if there exists a Z -linear map that makes the above diagram commutative. It turns out that such a map exists and will prove useful for understanding H 1 , R ( X ) .
Lemma 3.
The mapping
infl : H 1 ( X 1 ) H 1 , R ( X 1 ) , infl ( c ) = ^ 2 v i Δ i ,
where c = 2 ^ ( i Δ i ) , is well defined and Z -linear; the following diagram is commutative:
H 2 , R ( X , X 1 ) ^ 2 v H 1 , R ( X 1 ) inj infl H 2 ( X , X 1 ) ^ 2 H 1 ( X 1 ) .
Proof. 
By construction, it suffices to show that infl is well defined. Observe that, in view of sequence (1), ^ 2 is surjective. Hence, any c H 1 ( X 1 ) can be expressed as c = i ^ 2 ( Δ i ) , where i Δ i is unique, modulo elements of Ker ( ^ 2 ) . Secondly, in [31], we showed that
Ker ( ^ 2 ) = β Z β ,
where β = j Δ j ; Δ j is contained in the ( S , T ) -crossing component β and v ( Δ j ) = 1 .
Any face e contained in ^ 2 ( β ) appears with opposite signs in two distinct the 2-simplices Δ e and Δ e , where v ( Δ e ) = v ( Δ e ) = 1 . Therefore,
^ 2 v ( β ) = e ^ 2 ( β ) v ( e ) v ( Δ e ) · e v ( e ) v ( Δ e ) · e = 0 ,
i.e., inj ( β ) Ker ( ^ 2 v ) ; infl is well defined, and a lemma follows. □
We next employ infl : H 1 ( X 1 ) H 1 , R ( X 1 ) to obtain information about H 1 , R ( X 1 ) . X 1 is connected, and for a spanning tree T X 1 , the free generators of H 1 ( X 1 ) are in bijection with κ ’s mixed X 1 cycle closing edges for T X 1 . A generator c κ H 1 ( X 1 ) is a sum of edges u j T X 1 and the closing edge κ , with each such edge appearing exactly once. We fix T X 1 as follows: Consider the trees T S , T T of the sub-complexes for S and T, respectively, and let T X 1 be the tree obtained by connecting the T S , T T -roots with the mixed edge ω , corresponding to the two rainbow loops.
Lemma 4.
The following sequence of R modules is exact:
0 R Z infl ( H 1 ( X 1 ) ) inj H 1 , R ( X 1 ) I * H 1 , R ( X ) 0 .
Proof. 
With T X 1 as above, consider a generator c κ = j u j H 1 ( X 1 ) where, for 1 < j < n , u j are pure S- or T-edges and u 1 = ω while u n = κ . Since ^ 2 is surjective, we can also write
c κ = i ^ 2 ( Δ i ) = i ( e 0 , i e 1 , i + e 2 , i ) .
This involves additional mutually canceling edges, k, all of which are mixed and appear as faces of pairs of 2-simplices, ( Δ k , Δ k ) . Without loss of generality (w.l.o.g.), we can traverse the cycle c κ beginning at ω via the pure S-edges until we arrive at κ , after which we return back to ω via the pure T-edges. Any ( Δ k , Δ k ) pair satisfies either v ( Δ k ) = v ( Δ k ) or v ( Δ k ) v ( Δ k ) , and we accordingly categorize the k edges into k α - and k β -edges, respectively. As a result, in
infl ( c κ ) = i ^ 2 v ( Δ i ) = i v ( e 0 , i ) v ( Δ i ) · e 0 , i v ( e 1 , i ) v ( Δ i ) · e 1 , i + v ( e 2 , i ) v ( Δ i ) · e 2 , i ,
k β -edges persist with the coefficient, say f ( k β ) R , while all k α -edges cancel. Hence,
infl ( c κ ) = j v ( u j ) v ( Δ j ) · u j + k β f ( k β ) · k β ,
where u j ^ 2 ( Δ j ) . Note that only κ and ω are mixed c κ -edges, and so { u j } { k β } = .
Any set M of c κ -cycles induces a set of distinct mixed faces ω , κ 1 , , κ h . In the two sub-trees of S and T formed by the respective pure edges of the c κ -paths, these mixed faces connect an S-leaf with a T-minimal vertex or a T-leaf with an S-minimal vertex. By construction, these edges are distinct from any of the k β produced by any M-cycle, and the coefficients of any of these c κ i 1 i h in a linear combination c κ M r κ · infl ( c κ ) = 0 are zero. Iterating this argument, we arrive at
κ r κ · infl ( c κ ) = 0 r κ = 0 .
Accordingly, { infl ( c κ ) κ } freely generate { infl ( c κ ) κ } R , and the following diagram is commutative:
H 2 , R ( X , X 1 ) ^ 2 v H 1 , R ( X 1 ) I * H 1 , R ( X ) 0 inj infl H 2 ( X , X 1 ) ^ 2 H 1 ( X 1 ) I * 0 0
Since Im ( ^ 2 v ) = R Z Im ( infl ) , we have H 1 , R ( X ) H 1 , R ( X 1 ) / ( R Z Im ( infl ) ) , whence
0 R Z infl ( H 1 ( X 1 ) ) inj H 1 , R ( X 1 ) I * H 1 , R ( X ) 0 .
 □

6. The Rank of H 1 , R ( X 1 )

Next, we compute the rank of H 1 , R ( X 1 ) by constructing a basis. Note that an element in H 1 ( X 1 ) is a sum of cycles, and it is natural to ask what the analogue of a cycle in H 1 , R ( X 1 ) is.
Let c = i e i H 1 ( X 1 ) be minimal in terms of the number of edges it contains. A straightforward computation shows that there exist r i R with 1 v ( i r i · e i ) = 0 , where the multi-set ( r i ) i is unique up to a scalar factor. This shows that, except for a scalar multiple, minimal H 1 ( X 1 ) cycles determine the distinguished H 1 , R ( X 1 ) cycles, and we shall call a cycle c = i r i e i H 1 , R ( X 1 ) elementary if gcd ( { r i } ) = 1 and i e i H 1 ( X 1 ) is a minimal cycle. An elementary cycle is precisely the H 1 , R ( X 1 ) -analogue of a minimal H 1 ( X 1 ) -cycle.
In view of X 1 being connected, a basis of H 1 ( X 1 ) can be comprised of all minimal cycles derived from a fixed X 1 -spanning tree T : Any X 1 non-tree edge added to this tree results in a unique minimal cycle in the basis that is labeled by the edge. One key observation here is that { 1 ( e ) | e T } generates 1 ( C 1 ( X 1 ) ) for an arbitrary spanning tree T of X 1 .
To derive a basis of H 1 , R ( X 1 ) in an analogous fashion, an arbitrary choice of an X 1 -spanning tree is no longer sufficient because of the weights involved. However, in the following, we construct a distinguished tree T X 1 # that is particularly well suited to the understanding of the embedding of H 1 ( X 1 ) into H 1 , R ( X 1 ) .
Lemma 5.
There exists an X 1 -spanning tree, T X 1 # , such that the following sequence is exact:
0 C 1 , R ( T X 1 # ) 1 v C 0 , R ( X ) proj H 0 , R ( X ) 0 .
Proof. 
It suffices to construct T X 1 # such that 1 v ( T X 1 # R ) = 1 v ( X 1 R ) . To this end, we fix an arbitrary X 1 -spanning tree T 0 and let M = X 1 T 0 . We examine M-edges one by one, yielding a chain of processed edges, = M 0 , , M k 1 M k , and a sequence of trees, T 0 , , T k 1 , T k . We claim that these have the property 1 v ( T k R ) = 1 v ( M k T 0 R ) .
The above holds trivially for k = 0 , and we assume that M k 1 and T k 1 are constructed with 1 v ( T k 1 R ) = 1 v ( M k 1 T 0 R ) . Let e k M M k 1 and set M k = M k 1 { e k } . Then, there exist r e R such that e { e k } T k 1 r e 1 v ( e ) = 0 . Letting r = gcd { r e } and r e = r r e leads to e { e k } T k 1 r e 1 v ( e ) = 0 , where, by construction, at least one edge has a coefficient of one. In case the r e k = 1 , set T k = T k 1 . Otherwise, remove a T k 1 -edge with coefficient one, and add e k to T k 1 , obtaining T k . Note then that
1 v ( T k R ) = 1 v ( { e k } T k 1 R ) = 1 v ( { e k } R ) + 1 v ( M k 1 T 0 R ) = 1 v ( M k T 0 R ) .
By induction, when the processes terminates, it yields the desired T X 1 # . □
T X 1 # induces a new set of closing edges κ # and the associated minimal H 1 ( X 1 ) -cycles c κ # . We shall show that the corresponding elementary cycles constitute a basis of H 1 , R ( X 1 ) .
Proposition 3.
There exists an isomorphism
H 1 , R ( X 1 ) R Z H 1 ( X 1 ) .
Furthermore, H 1 , R ( X 1 ) is freely generated by the set of elementary cycles c κ # and has rank e v + 1 , where e and v are the numbers of 1- and 0-simplices in X 1 .
Proof. 
H 1 , R ( X 1 ) is free, as it is a sub-module of the finitely generated free module over the PID R and C 1 , R ( X 1 ) . By construction, to each c κ # H 1 ( X 1 ) , there corresponds a unique elementary H 1 , R ( X 1 ) -cycle c κ # v , whose κ # coefficient is one. Furthermore, c κ # v is non-torsion in H 1 , R ( X 1 ) ; if r · c κ # v = 0 , R being an integral domain implies r = 0 .
Claim. Any c H 1 , R ( X 1 ) has the representation c = κ r κ # c κ # v with a unique r κ # R .
We proceed by induction on the number of T X 1 # -closing edges in c.
To establish the induction basis, let r κ # κ # be the unique term that contains κ # in c. Using the fact that κ # appears with a coefficient of one in c κ # v , we derive c = r κ # c κ # v .
For the induction step, distinguishing r κ 1 # κ 1 # as a closing edge summand in c, we have c r κ 1 # · c κ 1 # v = c ˜ H 1 , R ( X 1 ) with κ 1 # c ˜ . By inductive hypothesis, c ˜ = κ # c ˜ r κ # · c κ # v with unique coefficients r κ # , hence the claim.
From the claim and the fact that the c κ # v are non-torsion, we conclude that
H 1 , R ( X 1 ) κ R c κ # v R Z H 1 ( X 1 ) ,
hence the proposition. □

7. The Modules of the Weighted Homology

By Lemma 5, we have
rnk ( Im ( 1 v ) ) = rnk ( 1 v ( T X 1 # R ) ) = v 1 ,
from which rnk ( C 0 , R ( X ) / Im ( 1 v ) ) = 1 .
Since C 0 , R ( X ) / Im ( 1 v ) is a module over the PID R, we arrive at
H 0 , R ( X ) C 0 , R ( X ) / Im ( 1 v ) M π R ,
where n 1 n v 1 0 , and M π = j = 1 v 1 R / ( π n j ) is the π -torsion module of H 0 , R ( X ) .
We shall proceed by computing M π . Note that the T X 1 # -tree is a spanning tree of X 1 , and { 1 v ( e ) | e T X 1 # } generates Im ( 1 v ) . Thus, we can compute H 0 , R ( X ) via T X 1 # . To this end, we process T X 1 # as follows: We select a T X 1 # -vertex u i and an incident edge e i = ( u j , u i ) for which log π v ( u i ) v ( e i ) is minimal. We contract e i onto u j by removing u i and replacing any edges of the form ( u k , u i ) with ( u k , u j ) while maintaining their original weights v ( u k , u j ) : = v ( u k , u i ) . Recursively contracting all ( v 1 ) edges in T X 1 # produces a sequence of trees T X 1 # = T 0 , , T v 1 , where T v 1 = u v is a single vertex at which point the process terminates. Relabeling the T X 1 # -vertices, we may assume that u i is removed at the ith step of this process.
Theorem 1.
There exists an isomorphism
H 0 , R ( X ) i = 1 v 1 R / ( π n i ) R ,
where n i = log π v ( u i ) v ( e i ) , as described above.
Proof. 
Let E m = { e i = ( u a i , u i ) 0 i m } denote the set of recursively contracted edges up to and including the mth step. Set e i = ( u i , u a i ) , x i = u i v ( u a i ) v ( u i ) u a i , and x v = u v , where the formal fractions are well defined by the minimality of the denominator exponent. Note then that 1 v ( e i ) = v ( u i ) v ( e i ) u i v ( u a i ) v ( e i ) u a i = v ( u i ) v ( e i ) x i : = w i x i .
Claim 1. 1 v ( E i T i R ) = Im ( 1 v ) for any 0 i v 1 .
We proceed by induction on i, where Lemma 5 implies the induction basis i = 0 . Consider the contraction at step i: E i = E i 1 { e i = ( u a i , u i ) } . A T i 1 -edge of the form f k = ( u k , u i ) induces the T i -edge f k = ( u k , u a i ) with v ( f k ) = v ( f k ) . Then,
1 v ( f k ) 1 v ( f k ) = v ( u k ) v ( f k ) u k v ( u a i ) v ( f k ) u a i v ( u k ) v ( f k ) u k v ( u i ) v ( f k ) u i = 1 v ( e i )
implies 1 v ( E i 1 T i 1 R ) = 1 v ( E i T i R ) , and Claim 1 follows by induction.
Claim 2. C 0 , R ( X ) = i = 1 v R x i .
Let z = j = 1 v r j u j C 0 , R ( X ) , where u i is removed from T i 1 at the ith contraction step. We rewrite the z in terms of the x i . This is done recursively, starting with u 1 , as follows: Let C p ( u q ) be the coefficient of u q at step p. Then, x i = u i v ( u a i ) v ( u i ) u a i implies C i ( u i ) u i = C i ( u i ) x i + C i ( u i ) v ( u a i ) v ( u i ) u a i , and we obtain the updated u a i -coefficient
C i + 1 ( u a i ) = C i ( u a i ) + C i ( u i ) v ( u a i ) v ( u i ) .
Processing all u i , we arrive at z = = 1 v 1 r i x i + r v x v , and Claim 2 follows.
Claim 3. H 0 , R ( X ) i = 1 v 1 R x ¯ i R x v with x ¯ i = x i + 1 v ( e i ) R .
By Claim 2, we have H 0 , R ( X ) = ( i = 1 v R x i ) / Im ( 1 v ) , and we make the following ansatz:
φ : H 0 , R ( X ) i = 1 v R x ¯ i R x v , φ ( i = 1 v 1 r i x i + r v x v ) + Im ( 1 v ) = i = 1 v 1 r i x ¯ i + r v x v .
First, φ is a well-defined homomorphism: Suppose i r i x i i r i x i Im ( 1 v ) . Since x v is linearly independent from x i , 1 i v 1 , this immediately implies that r v = r v . Claim 1 guarantees that Im ( 1 v ) = 1 v ( E v 1 R ) , and consequently, there exist unique coefficients, α i , such that
i = 1 v 1 r i x i i = 1 v 1 r i x i = 1 v ( i = 1 v 1 α i e i ) .
In view of 1 v ( e i ) = n i x i , we obtain
i = 1 v 1 r i x i i = 1 v 1 r i x i = i = 1 v 1 α i 1 v ( e i ) = i = 1 v 1 s i n i x i .
The linear independence of x i , in turn, implies that ( r i r i ) = s i n i for any i, from which ( r i r i ) x i = s i 1 v ( e i ) , and thus, r i x ¯ i = r i x ¯ i .
φ is, by construction, surjective and does not have a nontrivial kernel from which Claim 3 follows.
By construction, R x v R , and as for the direct summands R x ¯ k , for 1 k v 1 , we consider the mapping h : R x ¯ k R / R n k , h ( r x ¯ k ) = r mod n k . Suppose that ( r k r k ) x k = s k 1 v ( e k ) ; then, 1 v ( e k ) = n k x k implies that ( r k r k ) = s k n k , so h is well defined. Clearly h is surjective, and it remains to check the injectivity. h ( r x ¯ k ) = 0 implies that r = s n k and s n k x k = s 1 v ( e k ) implies that r x ¯ k = 0 ¯ , hence the theorem. □
We proceed by analyzing H 1 , R ( X ) . We begin with the expression for infl ( c κ # ) in Lemma 4:
infl ( c κ # ) = j v ( u j ) v ( Δ j ) · u j + k β f ( k β ) · k β ,
where Δ j is the 2-simplex containing the edge u j , and k β are mixed edges that appear when transitioning between 2-simplices with different weights that share a k β face.
The “troublemakers” here are the off-diagonal terms of the embedding, k β , which stem from passing from trivial to nontrivial crossing components and vice versa. We shall eliminate such transitions by appropriately deforming X without affecting its homology. We then have to check that the deformation gives rise to an inflation map that is just a restriction of the original inflation map for X. Let Δ X be a 2-simplex corresponding to a non-crossing arc, i.e., Δ is maximal in X. W.l.o.g., we can assume that Δ = [ s 1 , s 2 , t ] with s 1 , s 2 S , t T . Then, τ = [ s 1 , s 2 ] is Δ -free, and by construction, v ( τ ) = v ( Δ ) = 2 . Removing τ and Δ from X produces a sub-complex X with X X . By Proposition 2, we may remove all 2-simplices contained in X corresponding to non-crossing arcs together with their pure faces without changing the homology. Accordingly, the resulting X ˜ is an X-sub-complex such that H i , R ( X ˜ ) H i , R ( X ) . Note that X ˜ is, in general, not a complex induced by a bi-structure.
Lemma 6.
There exists an X ˜ 1 -spanning tree, T X ˜ 1 # , with 1 v ( T X ˜ 1 # R ) = 1 v ( X 1 R ) .
Proof. 
Let T 0 be a fixed X ˜ 1 -spanning tree. We then continue with X ˜ 1 T 0 -edge replacements, as in Lemma 5, and the process terminates with the X ˜ 1 -spanning tree T k = T X ˜ 1 # , which has the property 1 v ( T X ˜ 1 # R ) = 1 v ( C 1 , R ( X ˜ ) ) .
Note that a pure edge e that is not in X ˜ can be written as e = 2 v ( Δ e ) + e 2 v ( Δ e ) , where Δ e X 2 is the unique 2-simplex containing the pure edge e. Since v ( e ) = v ( Δ e ) , we have ( e 2 v ( Δ e ) ) C 1 , R ( X ˜ ) or ( 2 v ( Δ e ) + e ) C 1 , R ( X ˜ ) , depending on the sign of e in 2 v ( Δ e ) . Consequently, 1 v ( C 1 , R ( X ˜ ) ) = 1 v ( C 1 , R ( X ) ) , and the lemma follows. □
Note that H 1 , R ( X ˜ ) Ker ( 1 v ( X ˜ ) ) / Im ( 2 v ( X ˜ ) ) and Ker ( 1 v ( X ˜ ) ) = H 1 , R ( X ˜ 1 ) . Furthermore, by Proposition 3, H 1 , R ( X ˜ 1 ) is generated by elementary cycles induced by T X ˜ 1 # . Hence, we can exploit the structure of T X ˜ 1 # in order to compute H 1 , R ( X ˜ ) H 1 , R ( X ) .
Theorem 2.
H 1 , R ( X ) is an R-torsion module, and there exists an isomorphism
H 1 , R ( X ) κ # R / ( π | κ # | 1 ) ,
where κ # is a closing edge of a minimal cycle c κ # H 1 ( X ˜ ) with respect to T X ˜ 1 # .
Proof. 
We begin by establishing that the inflation map naturally restricts to X ˜ .
Claim 1. We have the commutative diagram
0 H 1 ( X 1 ) infl H 1 , R ( X 1 ) I * H 1 , R ( X ) 0 . I * I * 0 H 1 ( X ˜ 1 ) infl | H 1 ( X ˜ 1 ) H 1 , R ( X ˜ 1 ) I * H 1 , R ( X ˜ ) 0
Clearly, H 2 ( X 1 , X ˜ 1 ) = 0 and H 2 , R ( X 1 , X ˜ 1 ) = 0 , and by the long homology sequences, I * : H 1 ( X ˜ 1 ) H 1 ( X 1 ) and I * : H 1 , R ( X ˜ 1 ) H 1 , R ( X 1 ) are embeddings. Lemma 6 shows that T X ˜ 1 # is also a T X 1 # -tree, from which H 1 ( X 1 ) is obtained from H 1 ( X ˜ 1 ) by just adding all elementary cycles induced by the pure edges that are not contained in X ˜ . Consequently, removing these edges from X 1 is equivalent to restricting the inflation map infl : H 1 ( X 1 ) H 1 , R ( X 1 ) to the sub-complex X ˜ 1 , and Claim 1 follows.
Claim 2. For infl | H 1 ( X ˜ 1 ) : H 1 ( X ˜ 1 ) H 1 , R ( X ˜ 1 ) , we have infl | H 1 ( X ˜ 1 ) ( c κ # ) = π | κ # | 1 c κ # v . As in Lemma 4:
infl ( c κ # ) = j v ( u j ) v ( Δ j ) · u j + k β f ( k β ) · k β ,
where Δ j is the 2-simplex containing the edge u j , and k β are mixed edges that appear when transitioning between 2-simplices with different weights that share a k β face. By construction, for any Δ X ˜ 2 , v ( Δ ) = π 1 , and so no k β edges emerge in the inflation images. In particular, κ # has the coefficient v ( κ # ) / v ( Δ κ # ) = π | κ # | 1 in infl ( c κ # ) . By Lemma 6 1 v ( T X ˜ 1 # R ) = 1 v ( X ˜ R ) , and so the corresponding elementary cycle c κ # v has the coefficient 1 at κ # . Since H 1 , R ( X ˜ 1 ) is freely generated by all elementary cycles, Claim 2 follows.
H 1 , R ( X ˜ 1 ) is freely generated by M = { c κ # v κ # } and is free of rank | M | . By Lemma 4, | M | = rnk ( infl ( H 1 ( X ˜ 1 ) ) ) , from which H 1 , R ( X ˜ ) H 1 , R ( X ˜ 1 ) / ( R Z infl ( H 1 ( X ˜ 1 ) ) ) is, by the structure theorem of finitely generated modules over PIDs, a full-torsion module. The exact sequence
0 R Z infl ( H 1 ( X ˜ 1 ) ) inj H 1 , R ( X ˜ 1 ) I * H 1 , R ( X ˜ ) 0
implies that R Z infl ( H 1 ( X ˜ 1 ) ) = { π | κ # | 1 c κ # v κ # } R , and clearly,
H 1 , R ( X ) H 1 , R ( X ˜ ) κ # R / ( π | κ # | 1 ) .
 □
Theorem 3.
The injection inj : H 2 ( X ) H 2 , R ( X ) induces an isomorphism of R-modules:
H 2 , R ( X ) R Z H 2 ( X ) .
Proof. 
The long homology sequence for the weighted homology induces the exact sequence
0 H 2 , R ( X ) J * H 2 , R ( X , X 1 ) ^ 2 v Im ( ^ 2 v ) 0 ,
where H 2 , R ( X ) Ker ( 2 v ) and H 2 , R ( X , X 1 ) = C 2 , R ( X ) . Hence, Im ( J * ) = Ker ( 2 v ) = Ker ( ^ 2 v ) , and the sequence is exact. In view of Im ( 2 v ^ ) R H 1 , R ( X 1 ) and Proposition 3, all modules in the sequence are free and, thus, projective; hence, rnk ( H 2 , R ( X ) ) + rnk ( Im ( ^ 2 v ) ) = rnk ( H 2 , R ( X , X 1 ) ) . Analogously, the exact sequence
0 H 2 ( X ) J * H 2 ( X , X 1 ) ^ 2 H 1 ( X 1 ) I * 0
of free modules implies that rnk ( H 2 ( X ) ) + rnk ( H 1 ( X 1 ) ) = rnk ( H 2 ( X , X 1 ) ) . Since
R Z H 2 ( X , X 1 ) ) H 2 , R ( X , X 1 )
and rnk ( Im ( ^ 2 v ) ) = rnk ( R Z infl ( H 1 ( X 1 ) ) ) = rnk ( H 1 ( X 1 ) ) , we can conclude that
rnk ( H 2 , R ( X ) ) = rnk ( H 2 ( X ) ) .
Finally, since both H 2 , R ( X ) and R Z H 2 ( X ) are free finitely generated modules over the PID R, the theorem follows. □
Remark 1.
Since X 2 = X , by construction, C i > 2 , R ( X ) = 0 . As such, H i > 2 , R ( X ) = 0 , and, in view of Corollary 1, this holds for weighted complexes stemming from arbitrary bi-structures.

8. Conclusions and Future Work

In this paper, we introduced the weighted homology of the nerve complex X of an RNA bi-structure. We demonstrated that the weighted homology of a bi-structure distinctively augments the simplicial homology.
In the simplicial homology, only the zeroth and second homology groups are nontrivial [1]. Since the complex of a bi-structure is, by construction, connected, the simplicial homology carries key information exclusively via H 2 ( X ) .
The weighted homology not only conserves the information provided by the simplicial homology, but it also supplies additional information. Specifically, connectivity is still picked up by the free submodule of H 0 , R ( X ) ; however, additional torsion now emerges. This torsion is given a concrete combinatorial interpretation via the compression algorithm on the spanning tree T X 1 # . In fact, the compression procedure naturally generates the free submodule of rank one at its final step.
In the case of H 2 , R ( X ) , we consider H 2 ( X ) and observe that the nontrivial chain maps θ n : C n ( X ) C n , R ( X ) , θ ( i n i σ i ) = i n i v ( σ i ) σ i , produce an injection inj : H 2 ( X ) H 2 , R ( X ) , in which all coefficients are one, since any 2-simplex contained in the expression of an H 2 ( X ) -generator (i.e., appearing in a crossing component) has a weight of one. As a result, passing to the ring R preserves the information present in H 2 ( X ) .
In the computation of H 1 , R ( X ) , we employed the simplicial homology in a different way. Here, we consider the relative homology and transcend the information from the 1-skeleton, X 1 . While H 1 ( X ) is trivial [1], H 1 ( X 1 ) carries information, which we can utilize by extracting from
0 H 2 , R ( X ) J * H 2 , R ( X , X 1 ) ^ 2 v H 1 , R ( X 1 ) I * H 1 , R ( X ) 0 inj inj infl 0 H 2 ( X ) J * H 2 ( X , X 1 ) ^ 2 H 1 ( X 1 ) I * 0 0
the embedded exact sequence of Lemma 4. After resolving some technicalities, this allows us to compute H 1 , R ( X ) . Note that passing to the deformation X ˜ greatly simplifies the inflation map, and we observe that it acts on X ˜ as θ 1 would on the 1-skeleton.
The structure theorems for the weighted homology enable the classification of bi-structures by their R-modules. Such classifications can provide insights into the algorithmic complexity of problems involving bi-structures. For instance, the problem of computing thermodynamically stable sequences for a bi-structure can be efficiently solved in polynomial time when the second homology R-module H 2 , R ( X ) is trivial [2].
As for future work, we shall extend the homological analysis to RNA–RNA interaction structures [35]. An interaction structure can be represented as a diagram with two backbones drawn horizontally on top of each other such that both intra-molecular and inter-molecular bonds are non-crossing; see Figure 5.
The inter-molecular arcs naturally induce an equivalence relation ϕ on the vertices of the two backbones. The intersection of two loops is accordingly defined to be the set of ϕ -equivalence classes of vertices that they share. A bi-secondary structure can then be viewed as a particular type of interaction structure—namely, an interaction structure with two identical backbones, where interaction arcs connect any pair ( i , i ) for 0 i n + 1 . In contrast to a bi-structure, an interaction structure can exhibit nontrivial first-simplicial homology and requires revisiting the notion of a crossing component for interaction structures. The analysis involves many more topologies—in particular, surfaces and Mayer–Vietoris sequences [34].
The framework of the weighted homology is by no means restricted to bi- or interaction structures. It also gives rise to the consideration of the dissimilarity complex of a finite set of genomic sequences, which we briefly discuss in the following. A multiple sequence alignment W of genomic sequences is defined to be a finite set of words of equal length m over the finite alphabet A = { a 1 , , a s } . For any k { 1 , , m } , let f k : W A be the map that returns the symbol at position k of a word, namely, f k ( w ) = a w k for w = a w 1 a w m W . A d + 1 subset σ W forms a d-simplex if there exists at least one position j such that | { f j ( w ) | w σ } | = d + 1 . In other words, all d + 1 sequences in σ contain mutually different nucleotide types at the site j. For a fixed σ , the number of different positions j for which the above holds is called the dissimilarity of σ . This leads to the aforementioned dissimilarity complex for genomic sequences; see Figure 6. This complex is, by definition, low dimensional; its highest dimension is s 1 . For d + 1 = 2 , the 1-skeleton of the dissimilarity complex captures the well-known Hamming distance—the number of positions in which two genetic sequences differ. It may be worth pointing out that this framework integrates well-known generalizations of distances, as these appear naturally as weights in the dissimilarity complex. Triangle inequalities generalize to tetrahedron inequalities, etc., and this could enhance our understanding of genetic sequences. This is because basic constructs—e.g., trees, such as the tree of life—that reflect the ancestral relations between sequences depend on the notion of Hamming distance. Our preliminary investigations suggest that the homology of the dissimilarity complex captures evolutionary events. For example, the free rank of the first homology module gives bounds of the possible recombinations within the sequence set.

Author Contributions

Conceptualization, A.B., Q.H. and C.R.; writing, A.B., Q.H. and C.R.; review, A.B., Q.H. and C.R.; investigation: A.B., Q.H. and C.R. All authors have contributed equally to this work. All authors have read and agreed to the possible publication of the manuscript.

Funding

This research received no external funding; the authors received no external financial support for the research, authorship, and/or publication of this article.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We gratefully acknowledge the comments from and discussions with Fenix Huang and Thomas Li. We also appreciate the constructive feedback provided by the anonymous reviewers.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bura, A.C.; He, Q.; Reidys, C.M. Loop Homology of Bi-secondary Structures. Discret. Math. 2021, 344, 112371. [Google Scholar] [CrossRef]
  2. Huang, F.W.; Barrett, C.L.; Reidys, C.M. The energy-spectrum of bicompatible sequences. arXiv 2019, arXiv:1910.00190. [Google Scholar]
  3. Ren, S.; Wu, C.; Wu, J. Weighted persistent homology. Rocky Mt. J. Math. 2018, 48, 2661–2687. [Google Scholar] [CrossRef] [Green Version]
  4. Holley, R.W.; Apgar, J.; Everett, G.A.; Madison, J.T.; Marquisee, M.; Merrill, S.H.; Penswick, J.R.; Zamir, A. Structure of a ribonucleic acid. Science 1965, 147, 1462–1465. [Google Scholar] [CrossRef] [PubMed]
  5. Thirumalai, D.; Lee, N.; Woodson, S.A.; Klimov, D. Early events in RNA folding. Annu. Rev. Phys. Chem. 2001, 52, 751–762. [Google Scholar] [CrossRef] [Green Version]
  6. Fresco, J.R.; Alberts, B.M.; Doty, P. Some molecular details of the secondary structure of ribonucleic acid. Nature 1960, 188, 98–101. [Google Scholar] [CrossRef] [PubMed]
  7. Darnell, J.E. RNA: Life’s Indispensable Molecule; Cold Spring Harbor Laboratory Press: New York, NY, USA, 2011. [Google Scholar]
  8. Chapuy, G. A new combinatorial identity for unicellular maps, via a direct bijective approach. Adv. Appl. Math. 2011, 47, 874–893. [Google Scholar] [CrossRef]
  9. Waterman, M.S. Secondary structure of single-stranded nucleic acids. Adv. Math. Suppl. Stud. 1978, 1, 167–212. [Google Scholar]
  10. Schmitt, W.R.; Waterman, M.S. Linear trees and RNA secondary structure. DIscrete Appl. Math. 1994, 51, 317–323. [Google Scholar] [CrossRef]
  11. Hofacker, I.L.; Schuster, P.; Stadler, P.F. Combinatorics of RNA secondary structures. Discret. Appl. Math. 1998, 88, 207–237. [Google Scholar] [CrossRef] [Green Version]
  12. Haslinger, C.; Stadler, P.F. RNA structures with pseudo-knots: Graph-theoretical, combinatorial, and statistical properties. Bull. Math. Biol. 1999, 61, 437–467. [Google Scholar] [CrossRef] [PubMed]
  13. Jin, E.Y.; Qin, J.; Reidys, C.M. Combinatorics of RNA structures with pseudoknots. Bull. Math. Biol. 2008, 70, 45–67. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Orland, H.; Zee, A. RNA folding and large N matrix theory. Nucl. Phys. B 2002, 620, 456–476. [Google Scholar] [CrossRef] [Green Version]
  15. Andersen, J.E.; Chekhov, L.O.; Penner, R.; Reidys, C.M.; Sułkowski, P. Topological recursion for chord diagrams, RNA complexes, and cells in moduli spaces. Nucl. Phys. B 2013, 866, 414–443. [Google Scholar] [CrossRef] [Green Version]
  16. Bon, M.; Vernizzi, G.; Orland, H.; Zee, A. Topological classification of RNA structures. J. Mol. Biol. 2008, 379, 900–911. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  17. Andersen, J.E.; Penner, R.C.; Reidys, C.M.; Waterman, M.S. Topological classification and enumeration of RNA structures by genus. J. Math. Biol. 2013, 67, 1261–1278. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  18. Huang, F.W.; Reidys, C.M. Shapes of topological RNA structures. Math. Biosci. 2015, 270, 57–65. [Google Scholar] [CrossRef] [Green Version]
  19. Chen, W.; Deng, E.; Du, R.; Stanley, R.; Yan, C. Crossings and nestings of matchings and partitions. Trans. Am. Math. Soc. 2007, 359, 1555–1575. [Google Scholar] [CrossRef] [Green Version]
  20. Stanley, R.P. Enumerative Combinatorics; Wadsworth Publ.: Belmont, CA, USA, 1986. [Google Scholar]
  21. Sundaram, S. The Cauchy Identity for sp (2n). J. Comb. Theory Ser. A 1990, 52, 209–238. [Google Scholar] [CrossRef] [Green Version]
  22. Penner, R.; Waterman, M.S. Spaces of RNA secondary structures. Adv. Math. 1993, 101, 31–49. [Google Scholar] [CrossRef] [Green Version]
  23. Zuker, M.; Sankoff, D. RNA secondary structures and their prediction. Bull. Math. Biol. 1984, 46, 591–621. [Google Scholar] [CrossRef]
  24. Gralla, J.; Crothers, D.M. Free energy of imperfect nucleic acid helices: II. Small hairpin loops. J. Mol. Biol. 1973, 73, 497–511. [Google Scholar] [CrossRef]
  25. Turner, D.H.; Mathews, D.H. NNDB: The nearest neighbor parameter database for predicting stability of nucleic acid secondary structure. Nucleic Acids Res. 2009, 38, D280–D282. [Google Scholar] [CrossRef] [PubMed]
  26. Nussinov, R.; Jacobson, A.B. Fast algorithm for predicting the secondary structure of single-stranded RNA. Proc. Natl. Acad. Sci. USA 1980, 77, 6309–6313. [Google Scholar] [CrossRef] [Green Version]
  27. Waterman, M.S.; Smith, T.F. Rapid dynamic programming algorithms for RNA secondary structure. Adv. Appl. Math. 1986, 7, 455–464. [Google Scholar] [CrossRef] [Green Version]
  28. McCaskill, J.S. The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolym. Orig. Res. Biomol. 1990, 29, 1105–1119. [Google Scholar] [CrossRef]
  29. Reidys, C.M. Random induced subgraphs of generalizedn-cubes. Adv. Appl. Math. 1997, 19, 360–377. [Google Scholar] [CrossRef] [Green Version]
  30. Flamm, C.; Hofacker, I.L.; Maurer-Stroh, S.; Stadler, P.F.; Zehl, M. Design of multistable RNA molecules. RNA 2001, 7, 254–265. [Google Scholar] [CrossRef] [Green Version]
  31. Bura, A.C.; He, Q.; Reidys, C.M. Loop homology of bi-secondary structures II. arXiv 2019, arXiv:1909.01222. [Google Scholar]
  32. Whitehead, J. Simplicial Spaces, Nuclei and m-Groups. Proc. Lond. Math. Soc. 1939, 2, 243–327. [Google Scholar] [CrossRef]
  33. Cohen, M.M. A Course in Simple-Homotopy Theory, 2nd ed.; Springer: Berlin, Germany, 1973. [Google Scholar]
  34. Hatcher, A. Algebraic Topology; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar]
  35. Andersen, J.E.; Huang, F.W.; Penner, R.C.; Reidys, C.M. Topology of RNA-RNA interaction structures. J. Comput. Biol. 2012, 19, 928–943. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Left: a planar RNA secondary structure with a closing base pair r = ( 0 , 20 ) . Right: its diagram representation on the set of vertices [ 19 ] with two additional vertices, 0 and 20, forming the rainbow r = ( 0 , 20 ) , the loop s = [ 3 , 4 ] [ 8 , 12 ] [ 16 , 17 ] (shaded), and the corresponding intervals (underlined). The arc x = ( 3 , 17 ) is the maximal arc of s, α s = x .
Figure 1. Left: a planar RNA secondary structure with a closing base pair r = ( 0 , 20 ) . Right: its diagram representation on the set of vertices [ 19 ] with two additional vertices, 0 and 20, forming the rainbow r = ( 0 , 20 ) , the loop s = [ 3 , 4 ] [ 8 , 12 ] [ 16 , 17 ] (shaded), and the corresponding intervals (underlined). The arc x = ( 3 , 17 ) is the maximal arc of s, α s = x .
Mathematics 09 00744 g001
Figure 2. Top: a bi-structure decomposed into a distinguished sub-structure and its complement. The vertices contained in both (white) control the complexity of the dynamic programming (DP) routine in [2]. Bottom: the recursive step of the DP routine. Note that the recursion changes the set of white vertices by removing some and adding others.
Figure 2. Top: a bi-structure decomposed into a distinguished sub-structure and its complement. The vertices contained in both (white) control the complexity of the dynamic programming (DP) routine in [2]. Bottom: the recursive step of the DP routine. Note that the recursion changes the set of white vertices by removing some and adding others.
Mathematics 09 00744 g002
Figure 3. Left: a bi-structure R = ( S , T ) with S-loops a , b , S-rainbow arc α a , T-loops c , d , and T-rainbow arc α c . Right: its weighted loop nerve K 0 ( R ) = { a , b , c , d } , K 1 ( R ) = { [ b , a ] , [ b , c ] , [ b , d ] , [ d , c ] , [ a , c ] , [ a , d ] } (with [ b , a ] and [ d , c ] being pure edges, and any other edge being mixed), and K 2 ( R ) = { [ b , a , c ] , [ b , d , c ] , [ b , a , d ] , [ a , d , c ] } . We have ω ( a ) = 7 , ω ( b ) = 7 , ω ( c ) = 8 , ω ( d ) = 6 , ω ( [ b , a , c ] ) = ω ( [ b , d , c ] ) = ω ( [ b , a , d ] ) = ω ( [ a , d , c ] ) = 1 , while the weights of 1-simplices are displayed directly in the figure.
Figure 3. Left: a bi-structure R = ( S , T ) with S-loops a , b , S-rainbow arc α a , T-loops c , d , and T-rainbow arc α c . Right: its weighted loop nerve K 0 ( R ) = { a , b , c , d } , K 1 ( R ) = { [ b , a ] , [ b , c ] , [ b , d ] , [ d , c ] , [ a , c ] , [ a , d ] } (with [ b , a ] and [ d , c ] being pure edges, and any other edge being mixed), and K 2 ( R ) = { [ b , a , c ] , [ b , d , c ] , [ b , a , d ] , [ a , d , c ] } . We have ω ( a ) = 7 , ω ( b ) = 7 , ω ( c ) = 8 , ω ( d ) = 6 , ω ( [ b , a , c ] ) = ω ( [ b , d , c ] ) = ω ( [ b , a , d ] ) = ω ( [ a , d , c ] ) = 1 , while the weights of 1-simplices are displayed directly in the figure.
Mathematics 09 00744 g003
Figure 4. Splicing and removing tetrahedra together with their corresponding butterflies.
Figure 4. Splicing and removing tetrahedra together with their corresponding butterflies.
Mathematics 09 00744 g004
Figure 5. Left: an interaction structure with two interaction arcs, x and y. Right: the corresponding interaction complex.
Figure 5. Left: an interaction structure with two interaction arcs, x and y. Right: the corresponding interaction complex.
Mathematics 09 00744 g005
Figure 6. A multiple-sequence alignment of three sequences X , Y , Z , their corresponding weighted simplices, and the associated dissimilarity complex.
Figure 6. A multiple-sequence alignment of three sequences X , Y , Z , their corresponding weighted simplices, and the associated dissimilarity complex.
Mathematics 09 00744 g006
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Bura, A.; He, Q.; Reidys, C. Weighted Homology of Bi-Structures over Certain Discrete Valuation Rings. Mathematics 2021, 9, 744. https://doi.org/10.3390/math9070744

AMA Style

Bura A, He Q, Reidys C. Weighted Homology of Bi-Structures over Certain Discrete Valuation Rings. Mathematics. 2021; 9(7):744. https://doi.org/10.3390/math9070744

Chicago/Turabian Style

Bura, Andrei, Qijun He, and Christian Reidys. 2021. "Weighted Homology of Bi-Structures over Certain Discrete Valuation Rings" Mathematics 9, no. 7: 744. https://doi.org/10.3390/math9070744

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop