# Graph Coverings for Investigating Non Local Structures in Proteins, Music and Poems

## Abstract

## 1. Introduction

#### A Brief Review of the Literature

## 2. Graph Coverings and Conjugacy Classes of a Finitely Generated Group

## 3. Graph Coverings for Proteins

#### 3.1. The D614G Variant (Minus RBD) of the SARS-CoV-2 Spike Protein

AYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHDNPVLPF…

AYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVV.

NTQEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKV…

FVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTV

rel(H,E,C,G,I,T,4) =

CCCCCCCCEEEEEECCCCCCCEEEEECCCCCCCCCCEEEEEECCCCCCCC…

HHHHHHHHCC444444CHHHHHHHHHHHHHHHHHHHHCCCGGGGGHHHHH

HHIIIIICCCCCCCCCCCCCCCCCCTTTTTCCCCCCCCCHHHHHHHHHHH…

CCCTTTTTCCCCCTTTTTCCCC44444EEEEEECC,

#### 3.2. The $\beta $-2-Glycoprotein 1 or Apolipoprotein-H

## 4. Graph Coverings for Musical Forms

#### 4.1. The Sequence Isoc$(X;1)$, the Golden Ratio and More

#### 4.1.1. The Fibonacci Sequence

#### 4.1.2. The Period Doubling Cascade

#### 4.1.3. Musical Forms of the Classical Age

#### 4.2. The Sequence Isoc$(X;2)$ in Twentieth Century Music and Jazz

## 5. Graph Coverings for Prose and Poems

#### 5.1. Graph Coverings for Prose

- Le gamin du céleste Empire hésita d’abord; puis, se ravisant, il répondit: “Je vais vous le dire ”. Peu d’instants après, il reparut, tenant dans ses bras un fort gros chat, et le regardant, comme on dit, dans le blanc des yeux, il affirma sans hésiter: “Il n’est pas encore tout à fait midi.” Ce qui était vrai.

#### 5.2. Graph Coverings for Poems

## 6. Conclusions

## References

**Figure 1.**A picture of the secondary structure of D614G variant (minus RBD) of the SARS-CoV-2 spike protein found in the protein data bank in Europe [22].

**Figure 2.**A picture of the secondary structure of the apolipoprotein-H obtained with the software [24].

**Table 1.**The number Isoc$(X;d)$ for small values of first Betti number r (alias the number of generators of the free group ${F}_{r}$) and index d. Thus, the columns correspond to the number of conjugacy classes of subgroups of index d in the free group of rank r.

r | d = 1 | d = 2 | d = 3 | d = 4 | d = 5 | d = 6 | d = 7 |
---|---|---|---|---|---|---|---|

1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |

2 | 1 | 3 | 7 | 26 | 97 | 624 | 4163 |

3 | 1 | 7 | 41 | 604 | 13,753 | 504,243 | 24,824,785 |

4 | 1 | 15 | 235 | 14,120 | 1,712,845 | 371,515,454 | 127,635,996,839 |

5 | 1 | 31 | 1361 | 334,576 | 207,009,649 | 268,530,771,271 | 644,969,015,852,641 |

**Table 2.**Group analysis of the D614G variant (minus RBD) of the SARS-CoV-2 spike protein. The bold numbers mean that the cardinality structure of cc of subgroups of G fits that of the free group ${F}_{r-1}$ when the encoding makes use of r letters. In the last column, r is the first Betti number of the generating group ${f}_{p}$.

PDB 6XS6: AYTNSFTRGVYYPDKVFRSSVLHSTQDL … | Cardinality Structure of cc of Subgroups | r |
---|---|---|

6 letters H, E, C, G, I, T | [1,31,1361,334576] | 5 |

5 letters H, E, C, G, I | [1,15,235,14120] | 4 |

4 letters H, E, C, G | [1,7,41 604,14720] | 3 |

3 letters H, E, C | [1,3,7,30,127, 926] | 2 |

**Table 3.**Group analysis of apolipoprotein-H (PDB 6V06). The bold numbers means that the cardinality structure of cc of subgroups of ${f}_{p}$ fits that of the free group ${F}_{3}$ when the encoding makes use of 2 letters. The first model is the one used in the previous Section [24] where we took $4=H$ and $T=C$. The other models of secondary structures with segments E, H and C are from softwares PORTER, PHYRE2 and RAPTORX. The references to these softwares may be found in our recent paper [3]. The notation r in column 3 means the first Betti number of ${f}_{p}$.

PDB 6V06: GRTCPKPDDLPFSTVVPLKTFYEPG… | Cardinality Structure of cc of Subgroups | r |
---|---|---|

Konagurthu | [1,3,7,26,218,2241] | 2 |

PORTER | [1,3,7,26,97,624] | . |

PHYRE2 | [1,3,7,26,157,1046] | . |

RAPTORX | [1,7,17,134,923,13317] | 3 |

**Table 4.**Group analysis of a few musical forms whose structure of subgroups, apart from exceptions, is close to Isoc$(X;d)$ with $d=2$ (at the upper part of the table) or $d=3$ (at the lower part of the table). Of course, the forms A-B-C and A-B-C-D have the cardinality sequence of cc of subgroups exactly equal to Isoc$(X;2)$ and Isoc$(X;3)$, respectively.

Musical Form | Ref | Card. Struct. of cc of Subgr. | r |
---|---|---|---|

A-B-C-B-A | arch, Belá Bartók | [1,3,7,26,97,624, | 2 |

. | . | 4163,34470,314493] | . |

A-B-A-C-A-B-A | . | . | . |

A-B-A-C-A, A-B-A-C-A-B-A | rondo | . | . |

A-B-A-C | . | . | |

A-A-B-C-C | Haydn [32], | . | . |

. | djanba ([33], Figure 9.8) | . | . |

A-A-A-A-B-B-A-A-C-C-A-A | twelve-bar blues, | [1,7,14,109,396,3347, | 3 |

. | standard | 19758,287340] | . |

A-A-A-A-B-B-A-A-C-B-A-A | twelve-bar blues, | [1,3,7,26,97,624, | 2 |

. | variation 1 | 4163,34470,314493] | . |

A-A-A-A-B-B-A-A-B-C-A-C | twelve-bar blues, | [1,3,7,26,127, 799, | . |

. | variation 2 | 5168, 42879] | . |

A-B-C | Isoc$(X;2)$ | [1,3,7,26,97,624, | 2 |

. | . | 4163,34470,314493] | . |

A-A-B-B-C-C-D-D | pot pourri | [1,15,82,1583,30242] | 4 |

A-B-A-C-A-D-A | rondo | [1,7,41,604,13753,504243] | 3 |

A-B-C-D | Isoc$(X;3)$ | [1,7,41,604,13753, | 3 |

. | . | 504243,24824785] | . |

**Table 5.**Group analysis of an excerpt of a small poem in prose Le vieux saltimbanque by Charles Baudelaire. The text is split into segments encoded by the symbol H (for names and adjectives), E (for verbs), A for prepositions, B for adverbs, or C (for the other types: conjunctions, punctuation marks and so on). The cardinality structure of the cc of subgroups of a small index is compared to the one obtained with 10 runs of a sequence of words of a similar length (i.e., the length 250) with the corresponding number of letters.

Le Gamin du Céleste Empire …Ce Qui était Vrai. | Card. Seq. of cc of Subgroups | r |
---|---|---|

3 letters: rel=${C}^{2}{H}^{5}{C}^{2}{H}^{7}{H}^{6}{E}^{6}{C}^{7}C{C}^{4}C{C}^{2}{E}^{8}C\cdots $ | [1,3,7,34,131] | 2 |

4 letters: rel=${C}^{2}{H}^{5}{A}^{2}{H}^{7}{H}^{6}{E}^{6}{C}^{7}C{C}^{4}C{C}^{2}{E}^{8}C\cdots $ | [1,7,41,636,14364] | 3 |

5 letters: rel=${C}^{2}{H}^{5}{A}^{2}{H}^{7}{H}^{6}{E}^{6}{B}^{7}C{B}^{4}C{C}^{2}{E}^{8}C\cdots $ | [1,15,235,14376,.] | 4 |

[Random[1,3]: i in [1..250]] | [1,1,1,2,4,4] | 1 |

(10 runs) | [1,3,2,9,5,20] | 2 |

[1,3,1,6,6,15] | . | |

[1,3,7,30,124,987] | . | |

[1,7,17,126,323,2445] | 3 | |

etc. | ||

Isoc(X;2) | [1,3,7,26,97,624] | 2 |

[Random[1,4]: i in [1..250]] | [1,3,7,30,.] ($\times 3)$ | 2 |

(10 runs) | [1,3,10,51,.] ($\times 3)$ | . |

[1,3,7,26,457] | . | |

[1,3,10,39,.] | . | |

[1,3,13,52,.] | . | |

[1,7,20,143,.] | 3 | |

Isoc(X;3) | [1,7,41,604,13573] | 3 |

[Random[1,5]: i in [1..250]] | [1,7,41,620,.] ($\times 3)$ | 3 |

(10 runs) | [1,7,41,636,.] ($\times 3)$ | . |

[1,7,41,604,.] ($\times 2)$ | . | |

[1,7,41,668,.] | . | |

[1,7,50,819,.] | . | |

Isoc(X;4) | [1,15,235,14120,1712845] | 4 |

**Table 6.**Group structure of the poem Le Bateau Ivre’ (The Drunken Boat) by Arthur Rimbaud. Only the first strophe (that has four lines) is analyzed, firstly in its original form, then in an English translation. Each line is split into segments encoded by the symbol H (for names and adjectives), E (for verbs) or C (for the other types: conjunctions, adverbs, prepositions, punctuation marks and so on). The group relation is displayed for the first line only.) The cardinality structure of cc of subgroups of a small index is compared to the one obtained with 10 runs of a sequence of random 3-letter words of similar length (i.e., the length 35).

Comme je descendais des fleuves impassibles, | [1,1,7,17,114,1395,36973] | 1 |

rel=${C}^{4}{C}^{2}{E}^{10}{C}^{3}{H}^{7}{H}^{11}C$ | ||

Je ne me sentis plus guidé par les haleurs: | [1,3,7,26,97, 624,4171] | 2 |

Des Peaux-Rouges criards les avaient pris pour cibles | [1,3,7,26,97, 624,4163] | . |

Les ayant cloués nus aux poteaux de couleurs. | [1,3,7,26,97,624,4163] | . |

As I was floating down unconcerned rivers | [1,3,7,26,97, 624,4163,34470] | 2 |

rel=${C}^{2}\ast C\ast {E}^{3}\ast {E}^{8}\ast {C}^{4}\ast {E}^{1}1\ast {H}^{6}$ | ||

I now longer felt myself steered by the haulers: | [1,3,7,26,101,656,4227] | 2 |

Gaudy Redskins had taken them for targets | [1,3,7,26,97,624,4163,324935] | . |

Nailing them naked to coloured states. | [1,3,7,42,202,1682,9204] | . |

[Random[1,3]: i in [1..35]] | [1,3,7,30,.] ($\times 3)$ | 2 |

(10 runs) | [1,3,7,26,.]( $\times 3)$ | . |

[1,3,7,.,.,] | . | |

[1,3,10,.,.]($\times 2)$ | . | |

[1,3,13,.,.] | . | |

Isoc(X;2) | [1,3,7,26,97,624,4163,34470] | 2 |

**Table 7.**The same as in Table 6, but each line is split into segments encoded by the symbol H (for names and adjectives), E (for verbs), A for prepositions, or C (for the other types: conjunctions, adverbs, punctuation marks and so on). The cardinality structure of cc of subgroups of a small index is compared to the one obtained with 10 runs of a sequence of random 4-letter words of similar length (i.e., the length 35).

Comme je descendais des fleuves impassibles, | [1,7,41,604,13753] | 3 |

rel=${C}^{4}{C}^{2}{E}^{10}{A}^{3}{H}^{7}{H}^{11}C$ | ||

Je ne me sentis plus guidé par les haleurs: | [1,7,41,604,13753] | . |

Des Peaux-Rouges criards les avaient pris pour cibles | [1,7,41,604,13753] | . |

Les ayant cloués nus aux poteaux de couleurs. | [1,7,41,604,13753] | . |

As I was floating down unconcerned rivers | [1,7,59,1386,27011] | 3 |

rel=${C}^{2}C{E}^{3}{E}^{8}{A}^{4}{E}^{11}{H}^{6}$ | ||

I no longer felt myself steered by the haulers: | [1,7,41,604,13753] | . |

Gaudy Redskins had taken them for targets | [1,7,50,1763,51582] | . |

Nailing them naked to coloured states. | [1,7,59,1002,18671] | . |

[Random[1,4]: i in [1..35]] | [1,7,50,755,.] (×2) | 3 |

(10 runs) | [1,7,41,604,.] $(\times 3)$ | . |

[ 1,7,41,.,.]($\times 2)$ | . | |

[1,7,50,739,.]($\times 2)$ | . | |

[1,7,59,.,.] | . | |

Isoc(X;3) | [1,7,41,604,13753] | 3 |

