Projected Gradient Descent Method for Tropical Principal Component Analysis over Tree Space

Yoshida, Ruriko

doi:10.3390/math13111776

Open AccessArticle

Projected Gradient Descent Method for Tropical Principal Component Analysis over Tree Space

by

Ruriko Yoshida

Department of Operations Research, Naval Postgraduate School, Monterey, CA 93943, USA

Mathematics 2025, 13(11), 1776; https://doi.org/10.3390/math13111776

Submission received: 16 April 2025 / Revised: 23 May 2025 / Accepted: 24 May 2025 / Published: 27 May 2025

Download

Browse Figures

Versions Notes

Abstract

Tropical Principal Component Analysis (PCA) is an analogue of the classical PCA in the setting of tropical geometry, and applied it to visualize a set of gene trees over a space of phylogenetic trees, which is a union of lower-dimensional polyhedral cones in an Euclidean space with dimension

m (m - 1) / 2

, where m is the number of leaves. In this paper, we introduce a projected gradient descent method to estimate the tropical principal polytope over the space of phylogenetic trees, and we apply it to an Apicomplexa dataset. With computational experiments against Markov Chain Monte Carlo (MCMC) samplers, we show that our projected gradient descent method yields a lower sum of tropical distances between observations and their projections onto the estimated best-fit tropical polytope, compared with the MCMC-based approach.

Keywords:

phylogenomics; unsupervised learning; non-Euclidean geometry; tropical geometry

MSC:

14T90; 92D15; 62R01

1. Introduction

Phylogenomics is a relatively new field that applies tools from phylogenetics to genome data. One of the key tasks in phylogenomics is to analyze gene trees, which are phylogenetic trees representing the evolutionary histories of genes in the genome. In this work, we use an unsupervised learning method to visualize how gene trees are distributed over the space of phylogenetic trees, that is, the set of all possible phylogenetic trees with a fixed set of labels for all leaves.

A phylogenetic tree T on a given set of leaves

[m] : = {1, \dots, m}

is a weighted tree in which the internal nodes in T are unlabeled, their leaves X are labeled, and their branch lengths represent evolutionary time and mutation rates. In phylogenetics, a phylogenetic tree on the set of species

[m]

represents their evolutionary history. In phylogenomics, we construct a phylogenetic tree from an alignment or sequence data for each gene in a given genome. A phylogenetic tree reconstruced from a gene alignment is called a gene tree. Since different genes may have distinct evolutionary histories, gene trees can vary in topology and branch lengths. Thus, it is a statistical challenge to analyze a set of phylogenetic trees.

When conducting statistical analysis on a set of phylogenetic trees, we represent each tree as a vector in a high-dimensional vector space. One common method is to compute all pairwise distances between two distinct leaves in

[m]

, resulting in

R^{(\binom{m}{2})}

. However, not every vector in

R^{(\binom{m}{2})}

corresponds to a valid phylogenetic tree on

[m]

. In 1974, Buneman showed [1] that a vector derived from all possible pairwise distances between leaves in

[m]

must satisfy the four-point conditions to represent a phylogenetic tree. For an equidistant tree—namely, a rooted phylogenetic tree in which the total edge weight from the root to each leaf in

[m]

is equal (see Definition 9)—the vector must satisfy the three-point condition to correspond to the phylogenetic tree (Theorem 1).

In 2006, Ardila and Klivans showed that the space of all phylogenetic trees on

[m]

is a union of

m - 2

dimensional cones in

R^{(\binom{m}{2})}

, and that this space is not classically convex [2]. Therefore, classical statistical methods cannot be directly applied to a set of phylogenetic trees, as these methods assume a Euclidean sample space.

However, Ardila and Klivans also showed that the space of equidistant trees—rooted phylogenetic trees on

[m]

as defined in Definition 9—is a tropical Grasmaniann. So, the space of equidistant trees on

[m]

is tropically convex and forms a tropical linear space with the max-plus algebra over the tropical projective space. Therefore, we can apply tropical linear algebra to perform statistical analysis on the space of equidistant trees on

[m]

.

In 2019, Yoshida et al. introduced tropical principal component analysis (PCA), an analogue of a classical PCA from the perspective of tropical geometry, to visualize how gene trees are distributed over the space of equidistant trees on

[m]

using max-plus algebra [3]. For

s \leq (\binom{m}{2})

, they defined the

(s - 1)

-th order tropcial principal polytope, or the best-fit tropical polytope with s vertices, whose vertices serve as analogues of the classical first s principal components.They showed that computing these vertices can be formulated as a mixed integer linear programming problem, as shown in Problem 1 [3]. Later, Page et al. developed a Markov Chain Monte Carlo (MCMC) method to estimate the vertices of the tropical principal polytope from a set of gene trees.

In this work, inspired by the recent work of tropical gradient descent defined in [4], we introduce a projected gradient descent method to compute the set of vertices of the tropical principal polytope from a set of gene trees. We compute subgradients in order to find the optimal solution for the mixed integer programming problem in order to compute the tropical principal polytope shown in Theorem 3. Then, we apply our novel method to Apicomplexa data from [5], and our experiments using the R package TML version 2.3.0 [6] show that our method outperforms existing approaches in terms of computational time and cost function.

This paper is organized as follows. In Section 2, we introduce the basics of tropical geometry. In Section 3, we review the notions of metrics and ultrametrics, and discuss the isometry between the space of equidistant trees on

[m]

and the space of ultrametrics on the finite set

[m]

, based on results by Buneman [1]. In Section 4, we present tropical PCA and the s-th order tropical principal polytope for

s \leq e

, where

e : = (\binom{m}{2})

. Section 5 provides experimental results on the Apicomplexa dataset from [5].

2. Tropical Basics

In this section, we introduce the basics of tropical geometry to be used for our main results. Let

1

= (1, \dots, 1) \in R^{e}

. Then, through this paper, we consider the tropical projective torus,

R^{e} / R 1

which is isomorphic to

R^{e - 1}

. This means that

R^{e} / R 1

is equivalent to a hyperplane in

R^{e}

. This implies that for a point

x : = (x_{1}, \dots, x_{e}) \in R^{e} / R 1

,

(x_{1}, \dots, x_{e}) = (x_{1} + c, \dots, x_{e} + c)

where

c \in R

. See [7] for more details.

Throughout this paper, we consider tropically convex sets defined by the max-plus algebra provided in Definition 1.

Definition 1 (Tropical Arithmetic Operations).

The tropical semiring

(R \cup {- \infty}, \oplus, ⊙)

is defined using the following tropical addition ⊕ and multiplication ⊙:

a \oplus b : = max {a, b}, a ⊙ b : = a + b

for any

a, b \in R \cup {- \infty}

.

Remark 1.

- \infty

is the identity element under addition ⊕ and 0 is the identity element under multiplication ⊙ over

(R \cup {- \infty}, \oplus, ⊙)

.

Definition 2 (Tropical Scalar Multiplication and Vector Addition).

For any

a, b \in R \cup {- \infty}

and for any

v = (v_{1}, \dots, v_{e}), w = (w_{1}, \dots, w_{e}) \in {(R \cup {- \infty})}^{e}

, the tropical scalar multiplication and tropical vector addition are defined as:

a ⊙ v \oplus b ⊙ w : = (max {a + v_{1}, b + w_{1}}, \dots, max {a + v_{e}, b + w_{e}}) .

Definition 3 (Generalized Hilbert Projective Metric).

For any points

v : = (v_{1}, \dots, v_{d}), w : = (w_{1}, \dots, w_{e}) \in R^{e} / R 1

, the tropical metric,

d_{tr}

, between v and w, is defined as:

d_{tr} (v, w) : = max_{i \in {1, \dots, e}} \{v_{i} - w_{i}\} - min_{i \in {1, \dots, e}} \{v_{i} - w_{i}\} .

(1)

Remark 2.

The tropical metric

d_{tr}

is the metric over

R^{e} / R 1

.

Definition 4.

A subset

S \subset R^{e}

is called tropically convex if it contains the point

a ⊙ x \oplus b ⊙ y

for all

x, y \in S

and all

a, b \in R

. The tropical convex hull or tropical polytope,

tconv (V)

, of a given finite subset

V \subset R^{e}

is the smallest tropically convex set containing

V \subset R^{e}

. In addition,

tconv (V)

can be written as the set of all tropical linear combinations

tconv (V) = {a_{1} ⊙ v_{1} \oplus a_{2} ⊙ v_{2} \oplus \dots \oplus a_{r} ⊙ v_{r} : v_{1}, \dots, v_{r} \in V a n d a_{1}, \dots, a_{r} \in R} .

Any tropically convex subset S of

R^{e}

is closed under tropical scalar multiplication,

R ⊙ S \subseteq S

, i.e., if

x \in S

, then

x + c \cdot 1 \in S for all c \in R

. Thus, the tropically convex set S is identified as its quotient in the tropical projective torus

R^{e} / R 1

.

Definition 5

(Max-tropical Hyperplane [8]). A max-tropical hyperplane

H_{ω}^{max}

is the set of points

x \in R^{e} / R 1

, such that

max_{i \in {1, \dots, e}} {x_{i} + ω_{i}}

(2)

is attained at least twice, where

ω : = (ω_{1}, \dots, ω_{e}) \in R^{e} / R 1

.

Definition 6

(Min-tropical Hyperplane [8]). A min-tropical hyperplane

H_{ω}^{min}

is the set of points

x \in R^{d} / R 1

, such that

min_{i \in {1, \dots, e}} {x_{i} + ω_{i}}

(3)

is attained at least twice, where

ω : = (ω_{1}, \dots, ω_{e}) \in R^{e} / R 1

.

Remark 3.

A min-tropical hyperplane

H_{ω}^{min}

and a max-tropical hyperplane

H_{ω}^{max}

are tropically convex over

{(R \cup {- \infty})}^{e} / R 1

.

Definition 7

(Max-tropical Sectors from Section 5.5 in [8]). For

i \in [e]

, the i-th open sector of

H_{ω}^{max}

is defined as

S_{ω}^{max, i} : = {x \in R^{e} / R 1 | ω_{i} + x_{i} > ω_{j} + x_{j}, \forall j \neq i} .

(4)

and the i-th closed sector of

H_{ω}^{max}

is defined as

{\bar{S}}_{ω}^{max, i} : = {x \in R^{e} / R 1 | ω_{i} + x_{i} \geq ω_{j} + x_{j}, \forall j \neq i} .

(5)

Definition 8

(Min-tropical Sectors). For

i \in [d]

, the i-th open sector of

H_{ω}^{min}

is defined as

S_{ω}^{min, i} : = {x \in R^{d} / R 1 | ω_{i} + x_{i} < ω_{j} + x_{j}, \forall j \neq i},

(6)

and the i-th closed sector of

H_{ω}^{min}

is defined as

{\bar{S}}_{ω}^{min, i} : = {x \in R^{d} / R 1 | ω_{i} + x_{i} \leq ω_{j} + x_{j}, \forall j \neq i} .

(7)

3. Space of Phylogenetic Trees

A phylogenetic tree is a rooted or unrooted tree whose exterior nodes have unique labels, whose interior nodes do not have labels, and whose edges have non-negative weights. In this paper, we focus on an equidistant tree that is a rooted phylogenetic tree, such that a total weight on the path from its root to each leaf on the tree has the same total weight. Let

[m] : = {1, \dots, m}

be the set of leaf labels on an equidistant tree T.

Definition 9.

An equidistant tree T on

[m]

is a rooted phylogenetic tree on

[m]

, such that the total weight from the root to each leaf

i \in [m]

is equal to a constant

h > 0

for all

i \in [m]

. h is the height of T.

Example 1.

Figure 1 shows an equidistant tree a height 1 on

[4] : = {1, 2, 3, 4}

.

Suppose

D (i, j)

is the total weight on the unique path from a leaf

i \in [m]

and a leaf

j \in [m]

on a phylogenetic tree T. Then,

D = (D (1, 2), D (1, 3), \dots, D (m - 1, m)) \in R_{\geq 0}^{e}

, where

e : = (\binom{m}{2})

, is a metric, that is, D satisfies

\begin{matrix} D (i, j) \leq D (i, k) + D (k, j) \\ D (i, j) = D (j, i) \\ D (i, j) = 0 \end{matrix}

for all

i, j, k \in [m]

. The metric D is the tree metric of a phylogenetic tree T.

If metric D satisfies

\begin{matrix} max {D (i, j), D (i, k), D (k, j)} \end{matrix}

and this maximum is achieved at least twice for distinct

i, j, k \in [m]

, then D is called an ultrametric. Suppose

D (i, j)

is the total weight of the path from

i, i \in [m]

from an equidistant tree T, then we have the following theorem.

Theorem 1

(noted in [1]). Suppose we have an equidistant tree T with a leaf label set

[m]

and D as its tree metric. Then, D is an ultrametric if and only if T is an equidistant tree. In addition, we can uniquely reconstruct T from D.

Using Theorem 1, we consider the space of ultrametrics on

[m]

as the space of phylogenetic trees, which is the set of all possible equidistant trees with the leaf set

[m]

. Let

U_{m}

be the space of ultrametrics on

[m]

.

With tropical geometry, it can be shown that

U_{m}

is a tropical subspace over the tropical projective space

{(R \cup {- \infty})}^{e} / R 1

. Let

L_{m}

denote the subspace of

R^{e}

, defined by the linear equations, such that

x_{i j} - x_{i k} + x_{j k} = 0

for

1 \leq i < j < k \leq m

. The tropicalization

Trop (L_{m}) \subseteq {(R \cup {- \infty})}^{e} / R 1

is the tropical linear space consisting of points

(u_{12}, u_{13}, \dots, u_{m - 1, m})

, such that, for any

max (u_{i j}, u_{i k}, u_{j k})

, the maximum of

i, j, k \in [m]

is achieved at least twice.

In addition, it is important to note that the tropical linear space

Trop (L_{m})

corresponds to the graphic matroid of the complete graph

K_{m}

.

Theorem 2

(Theorem 2.18 in [3]). The image of

U_{m}

in the tropical projective torus

R^{e} / R 1

coincides with

Trop (L_{m})

.

Projection onto Tree Space

In tropical geometry, it is well-known that

U_{m}

is the support of a pointed simplicial fan of dimension

m - 2

and it has

2^{m} - m - 2

rays, defined as clade metrics [2].

Definition 10.

Suppose we have an equidistant phylogenetic tree T with leaf set

[m]

. A clade of T with leaves

σ \subset [m]

is the equidistant subtree of T formed by including all common ancestral interior nodes of combinations of leaves in σ, while excluding any common ancestors that involve leaves from

X - σ

in T, along with the edges in T that connect these interior nodes to the leaves in σ.

We note that a clade of an equidistant tree T with leaf set

σ \subset [m]

is a subtree of T with leaves

σ

. Feichtner showed that each topology of equidistant trees can be encoded by a nested set, that is, a set of clades

{σ_{1}, \dots, σ_{C}}

, where

C \in {1, \dots, m - 2}

, such that

σ_{i} \subset σ_{j}, or σ_{j} \subset σ_{i}, or σ_{i} \cap σ_{j} = \emptyset

for all

1 \leq i \leq j \leq C

and

| σ_{k} | \geq 2

for all

k = 1, \dots, C

[9].

Definition 11

(Clade Ultrametrics). We consider an equidistant tree T on the leaf set

[m]

. Let

σ \subset [m]

be a proper subset of

[m]

with at least two elements. Let

D_{σ} : = (D_{σ} (1, 2), \dots, D_{σ} (m - 1, m)) \in U_{m}

, such that

D_{σ} (i, j) = \{\begin{matrix} 0 & i f i, j \in σ \\ 1 & o t h e r w i s e . \end{matrix}

Then,

D_{σ}

is called a clade ultrametric.

We note that Ardila and Klivans showed that the set of clade ultrametrics forms a set of generators—i.e., rays—of a pointed simplicial fan of dimension

m - 2

, in terms of classical arithmetic in Euclidean geometry. In this paper, we use an extreme clade ultrametric, which is an analogue of a clade ultrametric defined using the max-plus algebra. This is done by replacing the identity element of classical addition with the identity of tropical addition ⊕ (namely, replacing 0 with

- \infty

), and replacing the identity element of classical multiplication with that of tropical multiplication ⊙ (namely, replacing 1 with 0).

Definition 12

(Extreme Clade Ultrametrics). We consider an equidistant tree T on the leaf set

[m]

. Let

σ \subset [m]

be a proper subset with at least two elements. Let

D_{σ} : = (D_{σ} (1, 2), \dots, D_{σ} (m - 1, m)) \in U_{m}

, such that

D_{σ} (i, j) = \{\begin{matrix} - \infty & i f i, j \in σ \\ 0 & o t h e r w i s e . \end{matrix}

Then,

D_{σ}

is called an extreme clade ultrametric.

Remark 4.

In polyhedral geometry, a polyhedral cone generated by a set of rays

V = {v^{1}, \dots, v^{k}} \subset R^{e - 1}

is defined as

C (V) = \{x \in R^{e - 1} | x = \sum_{i = 1}^{k} α_{i} v^{i}, α_{i} \geq 0 f o r a l l i = 1, \dots, k\} .

We replace classical addition with ⊕ and classical multiplication with ⊙ for

V = {v^{1}, \dots, v^{k}} \subset R^{e} / R 1 ≅ R^{e - 1}

; thus, we have

T r o p (C (V)) = \{x \in R^{e} / R 1 | x = ⨁_{i = 1}^{k} α_{i} ⊙ v^{i}, α_{i} \in R f o r a l l i = 1, \dots, k\}

which is a tropical polytope defined by V.

Proposition 1

([10]). The set of all extreme clade ultrametrics,

U_{m}^{\infty}

, is a generating set of

U_{m}

in terms of max-plus algebra.

Proposition 1 is a tropical geometric analogue of the simplicial complex result by Ardila and Klivans in [2], obtained by replacing classical addition with ⊕ and classical multiplication with ⊙.

Definition 13

(Projection Map [10]). The tropical projection map to ultrametric tree space

π_{U_{m}} : {(R \cup {- \infty})}^{e} / R 1 \to U_{m}

is given by

π_{U_{m}} (x) = ⨁_{v \in U_{m}^{\infty}} λ_{v} ⊙ v, λ_{v} = min_{j : v_{j} = 0} {x_{j}}

for

x : = (x_{1}, \dots, x_{e}) \in R^{e} / R 1

.

Proposition 2

([10]). For all

x \in R^{e} / R 1

, we have

d_{tr} (x, x^{'}) \leq d_{tr} (x, y)

for all

y \in U_{m}

, and where

x^{'} = π_{U_{m}} (x)

is defined by Definition 13.

The following proposition is key to the projected gradient methods we proposed in Section 4.

Proposition 3

(Lemma 19, [11]). The projection map

π_{U_{m}} (x)

is non-expansive in terms of

d_{tr}

, i.e.,

d_{tr} (π_{U_{m}} (u), π_{U_{m}} (v)) \leq d_{tr} (u, v)

for all

u, v \in R^{e} / R 1

.

Remark 5.

The projection map

π_{U_{m}}

is equivalent to the single linkage hierarchical clustering method [12]. Therefore, in the computational experiment described in Section 5, we use a single linkage hierarchical clustering method to projecting a subgradient.

4. Tropical Principal Component Analysis

Yoshida et al. introduced the notion of tropical principal component analysis (PCA), an analysis based on the best-fit tropical hyperplane or tropical polytope [3]. In particular, they applied tropical PCA using tropical polytopes to samples of ultrametrics within the space of ultrametrics. In this section, we consider

U_{m} \subset R^{e} / R 1

, where m is the number of leaves and

e = (\binom{m}{2})

. Suppose we have a sample

S : = {u_{1}, \dots u_{n}} \subset U_{m}

.

Definition 14

(Definition 3.1 in [13]). The

(s - 1)

-th order tropical principal polytope

P

minimizes

\sum_{i = 1}^{n} d_{tr} (u_{i}, w_{i})

where

w_{i}

is the projection onto a tropical polytope

P

with s many vertices, that is

d_{tr} (u_{i}, w_{i}) \leq d_{tr} (u_{i}, x)

for all

x \in P

for

S : {u_{1}, \dots u_{n}}

, which is called the

(s - 1)

th-order tropical principal component polytope of the sample

S

. The s vertices of the tropical principal component polytope

P

are called the

(s - 1)

th order tropical principal components, or equivalently, the best-fit tropical polytope with s vertices.

Remark 6.

The 0-th tropical principal polytope is a tropical Fermat–Weber point of a sample

S

with respect to

d_{tr}

. A tropical Fermat–Weber point of

S

with respect to

d_{tr}

is defined as

x^{*} : = \underset{x \in R^{e} / R 1}{arg min} \sum_{i = 1}^{n} d_{tr} (x, u_{i}) .

A tropical Fermat–Weber point of

S

with respect to

d_{tr}

is not necessarily unique, and the set of all tropical Fermat–Weber points forms both a classical polytope and a tropical polytope [14].

In this paper, we focus on the

(s - 1)

-th order principal components over

U_{m} \subset R^{e} / R 1

for

s > 1

. Our problem can be written as follows:

Problem 1.

We seek a solution for the following optimization problem:

min_{D^{(1)}, \dots, D^{(s)} \in U_{m}} \sum_{i = 1}^{n} d_{tr} (u_{i}, w_{i})

where

w_{i} = λ_{1}^{i} ⊙ D^{(1)} \oplus \dots \oplus λ_{s}^{i} ⊙ D^{(s)}, where λ_{k}^{i} = \min (u_{i} - D^{(k)}),

(8)

and

d_{tr} (u_{i}, w_{i}) = max {| u_{i} (k) - w_{i} (k) - u_{i} (l) + w_{i} (l) | : 1 \leq k < l \leq e}

(9)

with

u_{i} = (u_{i} (1), \dots, u_{i} (e)) and w_{i} = (w_{i} (1), \dots, w_{i} (e)) .

(10)

Remark 7

(Proposition 4.2 in [3]). Problem 1 can be formulated as a mixed integer programming problem.

In this section, we consider subgradients of Problem 1. Here, we are interested in computing

\frac{\partial d_{tr} (u_{i}, w_{i})}{\partial D^{(k)}} .

First, we notice that

\begin{matrix} \frac{\partial d_{tr} (u_{i}, w_{i})}{\partial D^{(k)}} & = & \frac{\partial d_{tr} (u_{i}, w_{i})}{\partial w_{i}} \frac{\partial w_{i}}{\partial D^{(k)}} \end{matrix}

by the product rule.

Let

δ_{i j} = \{\begin{matrix} 1 & i f i = j \\ 0 & o t h e r w i s e . \end{matrix}

be Kronecker’s Delta. Then, we have the following lemma:

Lemma 1

(Lemma 10 in [14]). For any two points

x, p \in R^{e} / R 1

, the gradient at x of the tropical distance between x and p is given by

\frac{\partial d_{tr} (p, x)}{\partial x (l^{'})} = (δ_{l^{'} t} - δ_{l^{'} t^{'}} ∣ x \in S_{- p}^{max, t} \cap S_{- p}^{min, t^{'}}) .

(11)

provided there are no ties in

(x - p)

, which ensures that the min- and max-sectors are uniquely identifiable; that is, the point x is inside of open sectors and not on the boundary of

H_{- p}

.

Also, we notice that

\begin{matrix} w_{i} (l) & = & max [(min_{j} (u_{i} (j) - D^{(1)} (j)) 1 + D^{(1)}) (l), \dots, (min_{j} (u_{i} (j) - D^{(s)} (j)) 1 + D^{(s)}) (l)], \end{matrix}

for

l = 1, \dots, e

.

Lemma 2.

\frac{\partial w_{i} (l^{'})}{\partial D^{(k)} (l)} = \{\begin{matrix} - 1 & i f {argmax}_{k^{'}} \{(D^{(k^{'})} + min_{j} (u_{i} (j) - D^{(k^{'})} (j)) 1) (l^{'})\} = k, \\ min_{j} (u_{i} (j) - D^{(k)} (j)) = l, a n d l^{'} \neq l, \\ 1 & i f {argmax}_{k^{'}} \{(D^{(k^{'})} + min_{j} (u_{i} (j) - D^{(k^{'})} (j)) 1) (l^{'})\} = k, \\ min_{j} (u_{i} (j) - D^{(k)} (j)) \neq l, a n d l^{'} = l, \\ 0 & o t h e r w i s e, \end{matrix}

(12)

for

i = 1, \dots, n

,

k = 1, \dots, s

, and for

l = 1, \dots, e

.

Proof.

Direct computation from the equation in (12). □

Lemma 3.

Subgradients of Problem 1 over

R^{e} / R 1

is

\sum_{i = 1}^{n} \frac{\partial d_{tr} (u_{i}, w_{i})}{\partial D^{(k)} (l)} = \sum_{i = 1}^{n} \frac{\partial d_{tr} (u_{i}, w_{i})}{\partial w_{i} (l^{'})} \frac{\partial w_{i} (l^{'})}{\partial D^{(k)} (l)}

which can be obtained by Equations (11) and (12).

Theorem 3.

The subgradient of Problem 1 over

U_{m}

is

π_{U_{m}^{\infty}} (\sum_{i = 1}^{n} \frac{\partial d_{tr} (u_{i}, w_{i})}{\partial D^{(k)} (l)})

where

\sum_{i = 1}^{n} \frac{\partial d_{tr} (u_{i}, w_{i})}{\partial D^{(k)} (l)}

is obtained in Lemma 3.

Proof.

Using Proposition 3, we know that

π_{U_{m}^{\infty}}

is non-expanding in terms of

d_{tr}

. Therefore, we have

d (0, π_{U_{m}^{\infty}} (\sum_{i = 1}^{n} \frac{\partial d_{tr} (u_{i}, w_{i})}{\partial D^{(k)} (l)} (x))) \leq d_{tr} (0, \sum_{i = 1}^{n} \frac{\partial d_{tr} (u_{i}, w_{i})}{\partial D^{(k)} (l)} (x)) .

Therefore,

d (0, π_{U_{m}^{\infty}} (\sum_{i = 1}^{n} \frac{\partial d_{tr} (u_{i}, w_{i})}{\partial D^{(k)} (l)} (x))) = 0

when x is at a critical point.

Suppose

x^{*} \in U_{m}

is an optimal solution for the Problem 1. Then, let

x^{t + 1} : = x^{t} - α_{t} \sum_{i = 1}^{n} \frac{\partial d_{tr} (u_{i}, w_{i})}{\partial D^{(k)} (l)} (x^{t}) .

Since

\sum_{i = 1}^{n} \frac{\partial d_{tr} (u_{i}, w_{i})}{\partial D^{(k)} (l)}

is a subgradient in Lemma 3, we have

\begin{matrix} \sum_{i = 1}^{n} d_{tr} (u_{i}, π_{t c o n v (D^{(1)}, \dots, D^{(k - 1)}, x^{t + 1}, D^{(k + 1)}, \dots, D^{(s)})} (u_{i})) \\ \leq & \sum_{i = 1}^{n} d_{tr} (u_{i}, π_{t c o n v (D^{(1)}, \dots, D^{(k - 1)}, x^{t}, D^{(k + 1)}, \dots, D^{(s)})} (u_{i})), \end{matrix}

for

k = 1, \dots, s

. Using Proposition 3, we have

\begin{matrix} \sum_{i = 1}^{n} d_{tr} (π_{U_{m}^{\infty}} (u_{i}), π_{U_{m}^{\infty}} (π_{t c o n v (D^{(1)}, \dots, D^{(k - 1)}, x^{t + 1}, D^{(k + 1)}, \dots, D^{(s)})} (u_{i}))) \\ = & \sum_{i = 1}^{n} d_{tr} (u_{i}, π_{U_{m}^{\infty}} (π_{t c o n v (D^{(1)}, \dots, D^{(k - 1)}, x^{t + 1}, D^{(k + 1)}, \dots, D^{(s)})} (u_{i}))) \\ \leq & \sum_{i = 1}^{n} d_{tr} (u_{i}, π_{t c o n v (D^{(1)}, \dots, D^{(k - 1)}, x^{t + 1}, D^{(k + 1)}, \dots, D^{(s)})} (u_{i})) \\ \leq & \sum_{i = 1}^{n} d_{tr} (u_{i}, π_{t c o n v (D^{(1)}, \dots, D^{(k - 1)}, x^{t}, D^{(k + 1)}, \dots, D^{(s)})} (u_{i})), \end{matrix}

for

k = 1, \dots, s

. □

5. Computational Experiments

5.1. Simulated Dataset

Next, we generate gene trees under the multispecies coalescent model (MCM) using a given species tree via the software Mesquite version 3.81 [15]. We fixed the effective population size at

N_{e} = 100, 000

and varied

R = \frac{S D}{N_{e}}

where

S D

is the species depth, defined as the number of generations from the most recent common ancestor (the root of the tree) to its leaves.

In this experiment, we simulate 1000 trees for each valuee of R = 0.25, 0.5, 1.0, 2.0, 5.0, 10.0. We run 100 iterations of both the MCMC method and our projected gradient method. To account for variability in performance, we repeat both methods 10 times for each R. For each run, we compute the sum of tropical distances

d_{tr}

between the observed trees and their projections onto the estimated tropical polytope. This sum, representing the total “magnitude of errors” (SE) in terms of

d_{tr}

), corresponds to the estimated optimal value of the linear programming problem in Problem 1. The results are shown in Figure 2. For the learning rate schedule, we start with

λ = 0.001

and multiply it by

0.999

at each iteration. Note that this learning rate has not been optimized. Therefore, investigating an optimal learning rates remains a direction for future work. As for the stopping criterion, we terminate the algorithm when the preset maximum number of iterations is reached. Each iteration has time complecity

O (m^{2} n)

, where n is the sample size.

5.2. Empirical Dataset

We apply our projected gradient method to estimate the best-fit tropical polytope for an empirical dataset consisting of 268 orthologous sequences from eight protozoa species, as presented in [5]. This dataset contains gene trees reconstructed from the following sequences: Babesia bovis (Bb), Cryptosporidium parvum (Cp), Eimeria tenella (Et) [15], Plasmodium falciparum (Pf) [11], Plasmodium vivax (Pv), Theileria annulata (Ta), and Toxoplasma gondii (Tg). The outgroup is a free-living ciliate, Tetrahymena thermophila (Tt) (Figure 3).

In order to run this experiment, we used a Mac Pro laptop (Apple Inc., Cupertino, CA, USA) with Apple M4 Max and 128 GB memory. We implemented our projected gradient method using R.

We set the maximum number of iterations to 100 for our projected gradient method, and to 1000 for the MCMC method. For the learning rate schedule, we initialize

λ = 0.01

and multiply it by

0.999

at each iteration. Note that the learning rate and scheduling have not been optimized. Using the projected gradient descent to estimate the tropical principal polytope on this dataset takes

6.82

s, resulting in an estimated optimal value of

360.8589

for the optimization problem in Problem 1. In comparison, the Markov Chain Monte Carlo (MCMC) method implemented via the TML package [6] yields an estimated optimal value of

397.6459

, with a computational time of

54.70

s over 1000 iterations.

6. Conclusions

In this work, we introduce a novel method to approximate the best-fit tropical polytope that explains a sample of gene trees. We show that our gradient method effectively reduces the objective function when an appropriate learning rate is used. Computational experiments show that this method achieves a lower sum of tropical distances between observations and their projections onto the estimated best-fit tropical polytope, compared to the MCMC approach proposed by Page et al. [13]. Although we implement a decreasing learning rate in our experiments, the optimal learning rate schedule for this problem remains an open question.

We implement our method in R and the source code is available at http://polytopes.net/Tropical_Gradient2.zip (accessed on 15 May 2025).

Funding

This research was funded by NSF Division of Mathematical Sciences: Statistics Program DMS 2409819.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

PCA	Principal Component Analysis
MCMC	Markov Chain Monte Carlo

References

Buneman, P. A note on the metric properties of trees. J. Comb. Theory Ser. B 1974, 17, 48–50. [Google Scholar] [CrossRef]
Ardila, F.; Klivans, C.J. The Bergman Complex of a Matroid and Phylogenetic Trees. J. Comb. Theory Ser. B 2006, 96, 38–49. [Google Scholar] [CrossRef]
Yoshida, R.; Zhang, L.; Zhang, X. Tropical Principal Component Analysis and its Application to Phylogenetics. Bull. Math. Biol. 2019, 81, 568–597. [Google Scholar] [CrossRef] [PubMed]
Talbut, R.; Monod, A. Tropical Gradient Descent. arXiv 2024, arXiv:2405.19551. [Google Scholar] [CrossRef]
Kuo, C.; Wares, J.P.; Kissinger, J.C. The Apicomplexan Whole-Genome Phylogeny: An Analysis of Incongruence among Gene Trees. Mol. Biol. Evol. 2008, 25, 2689–2698. [Google Scholar] [CrossRef] [PubMed]
Barnhill, D.; Yoshida, R.; Aliatimis, G.; Miura, K. Tropical geometric tools for machine learning: The TML package. J. Softw. Algebra Geom. 2024, 14, 133–174. [Google Scholar] [CrossRef]
Maclagan, D.; Sturmfels, B. Introduction to Tropical Geometry; Graduate Studies in Mathematics; American Mathematical Society: Providence, RI, USA, 2015; Volume 161. [Google Scholar]
Joswig, M. Essentials of Tropical Combinatorics; Springer: New York, NY, USA, 2021. [Google Scholar]
Feichtner, E.M. Complexes of trees and nested set complexes. Pac. J. Math. 2004, 227, 271–286. [Google Scholar] [CrossRef]
Ardila, F. Subdominant Matroid Ultrametrics. Ann. Comb. 2005, 8, 379–389. [Google Scholar] [CrossRef]
Jaggi, M.; Katz, G.; Wagner, U. New results in tropical discrete geometry. Preprint 2008. Available online: https://api.semanticscholar.org/CorpusID:14825187 (accessed on 14 April 2025).
Gascuel, O.; McKenzie, A. Performance analysis of hierarchical clustering algorithms. J. Classif. 2004, 21, 3–18. [Google Scholar] [CrossRef]
Page, R.; Yoshida, R.; Zhang, L. Tropical principal component analysis on the space of phylogenetic trees. Bioinformatics 2020, 36, 4590–4598. [Google Scholar] [CrossRef] [PubMed]
Barnhill, D.; Sabol, J.; Yoshida, R.; Miura, K. Tropical Fermat-Weber Polytropes. arXiv 2024, arXiv:2402.14287. [Google Scholar] [CrossRef]
Maddison, W.P.; Maddison, D. Mesquite: A Modular System for Evolutionary Analysis. Version 2.72. 2009. Available online: http://mesquiteproject.org (accessed on 11 August 2016).

Figure 1. An equidistant tree with

[4] : = {1, 2, 3, 4}

from Example 1. The height of the tree is 1.

Figure 1. An equidistant tree with

[4] : = {1, 2, 3, 4}

from Example 1. The height of the tree is 1.

Figure 2. Side-by-side boxplots for the SE for each method and ratio. We repeat computation 10 times for each ratio and method.

Figure 3. (Left): Estimated second-order tropical principal polytope. (Right): Each color represents a tree topology. The number inside each bracket is the frequency of the tree topology. 1 presents “Pv”, 2 represents “Pf”, 3 represents “Tg”, 4 represents “Et”, 5 represents “Cp”, 6 represents “Ta”, 7 represents “Bb” and 8 represents “Tt”.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yoshida, R. Projected Gradient Descent Method for Tropical Principal Component Analysis over Tree Space. Mathematics 2025, 13, 1776. https://doi.org/10.3390/math13111776

AMA Style

Yoshida R. Projected Gradient Descent Method for Tropical Principal Component Analysis over Tree Space. Mathematics. 2025; 13(11):1776. https://doi.org/10.3390/math13111776

Chicago/Turabian Style

Yoshida, Ruriko. 2025. "Projected Gradient Descent Method for Tropical Principal Component Analysis over Tree Space" Mathematics 13, no. 11: 1776. https://doi.org/10.3390/math13111776

APA Style

Yoshida, R. (2025). Projected Gradient Descent Method for Tropical Principal Component Analysis over Tree Space. Mathematics, 13(11), 1776. https://doi.org/10.3390/math13111776

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Projected Gradient Descent Method for Tropical Principal Component Analysis over Tree Space

Abstract

1. Introduction

2. Tropical Basics

3. Space of Phylogenetic Trees

Projection onto Tree Space

4. Tropical Principal Component Analysis

5. Computational Experiments

5.1. Simulated Dataset

5.2. Empirical Dataset

6. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI