Semi-Supervised Segmentation of Echocardiography Videos Using Graph Signal Processing

El rai, Marwa Chendeb; Darweesh, Muna; Al-Saad, Mina

doi:10.3390/electronics11213462

Open AccessArticle

Semi-Supervised Segmentation of Echocardiography Videos Using Graph Signal Processing

by

Marwa Chendeb El rai

^1,2,*,†,

Muna Darweesh

^2,† and

Mina Al-Saad

^2,†

¹

Mathematics Department, School of Arts and Sciences, American University in Dubai, Dubai 28282, United Arab Emirates

²

College of Engineering and Information Technology, University of Dubai, Dubai 14143, United Arab Emirates

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Electronics 2022, 11(21), 3462; https://doi.org/10.3390/electronics11213462

Submission received: 15 September 2022 / Revised: 7 October 2022 / Accepted: 13 October 2022 / Published: 26 October 2022

Download

Browse Figures

Versions Notes

Abstract

:

Machine learning and computer vision algorithms can provide a precise and automated interpretation of medical videos. The segmentation of the left ventricle of echocardiography videos plays an essential role in cardiology for carrying out clinical cardiac diagnosis and monitoring the patient’s condition. Most of the developed deep learning algorithms for video segmentation require an enormous amount of labeled data to generate accurate results. Thus, there is a need to develop new semi-supervised segmentation methods due to the scarcity and costly labeled data. In recent research, semi-supervised learning approaches based on graph signal processing emerged in computer vision due to their ability to avail the geometrical structure of data. Video object segmentation can be considered as a node classification problem. In this paper, we propose a new approach called GraphECV based on the use of graph signal processing for semi-supervised learning of video object segmentation applied for the segmentation of the left ventricle in echordiography videos. GraphECV includes instance segmentation, extraction of temporal, texture and statistical features to represent the nodes, construction of a graph using K-nearest neighbors, graph sampling to embed the graph with small amount of labeled nodes or graph signals, and finally a semi-supervised learning approach based on the minimization of the Sobolov norm of graph signals. The new algorithm is evaluated using two publicly available echocardiography videos, EchoNet-Dynamic and CAMUS datasets. The proposed approach outperforms other state-of-the-art methods under challenging background conditions.

Keywords:

echocardiography; video object segmentation; deep learning; graph signal processing; semi-supervised learning

1. Introduction

The World Health Organization (WHO) reports that cardiovascular diseases are the major cause of death with the estimation that 17.9 million people die every year [1]. In the last two decades, advancement in imaging technology and machine leaning techniques have advanced diagnosing and treating cardiovascular diseases. Echocardiography is a safe and low-cost test for cardiac diagnosis [2]. It is a non-invasive examination which observes all the structures of the heart, namely the valves and the cavities (atria and ventricles). Echocariography is able to explore the cardiac origin of symptoms, such as shortness of breath, chest pain, or malaise. It evaluates the impact of a disease, such as high blood pressure or pulmonary arterial hypertension or certain medications, on the heart. Ecocardiography diagnoses heart failure which prevents blood from flowing back when it enters one of the heart chambers or if is expelled by the heart.

The left ventricle is the principal pumping cavity of the heart. It pumps blood rich in oxygen into the aorta and to the rest of the body. Most cardiac functions, such as myocardial motion analysis and ejection fraction estimation, are determined from the left ventricle. In the last two decades, advancement in imaging technology, computer vision, and the machine leaning techniques have advanced the diagnosis and treatment of cardiovascular diseases. These modern imaging technologies have improved the diagnostic procedures, which in turn increase accuracy and optimize the workload of healthcare workers. Despite the fact that deep learning contributed to the medical diagnosis, there are many obstacles that need to be resolved before deployment. Specifically, deep learning techniques require huge amounts of data to perform as competently as human judgment. Furthermore, due to privacy laws and various standards applied in different healthcare industries, medical data are rarely available compared to the data obtainability in other research fields [3]. In addition, the labeling process of medical data are considered complex, time-consuming, and require experienced healthcare professionals. Therefore, the main idea of this research is to develop a method called GraphECV that requires little labeled data to be used in the semi-supervised segmentation of the left ventricle. It is based on the use of graph signal processing.

Recently, a semi-supervised Background Subtraction approach proposed by [4] and based on the theory of graph signal processing was applied on video object segmentation (VOS). The latter method was updated to perform the semi-supervised segmentation of synthetic aperture radar (SAR) images, where left ventricular objects have different sizes and are embedded in complex environments [5]. In this work, we propose a semi-supervised learning segmentation algorithm for the echocardiography videos called GraphECV. Our algorithm is inspired by the work of [4,6]. GraphECV has the advantage of seeking less labeled data during the training phase than other deep learning approaches while adapting to complex background and texture of echoardiography scenarios. Experimental findings show the effectiveness of our proposed method; it outperforms the other state-of-the-art approaches.

The principal contributions of this work are as follows:

Our method supports a new semi-supervised learning model for the left ventricle in echocardiography videos that integrates the graph signal processing where nodes are classified into the left ventricle or background.
The motion, temporal, statistical, and texture features are used to represent the nodes on the graph. This integration has not appeared in the literature.
The experiments were applied over two public datasets: EchoNet Dynamic and CAMUS. Despite the scarcity of labeled data, GraphECV surpassed many of the state-of-the-art methods.

The rest of the paper is organized as follows. Section 2 summarizes the related works. Section 3 discusses the basic concepts and the proposed semi-supervised learning segmentation algorithm for echocardiograpgy videos based on Graph Signal Processing GraphECV. Section 4 presents the experimental results including the description of datasets and the analysis of ablation studies. Finally, Section 5 concludes and draws future directions of this research.

2. Related Work

This section briefly surveys: (1) Graph Signal Processing and its application in computer vision, and (2) supervised and semi-supervised Video Object segmentations (VOS).

2.1. Graph Signal Processing

Graph Signal processing (GSP) is a domain of signal processing that deals with data illustrated on graphs. It reflects the interaction between two connected fields: applied mathematics and signal processing. The data are depicted as signals and defined in groups of nodes on a weighted graph. It gives a natural representation that integrates both the data and the underlying structure of the geometry [7]. In video processing, GSP has played an important role in analyzing natural signals that are in irregular domains [8]. It is a very beneficial task as it considers the spatio-temporal relationships of the pixels. GSP and its application in the field of machine learning were widely discussed in [7].

2.2. Video Object Segmentation

Although complex, the methods related to the video object segmentation (VOS) emerged after abundant research tools were used for image object segmentation. VOS can be categorised into three main groups: supervised, semi-supervised and unsupervised learning methods. The VOS was investigated by the classical method such as conditional random field (CRF) [9,10] and by the use of deep learning algorithms where authors explored the embedding of crucial and challenging temporal information in VOS [11,12,13]. A common choice for reflecting the temporal relation between frames in VOS is optical flow [14]. Segflow proposed by [11] introduced a second channel for optical flow [15] besides the CNN network used for image segmentation. In order to utilize most of the temporal information relevant to VOS, authors in [16] created a TMANET method which integrates the attention mechanism with convolutional neural networks. Another study [17] worked to find the similarity between two consecutive frames to obtain the attention coefficients. To boost more the VOS accuracy, ref. [18] computed the motion cue to strengthen the temporal representation of the target frame and the neighbours. Some works in deep learning tackle the problem of VOS as a classical image segmentation by considering the frames independently. The main drawback is the failure to capture the dynamics of the video [19,20]. In addition to the temporal features, authors used sequential model long short-term memory

(L S T M)

to learn the interference of redundant video frames [21,22]. The deep learning methods require a large amount of annotated data to obtain high performance levels. Many studies outlined the prediction of labels with few labeled samples in the training phase [23,24]. Proposal-generation, refinement and merging for video object segmentation (PReMVOS) automatically generates pixel masks over video sequences, while the first frame is annotated [25].Authors in [26] introduced a new approach that extracts texture features for image indexing and retrieval in the biomedical field. Authors in [27] proposed a study to predict COVID-19 for diabetic patients based on a fuzzy interface system and machine learning algorithms.

Echocardiography (cardiac ultrasound imaging) is primarily used as a clinical tool for the evaluation of different cardiovascular functions [28]. The authors of [29] proposed a semi-segmentation algorithm to segment the left ventricle endocardium from echocardiography videos and adaptive spatio-temporal semantic calibration to ensure the alignment of the feature maps for the consecutive frames, which will reduce the effect of the speckle noise.

In addition, the acquired temporal information from the feature map of the neighboring frame to the current frame is included to improve the segmentation performance. However, its performance might degrade with irregular cardiac motion or low contrast videos. Another approach proposed by Sultan et al. is to segment the anterior mitral leaflet (AML) which is required to diagnose rheumatic heart disease. In this algorithm, the video frames are converted to virtual M-mode space after specifying a single point initialization on the posterior wall of the aorta and then segmenting the AML in this space. However, the proposed algorithm has low robustness since a single seed information is at hand, which might affect the segmentation process, and the algorithm misses the AML tip. A joint learning approach for the spatio-temporal echocardiographic sequence of the left ventricular motion tracking and segmentation was proposed by Ta et al. The features learned from both the segmentation and motion tracking are joined bi-directionally, thus utilizing the features that might not be captured from one branch. The semi-segmentation technique adopted is U-Net. The proposed technique includes physiological constraints which ensure realistic cardiac functioning, which helps in reducing the dependency on the accurate segmentation. ATTIA et al. proposed an automated segmentation technique that extracts the intra-frame and the motion information [2]. Sigit et al. reported a segmentation method that applies the B-Spline to identify the cardiac cavity in the initial frame and optical flow to track and detect the border for each frame in the video [30].

3. Graph Signal Processing and Semi-Supervised Echocardiography Video Segmentation

Figure 1 displays a graphical illustration of our proposed semi-supervised left ventricle segmentation. The framework proposed in this work can be divided into the following steps: a deep learning algorithm FgSegNet_S [31] to segment the left ventricle; handcrafted features extracted from each instance, including optical flow, motion, texture, and statistical features crucial to illustrate the spatio-temporal information of each instance node; graph construction using K-nearest neighbors; graph signal revealed by the annotated images; sampling of graph signals; and finally, to label all the nodes and reconstruct the graph, a semi-supervised Sobolev norm minimization technique is applied. The latter permits us to evaluate the unlabeled nodes and classifies them between the left ventricle and the background with a limited input amount of labeled samples. The sampling known in classical digital signal processing is identical to the sampling of signals or labeled data on the graphs. The concept used here is to reconstruct all the signals from the embedded labeled data belonging to the left ventricle.

3.1. Introduction to Signal Processing on Graphs

GSP is an expanding area of research that extends classical analytical methods to non-regular fields, exploiting the topology of the underlying graphs relying on the Laplacian and Fourier analyses. GSP is therefore a crucial tool, having the ability to associate signals (left ventricle activity) with the other parts of the heart considered as background in this study. GSP can be defined as an undirected graph

G = (ν, η)

with a collection of nodes and a collection of edges

η = (i, j)

having

w_{i j}

as weight reflecting the similarity and correlation between the samples at nodes i and j;

w_{i j}

is an element of the adjacency matrix

W

;

w_{i j} = 0

when

(i, j) \notin η

. The graph signal emerging over the nodes of graph G is defined as

f \in R^{N}

. A subgroup A consists of M sampled data, where the ratio of M to N is known as the sampling density. Hence, the sampled graph signal is defined as

y (A) = Sy

, where

S = {[δ_{a_{1}}, δ_{a_{1}}, . . ., δ_{a_{M}}]}^{T}

is an N dimension Kronecker column vector [32] acting as binary decimation matrix.

3.2. Instance Segmentation

The complexity nature, gray color, and background of echocardiography images is different from optical images. They require an additional module in any segmentation method to enhance the contrast between the features in order to differentiate between the different parts of an ultrasound image. In 2018, the FgSegNet_S method [31] was developed for video foreground background segmentation. It includes a Feature Pooling Module (FPM) integrated between the encoder and the decoder CNN networks of the SegNet [33] segmentation approach. The FPM has the ability to elicit multi-scale features from the input encoder CNN. The extracted features exemplify the input to the following decoder CNN. In addition, FPM guarantees a strong feature pooling versus any probe motion [34]. In addition, FPM grants better classification of the uncertainty sourced by speckle noise [35]. An ablation study was conducted to compare many semantic and instance segmentation methods. For echocardiography videos, the performance using FgSegnet_S outstripped the other segmentation methods. The results are shown in Table 1.

3.3. Feature Extraction and Nodes Representation

Many features of interest can be integrated, such as texture, motion and optical flow features pivotal in medical videos. Thus, the temporal information was included by the estimation of optical flow [14]. The idea is to use the echo features which are different from optical parameters. With gray scale and no color information, echocardiography videos acquired using ultrasound imaging system vary glaringly from optical images. It is essential to determine features that characterize the left ventricular from the other tissues and background. Statistical features are estimated based on the fact that ultrasound tissue can be characterized by a generalized Gamma distribution

(G e n e r a l i z e d Γ D i s t r i b u t i o n)

[36]. This distribution can help distinguish between the left ventricle, other tissues, and background, as the left ventricular pixels appear darker than the others.

3.3.1. Statistical Representation of Echocardiography Data

Echocardiography videos, like any other ultrasound images, are depicted by the presence of a salt and pepper pattern called speckles. They are the result of an important spatial heterogeneity between close pixels. The study of backscattered echos from tissues demands a proper analysis of the ultrasonic signals which can be provided through their statistical description [37]. The parameters of a statistical model permit the generation of discriminative descriptors that are crucial for the classification and identification of left ventricle and other tissues [38]. The experiments depicted in [36] show that the generalized Gamma distribution

(G Γ D)

can precisely describe the behavior of myocardial tissue and left ventricle in echocardiography images. The parameters of

(G Γ D)

offer a posterior probability which is helpful for classification and segmentation [36,39]. Hence, the gray level statistics of echocardiography videos can be modeled by

(G Γ D)

[36,40] having the following probability density function (PDF):

p (x) = \frac{|φ| κ^{κ}}{θ Γ (κ)} {(\frac{x}{θ})}^{κ φ - 1} e x p (- κ {(\frac{x}{θ})}^{φ}) .

(1)

In this equation,

Γ (\cdot)

symbolizes the Gamma function. The scale

θ

, the shape

κ

, and power’s distribution

φ

parameters are estimated using Mellin Transform and second-kind statistics [36,41]. The log-cumulant expressions of

(G Γ D)

can be expressed as follows [42]:

{\hat{ζ}}_{1} = l o g (θ) + \frac{Φ_{0} (κ) - l o g (κ)}{φ}

(2)

{\hat{ζ}}_{i} = \frac{Φ_{0} (i - 1, κ)}{φ^{i}} i = l, 2, 3, . . .

(3)

where

Φ_{0} (x, y)

and

Φ_{0} (x)

are the Polygamma and Diagamma functions, respectively. The higher order of Polygamma function leads to the estimation of the shape parameter as follows:

\hat{κ} = \frac{λ_{1}}{3 λ_{0}} + \sqrt[3]{\frac{- O}{2} + \sqrt{\frac{O^{2}}{4} + \frac{T^{3}}{27}}} + \sqrt[3]{\frac{- O}{2} + \sqrt{\frac{O^{2}}{4} + \frac{T^{3}}{27}}}

(4)

where O, T quantities are defined as:

\begin{matrix} O = \frac{3 λ_{0} λ_{2} - {λ_{1}}^{2}}{{3 λ_{0}}^{2}} \end{matrix}

(5)

\begin{matrix} T = \frac{3 {λ_{1}}^{3} - 9 λ_{0} λ_{1} λ_{2} + {27 λ_{0}}^{2} λ_{3}}{{27 λ_{0}}^{3}} \end{matrix}

(6)

λ_{i}

are expressed as follows:

\begin{matrix} λ_{0} = 8 \hat{ζ_{3}^{2}} \\ λ_{1} = 4 (3 \hat{ζ_{3}^{2}} - 2 \hat{ζ_{2}^{3}}) \\ λ_{2} = 2 (3 \hat{ζ_{3}^{2}} - 8 \hat{ζ_{2}^{3}}) \\ λ_{3} = \hat{ζ_{3}^{2}} - 8 \hat{ζ_{2}^{3}} \end{matrix}

(7)

Using the log-cumulants calculated in Equations (2) and (3) and the estimated

κ

in Equation (4) permits us to evaluate the parameters

φ

and

θ

as:

\hat{φ} = s i g n (- \hat{ζ_{3})} \sqrt{\frac{1}{\hat{ζ_{2}}} Φ_{0} (1, \hat{κ})}

(8)

\hat{θ} = e x p {\hat{ζ_{1}} - \frac{Φ_{0} (\hat{κ}) - l o g (\hat{κ})}{\hat{φ}}}

(9)

The classical statistical features, such as kurtosis, skewness, mean, and standard deviation, are extracted too.

3.3.2. Texture Features

In general, the texture features are used due to their major influence in image segmentation [43]. In the case of echocardiography video frames having complex separations between the boundary regions, the texture features are essential to discriminate between boundaries as they are considered a function of the spatial variation of pixel intensities of gray values. Consequently, local binary pattern (LBP), entropy, and intensities are determined to represent the texture features [44]. LBP texture descriptors have a potential capacity to distinguish between the tiny differences in terms of topography and texture [44] as they contain details from different areas of the left ventricle [45]. The entropy is determined as the first occurrence for texture analysis.

3.3.3. Nodes Representation of Segmented Instances

The first step of the framework generates output masks by applying FgSegNet_S. For each instance, the motion, temporal, statistical, and texture features are estimated. Then, they are concatenated to represent the instances on the vertices of the graph. The features constitute dimensional vectors of length 148.

3.4. Graph Construction

The GraphECV algorithm uses K-nearest neighbors (k-NN) reported in most of the STOA methods for graph construction. We consider

X = {[x_{1}, x_{2}, . . ., x_{N}]}^{T}

the matrix of features of N vertices. By linking K neighbors of each vertex or node, the following kernel is implemented to estimate the weight of each edge:

w_{i j} = e x p - \frac{∥ x_{i} - x_{j} ∥_{2}^{2}}{σ^{2}}

(10)

w_{i j}

is calculated using a Gaussian kernel [46] which reflects the similarity and correlation between the samples at nodes i and j. The high values of

w_{i j}

indicate that the instances are well correlated;

σ

is the standard deviation expressed as:

σ = \frac{1}{| ε + N |} \sum_{(i, j) \in ε} ∥ x_{i} - x_{j} ∥_{2}^{2}

(11)

3.5. Graph Signals

In this work, the matrix

Y \in {0, 1}^{N \times 2}

is considered a graph signal with two classes

(p = 2)

: left ventricular and background. Each row of

Y

reflects the segmented region belonging to the left ventricular

([0, 1])

or to the background

([1, 0])

. In order to identify whether the node is background or a left ventricle, the intersection over union and the intersection over vertex are computed [4].

3.6. Semi-Supervised Learning

The GraphECV algorithm requires the Sobolev norm which is a semi-supervised learning approach to construct the graph after the sampling operation. Variational splines or combinatorial Laplace operator are introduced as tools to minimize the Sobolev norms [47]. The graph signal of labeled (or sampled) data is defined as:

y_{p} (A) = {Sy}_{p}

(12)

where

p = 1, 2

. The Sobolev norm of the graph signals is expressed as:

{∥ z_{p} ∥}_{α, ϵ} = ∥ {(L + ϵ I)}^{α / 2} z_{p} ∥_{2}^{2} .

(13)

Consequently, the semi-supervised learning approach aiming to minimize the Sobolev norm can be watched as an optimisation problem expressed as:

z_{p} z_{p}^{T} {(L + ϵ I)}^{α} z_{p} s . t S z_{p} - y_{p} (A) = 0

(14)

where

L

is the combinatorial Laplacian matrix of the graph G, and

I

is the identity matrix. As

(L + ϵ I)

is an invertible matrix for

ϵ > 0

in non-directed graphs [48], the solution of the optimization problem in Equation (14) can be revealed as:

\tilde{Z} = {({(L + ϵ I)}^{- 1})}^{α} S^{T} {(S {({(L + ϵ I)}^{- 1})}^{α} S^{T})}^{- 1} y (A) .

(15)

Experimentally,

ϵ

is set to

0.2

, and

α

to 1 in this work. These values provide the best results after running different experiments for

ϵ

=

[0.2, 0.1, 10, 20]

and

α

=

[1, 2]

. The graph signal processing toolbox developed by [49] is employed to solve the optimization problem introduced in Equation (14).

4. Experimental Results

This section introduces the echocardiography videos datasets used in the current work, the evaluation metrics applied to assess the performance of the proposed methodology, the implementation details, and the experiments performed during the implementation of our proposed method GraphECV.

4.1. Datasets

4.1.1. Echonet-Dynmaic Dataset

The EchoNet-Dynamic Dataset [50] is the first dataset reported in the literature with 10,030 echocardiography videos. The 2-D gray scale videos are generated from 10,030 individuals with unique visits. The videos have a resolution of 112 × 112 pixels, and it is a four-chamber view [50]. Two frames of each video were manually labeled by medical professionals [29].

4.1.2. CAMUS Dataset

The second dataset adopted for this research is the Cardiac Acquisitions for Multi-structure Ultrasound Segmentation (CAMUS) [51]. It was introduced to the research community in 2019. This 2-D dataset contains the medical exams of 500 patients. The data were collected at various acquisition settings with no prerequisite, so some cases were challenging to trace. In addition, in some cases, the wall is not visible. A portion of the data were acquired in five-chambers view settings rather four-chambers view setting, since the probe orientation was unfeasible. This would produce realistic scenarios [51].

4.2. Evaluation Metric

The evaluation criteria is the Dice coefficient

(D C)

or

F 1 - m e a s u r e

. The Dice coefficient is a geometric metric which measures the pixel similarities between ground truth data and their corresponding predicted segmentation. It is expressed as follows [52]:

D C = \frac{2 | X \cap Y |}{| X + Y |} = \frac{2 T P}{2 T P + F P + F N}

(16)

where X is the predicted data, and Y is the ground truth. The Dice coefficient determines the overlap between X and Y.

T P (T r u e P o s i t i v e)

,

F P (F a l s e P o s i t i v e)

, and

F N (F a l s e N e g a t i v e)

represent the amount of pixels which are correctly assigned as labels, incorrectly assigned as labels, and incorrectly assigned as no labels in X, respectively.

4.3. Implementation Details

Python 3.7 and Detectron [53] were used for the implementation of instance and semantic segmentation. FgSegNet_S was trained for 200 epochs using a learning rate of 0.00035 and a batch size of 5. The graph signal processing toolbox [49] was utilized for the reconstruction of graph signals. The experiments were implemented on a powerful NVIDIA Geforce RTX GPU.

4.4. Results

Several ablation studies were conducted over the datasets to analyze the performance of our proposed model (GraphECV) which implicates many parameters such as the percentage of labeled ground truth frames used during the training process, the number of k neighbors for k-NN in the graph construction, and the parameters

α

and

ϵ

for the Sobolev norm. Ablation studies were carried out for these parameters. In addition, experimental results analyze and discuss some components of the framework in Figure 1 such as the segmentation method. To represent the nodes on the graph during the graph construction, many semantic segmentation methods were applied. FgSegNet_S attained the best performance due to the FPM module implemented between the encoder and decoder. FPM was able to better segment the dense spatial of echocardiography videos. FgSegNet_S outperformed the STOA segmentation algorithms such as Mask-RCNN [54], Unet [55] and Deeplab [56] especially in the case of a small amount of annotated data

(5 %)

. Table 1 depicts the segmentation’s performance in terms of the Dice coefficient score for different segmentation methods. K-NN is responsible of the construction of the graph. Table 2 briefs the performance of GraphECV with various values of k parameter (k = 5, k = 10, k = 20, and k = 30). For both datasets, the best results were calculated for

k = 30

where all the nodes were connected.On the other hand, for small k values, the graph was mislaying global information of the database. For this experiment,

5 %

of annotated data was used.

The parameters

α

and

ϵ

of the semi-supervised learning block were associated to the Sobolev minimization. Sobolev minimization experiments were performed for

ϵ = 0.2

, 0.5, 1, 20, and

α = 1, 2

. Table 3 and Table 4 summarize the performance of the minimization of the Sobolev for

α = 1

and

α = 2

, respectively. The best results were obtained for

ϵ = 0.2

and

α = 1

. Higher values of

α

make the Laplacian matrix denser, which at the same time result in computational and memory problems.

For the variation of percentage of labeled data, the experiments were conducted using the percentages of

[5 %, 10 %, 20 %, 30 %,

and

50 %]

. The nodes of the graph were weighted upon calculation of the motion, temporal, texture, and statistical features. To show the discriminative effect of the statistical parameters of

G Γ D

, GraphECV was applied without the integration of statistical features. The results obtained in both cases are reported in Table 5. While the framework was trained using

5 %

and

30 %

of labeled data, it was proven that our solution to integrate the statistical parameters of

G Γ D

outstripped by approximately

10 %

the case where the

G Γ D

parameters were not included.

Our method was compared with the semi-supervised and supervised STOA methods for VOS. The supervised STOA methods include proposal-generation, refinement and merging for video object segmentation (PReMVOS) [57] and one shot video object segmentation (OSVOS) [58], while the semi-supervised STOA methods are temporal memory attention network (TMANet) [16] and a corrective fusion network for efficient semantic segmentation on video (Accel) [14]. For fair comparison, these methods were applied on the same test datasets. Figure 2 and Figure 3 show the visual results of the proposed GraphECV method and the STOA algorithms for the EchoNet-Dynamic and CAMUS datasets, respectively. Table 6 and Table 7 display the comparisons of the qualitative results of the GraphECV approach on EchoNet-Dynamic and CAMUS datasets for VOS with the STOA methods spanning all the percentage of labeled data. The whole training data

(100 %)

was used in the case of the EchoNet-Dynamic dataset to compare with the baseline deep learning segmentation method EchoNet-Dynamic developed by [50] (Echone-Dynamic method is called Echonet here to differentiate between the method and the dataset). The performance of our proposed framework surpasses the other STOA methods for all the percentage of labeled data on both datasets. We can observe an improvement of the Dice coefficient when the percentage of labeled data increased from

5 %

to higher percentages. Although in the case of a very small amount of annotated data

(5 %)

, our results show competitive performance compared to other STOA methods trained over

50 %

or fully annotated data. This is mainly due to the semi-supervised learning yielding rigorous discrimination of the left ventricle on graph nodes.

5. Conclusions

Accurate interpretation and analysis of the echocardiography videos are important in assessing cardiovascular diseases. In this research paper, we suggested a new tool, GraphECV, of semi-supervised learning for echocardiography video segmentation aiming to detect the left ventricle. The framework of GraphECV requires segmentation and extraction of texture, statistical, and temporal features to represent the nodes on the graph, application of K-nearest neighbors to construct the graph, graph sampling by embedding the graph with few labeled data, and at the en, semi-supervised learning to reconstruct the graph. The proposed algorithm was evaluated on two publicly available echocardiography datasets. Through the experiments, GraphECV consistently outstripped several STOA methods by a significant margin.

For future research directions, we intend to address the problem of real-time processing for echocardiography videos which improves the diagnostic process and the healthcare of the patient. Furthermore, the semi-supervised learning based on Graph Signal Processing can explore other relevant features capable of enhancing the representation of nodes on the graph.

Author Contributions

Methodology, M.C.E.r.; Writing—original draft, M.D.; Writing—review & editing, M.A.-S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

EchoNet-Dynamic dataset is publicly available at https://echonet.github.io/dynamic/ (accessed on 5 September 2022). CAMUS datset is publicly avilable at https://www.creatis.insa-lyon.fr/Challenge/camus/databases.html (accessed on 5 September 2022).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

$C A M U S$	Cardiac Acquisitions for Multi structure Ultrasound Segmentation
$F P M$	Feature Pooling Module
$G Γ D$	Generalized Gama Distribution
$G S P$	Graph Signal Processing
$L B P$	Local Binary Pattern
$S T O A$	State-Of-The-Art
$V O S$	Video Object Segmentation

References

World Health Organization. Cardiovascular Diseases. Available online: https://www.who.int/health-topics/cardiovascular-diseases#tab=tab_1 (accessed on 5 September 2022).
Attia, D.; Benazza-Benyahia, A. Left ventricle detection in echocardiography videos. In Proceedings of the 2018 4th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), Sousse, Tunisia, 21–24 March 2018; pp. 1–6. [Google Scholar] [CrossRef]
Madani, A.; Jia Rui Ong, A.T.; Mofrad, M.R.K. Deep echocardiography: Data-efficient supervised and semi-supervised deep learning towards automated diagnosis of cardiac disease. NPJ Digit. 2018, 1, 59. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Giraldo, J.H.; Javed, S.; Bouwmans, T. Graph Moving Object Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 2485–2503. [Google Scholar] [CrossRef] [PubMed]
Chendeb El Rai, M.; Giraldo, J.H.; Al-Saad, M.; darweesh, M.; Bouwmans, T. SemiSegSAR: A Semi-Supervised Segmentation Algorithm for Ship SAR Images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
Chendeb El Rai, M.; Al-Saad, M.; Darweesh, M.; Al Mansoori, S.; Al Ahmad, H.; Mansoor, W. Moving Objects Segmentation in Infrared Scene Videos. In Proceedings of the 2021 4th International Conference on Signal Processing and Information Security (ICSPIS), Dubai, United Arab Emirates, 24–25 November 2021; pp. 17–20. [Google Scholar] [CrossRef]
Dong, X.; Thanou, D.; Toni, L.; Bronstein, M.; Frossard, P. Graph Signal Processing for Machine Learning: A Review and New Perspectives. IEEE Signal Process. Mag. 2020, 37, 117–127. [Google Scholar] [CrossRef]
Stankovic, L.; Mandic, D.P.; Dakovic, M.; Kisil, I.; Sejdic, E.; Constantinides, A.G. Understanding the Basis of Graph Signal Processing via an Intuitive Example-Driven Approach [Lecture Notes]. IEEE Signal Process. Mag. 2019, 36, 133–145. [Google Scholar] [CrossRef]
Zhen, M.; Li, S.; Zhou, L.; Shang, J.; Feng, H.; Fang, T.; Quan, L. Learning discriminative feature with crf for unsupervised video object segmentation. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2020; pp. 445–462. [Google Scholar]
Bhatti, A.H.; Rahman, A.U.; Butt, A.A. Unsupervised video object segmentation using conditional random fields. Signal Image Video Process. 2019, 13, 9–16. [Google Scholar] [CrossRef]
Cheng, J.; Tsai, Y.H.; Wang, S.; Yang, M.H. Segflow: Joint learning for video object segmentation and optical flow. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 686–695. [Google Scholar]
Ren, Z.; Yan, J.; Ni, B.; Liu, B.; Yang, X.; Zha, H. Unsupervised deep learning for optical flow estimation. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
Ding, M.; Wang, Z.; Zhou, B.; Shi, J.; Lu, Z.; Luo, P. Every frame counts: Joint learning of video segmentation and optical flow. Proc. Aaai Conf. Artif. Intell. 2020, 34, 10713–10720. [Google Scholar] [CrossRef]
Jain, S.; Wang, X.; Gonzalez, J.E. Accel: A corrective fusion network for efficient semantic segmentation on video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 8866–8875. [Google Scholar]
Ilg, E.; Mayer, N.; Saikia, T.; Keuper, M.; Dosovitskiy, A.; Brox, T. Flownet 2.0: Evolution of optical flow estimation with deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2462–2470. [Google Scholar]
Wang, H.; Wang, W.; Liu, J. Temporal Memory Attention for Video Semantic Segmentation. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; pp. 2254–2258. [Google Scholar]
Zhang, X.; Xia, Y. LSMVOS: Long-Short-Term Similarity Matching for Video Object. arXiv 2020, arXiv:2009.00771. [Google Scholar]
Xiao, H.; Feng, J.; Lin, G.; Liu, Y.; Zhang, M. Monet: Deep motion exploitation for video object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1140–1148. [Google Scholar]
Girisha, S.; Pai, M.M.M.; Verma, U.; Pai, R.M. Performance Analysis of Semantic Segmentation Algorithms for Finely Annotated New UAV Aerial Video Dataset (ManipalUAVid). IEEE Access 2019, 7, 136239–136253. [Google Scholar] [CrossRef]
Smistad, E.; Østvik, A. 2D left ventricle segmentation using deep learning. In Proceedings of the 2017 IEEE international ultrasonics symposium (IUS), Washington, DC, USA, 6–9 September 2017; pp. 1–4. [Google Scholar]
Yuan, Y.; Liang, X.; Wang, X.; Yeung, D.Y.; Gupta, A. Temporal dynamic graph LSTM for action-driven video object detection. In Proceedings of the IEEE International Conference on Computer Vision, Washington, DC, USA, 6–9 September 2017; pp. 1801–1810. [Google Scholar]
Xu, N.; Yang, L.; Fan, Y.; Yang, J.; Yue, D.; Liang, Y.; Price, B.; Cohen, S.; Huang, T. Youtube-vos: Sequence-to-sequence video object segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 585–601. [Google Scholar]
Wang, W.; Shen, J.; Porikli, F.; Yang, R. Semi-supervised video object segmentation with super-trajectories. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 985–998. [Google Scholar] [CrossRef]
Duarte, K.; Rawat, Y.S.; Shah, M. Capsulevos: Semi-supervised video object segmentation using capsule routing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 8480–8489. [Google Scholar]
Luiten, J.; Voigtlaender, P.; Leibe, B. PReMVOS: Proposal-generation, Refinement and Merging for the YouTube-VOS Challenge on Video Object Segmentation 2018. In Proceedings of the 1st Large-Scale Video Object Segmentation Challenge—ECCV Workshops, Munich, Germany, 8–14 September 2018. [Google Scholar]
Deep, G.; Kaur, J.; Singh, S.P.; Nayak, S.R.; Kumar, M.; Kautish, S. MeQryEP: A Texture Based Descriptor for Biomedical Image Retrieval. J. Healthc. Eng. 2022, 2022, 9505229. [Google Scholar] [CrossRef]
Aggarwal, A.; Madam Chakradar, M.; AL-Dois, H. COVID-19 Risk Prediction for Diabetic Patients Using Fuzzy Inference System and Machine Learning Approaches. J. Healthc. Eng. 2022, 2022, 4096950. [Google Scholar] [CrossRef]
Dangi, S.; Yaniv, Z.; Linte, C.A. Left Ventricle Segmentation and Quantification from Cardiac Cine MR Images via Multi-task Learning. arXiv 2018, arXiv:1809.10221. [Google Scholar]
Wu, H.; Liu, J.; Xiao, F.; Wen, Z.; Cheng, L.; Qin, J. Semi-supervised segmentation of echocardiography videos via noise-resilient spatiotemporal semantic calibration and fusion. Med. Image Anal. 2022, 78, 102397. [Google Scholar] [CrossRef]
Sigit, R.; Rochmawati, E. Segmentation echocardiography video using B-Spline and optical flow. In Proceedings of the International Conference on Knowledge Creation and Intelligent Computing (KCIC), Manado, Indonesia, 15–17 November 2016; pp. 226–231. [Google Scholar] [CrossRef]
Lim, L.; Keles, H. Foreground segmentation using convolutional neural networks for multiscale feature encoding. Pattern Recognit. Lett. 2018, 112, 256–262. [Google Scholar] [CrossRef] [Green Version]
Ortega, A.; Frossard, P.; Kovačević, J.; Moura, J.; Vandergheynst, P. Graph Signal Processing: Overview, Challenges, and Applications. Proc. IEEE 2018, 106, 808–828. [Google Scholar] [CrossRef] [Green Version]
Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
Lim, L.A.; Keles, H.Y. Learning multi-scale features for foreground segmentation. Pattern Anal. Appl. 2019, 23, 1369–1380. [Google Scholar] [CrossRef] [Green Version]
Gifani, P.; Behnam, H.; Haddadi, F.; Sani, Z.A.; Gifani, P. Echocardiography noise reduction using sparse representation. Comput. Electr. Eng. 2016, 53, 301–318. [Google Scholar] [CrossRef]
Vegas-Sanchez-Ferrero, G.; Aja-Fernandez, S.; Palencia, C.; Martin-Fernandez, M. A generalized gamma mixture model for ultrasonic tissue characterization. Comput. Math. Methods Med. 2012, 2012, 481923. [Google Scholar] [CrossRef] [Green Version]
Prager, R.W.; Gee, A.H.; Treece, G.M.; Berman, L.H. Analysis of speckle in ultrasound images using fractional order statistics and the homodyned k-distribution. Ultrasonics 2002, 40, 133–137. [Google Scholar] [CrossRef]
Byra, M.; Nowicki, A.; Wróblewska-Piotrzkowska, H.; Dobruch-Sobczak, K. Classification of breast lesions using segmented quantitative ultrasound maps of homodyned K distribution parameters. Med. Phys. 2016, 43, 5561–5569. [Google Scholar] [CrossRef] [PubMed]
Huang, L.F. The Nakagami and its related distributions. WSEAS Trans. Math. 2016, 15, 477–485. [Google Scholar]
Vegas-Sanchez-Ferrero, G.; Seabra, J.; Rodriguez-Leor, O.; Serrano-Vida, A.; Aja-Fernandez, S.; Palencia, C.; Martin-Fernandez, M.; Sanches, J. Gamma mixture classifier for plaque detection in intravascular ultrasonic images. IEEE Trans. Ultrason. Ferroelectr. Freq. Control. 2014, 61, 44–61. [Google Scholar] [CrossRef] [PubMed]
Nicolas, J.M.; Anfinsen, S.N. Introduction to Second Kind Statistics: Application of Log-Moments and Log-Cumulants to the Analysis of Radar Image Distributions. 2011. Available online: https://www.semanticscholar.org/paper/Introduction-to-Second-Kind-Statistics%3A-Application-Nicolas-Anfinsen/cd3f5316c6975bf512bc25cff20ef8529442c52a (accessed on 5 September 2022).
Pappas, O.A.; Anantrasirichai, N.; Achim, A.M.; Adams, B.A. River Planform Extraction From High-Resolution SAR Images via Generalized Gamma Distribution Superpixel Classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 3942–3955. [Google Scholar] [CrossRef]
Skorton, D.J.; Collins, S.M.; Nichols, J.; Pandian, N.G.; Bean, J.A.; Kerber, R.E. Quantitative texture analysis in two-dimensional echocardiography: Application to the diagnosis of experimental myocardial contusion. Circulation 1983, 68, 217–223. [Google Scholar] [CrossRef] [Green Version]
Iakovidis, D.K.; Keramidas, E.G.; Maroulis, D.E. Fuzzy Local Binary Patterns for Ultrasound Texture Characterization. In Proceedings of the International Conference Image Analysis and Recognition, Póvoa de Varzim, Portugal, 25–27 June 2008; pp. 750–759. [Google Scholar]
Moghaddasi, H.; Nourian, S. Automatic assessment of mitral regurgitation severity based on extensive textural features on 2D echocardiography videos. Comput. Biol. Med. 2016, 73, 47–55. [Google Scholar] [CrossRef]
Dong, X.; Thanou, D.; Frossard, P.; Vandergheynst, P. Learning Laplacian Matrix in Smooth Graph Signal Representations. IEEE Trans. Signal Process. 2016, 64, 6160–6173. [Google Scholar] [CrossRef] [Green Version]
Pesenson, I. Variational splines and Paley–Wiener spaces on combinatorial graphs. Constr. Approx. 2009, 29, 1–21. [Google Scholar] [CrossRef] [Green Version]
Giraldo, J.H.; Bouwmans, T. Semi-Supervised Background Subtraction Of Unseen Videos: Minimization Of The Total Variation Of Graph Signals. In Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 3224–3228. [Google Scholar] [CrossRef]
Perraudin, N.; Paratte, J.; Shuman, L.M.; Kalofolias, V.; Vandergheynst, P.; Hammond, D. GSPBOX: A toolbox for signal processing on graphs. arXiv 2014, arXiv:1408.5781. [Google Scholar]
Ouyang, D.; He, B.; Ghorbani, A.; Yuan, N.; Ebinger, J.; Langlotz, C.P.; Heidenreich, P.A.; Harrington, R.A.; Liang, D.H.; Ashley, E.A.; et al. Video-based AI for beat-to-beat assessment of cardiac function. Nature 2020, 580, 252–256. [Google Scholar] [CrossRef]
Leclerc, S.; Smistad, E.; Pedrosa, J.; Østvik, A.; Cervenansky, F.; Espinosa, F.; Espeland, T.; Berg, E.A.R.; Jodoin, P.M.; Grenier, T.; et al. Deep Learning for Segmentation Using an Open Large-Scale Dataset in 2D Echocardiography. IEEE Trans. Med. Imaging 2019, 38, 2198–2210. [Google Scholar] [CrossRef] [Green Version]
Perazzi, F.; Pont-Tuset, J.; McWilliams, B.; Van Gool, L.; Gross, M.; Sorkine-Hornung, A. A benchmark dataset and evaluation methodology for video object segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 724–732. [Google Scholar]
Detectron 2. 2019. Available online: https://github.com/facebookresearch/detectron2 (accessed on 5 September 2022).
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [Google Scholar] [CrossRef]
Luiten, J.; Voigtlaender, P.; Leibe, B. PReMVOS: Proposal-generation, Refinement and Merging for Video Object Segmentation. In Asian Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
Caelles, S.; Maninis, K.K.; Pont-Tuset, J.; Leal-Taixé, L.; Cremers, D.; Van Gool, L. One-shot video object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 221–230. [Google Scholar]

Figure 1. The pipeline of the proposed GraphECV method applies instance segmentation. The motion, temporal, textural, and statistical features have been extracted to represent the nodes on the graph. The graph signal propagates the labeled data into the graph. It is used to identify if a node is a left ventricle or a background. A set of nodes are sampled via graph sampling. The reconstruction of all the labels in the graph was achieved by the use of a semi-supervised learning algorithm.

Figure 2. EchoNet-Dynamic dataset: visual results of the proposed GraphECV method and other STOA algorithms. Different frames from different videos with many motions of the left ventricle are presented from top to bottom. From left to right: original frame, the ground truth, PreMVOS [57], Tmanet [16], Accel [14], Echonet [50], OSVOS [58], and our proposed GraphECV.

Figure 3. CAMUS dataset: visual results of the proposed GraphECV method and other STOA algorithms. Different frames from different videos with many motions of the left ventricle are presented from top to bottom. From left to right: original frame, the ground truth, PreMVOS [57], Tmanet [16], Accel [14], OSVOS [58], and our proposed GraphECV.

Table 1. Dice Coefficient score of our proposed method when using different segmentation methods for nodes representation. The percentage of annotated data is

5 %

.

Table 1. Dice Coefficient score of our proposed method when using different segmentation methods for nodes representation. The percentage of annotated data is

5 %

.

Dataset	Unet	DeepLab	Mask-RCNN	FgSegNet_S
EchoNet-Dynamic	0.6432	0.7890	0.742	0.9113
CAMUS	0.6995	0.7890	0.7695	0.9270

Table 2. Dice coefficient score of our proposed method with change in the construction of the graph. This ablation encompasses K-nearest neighbors with k = 5, k = 10, k = 20, and k = 30.

	k = 5	k = 10	k = 20	k = 30
EchoNet-Dynamic	0.8320	0.8523	0.8789	0.9113
CAMUS	0.8459	0.8648	0.8837	0.9270

Table 3. Dice coefficient score of our proposed method with ablation encompassing the Sobolev minimization parameters with

α = 1

. The percentage of annotated data is

5 %

.

Table 3. Dice coefficient score of our proposed method with ablation encompassing the Sobolev minimization parameters with

α = 1

. The percentage of annotated data is

5 %

.

	$ϵ = 0.1$	$ϵ = 0.2$	$ϵ = 10$	$ϵ = 20$
EchoNet-Dynamic	0.8589	0.9113	0.8954	0.8561
CAMUS	0.8796	0.9270	0.9014	0.8924

Table 4. Dice coefficient score of our proposed method with ablation encompassing the Sobolev minimization parameters with

α = 2

. The percentage of annotated data is

5 %

.

Table 4. Dice coefficient score of our proposed method with ablation encompassing the Sobolev minimization parameters with

α = 2

. The percentage of annotated data is

5 %

.

	$ϵ = 0.1$	$ϵ = 0.2$	$ϵ = 10$	$ϵ = 20$
EchoNet-Dynamic	0.8321	0.8652	0.8789	0.8958
CAMUS	0.8591	0.8687	0.8956	0.9087

Table 5. Average Dice coefficient reported on the datasets without and with the integration of statistical parameters of

G Γ D

;

5 %

, and

30 %

of labeled data were used during the training process.

Table 5. Average Dice coefficient reported on the datasets without and with the integration of statistical parameters of

G Γ D

;

5 %

, and

30 %

of labeled data were used during the training process.

	With $G Γ D$		Without $G Γ D$
Dataset	5%	30%	5%	30%
EchoNet-Dynamic	0.9113	0.9301	0.7919	0.8212
CAMUS	0.9270	0.9329	0.8210	0.8531

Table 6. EchoNet-Dynamic dataset: Statistical comparison based on the average Dice coefficient with supervised and semi-supervised STOA methods with different percentage of labeled training data.

Method	5%	10%	20%	30%	50%	100%
OSVOS	0.8012	0.8745	0.8823	0.8912	0.9025	0.9132
PreMVOS	0.7532	0.7756	0.7845	0.7960	0.8023	0.8245
Echonet	-	-	-	-	-	0.9200
Accel	0.8496	0.8579	0.8598	0.8609	0.8641	0.8756
TMANET	0.8523	0.8699	0.8752	0.8895	0.9027	0.9132
Ours	0.9113	0.9209	0.9285	0.9301	0.9355	0.9389

Table 7. CAMUS dataset: Statistical comparison based on the average Dice coefficient with supervised and semi-supervised STOA methods with different percentage of labeled training data.

Method	5%	10%	20%	30%	50%
OSVOS	0.8352	0.8401	0.8479	0.8654	0.8845
PreMVOS	0.8629	0.8710	0.8746	0.8810	0.89553
Accel	0.8745	0.8891	0.8954	0.9058	0.9125
TMANET	0.8862	0.8954	0.9018	0.9132	0.9258
Ours	0.9270	0.9257	0.9289	0.9329	0.9396

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

El rai, M.C.; Darweesh, M.; Al-Saad, M. Semi-Supervised Segmentation of Echocardiography Videos Using Graph Signal Processing. Electronics 2022, 11, 3462. https://doi.org/10.3390/electronics11213462

AMA Style

El rai MC, Darweesh M, Al-Saad M. Semi-Supervised Segmentation of Echocardiography Videos Using Graph Signal Processing. Electronics. 2022; 11(21):3462. https://doi.org/10.3390/electronics11213462

Chicago/Turabian Style

El rai, Marwa Chendeb, Muna Darweesh, and Mina Al-Saad. 2022. "Semi-Supervised Segmentation of Echocardiography Videos Using Graph Signal Processing" Electronics 11, no. 21: 3462. https://doi.org/10.3390/electronics11213462

APA Style

El rai, M. C., Darweesh, M., & Al-Saad, M. (2022). Semi-Supervised Segmentation of Echocardiography Videos Using Graph Signal Processing. Electronics, 11(21), 3462. https://doi.org/10.3390/electronics11213462

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Semi-Supervised Segmentation of Echocardiography Videos Using Graph Signal Processing

Abstract

1. Introduction

2. Related Work

2.1. Graph Signal Processing

2.2. Video Object Segmentation

3. Graph Signal Processing and Semi-Supervised Echocardiography Video Segmentation

3.1. Introduction to Signal Processing on Graphs

3.2. Instance Segmentation

3.3. Feature Extraction and Nodes Representation

3.3.1. Statistical Representation of Echocardiography Data

3.3.2. Texture Features

3.3.3. Nodes Representation of Segmented Instances

3.4. Graph Construction

3.5. Graph Signals

3.6. Semi-Supervised Learning

4. Experimental Results

4.1. Datasets

4.1.1. Echonet-Dynmaic Dataset

4.1.2. CAMUS Dataset

4.2. Evaluation Metric

4.3. Implementation Details

4.4. Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI