This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

The aim of this paper is to present a mobile agents model for distributed classification of Big Data. The great challenge is to optimize the communication costs between the processing elements (PEs) in the parallel and distributed computational models by the way to ensure the scalability and the efficiency of this method. Additionally, the proposed distributed method integrates a new communication mechanism to ensure HPC (High Performance Computing) of parallel programs as distributed one, by means of cooperative mobile agents team that uses its asynchronous communication ability to achieve that. This mobile agents team implements the distributed method of the Fuzzy C-Means Algorithm (DFCM) and performs the Big Data classification in the distributed system. The paper shows the proposed scheme and its assigned DFCM algorithm and presents some experimental results that illustrate the scalability and the efficiency of this distributed method.

Computer science technologies have introduced several intensive data application-based complex tasks in different domains (Internet of Things (IoT), cloud computing, data mining, Big Data analysis), etc., in order to improve HPC (High Performance Computing).

Consider the large amount of data and the complex tasks that these applications have to process. Their scalability and efficiency depends on their abilities to manage these considerations. They also depend on the processing environment where they are deployed. For example, in the medical domain, performing an application for MRI (magnetic resonance imaging) image cerebral analysis-based clustering algorithms. It involves a wide number of data to be processed by this application and that requires great processing power to achieve HPC.

Clustering algorithms are widely used in the medical field to analyze, and diagnose and detect abnormal regions based on MRI image classification. However, these features require using high performance computational models that grant the efficiency and the flexibility with the most complex clustering algorithms such as the Fuzzy C-Means Algorithm. So, how can we implement these requirements in parallel and distributed computational model-based distributed system? Consider the great challenge of optimizing the communication cost in distributed computational models. We will present a cooperative computational processing model that achieves these computational requirements. This paper is organized as follows:

We provide the model of parallel and distributed computing where the distributed DFCM method is assigned to be implemented (

We demonstrate that the DFCM method implementation-based mobile agents is promising for Big Data classification (

To highlight the aim of this paper, we start with a brief overview about the parallel and distributed computational models [

The multi-agent system (MAS) [

Mobile agents have interesting skills, such as autonomy, mobility, and asynchronous communication ability. They can communicate by sending asynchronous ACL (agent communication language) messages between each other, which significantly reduces the communication cost in the computational model. Thus, the mobile agents grant efficient communication mechanisms for HPC.

The cooperative computational model where the proposed (DFCM) method is assigned to be implemented is a parallel and distributed virtual machine based on mobile agents. This machine is built over a distributed computing grid of size (

The distributed classification is performed by the implementation of the proposed DFCM method on a cooperative mobile agents team works model as illustrated in

The well-known clustering algorithm named the fuzzy c-means (FCM) is proposed by Dunn [_{1}, _{2}, …, _{N}} that minimizes the objective function given by the following equation:

The membership matrix U has the properties:

_{j}_{i}

V_{i}

_{i}_{j}

N Number of data.

c Number of clusters 2 ≤

To reach a minimum of dissimilarity function there are two conditions:

The standard FCM classification is achieved according to the following algorithm stages, which are summarized in

The distributed algorithm described in

Mobile Team Leader Agent Initialization

In this step the Team Leader agent is initialized by the input MRI image and the values of

Grid Construction

The Team leader agent splits the input image into (

Fuzzy C-Means Classification

Each AVPE(

For each iteration

The Team Leader agent sends the class centers to all the AVPEs.

Each AVPE(

Each AVPE(

where:

^{m} × data) computed for each class center

^{m}) computed for each class center

^{m} × distance²) computed for all classes. This term is computed by:

The Team Leader agent performs these three sub tasks: assembling the elementary results, computing the new class centers, and computing the objective function J_{t}

Assembling the elementary results

The Team Leader agent receives the elementary results (TE1(

Computing the global class centers

The Team Leader agent gets the computed global values (GTE1(_{i}

Computing the objective function J_{t}

The Team leader agentgets the global value of GTE3(

The Team Leader agent tests the condition of the algorithm convergence (|J_{t}_{(t−1)}|<E_{th}).

// End of iteration

The Team Leader agent requests to each AVPE(

Each AVPE(

The Team Leader agent assembles the segmented elementary images and displays the segmented output image.

A distributed computing environment, as illustrated in

JADE (Java Agent DEvelopment) [

The classification of the MRI image is performed on this platform by applying this middleware. It creates the main components of this model, which are:

The host container: this is the second container which is started in the platform after the main container, where the mobile team leader agent is deployed in order to perform its tasks in the grid.

The agent containers: these are the containers that are started in the platform, where the mobile team worker agents will move to perform their tasks.

The proposed DFCM algorithm is implemented in this model for MRI medical image analysis. To do so, we choose two cerebral MRI images: brain MRI image (Img1) in

To illustrate the effectiveness features of the implementation of the FCM program in this model, we present the proposed five cases studies:

Dynamic convergence of this program for the MRI image (Img1) with two different class center initializations:

(

(

Dynamic convergence of this program for the MRI image (Img2) with two different class center initializations:

(

(

The DFCM classification time according to the number of agents involved in the classification for the initial class centers (c1, c2, c3, c4, c5) = (1.5, 2.2, 3.8, 5.2, 8.6) for Img1, and (c1, c2, c3, c4, c5) = (1.5, 2.2, 3.8, 5.2, 8.6) for Img2. In

A detailed comparison between the FCM and the DFCM methods is made in

The DFCM classification time according to the number of nodes in the grid computing by considering 16 AVPEs for the two images (Img1) and (Img2). In

The speedup S(DFCM), its relative speedup S_{R}(DFCM), and the efficiency of the DFCM classification method are presented, respectively, in _{R}(DFCM) are illustrated in

T(FCM)is the classification time of the FCM method which corresponds to one agent; and

T(DFCM) is the classification time of the DFCM method which corresponds to the number NA of agents.

There are several inspiring parallel methods of the clustering algorithm on massively parallel computational models which have demonstrated interesting clustering results. In [

The different parallel methods differ from each other by the computational models which are assigned to be implemented. Their implementations depend on the parallel computing strategies [

So, thanks to these several interesting works, the proposed DFCM method is implemented on a scalable and efficient mobile agents model. The JADE middleware-based mobile agents model is the easiest and the most suitable solution to implement this distributed method.

In this paper, we presented a distributed fuzzy c-means method (DFCM) and its application on MRI image classification. This method is implemented on an SPMD model based on cooperative mobile agents grid computing. This model implements the asynchronous communication mechanism, which is based on exchanging ACL messages between the AVPEs. The results obtained by implementing this program, related to the class centers convergence, the classification time, the speedup of the method, and the efficiency, demonstrate that the proposed DFCM method can reduce the complexity of the fuzzy clustering algorithms and, especially, the communication cost between the PEs. The mobile agents abilities ensure the distributed performance keys that ensure HPC.

Fatéma Zahra Benchara proposed, implemented the distributed model, performed the experimental results and wrote the paper. Mohamed Youssfi proposed the idea, implemented the sequential model, analyzed and validated the experimental results. Omar Bouattane and Hassan Ouajji performed the review of the paper.

The authors declare no conflict of interest.

A cooperative computational grid of size 4 × 4.

Computational model for DFCM classification overview.

Standard fuzzy c-means classification stages.

Distributed FCM algorithm organization chart.

Sequence diagram for DFCM on a cooperative multi-agent model.

Agent asynchronous communication mechanisms in a 2D mesh (4 × 4).

Multi-agent middleware overview.

Classification results by the elaborated DFCM model implementation for Img1.

Classification results by the elaborated DFCM Model implementation for Img2.

Dynamic convergence of the class centers starting from class centers (c1, c2, c3) = (1.1, 2.5, 3.8). (

Dynamic convergence of the class centers starting from class centers (c1, c2, c3) = (140.5, 149.5, 150.5). (

Dynamic convergence of the class centers starting from class centers (c1, c2, c3, c4, c5) = (1.5, 2.2, 3.8, 5.2, 8.6). (

Dynamic convergence of the class centers starting from class centers (c1, c2, c3, c4, c5) = (140.5, 149.5, 150.5, 220.2, 250.5). (

Time of DFCM classification for each MRI image depending on the number of agents with initial class centers (c1, c2, c3) = (1.1, 2.5, 3.8) for Img1, and (c1, c2, c3, c4, c5) = (1.5, 2.2, 3.8, 5.2, 8.6) for Img2.

DFCM Classification time depending on the number of nodes in the grid using 16 AVPEs for Img1 with initial class centers (c1, c2, c3) = (1.1, 2.5, 3.8), and with initial class centers (c1, c2, c3, c4, c5) = (1.5, 2.2, 3.8, 5.2, 8.6) for the Img2.

DFCM classification data depending on the number of nodes in the grid using 16 AVPEs (

Relative Speedup of DFCM Classification depending on the number of AVPEs and with initial class centers (c1, c2, c3) = (1.1, 2.5, 3.8) for Img1, and (c1, c2, c3, c4, c5) = (1.5, 2.2, 3.8, 5.2, 8.6) for Img2.

Efficiency of DFCM classification depending on the number of AVPEs and with initial class centers (c1, c2, c3) = (1.1, 2.5, 3.8) for Img1, and (c1, c2, c3, c4, c5) = (1.5, 2.2, 3.8, 5.2, 8.6) for Img2.

Different states of the distributed fuzzy c-means (DFCM) algorithm forImg1 classification starting from different class centers initialization.

Case | Initial Class Centers | Final Class Centers | Number of Iteration | ||||
---|---|---|---|---|---|---|---|

C1 | C2 | C3 | C1 | C2 | C3 | ||

CASE 1 | 1.1 | 2.5 | 3.8 | 1.100 | 97.667 | 146.569 | 13 |

CASE 2 | 140.1 | 149.5 | 150.8 | 1.100 | 97.661 | 146.566 | 20 |

Different states of the distributed fuzzy c-means (DFCM) algorithm for Img2 classification starting from class centers initialization.

Initial Class Centers | Final Class Centers | Number of Iteration | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

Case | C1 | C2 | C3 | C4 | C5 | C1 | C2 | C3 | C4 | C5 | |

CASE 1 | 1.5 | 2.2 | 3.8 | 5.2 | 8.6 | 1.742 | 67.587 | 101.709 | 238.983 | 170.040 | 35 |

CASE 2 | 140.5 | 149.5 | 150.5 | 220.5 | 250.5 | 1.764 | 67.967 | 101.858 | 239.140 | 170.560 | 26 |

FCM and DFCM method comparison for classification of two images (Img1and Img2).

FCM Method | DFCM Method | |||
---|---|---|---|---|

Classification Time (Img1) (ms) | Classification Time (Img2) (ms) | Number of Agents | Classification Time (Img1) (ms) | Classification Time (Img2) (ms) |

778 | 1509 | 1 | 778 | 1509 |

- | - | 2 | 334 | 860 |

- | - | 4 | 200 | 516 |

- | - | 8 | 144 | 371 |

- | - | 16 | 108 | 278 |

- | - | 32 | 103 | 266 |

Speedup of the distributed fuzzy c-means (DFCM) method for Img1 and Img2.

Number of Agents | S_{R}(DFCM) (Img1) (%) |
S(DFCM) (Img1) | S_{R}(DFCM) (Img2) (%) |
S(DFCM) (Img2) |
---|---|---|---|---|

1 | 0 | 1 | 0 | 1 |

2 | 57.069 | 2.32 | 43.008 | 1.75 |

4 | 74.293 | 3.89 | 65.805 | 2.92 |

8 | 81.491 | 5.4 | 75.414 | 4.06 |

16 | 86.118 | 7.2 | 81.577 | 5.42 |

32 | 86.76 | 7.55 | 82.372 | 5.67 |