You are currently viewing a new version of our website. To view the old version click .
Mathematical and Computational Applications
  • Article
  • Open Access

25 May 2018

The Impact of the Implementation Cost of Replication in Data Grid Job Scheduling

,
,
and
1
Department of Computer Science, COMSATS University, University Road, Tobe Camp, Abbottabad 22060, Pakistan
2
Department for Management of Science and Technology Development, Ton Duc Thang University, Ho Chi Minh City, Vietnam
3
Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam
4
Department of Computer Science, University of Texas at San Antonio, San Antonio, TX 78249, USA
This article belongs to the Special Issue Applied Modern Mathematics in Complex Networks

Abstract

Data Grids deal with geographically-distributed large-scale data-intensive applications. Schemes scheduled for data grids attempt to not only improve data access time, but also aim to improve the ratio of data availability to a node, where the data requests are generated. Data replication techniques manage large data by storing a number of data files efficiently. In this paper, we propose centralized dynamic scheduling strategy-replica placement strategies (CDSS-RPS). CDSS-RPS schedule the data and task so that it minimizes the implementation cost and data transfer time. CDSS-RPS consists of two algorithms, namely (a) centralized dynamic scheduling (CDS) and (b) replica placement strategy (RPS). CDS considers the computing capacity of a node and finds an appropriate location for the job. RPS attempts to improve file access time by using replication on the basis of number of accesses, storage capacity of a computing node, and response time of a requested file. Extensive simulations are carried out to demonstrate the effectiveness of the proposed strategy. Simulation results demonstrate that the replication and scheduling strategies improve the implementation cost and average access time significantly.

1. Introduction

The term grid computing was first put forth by Foster and Kesselman [,] as the paradigm that provides reliable, consistent resource sharing and execution of jobs in the distributed systems. A data grid is an integrated architecture that offers a collection of storage and computational resources in distributed geographical locations []. Data grids involve the management and storage of huge amounts of data. Key data grid application areas include (a) climate simulation, (b) the Large Hadron Collider (LHC), (c) astronomy and earth observation, and (d) high-energy physics (HEP) that produce a huge volume of data (terabytes to petabytes) [,,,,,,,]. Data grid scheduling strategies should attempt to optimize the data availability, job execution time, fault tolerance, bandwidth consumption, and response time [].
Techniques that are applied to overcome the data availability problem are (a) caching, (b) mirroring, and (c) replication. The cache is small in size and stores the most commonly requested data, but in large computing environments a small cache may not fulfil the data request rate for each computational resource []. Mirroring conveys the data’s meaning, coping files from one server to other. Mirroring is not a practical technique, as it needs to move all of the data from one site to the other all the time, as the request rate changes periodically. Replication involves deploying several data files on servers to fulfil the client request more rapidly. Replication normally increases the accessibility of data and optimizes response time and bandwidth consumption rate, which ultimately boosts the system performances. Considering the dynamic data grid environment that involves managing huge amounts of data, a replication technique is considered in order to improve data access time in a data grid environment.
Data replication when employed in data grids improves efficiency and reliability in dealing with wide area data accessing and retrieving processes [,]. Data replication techniques are of two types: (a) static and (b) dynamic []. In static replication techniques, a number of created replicas occur within the same node, until the user deletes them or time duration expires. In dynamic replication techniques, data files are duplicated based on user requests and storage capacity requirements of a computing node.
An important issue in data replication techniques is replica placement []. Replica placement is the assignment of the duplicate copies of data to an appropriate number of nodes. Replica placement is further categorized into two phases: (a) replica discovery and (b) replica selection. Replica discovery decides when and where to place the replica. Replica selection decides which file to replicate [,]. If sufficient storage space is not available in a node then the file replacement stage is activated for new replicas []. Thus, the best replica placement strategy improves the response time and bandwidth consumption significantly.
The motivation of this paper is to develop a scheduling and replication strategy that not only minimizes the implementation cost, but also efficiently schedules the tasks with minimum data movements. The scheduling strategy should be able to minimize the transfer and deletion actions of data files among computing nodes. Devising such a strategy is a challenge, as users’ request patterns may change very frequently in a data grid environment. Such an example is a video server system, where the demand for videos changes each day. Therefore, fulfilling the required demand from the nearest server by replicating and replacing new files all the time is not an appropriate methodology. Scheduling jobs for an appropriate computing node is essential, since data transferring among computing nodes increases the implementation cost and data access time []. Therefore, in our proposed strategy, we have considered both the replication and job scheduling.
In this paper, a centralized dynamic scheduling strategy-replica placement strategy (CDSS-RPS) is proposed. CDSS-RPS consists of two strategies: (a) centralized dynamic scheduling strategy (CDSS) and (b) replica placement strategy (RPS). CDSS considers the data grid architecture that supports effective scheduling of job and data replication. RPS replaces replicas on the computing sites that have enough storage space and computing capability. We have considered the replication and scheduling strategy equally to minimize the job execution time.
The main contributions of this paper are as follows:
  • We present a job scheduling strategy for data-intensive applications in a data grid environment. Our paper modifies the cost minimization strategy presented in [], and considers the replication strategy along with the scheduling strategy.
  • We present a novel data replication and scheduling strategy. Our approach consists of minimizing the data access time and implementation cost, i.e., (a) scheduling the tasks on the available computing node for the number of tasks requested, and (b) minimizing the transfer time of all data files with respect to their network bandwidth consumption.
  • GridSim Tool Kit 4.1 [] was also used to define the data grid architecture that supports job scheduling and data replications and perform the simulation.
  • Extensive simulations were carried out to compare the implementation cost and turnaround time of the proposed strategy with the contemporary data replication strategies in a data grid environment.
The rest of the paper is organized as follows: Section 2 discusses the related work. In Section 3, the proposed work is described, i.e., the system model, the problem formulation, and the proposed methodology. The methodology experimentally evaluated and results are presented in Section 4. Finally, Section 5 concludes the paper.

3. Proposed Work

The following subsection explains the system model, the working of CDSS-RPS, the CDSS-RPS, and the architectural diagram in detail.

3.1. System Model

In a data grid system, efficient network performance is required to schedule and dispatch the number of jobs required by the end user. Figure 1 shows how a data grid system supports, schedules, and dispatches the jobs. A typical data grid system is comprised of a set of domains that consist of a replica server and a number of computing sites. Storage for replicas and computational resources for the jobs are provided by the replica server and the computing site, respectively. Both of these functioning devices are commonly known as nodes. A local area network (LAN) serves every two nodes within domains with a similar network bandwidth. Also, a wide area network (WAN) connects different domains’ nodes with different network bandwidth.
Figure 1. System architecture.
If a replica server exists in the same domain as a computing site, it is known as the primary replica server (PRS) for that computing site, while all other replica servers are considered to be secondary replica servers (SRS). The data grid information service keeps a record of available resources and provides resource registration service in the data grid. Based on specific strategies, it is the job of the grid scheduler to allocate a task to the specific computing nodes, as shown in Figure 1. A local storage device stores the data before it is allocated to the computing nodes. If the cache of a computing node does not contain the required data, then the request for the data will be directed to its PRS. PRS will search within a domain first, but if the required data is not available within a domain, then PRS will search among all replica servers and collect the data from the replica server having the highest bandwidth capacity connection to the computing node. The ideal grid scheduler should assign the jobs to appropriate computing sites with all the available data files, which ultimately minimizes the search time significantly.
The data grid architecture is defined as follows:
  • There are D domains in the data grid.
  • For every domain dD, the number of computing sites situated in d is denoted by CS(d).
  • RS(d) are the replica servers within each domain d.
  • NS(d) is the set of nodes within each d, where NS(d) = CS(d) U RS(d).
  • Ci is the computing capability for a computing sites i.
Our CDSS-RPS strategy is based on two phases: CDSS and RPS. In CDSS, the most-accessed jobs are collected from every server in the meta scheduler. The meta scheduler then takes the average of the number of files to be accessed. Then, the RPS strategy is applied to replicate the most-accessed jobs (calculated by CDSS) at the desired computing sites where the demand of the file occurred most frequently. The system architecture with the proposed strategy is explained below.

3.2. Centralised Dynamic Scheduling Strategy-Replica Placement Strategy Architecture Diagram

Figure 2 describes the working of the proposed strategy.
Figure 2. Architecture diagram for centralized dynamic scheduling strategy-replica placement strategies (CDSS-RPS).
The mechanisms of architecture diagram are described below.

3.2.1. Grid User

A grid user submits a job to the meta scheduler by specifying features of jobs (i.e., length of a job, expressed in millions of instructions per second (MIPS)) and quality of service requirements (i.e., processing time).

3.2.2. Meta Scheduler

The resource locator in a meta scheduler registers resources with grid information service (GIS). At the time of registration, the resource locator gets the information of number of computing resources, computing capability, and processing speed from each domain.

3.2.3. Scheduling Manager

The scheduling manager collects the details of available resources and number of user’s job requests to execute. The scheduling manager sends details of the number of jobs and available resource information to the replicator for evaluating the list of data files accessed frequently.

3.2.4. Replicator

The replicator service consists of three phases: (a) information service, (b) replication advisor, and (c) job placement. The information service gets the information about the number of data files assigned to the computing sites of each domain. The replication advisor then takes the decision on the collected information and invokes the CDSS algorithm discussed in Section 3.2.4.1. The replica placement of a data file and job placement occur as discussed in Section 3.2.4.2 and Section 3.2.4.3. The reorganized information about resources and data files is updated on each replica server and the computing sites of each domain.

3.2.4.1. Centralised Dynamic Scheduling Strategy (CDSS)

In a centralized dynamic scheduling strategy, there is a meta scheduler running in a system, elaborated in Figure 3. Every primary replica server (PRS) collects the information of data access from each computing site or resource discussed in Figure 1. PRS sends the number of files and its number of access (NOA) to the meta scheduler. The meta scheduler aggregates the number of accesses of a file and stores the result in a history table H. The centralized dynamic scheduling strategy is invoked by the meta scheduler, which informs the replica server to perform replication. The formula to calculate the average number of access file is as follows.
N O A ¯ = 1 | H | h H N O A ( h )
NOA is number of access of each file identification (FID). |H| is the number of files records in a history table H, and N O A ¯ is the average number of access.
Figure 3. Illustrative example (a) without scheduler and (b) with scheduler.
The replica placement and scheduling strategy is presented in the CDSS algorithm.
Algorithm 1: CDSS Algorithm
Input: Number of files to be processed
Output: collect the number of files to be accessed more frequently.
Begin
1: Meta scheduler collects the number of accessed files information from each replica server.
2: Meta scheduler aggregates the number of access of files.
3: The average number of accesses is calculated according to Equation (1).
4: For each file from the history table do
5:   Compare each file access with the average number of accesses.
6:  if
7:    the number of file accesses is less than the average number of accesses
8:    Remove files accessed
9:  End if
10:   Sort the remaining files in descending order.
11:   Last record in a history table is denoted by LAF, where LAF is the least accessed file.
12: End for
13: Place the files on replica servers following the history table H.
14: While history table H! = empty
15:   Pop h from the history table H
16:   Request from replica placement strategy (RPS) (Algorithm 2) to replicate the file id (FID) (h).
17:   Update record h
18:    If NOA (h) > LAF then
19:      Re-insert record h in H according to descending order of NOA field
20:    End if
21: End while
End
Computing sites with a higher computing capability fulfil requests on time or even sometimes before the specified time. The computing capability of domain D is defined as c d = ( D ) c i , and the computing capability of all domains is c D = c d . Let us suppose the relationship between data request rate is proportional to the computing capability. The proportionality difference between them is measured by the factor θ from domain D, and can be denoted by:
D _ R E Q ( D ) = θ · c d
where the θ value should be between 0 and 1. Let us take θ = 1 then D_REQ (D) =   c d . For data f, where f represents the data files, the probability of the requested rate is denoted by Probf. Therefore, the data request rate for data f from domain d can be defined as
D _ R E Q ( D , f ) = Prob f · D _ R E Q ( D ) = θ · Prob f · c d
The computing node, having the highest available bandwidth, most likely has the capability to directly access the replicas. The bandwidth capacity between computing nodes within the same domain is supposed to be the same. BW d , k is a measure of the bandwidth capacity between the computing node and replica server k, where a computing node can be any node in domain d. The number of replica servers in data grid architecture is denoted as R_SER, which stores the original data, or the replicas of data f. The bandwidth capacity B_CAP while accessing the data f from any node in domain d can be defined as
B _ CAP ( d , f ) = max k R _ SER f BW d , k
The average response time for data f is defined as follows
AvgResp   Time ( f ) = d D c d max k R _ SER f BW d , k
The average response time is the minimal average response time for any data f.
The replica placement is made on the basis of largest bandwidth capacity and storage place of every replica server. Let Rf be the set of servers containing replicas or the original copy of data f. Replicate the data f to the replica server, where data f does not exist and has enough storage space to replicate data.

3.2.4.2. Replica Placement Strategy

Let XNEW contain the set of data files that are requested in a new session, and XOLD contain the set of replicas in an old session. After scheduling and placement of data file f, a data request is generated on a number of replica servers. These requests take replica placements XOLD and XNEW as inputs, and introduce additional (superfluous) transfer actions as early as possible to the start of data request submitted. Superfluous replicas are defined as the data files that are not present in the XNEW state, but were available in XOLD state. Corresponding deletion action is added to schedule before the transfer action takes place. The next session explains the transfer and deletion mechanism, as well as how to produce a new valid schedule that incurs the lowest cost.
Algorithm 2: RPS Algorithm
Input: Most accessed data files that need to be replicated.
Output: Replicate the files on the computing sites
Begin
1: Calculate the computing capability of a domain c d and sum of all domains c D .
 Where c d = ( D ) c i (D is the domain)
 And c D = c d (computing capability of all domains)
2: Data request rate is proportional to its computing capability; proportional relationship is presented in Equations (2) and (3).
3: Set bandwidth equal between computing nodes within the domain and varying outside the domain.
4: Calculate the highest available bandwidth between the nodes in domain d to replica server k as shown in Equation (4).
5: Average response time for most accessed or requested data files are calculated in Equation (5).
6:  If bandwidth capacity and storage capacity is greater than that of the file to be replicated do
7:    If else replica server does not already contain that data file do
8:      Replica placement
9:    End if else
10: Else
11:    Delete data files that are not required in a new requested state.
12: End if
End

3.2.4.3. Schedule Enhancement Operator

When the most accessed data files are replicated over the number of servers, the request is generated to check the cost of transfers and deletions. In case the requested data file is inaccessible from the host server, it is transferred from the computing site having the highest available bandwidth and it is replicated on the host server as well. We next present Algorithm 3, which considers the bandwidth as a main feature for a selection and deletion of a replica. If the storage size of a server is smaller, then the algorithm deletes the files that are not required in a new state but were available in an old state, or the files that are least recently used.
Algorithm 3: Schedule Enhancement Operator
Input: Request is generated
Output: Request is granted successfully
Begin
1: If (requested replica is in a computing site) Rr = Cs
2:  fulfill the request
3: Else if (requested replica is not in a computing site) Rr! = Cs
4:  search replica within the same domain Goto domain Cd(i)
5:   If (replica is found in the same domain) Rr = Cd
6:    move to requested site Rr = Cs
7:     If (! Enough space available) (space! = 0)
8:       Directly fulfill the request // avoid duplication
9:     End if
10:   else
11:    search replica in other domain G o t o   ( D ) c i (D is the domain)
12:    create a list of computing sites that offer requested replica
13:    select the replica from the computing site that offers the highest available bandwidth
14:   End else
15:   End if
16:    If (space is available to store new replica)
17:     store it
18:    Else
19:     while (not enough space is available)
20:       delete least accessible files
21:       store new replica
22:     End while
23:    End else
24:    End if
25: End else if
26: End if
  End

3.2.5. Dispatcher

The dispatcher dispatches the tasks one by one from the queue to the computing sites for which they are scheduled.

3.2.6. Scheduling Receptor

The scheduling receptor gets the task completion details from each computing sites and sends the completion results details to the user.

3.3. Illustrative Example

For better understanding, we choose to provide a complete scenario, to elaborate the working of the proposed strategy illustrated via a simple example, shown in Figure 3. Figure 3a,b show the difference between the implementation costs without scheduling the data files on each domain and with a scheduler, defining the job submission and replication. Suppose there are three replica servers, each in a domain. CSi are computing sites connected to each replica server with the same bandwidth within a domain. The network bandwidth between different replica servers is different. Suppose S1 consists of data files (a, b, b), S2 consists of data files (c, c), and S3 contains data files (b, a). The network bandwidth between S1 and S2 is 8 bps, 4 bps between S2 and S3, and 3 bps between S1 and S3. The network bandwidth between the computing sites and the replica server is supposed to be the same within a domain.
In Figure 3a, the request is submitted to any computing site. Let us suppose that a job is submitted to CS1, and the job requires data file c. S1 does not contain data file c, therefore it is transferred from S2 to fulfill the request of a job. If S1 does not contain enough space, it will delete any file to place replica c. Similarly, the same situation occurs if b is requested in S2; this also requires the transfer and deletions to fulfill the request. Transfers and deletions increase the implementation cost, delay data request completion, and cause congestion in network traffic. Figure 3b explains the same situation with the meta scheduler. The meta scheduler holds all the information of each domain and decides the job submission on its base. It replicates the most-accessed data files to the number of computing sites, wherever the data request rate is higher than others. In Figure 3b, the meta scheduler submits the jobs to S2 directly, minimizing the implementation cost and ultimately decreasing data access time as well. This shows that adding the scheduler significantly improves the efficiency of accomplishing the user requests.

4. Simulation Results and Discussion

The following subsection explains the simulation results and their discussion in detail. It includes details of the simulation setup, performance evaluation parameters, results, and discussion of those results.

4.1. Simulation Setup

To evaluate the effectiveness of the proposed strategy, we carried out a simulation using the Grid Sim simulator [,]. For simulation in a data grid environment, we considered 25 sets of domains, and each and every domain had a server, also called the replica server. We took 80 computing sites that were randomly dispersed to the 25 domains. The links between computing site nodes and server nodes were given a bandwidth that was randomly distributed. Within the domain, the bandwidth was set to be within the range of 15 to 100 Mbps, but outside the domain the bandwidth was set normally to less than 15 Mbps. The number of files differed from 100 to 500, and the size of the files was homogeneously dispersed from 100 to 400. The primary replicas were assigned randomly to all the nodes.
We compared the performance of the proposed strategy, GOLCF, with the contemporary strategy, HOCF, as these strategies are more related to our work as compared to rest of the newer strategies mentioned in related work. The performance evaluation parameters that were used to compare results include implementation cost, i.e., the cost associated with implementing the data grid; geometric mean of turnaround time; turnaround time; and average turnaround time.

4.2. Results and Discussion

Simulation results, performance evaluation parameters, and their discussions are presented in this section.

4.2.1. Evaluation of Replica Transfer Scheduling Problem Algorithms

Implementation cost was compared initially with the schedules that were produced by non-scheduled RTSP algorithms. Figure 4 shows the performance of the schedules created by GOLCF and HOCF from []. The number on the x-axis represents the replicas per object created, and the number on the y-axis represents the implementation cost. Let us suppose that XNEW changes with the replica numbers need to be created for each object. Storage capacity is supposed to be kept equal for all the replicas in XNEW.
Figure 4. Cost of replica transfer scheduling problem (RTS)-based algorithms.

4.2.2. Comparison of Centralized Dynamic Scheduling Strategy-Replica Placement Strategy with Greedy Object Lowest Cost First and Highest Opportunity Cost First

For a performance comparison with the CDSS-RPS, we chose the best contemporary algorithms, which include GOLCF and HOCF CDSS-RPS, and RTSP algorithms, which were taken as input for the XOLD and XNEW states. Figure 5 represents the results for the implementation cost for CDSS-RPS, GOLCF, and HOCF, respectively, when the replica per file size varied from 5 to 100. The number on the x-axis represents the replicas per object created, and the y-axis represents the implementation cost. Implementation cost is based on the number of transfers and deletions occurring in order to fulfill the requests. When the number of transfers increases, it also increases the implementation cost. Figure 5 shows that the implementation cost of our strategy is less than the other two. CDSS-RPS is a better choice compared to GOLCH and HOCF. GOLCF makes the decision on the basis of the smallest link cost, and gives better results as compared to HOCF, but the number of files and jobs in GOLCF are not scheduled properly, as it does not consider the most-accessed files and their replications. The HOCF strategy is based on the nearest and second-nearest goal server. CDSS-RPS schedules the most accessed files, then replicates the files on its access bases; this improves the implementation cost significantly, as shown in Figure 5.
Figure 5. Schedule implementation cost w.r.t. CDSS_RPS, greedy object lowest cost first (GOLC) and highest opportunity cost first (HOCF).
As the amount of replicas per object increases, the probability of transfers and deletion also increases. Initially, the difference between the GOLCF and CDSS-RPS is very minor, because the probability of transfer action is low. With the increase of the number of replicas, the transfer action also increases, and this degrades the GOLCF and HOCF performances.

4.2.3. Geometric Mean of Turnaround Time

Figure 6 compares the geometric mean of turnaround time (GMTT) of CDSS-RPS, GOLCF, and HOCF. The number on the x-axis represents the number of jobs, and the y-axis represents the GMTT. A lower GMTT means an improved access time and performance from a user perspective. The GMTT does not allow applications or jobs that are too long. CDSS-RPS schedules the files and then replicates them on servers, then introduces an enhanced scheduling operator that handles the transfer and deletion action if required. This step-wise mechanism improves the GMTT, as shown in Figure 6 below.
Figure 6. Geometric mean of turnaround time.
The GMTT is calculated as:
GMTT = ( i = 1 n a i ) 1 / n
where a is the turnaround time the job, i to n are the number of jobs in the concerned workload.

4.2.4. Execution Time

The experimental results of execution time for the number of jobs or files are shown in Figure 7. The execution time of a job is calculated as the sum of executing the job, which is the time used to transfer the file and process that job. Time reserved to transmit the file is the most important factor that impacts job execution time for jobs in a data grid. In Figure 7, the number on the x-axis represents the number of jobs, and the y-axis represents the execution time (in milliseconds). The execution time of CDSS-RPS is compared with that of GOLCF and HOCF.
Figure 7. Execution time for all sets of jobs.
The turnaround time of 100 to 500 numbers of jobs is shown in a Figure 8 below, and the average turnaround time is 1.7.
Figure 8. Turnaround time.
The above results show that CDSS-RPS works better than GOLCF and HOCF. GOLCF minimizes the implementation cost significantly, but does not schedule the files; this may increase the cost in the long run. CDSS-RPS schedules the files on its access basis; this minimizes the transfer and deletion probabilities, and ultimately improves the implementation cost. CDSS-RPS extends the work of [] and gives better results than RTSP heuristics.

5. Conclusions and Future Work

In this paper, the centralized dynamic scheduling strategy-replica placement strategy (CDSS-RPS) is proposed. Results are formulated via simulations, also with reference to the plain RTSP algorithms. Our results indicate that the proposed CDSS-RPS algorithm provides an improved scheduling and replication mechanism compared to the RTSP heuristics. Though our paper shows significantly improved results, there is still a need for further testing of this strategy, in order to advance our knowledge in this area. The CDSS-RPS strategy can also be studied in terms of fault tolerance as well. For example, failure may occur in any computing sites, which may increase the implementation cost substantially. We have assumed a scenario where no failure occurs. It may also be worthwhile researching variants of the current problem. Investigating the faults and applying the fault-tolerant strategies may increase the execution time and the implementation cost as well, but the overall scheduling mechanism will be improved significantly. In the future, we plan to improve the replication strategy by further testing and comparisons with the latest techniques of a similar nature with regards to additional parameters.

Author Contributions

B.N. and F.I. designed the new techniques and performed the simulations and wrote the results. S.S. and A.T.C. contributed in directing the research and in writing the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Foster, I.; Kesselman, C. (Eds.) The Grid 2: Blueprint for a New Computing Infrastructure; Elsevier: New York, NY, USA, 2003. [Google Scholar]
  2. Jolfaei, F.; Haghighat, A. Improvement of job scheduling and tow level data replication strategies in data grid. Int. J. Mob. Netw. Commun. Telemat. 2012, 2, 21–36. [Google Scholar] [CrossRef]
  3. Balasangameshwara, J.; Raju, N. A hybrid policy for fault tolerant load balancing in grid computing environments. J. Netw. Comput. Appl. 2012, 35, 412–422. [Google Scholar] [CrossRef]
  4. Abdi, S.; Pedram, H.; Mohamadi, S. The Impact of Data Replication on Job Scheduling Performance in Hierarchical data Grid. arXiv, 2010; arXiv:1010.0562. [Google Scholar]
  5. Berriman, G.B.; Deelman, E.; Good, J.C.; Jacob, J.C.; Katz, D.S.; Kesselman, C.; Laity, A.C.; Prince, T.A.; Singh, G.; Su, M.-H. Montage: A grid-enabled engine for delivering custom science-grade mosaics on demand. In Proceedings of the SPIE Astronomical Telescopes and Instrumentation, Glasgow, UK, 21–25 June 2004. [Google Scholar]
  6. Mineter, M.J.; Jarvis, C.H.; Dowers, S. From stand-alone programs towards grid-aware services and components: A case study in agricultural modelling with interpolated climate data. Environ. Model. Softw. 2003, 18, 379–391. [Google Scholar] [CrossRef]
  7. Foster, I.; Iamnitchi, A. On death, taxes, and the convergence of peer-to-peer and grid computing. In Proceedings of the International Workshop on Peer-to-Peer Systems, Berkeley, CA, USA, 21–22 February 2003. [Google Scholar]
  8. Grace, R.K.; Manimegalai, R. Dynamic replica placement and selection strategies in data grids—A comprehensive survey. J. Parallel Distrib. Comput. 2014, 74, 2099–2108. [Google Scholar] [CrossRef]
  9. Folling, A.; Grimme, C.; Lepping, J.; Papaspyrou, A. Robust load delegation in service grid environments. IEEE Trans. Parallel Distrib. Syst. 2010, 21, 1304–1316. [Google Scholar] [CrossRef]
  10. Li, H. Realistic workload modeling and its performance impacts in large-scale escience grids. IEEE Trans. Parallel Distrib. Syst. 2010, 21, 480–493. [Google Scholar] [CrossRef]
  11. Sonmez, O.; Mohamed, H.; Epema, D.H.J. On the benefit of processor coallocation in multicluster grid systems. IEEE Trans. Parallel Distrib. Syst. 2010, 21, 778–789. [Google Scholar] [CrossRef]
  12. Amjad, T.; Sher, M.; Daud, A. A survey of dynamic replication strategies for improving data availability in data grids. Future Gener. Comput. Syst. 2012, 28, 337–349. [Google Scholar] [CrossRef]
  13. Morris, J.H.; Satyanarayanan, M.; Conner, M.H.; Howard, J.H.; Rosenthal, D.S.; Smith, F.D. Andrew: A distributed personal computing environment. Commun. ACM 1986, 29, 184–201. [Google Scholar] [CrossRef]
  14. Balman, M. Failure-Awareness and Dynamic Adaptation in Data Scheduling. Master’s Thesis, Bogazici University, Istanbul, Turkey, 2008. [Google Scholar]
  15. Tang, M.; Lee, B.; Tang, X.; Yeo, C.-K. The impact of data replication on job scheduling performance in the Data Grid. Future Gener. Comput. Syst. 2006, 22, 254–268. [Google Scholar] [CrossRef]
  16. Mansouri, N.; Dastghaibyfard, G.H. Enhanced dynamic hierarchical replication and weighted scheduling strategy in data grid. J. Parallel Distrib. Comput. 2013, 73, 534–543. [Google Scholar] [CrossRef]
  17. Bsoul, M.; Abdallah, A.E.; Almakadmeh, K.; Tahat, N. A round-based data replication strategy. IEEE Trans. Parallel Distrib. Syst. 2016, 27, 31–39. [Google Scholar] [CrossRef]
  18. Nicholson, C.; Cameron, D.G.; Doyle, A.T.; Millar, A.P.; Stockinger, K. Dynamic data replication in lcg 2008. Concurr. Comput. Pract. Exp. 2008, 20, 1259–1271. [Google Scholar] [CrossRef]
  19. Loukopoulos, T.; Tziritas, N.; Lampsas, P.; Lalis, S. Implementing replica placements: Feasibility and cost minimization. In Proceedings of the Parallel and Distributed Processing Symposium, Long Beach, CA, USA, 26–30 March 2007. [Google Scholar]
  20. GridSim. 2010. Available online: http://www.buyya.com/gridsim/ (accessed on 17 May 2018).
  21. Horri, A.; Sepahvand, R.; Dastghaibyfard, G. A hierarchical scheduling and replication strategy. Int. J. Comput. Sci. Netw. Secur. 2008, 8, 30–35. [Google Scholar]
  22. Ranganathan, K.; Foster, I. Simulation studies of computation and data scheduling algorithms for data grids. J. Grid Comput. 2003, 1, 53–62. [Google Scholar] [CrossRef]
  23. Bsoul, M.; Al-Khasawneh, A.; Abdallah, E.E.; Kilani, Y. Enhanced fast spread replication strategy for data grid. J. Netw. Comput. Appl. 2011, 34, 575–580. [Google Scholar] [CrossRef]
  24. Radoslavov, P.; Govindan, R.; Estrin, D. Topology-informed internet replica placement. Comput. Commun. 2002, 25, 384–392. [Google Scholar] [CrossRef]
  25. Tang, X.; Xu, J. On replica placement for QoS-aware content distribution. In Proceedings of the INFOCOM 2004, Twenty-Third Annual Joint Conference of the IEEE Computer and Communications Societies, Hong Kong, China, 6 February 2004; Volume 2. [Google Scholar]
  26. Dowdy, L.W.; Foster, D.V. Comparative models of the file assignment problem. ACM Comput. Surv. 1982, 14, 287–313. [Google Scholar] [CrossRef]
  27. Loukopoulos, T.; Lampsas, P.; Ahmad, I. Continuous replica placement schemes in distributed systems. In Proceedings of the 19th Annual International Conference on Supercomputing, Cambridge, MA, USA, 20–22 June 2005. [Google Scholar]
  28. Maheswaran, M.; Ali, S.; Siegel, H.J.; Hensgen, D.; Freund, R.F. Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. J. Parallel Distrib. Comput. 1999, 59, 107–131. [Google Scholar] [CrossRef]
  29. Hamscher, V.; Schwiegelshohn, U.; Streit, A.; Yahyapour, R. Evaluation of job-scheduling strategies for grid computing. In Proceedings of the Grid Computing—GRID, Bangalore, India, 17–20 December 2000; Springer: Berlin/Heidelberg, Germany, 2000; pp. 191–202. [Google Scholar]
  30. Shan, H.; Oliker, L.; Biswas, R. Job superscheduler architecture and performance in computational grid environments. In Proceedings of the ACM/IEEE conference on Supercomputing, Phoenix, AZ, USA, 15–21 November 2003. [Google Scholar]
  31. Park, S.-M.; Kim, J.H.; Ko, Y.B.; Yoon, W.S. Dynamic data grid replication strategy based on Internet hierarchy. In Proceedings of the International Conference on Grid and Cooperative Computing, Shanghai, China, 7 December 2003; Springer: Berlin/Heidelberg, Germany, 2004; pp. 838–846. [Google Scholar]
  32. Chang, R.-S.; Chen, P.-H. Complete and fragmented replica selection and retrieval in Data Grids. Future Gener. Comput. Syst. 2007, 23, 536–546. [Google Scholar] [CrossRef]
  33. Chang, R.-S.; Chang, J.-S.; Lin, S.-Y. Job scheduling and data replication on data grids. Future Gener. Comput. Syst. 2007, 23, 846–860. [Google Scholar] [CrossRef]
  34. Abdi, S.; Mohamadi, S. The Impact of Data Replication on Job Scheduling Performance in Hierarchical Data Grid. Int. J. Appl. Graph Theory Wirel. Ad Hoc Netw. Sens. Netw. 2010, 2, 15–25. [Google Scholar] [CrossRef]
  35. Abdi, S.; Mohamadi, S. Two level job scheduling and data replication in data grid. Int. J. Grid Comput. Appl. 2010, 1, 23–37. [Google Scholar] [CrossRef]
  36. Sashi, K.; Thanamani, A.S. Dynamic replication in a data grid using a Modified BHR Region Based Algorithm. Future Gener. Comput. Syst. 2011, 27, 202–210. [Google Scholar] [CrossRef]
  37. Andronikou, V.; Mamouras, K.; Tserpes, K.; Kyriazis, D.; Varvarigou, T. Dynamic QoS-aware data replication in grid environments based on data “importance”. Future Gener. Comput. Syst. 2012, 28, 544–553. [Google Scholar] [CrossRef]
  38. Taheri, J.; Lee, Y.C.; Zomaya, A.Y.; Siegel, H.J. A Bee Colony based optimization approach for simultaneous job scheduling and data replication in grid environments. Comput. Oper. Res. 2013, 40, 1564–1578. [Google Scholar] [CrossRef]
  39. Vrbsky, S.V.; Galloway, M.; Carr, R.; Nori, R.; Grubic, D. Decreasing power consumption with energy efficient data aware strategies. Future Gener. Comput. Syst. 2013, 29, 1152–1163. [Google Scholar] [CrossRef]
  40. Almuttairi, R.M.; Wankar, R.; Negi, A.; Rao, C.R.; Agarwal, A.; Buyya, R. A two phased service oriented Broker for replica selection in data grids. Future Gener. Comput. Syst. 2013, 29, 953–972. [Google Scholar] [CrossRef]
  41. Gui, X.; Kui, Y. A Global Dynamic Scheduling with Replica Selection Algorithm Using GridFTP. In Proceedings of the 18th International Conference on Parallel and Distributed Computing and Systems, Bangkok, Thailand, 18–20 December 2006; Volume 1. [Google Scholar]
  42. Li, W.; Yang, Y.; Yuan, D. Ensuring cloud data reliability with minimum replication by proactive replica checking. IEEE Trans. Comput. 2016, 65, 1494–1506. [Google Scholar] [CrossRef]
  43. Souravlas, S.; Sifaleras, A. Binary-tree based estimation of file requests for efficient data replication. IEEE Trans. Parallel Distrib. Syst. 2017, 28, 1839–1852. [Google Scholar] [CrossRef]
  44. Liu, G.; Shen, H.; Chandler, H. Selective data replication for online social networks with distributed datacenters. IEEE Trans. Parallel Distrib. Syst. 2016, 27, 2377–2393. [Google Scholar] [CrossRef]
  45. Michelon, G.A.; Lima, L.A.; de Oliveira, J.A.; Calsavara, A.; de Andrade, G.E. A strategy for data replication in mobile ad hoc networks. In Proceedings of the IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), Banff, AB, Canada, 20–22 September 2017; pp. 486–489. [Google Scholar]
  46. Nahir, A.; Orda, A.; Raz, D. Replication-based load balancing. IEEE Trans. Parallel Distrib. Syst. 2016, 27, 494–507. [Google Scholar] [CrossRef]
  47. Souli-Jbali, R.; Hidri, M.S.; Ayed, R.B. Dynamic data replication-driven model in data grids. In Proceedings of the IEEE 39th Annual Computer Software and Applications Conference (COMPSAC), Taichung, Taiwan, 1–5 July 2015; Volume 3, pp. 393–397. [Google Scholar]
  48. Patil, R.; Zawar, M. Improving replication results through directory server data replication. In Proceedings of the International Conference on Trends in Electronics and Informatics (ICEI), Tirunelveli, India, 11–12 May 2017; pp. 677–681. [Google Scholar]
  49. Khalajzadeh, H.; Yuan, D.; Grundy, J.; Yang, Y. Cost-Effective Social Network Data Placement and Replication using Graph-Partitioning. In Proceedings of the IEEE International Conference on Cognitive Computing (ICCC), Honolulu, HI, USA, 25–30 June 2017; pp. 64–71. [Google Scholar]
  50. Usman, A.; Zhang, P.; Theel, O. A Component-Based Highly Available Data Replication Strategy Exploiting Operation Types and Hybrid Communication Mechanisms. In Proceedings of the IEEE International Conference on Services Computing (SCC), Honolulu, HI, USA, 25–30 June 2017; pp. 495–498. [Google Scholar]
  51. Mansouri, N. A threshold-based dynamic data replication and parallel job scheduling strategy to enhance Data Grid. Cluster Comput. 2014, 173, 957–977. [Google Scholar] [CrossRef]
  52. Zeng, L.; Veeravalli, B.; Zomaya, A.Y. An integrated task computation and data management scheduling strategy for workflow applications in cloud environments. J. Netw. Comput. Appl. 2015, 50, 39–48. [Google Scholar] [CrossRef]
  53. Casas, I.; Taheri, J.; Ranjan, R.; Wang, L.; Zomaya, A.Y. A balanced scheduler with data reuse and replication for scientific workflows in cloud computing systems. Future Gener. Comput. Syst. 2017, 74, 168–178. [Google Scholar] [CrossRef]
  54. Spaho, E.; Barolli, L.; Xhafa, F. Data replication strategies in P2P systems: A survey. In Proceedings of the 17th International Conference on Network-Based Information Systems (NBiS), Fiscianoon, Italy, 10–12 September 2014; pp. 302–309. [Google Scholar]
  55. Souravlas, S.; Sifaleras, A. Trends in data replication strategies: A survey. Int. J. Parallel Emerg. Distrib. Syst. 2017, 1–18. [Google Scholar] [CrossRef]
  56. Loukopoulos, T.; Tziritas, N.; Lampsas, P.; Lalis, S. Investigating the Replica Transfer Scheduling Problem. In Proceedings of the 18th International Conference on Parallel and Distributed Computing and Systems (PDCS’06), Bangkok, Thailand, 1–18 August 2018. [Google Scholar]
  57. Buyya, R.; Murshed, M. Gridsim: A toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. Concurr. Comput. Pract. Exp. 2002, 14, 1175–1220. [Google Scholar] [CrossRef]

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.