M-dimension hybrid algorithm for scientific workflow in cloud computing

: Cloud computing is developing with growing ubiquity in workflow scheduling, particularly for scientific workflow. With the emergence cloud computing, can benefit from virtually unlimited resources with minimal hardware investment. In the field of cloud computing, scheduling the operations of the Scientific Workflow Application (SWFA) to the available computing resources at the cost of optimizing the implementation of SWFA is one of the most challenging processes of the Workflow Management System (WfMS). Some strategies of cost optimization have been offered for developing SWFS economic nature in cloud computing. Basic paper aim is providing novel M-dimension hybrid algorithm that applies meta-heuristic algorithm like Hybrid Cost-effective Hybrid-Scheduling (HCHS), Completion Time Driven Hyper-Heuristic (CTDHH), genetic algorithm (GA), particle swarm optimization (PSO) applying heuristic algorithms like algorithms of IC-PCPD2 and IC-Loss. Given experimental comparison outcomes, presented technique has proven to show the most efficient outcomes of performance for whole considered experimental scenarios.


INTRODUCTION
Heterogeneous framework has advanced as the around the world foundation for electronic computing apps' future immersing heterogeneous resources.Heterogeneous framework contains systems of grid, clusters, cloud systems and so on.For keeping complicated technical tests, distributed resources such as scientific instruments, apps, computational devices require being grouped when observing workflow task on heterogeneous systems [1].Computing of Cloud is a type of novel paradigm of computing which works to solve novel issue that includes various computers for including large system of computing for running several big operations.One of the main topics in cloud computing refers to scheduling of task and effective cloud resources usage [2].
Workflow is the most repeatedly applied model for providing apps and has been widely applied in domains of scientific computing like astronomy, physics, bioinformatics.Scientific workflows' operations are computationintensive/data-intensive, so need the high-performance computing area for presenting resources of computing.Cloud computing is the big-scale platform of distributed computing that resources of computation are accessible on demand for such apps of workflow [3].This is popular which scheduling of cloud workflow is NP-complete problem, workflow of scheduling in multi cloud area is more complex [4].It is due to that in multi cloud area, services are proposed from multiple autonomous cloud IaaS platforms, resources of computing will be gathered as one or more compound services.After that, choosing the optimum services' integration from several platforms of IaaS for meeting needs of QoS is a challenging issue.
Imposing apps of workflow on Heterogeneous system proposes some advantages [1]: 1) being able to build active apps using heterogeneous resources.
2) Full use of resource which happens in particular area for increasing results and costs of running.
3) intervals of running different fields of administrator for getting particular operations of processing.Different domains of scientific apply workflows for analyzing big data numbers and to effectively execute complicated simulations and tests.The process could be formed as the workflow sharing that in smaller and simpler processes of sub (like operations).Such operations could be shared to several resources of calculation for a more effective and quicker running.Workflow operations' scheduling in shared platforms has been broadly studied over the years [5].
SWFS cost optimization challenge in computing of cloud is the multi-objective cost-aware issue which needs 3 basic natures consideration [6]: (1) various users that often compete for resources in computing of cloud for satisfying limitations of QoS, (2) inter-dependencies between operations of workflow, (3) high communication cost because of inter-dependencies among tasks.Although, taking the whole issue of cost optimization associated to the nature makes process of SWFS complex and needs high calculation resources amount in terms of computational time.The present study shows a novel technique applying 6 algorithms at the same time (4) meta-heuristic CTDHH [7], HCHS [6], PSO, GA, two heuristic IC-PCPD2, IC-Loss).
This paper is formed as follows: In section 2, describes the related works.In section 3 describes the Hybrid Costeffective Hybrid-Scheduling (HCHS), heuristic algorithms such as IC-Loss and IC-PCPD2 algorithms, and Completion Time Driven Hyper-Heuristic (CTDHH).Next, Section 4 provides the proposed method.Then, in Section 5, the results and discussion of this research paper are provided and finally, Section 7 provides a conclusion.

REVIEW OF PREVIOUS WORKS.
In this section, we aim to study the work that exists in this area.Most algorithms concentrate on creating near/ near-optimum solutions.For suboptimal group, we recognize 2 techniques applied by researched algorithms.First 2 are heuristic and meta-heuristic strategies.

Heuristics based techniques
Heuristics.Generally, heuristic is the laws' collection which have target in recognizing the solution for specific issue.These laws are particular to issue and are modeled then the near solution is recognized in the acceptable time frame.For scenario of scheduling mentioned, the heuristic strategy applies information on cloud features and task of workflow for identifying the plan which faces QoS needs of user.They are simpler for implementing and predictable than the techniques based on meta-heuristic [5].
In [8], writers address scientific workflow scheduling issue in actively provisioned commercial cloud areas.Here, it uses scheduling of workflow for obtaining lower cost with good times of response in areas of cloud.They defined novel algorithms, Proportional Deadline Constrained (PDC), Deadline Constrained Critical Path (DCCP).PDC acts with increasing parallelism in workflow by dividing that in logical classes and proportionally sub sharing whole workflow deadline over them.Algorithm of DCCP is like PDC, basic difference is that specifies limited crucial way via workflow doing co-locate tasks which connect on similar example.
In [9] has been presented the active cost-minimization and deadline limited heuristic, JIT-C, to schedule scientific apps in area of Cloud.For sustaining low cost of running, resources are provisioned only before they are required.Deadline meeting aim is obtained via continuous execution tasks controlling and actively making economic decisions of scheduling for subsequent tasks so deadline limitation is not violated.Tests of simulation performed on 4 popular workflows illustrate that compared with other new heuristics, RTC, IC-PCP, RCT, presented algorithm shows the highest hit rate to face deadline.
In [10], writers defined 4-laye system of workflow scheduling.New CWSA scheduling policy was presented to plan apps of workflow in multitenant area of cloud computing.Various performance metrics analysis was performed.The wide simulations were carried out for assessing presented scheduling policy performance.CWSA performance was compared with various policies of scheduling for highlighting performance and presented solution robustness.Achieved outcomes illustrate that CWSA performs better than other policies of scheduling.Significantly, CWSA was illustrated for using resources of computation accurately decreasing idle cloud resource nodes time.
In [11], writers presented the single approach of scheduling for workflows in area of cloud for addressing problems associated to time and cost.Presented algorithm reduces tasks' execution time at various steps according to tasks' height shaping sets of task.Time of completion is decreased to the feasible minimal level in every height, thus, additional resources allocation takes less time and overall completion workflow time is decreased.Presented algorithm needs less time for allocating workflow tasks to virtual machines.Workflows' makespan is importantly decreased compared to other present algorithms and it reduces sum price for task execution taking extra resource just when this is required.
In [12], for resource scheduling in Infrastructure as a Service (IaaS) area of cloud and workflow proposed an augmented shuffled frog leaping algorithm (ASFLA) built technique.ASFLA outcomes shows that last one performs better than other techniques to decrease all regarded workflows performance cost.

Meta-Heuristics based techniques
When heuristics are modeled to work best on the particular issue, meta-heuristics are general-aim algorithms modeled for solving issues of optimization.They are higher level approaches which use issue specified heuristics for recognizing near-optimum solution to the issue.In comparison with algorithms based on heuristic, meta-heuristic strategies are more computationally intensive in general and take longer to execute; although, they want to identify more desirable plans as they recognize various solutions applying guided search.Applying meta-heuristics for solving workflow scheduling issue in clouds includes issues like shaping theoretically unbound number of resources, describing issues for preventing invalid solutions (like data dependency violations) for making convergence simple, pruning space of search applying heuristics on cloud resource model base [5].
In [13], writers provided GA-ETI, planner for scientific apps for systems of cloud to optimize the execution makespan and monetary cost concurrently.GA-ETI applies increased crossover for integrating genes clusters than shared chromosomes randomly; this uses increment/decrement mutations for adding or eliminating virtual machines from the provided chromosome.The two modifications show decreased inherent randomness in comparison with basic GA.GA-ETI solutions had lower monetary cost and makespan in comparison to solutions presented by HEFT.
In [14], aim is reducing workflow makespan and cost under limitation of reliability.Because of various cloud platforms failure coefficients, this is essential for cloud users to consider mapping kinds of VM to tasks for whole reliability of workflow.For solving previous issue, presented algorithm of MOS, given the optimization of particle swarm, takes location of tasks execution and tasks data transmission order at the same time.Outcomes of simulation, on real-world scientific workflow structures base, illustrate that algorithm of MOS performs better than algorithms of RANDOM and CMOHEFT on whole multi-objective metrics of performance.
In [15], writers take static scheduling problem for real-time workflows in systems of cloud.They try to reduce cost of execution when promising limitation on makespan.The algorithm could set computing resources adaptively occupied when process of planning in the iterative way.PSO is used in every iteration and iteration stops while VMs number applied is fixed.Algorithm gets rid of basic resource pool dependence.Moreover, fine-grained billing technique is used related to trends of market.
In [16], developed WFSACO version provided for planning workflow applying ACO reduces makespan and decreasing WFSACO complexity.At first, tasks in every workflow class are organized given the length of task and child tasks number.Organized tasks are mapped to resources applying ACO.For decreasing makespan, classes of pheromone and heuristics data are updated for every machine, given the feasibility of transition as each assumption of presented strategy.Presented strategy empirical outcomes' comparison with that of other present algorithms is being worked out with other metrics like resource cost, planning time.
In [17], new meta-heuristic technique based on Discrete binary cat swarm optimization (DBCSO) is offered which tries to achieve the optimized makespan when running apps of workflow in area of cloud.Outcomes of Simulation illustrate that DBCSO provides optimized makespan and this is observed which carries out better for big tasks' number than standard particle swarm optimization (PSO) and binary particle swarm optimization (BPSO).
In [18], writers present algorithm of multi-objective gene expression programming based on indicator.For every resource pair and task, amount of superiority is computed by the operation of superiority.The pair with the highest superiority is chosen at every stage.To look for the best function of superiority, some low-level heuristics are described at first and applied as construction blocks to build last heuristics.After that, they combine the technique multi-objective optimization with a recent publish GP variant (called SL-GEP) based on indicator for looking for several heuristics which have various trade-offs among workflow execution time and cost.
In [19], writers have presented the novel strategy of PEFT genetic algorithm for decreasing time of execution on this model.One approach is deployed for allowing GA to concentrate on chromosomes aim optimization for getting the best appropriate mutated kids.After achieving the possible answer, GA concentrates on execution time optimization.Outcomes illustrate that PEFTGA outperforms operations' plans on virtual machines in makespan terms.Time of completion (makespan) for presented algorithm of PEFTGA is decreased by medium 25% in comparison with standard GA.
In [20], writers provide the adaptive privileged multi-objective workflow scheduling algorithm for area of cloud computing.This optimizes aims such as cost and makespan.Most present task takes a single function of fitness; they count on function of fitness given the two sum cost and makespan.This is monitored that adaptive privileged multiobjective workflow scheduling algorithm obtains better outcomes than several present algorithms of workflow scheduling.This balances load on resources of compute spreading functions on accessible resources.

THE PROPOSED ALGORITHM
The idea provided in [6][7] is using several meta-heuristic algorithms and has selected one of such algorithms for execution in every step.One of the issues in [7] refers to high order algorithm times because of lack of attention to basic algorithms population.For solving the issue, presented method is using heuristic algorithms for developing basic algorithms population for obtaining better outcomes at shorter order times.For such aim, algorithms of IC-PCPD2 and IC-Loss have been applied for generating basic population.General presented technique algorithm is as:

Algorithm 1: M-dimension Hybrid-Scheduling Algorithm
Input: W=(T,E), E={(Ti ,Tj , Dataij)|(Ti,Tj)∈  × }, Eij=(Ti,Tj,dataij), T=⋃  !"#$ i , H(the set of the Meta-Heuristic algorithms: CTDHH, HCHS, GA, PSO, c is user budget for w and d is the user deadline.Output: For cost optimization of w there is most optimal solution 1: α ← TimeOfFS (w) // α is the fastest schedule time of w 2: for each an n ∈ {1,…,m} do // select m meta Heuristic algorithm from initializing 3: an ← ∅; 4: end for 5: for each an n ∈ {1,…,m} do // For each meta Heuristic algorithm compute results 6: for i←1 to 5 do // For different deadline factors compute the algorithms 7: insert an(w,c,d) into an 8: α← α*i 9: end for 10: end for 11: for 1 to 5 do 12: Run h with initial population an n ∈ {1,…,m} ∀h ∈ H 13: end for 14: compute DHHA (w) Presented algorithm needs 2 basic incomes for the accurate working.Incomes include: (i) W=(T,E), that T (vertex) is functions' collection and E (edges) is directed edges collection among functions and (ii) H(Heuristic algorithms collection: GA, CTDHH, HCHS, particle swarm optimization (PSO)).As illustrated in Algorithm 1, the first presented technique basic step is executing every 4 used meta-heuristic algorithms (like PSO, CTDHH, HCHS, GA).On the other hand, presented algorithm executes every (h) developed Heuristic algorithms (H) for planning submitted workflow functions given the accessible VMs for 4 times for every specific scenario, that (h) shows algorithm of Low Level Heuristic (LLH) that is a part of (H) collection.
Firstly, basic population is generated applying 2 algorithms: IC-PCPD2 (with user deadline limit), algorithm of IC-LOSS (with user budget limit).By generating basic population, the run time of algorithms are decreased in presented algorithm.

Using IC-PCPD2 algorithm for initial population
IC-PCPD2 [21] is 2-step algorithm which has the same structure to PCP algorithm with 2 basic differences.At first, 3 way determining policies have been replaced with unique novel policy for adapting to novel model of cost pricing.Secondly, step of scheduling is changed in a way for planning the function, at first this attempts in using left present computation services time samples, in a case this fails, it takes launching as the novel sample for running a function before the sub deadline.In other words, ICPCP is the 1-step algorithm that applies an approach like PCP deadline distribution step, however instead of determining sub deadlines to partial critical path tasks, it attempts in planning them by identifying (novel/current one) computation service sample that could run whole way before its latest last time.[22] for area of cloud.As in the IaaS clouds replanning personal task on a cheaper machine might raise sum cost of execution, replanning Loss algorithm step must be adapted.IC-Loss attempts in replanning whole sample tasks to a cheaper present/novel sample, for decreasing sum cost of execution.According to that algorithm of IC-LOSS is the algorithm which presents optimum planning according to budget provided by user, the algorithm has been applied for generating basic population.

Completion time driven hyper-heuristic (CTDHH) algorithm
CTDHH [7] has been provided for SWFS cost optimization concern in area of cloud.This algorithm is taken as novel developed method which is able to accelerating meta-heuristic algorithm execution time.CTDHH applies High Level Heuristic (HLH) approach using 4 popular meta-heuristic algorithms based on population that work as Low Level Heuristic (LLH).Basic HLH approach aim is to intelligently lead search process given the employed metaheuristic LLH algorithms performance.CTDHH strategy performance has been widely assessed compared to that with 4 strategies based on population (like GA, PSO, HIWO, IWO) and the present hyper heuristic strategy known as Hyper-Heuristic Scheduling Algorithm (HHSA).Given the lowest obtained completion time, presented strategy actively leads searching processes for identifying the optimum solution by continuously saving scores of computed time (like last runs completion times) of entire LLH algorithms for every taken scenario and after each execution.Accordingly, completion time algorithm based on hyper heuristic becomes more efficient in a way to let reusing and using max employed LLH algorithms strengths to search for optimum targeted cost optimization issue solution.From sum computational cost outcomes, this strategy has obtained the cheapest sum computational cost compared to baseline strategies.Such outcomes are influenced by size and kind of SWFA.It is basically due to complicated and big submitted SWFA workflow tasks size that finally make strategies of SWFS for taking longer time for such tasks' execution.It is because of the reality that Montage SWFA tasks have less precedence limits.So, strategy of CTDHH has obtained the most optimum outcomes for most of the SWFA datasets and for most taken scenarios in comparison to strategies of baseline and HHSA.In our presented technique, we applied the mentioned algorithm building the suitable basic population.

Hybrid Cost-effective Hybrid-Scheduling (HCHS) algorithm
Strategy of Hybrid Cost-effective Hybrid-Scheduling (HCHS) [6] is for providers of service for leasing samples of VM from providers of cloud to process hybrid workloads, when facing needed response time for interactive services and batch jobs' deadlines.This paper basic aim is providing the novel strategy that applies a meta-heuristic algorithm like Completion Time Driven Hyper-Heuristic (CTDHH) also by applying 2 heuristic algorithms like algorithms of IC-PCP and IC-Loss, presented technique is using heuristic algorithms for developing basic such algorithms population for obtaining better outcomes at a shorter order times.For this aim, algorithms of IC-PCP and IC-Loss have been applied for generating basic population.

Genetic Algorithm (GA)
Strategy of GA [23] is one of the basic Evolutionary Computation domain levels.This evolves from population genetics for solving complicated kind optimization issues kind and uses Mendel's genetics structure law for organizing chromosomes, genes, alleles.This includes 2 tasks that are crossover and mutation for recognizing developed solutions.Population includes 2 basic individuals that are genotype and phenotype.Carrying out such tasks in iterated manner outcomes in the adaptive-fit solution.GA aim is increasing candidate solutions payoff in population against the cost function from field of issue.GA approach is repeatedly used surrogates for mutation genetic mechanisms and recombination on candidate solutions population that cost function (called function of objective/fitness) used for decoded candidate governs representation probabilistic contributions provided candidate solution could make to subsequent candidate solution creation.

Particle Swarm Optimisation (PSO) algorithm
Strategy of PSO [24] is one of the basic swarm intelligence environment levels.This is according to social foraging manner of bird.For instance, birds flocking and fish schooling.For recognizing better solution, 2 basic approaches have been presented: (i) global position, (ii) local position.Swarm maintains whole history of location for future reference.PSO is targeted in identifying optimum solution given whole swarm location info in multidimensional hyper-volume.Optimum solution could be achieved by randomly assigning locations and velocities to whole swarms.Strategy of PSO proceeds by incrementing each swarm position given the velocity.Important PSO strategy opinion is looking for a problem space for assigning the popular global location that is known to particle after sampling for every novel update associated to location.Accordingly, particles set together/converge in developed solution connecting better locations exploration and use in search space. .

EVALUATION OF RESULTS
To evaluate and compare the proposed method has been used the algorithms of the paper [6][7], because this scheduling algorithm has the same criteria with the proposed algorithm.For these comparisons have been tested five workflows Montage, Inspiral, Epigenimics, SIPHT and Cybershake with the available classes in Table 1.The experiments show that for a workflow, according to the two algorithms we considered, whatever performance of the two IC-LOSS and IC-PCPD2 algorithms be better in the production of the initial population, the proposed method has a better performance than the base algorithms.For the scheduling of the cloud workflow, our proposed evolutionary algorithm is more stable and seems to be more likely to produce acceptable schedulers.Average completion time worth of offered presented technique is better than medium CTDHH and HCHS strategy completion time for whole workflows.is for applied heuristic algorithms (IC-PCPD2, IC-LOSS) in generating basic algorithms' population (PSO, GA, CTDHH, HCHS) taking for identifying the most optimum solution for workflow cost optimization.As it is obvious in Table 6, proposed algorithm outcomes are better than other algorithms, order of time is decreased for workflow.

CONCLUSION
One of the scheduling approaches is to rely on the quality of service (QoS) the user's desired on scheduling.Quality of service-based workflow scheduling algorithms usually consider a number of service quality parameters, such as cost parameters, task completion deadlines (and in general, parameters in the service level agreement) for scheduling.In this paper is presented a combination of Meta-heuristic and heuristic algorithms for workflows scheduling in a cloud environment.In this way, some heuristic algorithms (IC-PCPD2, IC-LOSS) were used to create the initial input and population of the evolutionary algorithm of PSO, GA, CTDHH, and HCHS.According to the results, the proposed method is in many cases better than the initial algorithms.In the future, we plan to extend the proposed method for multiple workflows and multi-cloud environments.

FIGURE 5 .
FIGURE 5. -Comparison of average completion time in proposed algorithm with other algorithms (SIPHT workflow)