Abstract: Big data frameworks which include Spark and Hadoop are broadly followed to run analytics jobs in each study and industry. Cloud gives low priced compute sources that are less complicated to manage. Hence, many businesses are transferring in the direction of a cloud deployment in their huge facts computing clusters. However, activity scheduling is a complicated trouble withinside the presence of diverse Service Level Agreement (SLA) goals which include economic fee reduction, and activity overall performance improvement. Most of the prevailing studies does now no longer cope with a couple of goals collectively and fail to seize the inherent cluster and workload traits. In this article, we formulate the activity scheduling trouble of a cloud-deployed Spark cluster and endorse a unique Reinforcement Learning (RL) version to deal with the SLA goals. We expand the RL cluster surroundings and enforcement Deep Reinforce Learning (DRL) primarily based totally schedulers in TF-Agent’s framework. The proposed DRL-primarily based totally scheduling retailers’ paintings at a fine-grained stage to area the executors of jobs at the same time as leveraging the pricing version of cloud VM instances. In addition, the DRL-primarily based totally retailers also can analyze the inherent traits of various forms of jobs to discover a right placement to lessen each the whole cluster VM utilization fee and the common activity duration. The consequences display that the proposed DRL-primarily based totally algorithms can lessen the VM utilization fee as much as feasible.
Keywords: Cloud computing, cost-efficiency, performance improvement, deep reinforcement learning, big data.
| DOI: 10.17148/IARJSET.2023.10106