Big data analysis has become a vital tool in many disciplines. Due to its intensive nature, big data analysis is often performed in cloud computing environments. Cloud computing offers the potential for large scale parallelism and scalable provision. However, determining an optimal deployment can be an expensive operation and therefore some form of prediction of performance prior to deployment would be extremely useful. In this paper we explore the deployment of one complex such problem, the NGS pipeline. We use provenance execution data to populate models simulated in WorkflowSim and CloudSim. This allows us to explore different scenarios for runtime properties.
CITATION STYLE
Llwaah, F., Cała, J., & Thomas, N. (2016). Simulation of runtime performance of big data workflows on the cloud. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9951 LNCS, pp. 141–155). Springer Verlag. https://doi.org/10.1007/978-3-319-46433-6_10
Mendeley helps you to discover research relevant for your work.