High Performance Computing for Big Data in Distributed Systems

Distributed Processing System, 2020, 1(2); doi: 10.38007/DPS.2020.010202.

High Performance Computing for Big Data in Distributed Systems

Author(s)

Antonio Cruz Chavez

Corresponding Author:

Antonio Cruz Chavez

Affiliation(s)

Univ Quebec, Ecole Technol Super, Montreal, PQ, Canada

Download PDF
|
Download: 49
|
View: 1902

Abstract

With the continuous progress of high performance computing technology and application technology, high performance computer has been applied more and more widely. This paper focuses on the research and application of high performance computing for big data in distributed systems. This paper first designs a distributed hybrid storage system based on DRAM and SSD, and then implements a phase-consistent strategy to optimize the performance of client-side cache and optimize the read performance through client-side metadata cache. The simulation results show that the system designed in this paper can be applied to the actual production environment.

Keywords

Big Data, High Performance Computing, Distributed Systems, Meta Data

Cite This Paper

Antonio Cruz Chavez. High Performance Computing for Big Data in Distributed Systems. Distributed Processing System (2020), Vol. 1, Issue 2: 10-17. https://doi.org/10.38007/DPS.2020.010202.

References

[1] Kumar H, Chauhan N K, Yadav P K. A High Performance Model for Task Allocation in Distributed Computing System Using K-Means Clustering Technique. International Journal of Distributed Systems and Technologies, 2018, 9(3):1-23. https://doi.org/10.4018/IJDST.2018070101

[2] Sierra R, Carreras C, Caffarena G. Witelo: Automated generation and timing characterization of distributed-control macroblocks for high-performance FPGA designs. Integration, 2019, 68(SEP.):1-11. https://doi.org/10.1016/j.vlsi.2019.05.001

[3] CJ Barrios Hernández, Gitler I, Klapp J. [Communications in Computer and Information Science] High Performance Computing Volume 697 || Distributed Big Data Analysis for Mobility Estimation in Intelligent Transportation Systems. 2017, 10.1007/978-3-319-57972-6(Chapter 11):146-160.

[4] Brinkmann A, Mohror K, Yu W, et al. Ad Hoc File Systems for High-Performance Computing. Journal of Computer Science and Technology, 2020, 35(1):4-26.

[5] Czarnul P, Proficz J, Drypczewski K. Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems. Scientific Programming, 2020, 2020(5):1-19. https://doi.org/10.1155/2020/4176794

[6] Mathe Z, Haen C, Stagni F. Monitoring performance of a highly distributed and complex computing infrastructure in LHCb. Journal of Physics Conference, 2017, 898(9):092028.

[7] Titov M, G Záruba, De K, et al. A study of the applicability of recommender systems for the Production and Distributed Analysis system PanDA of the ATLAS Experiment. Journal of Physics Conference Series, 2018, 1085(4):042028.

[8] Roberto, Diversi, Andrea, et al. Thermal Model Identification of Computing Nodes in High-Performance Computing Systems. IEEE Transactions on Industrial Electronics, 2019, 67(9):7778-7788.

[9] Cuomo S, Galletti A, Marcellino L. A GPU parallel optimised blockwise NLM algorithm in a distributed computing system. International Journal of High Performance Computing and Networking, 2018, 11(4):304.

[10] Michal, Janczykowski, Wojciech, et al. Large-scale urban traffic simulation with Scala and high-performance computing system - ScienceDirect. Journal of computational science, 2019, 35(C):91-101. https://doi.org/10.1016/j.jocs.2019.06.002

[11] Nathan H, Vishal A, Farrens M K, et al. A Survey of End-System Optimizations for High-Speed Networks. ACM Computing Surveys, 2018, 51(3):1-36. https://doi.org/10.1145/3184899

[12] CaoNgocNguyen, SoonwookHwang, Jik-SooKim. Making a case for the on-demand multiple distributed message queue system in a Hadoop cluster. Cluster Computing, 2017, 20(3):2095–2106.

[13] Borghesi A, Molan M, Milano M, et al. Anomaly Detection and Anticipation in High Performance Computing Systems. IEEE Transactions on Parallel and Distributed Systems, 2020, PP(99):1-1.

[14] P López, Baydal E. Teaching high-performance service in a cluster computing course. Journal of Parallel and Distributed Computing, 2018, 117(jul.):138-147. https://doi.org/10.1016/j.jpdc.2018.02.027

[15] Ko H, Pack S. Distributed Device-to-Device Offloading System: Design and Performance Optimization. IEEE Transactions on Mobile Computing, 2020, PP(99):1-1.

[16] Yokota R, Weiland M, Keyes D, et al. [Lecture Notes in Computer Science] High Performance Computing Volume 10876 || Zeno: A Straggler Diagnosis System for Distributed Computing Using Machine Learning. 2018, 10.1007/978-3-319-92040-5(Chapter 8):144-162. https://doi.org/10.1007/978-3-319-92040-5_8

[17] Reuther A, Byun C, Arcand W, et al. Scalable system scheduling for HPC and big data. Journal of Parallel and Distributed Computing, 2018, 111(jan.):76-92. https://doi.org/10.1016/j.jpdc.2017.06.009

[18] Han M, Park J, Baek W. Design and Implementation of a Criticality- and Heterogeneity-Aware Runtime System for Task-Parallel Applications. IEEE Transactions on Parallel and Distributed Systems, 2020, PP(99):1-1