International Journal of Big Data Intelligent Technology, 2025, 6(2); doi: 10.38007/IJBDIT.2025.060215.
Jin Li
Morgan Stanley, 65 Irby Ave NW, Atlanta, GA, 30305, US
With the rapid advancement of big data technology, distributed architecture has become the mainstream in the industry when processing massive amounts of information. However, when dealing with such large datasets, the query efficiency and performance of the system become key factors that constrain its response speed and accuracy. This study analyzed the key performance bottlenecks of distributed data queries, such as storage response latency, hardware processing capacity limits, and data consistency assurance. Based on this, a series of targeted improvement measures were proposed. Specifically, in terms of distributed storage network latency, hardware resource upgrade requirements, and data consistency maintenance, research has proposed solutions such as optimizing data distribution, regularly upgrading hardware facilities, and adopting distributed locking strategies. After implementing these optimization measures, query response can be accelerated, data accuracy can be ensured, and hardware costs and maintenance expenses can be reduced. The research results show that these optimization methods can enhance the overall performance of processing large-scale data systems.
Distributed Data Query, Data Optimization, Large Scale Data Processing, Performance Improvement, System Bottleneck
Jin Li. The Impact of Distributed Data Query Optimization on Large-Scale Data Processing. International Journal of Big Data Intelligent Technology (2025), Vol. 6, Issue 2: 139-146. https://doi.org/10.38007/IJBDIT.2025.060215.
[1] Deepthi B. Gnana, et al. "An efficient architecture for processing real-time traffic data streams using apache flink."Multimedia Tools and Applications 83.13(2023):37369-37385.
[2] Trinh Thanh, et al. "A novel ensemble-based paradigm to process large-scale data. "Multimedia Tools and Applications 83.9(2023):26663-26685.
[3] Lin Zihang, et al. "SciSciNet: A large-scale open data lake for the science of science research." Scientific data 10.1(2023):315-315.
[4] Kontou Eftychia E, et al. "UmetaFlow: an untargeted metabolomics workflow for high-throughput data processing and analysis." Journal of cheminformatics 15.1(2023):52-52.
[5] Su H, Luo W, Mehdad Y, et al. Llm-friendly knowledge representation for customer support[C]//Proceedings of the 31st International Conference on Computational Linguistics: Industry Track. 2025: 496-504.
[6] Su Jinshu, et al. "Technology trends in large-scale high-efficiency network computing." Frontiers of Information Technology & Electronic Engineering 23.12(2022):1733-1746.
[7] Zou, Y. (2025). Design and Implementation of a Cloud Computing Security Assessment Model Based on Hierarchical Analysis and Fuzzy Comprehensive Evaluation. arXiv preprint arXiv:2511.05049.
[8] Liu, B. (2025). Design and Implementation of Data Acquisition and Analysis System for Programming Debugging Process Based On VS Code Plug-In. arXiv preprint arXiv:2511.05825.
[9] Zhu, P. (2025). The Role and Mechanism of Deep Statistical Machine Learning In Biological Target Screening and Immune Microenvironment Regulation of Asthma. arXiv preprint arXiv:2511.05904.
[10] Chang, Chen-Wei. "Compiling Declarative Privacy Policies into Runtime Enforcement for Cloud and Web Infrastructure." (2025).
[11] F. Liu, "Transformer XL Long Range Dependency Modeling and Dynamic Growth Prediction Algorithm for E-Commerce User Behavior Sequence," 2025 2nd International Conference o`n Intelligent Algorithms for Computational Intelligence Systems (IACIS), Hassan, India, 2025, pp. 1-6, doi: 10.1109/IACIS65746.2025.11211467.
[12] F. Liu, "Architecture and Algorithm Optimization of Realtime User Behavior Analysis System for Ecommerce Based on Distributed Stream Computing," 2025 International Conference on Intelligent Communication Networks and Computational Techniques (ICICNCT), Bidar, India, 2025, pp. 1-8, doi: 10.1109/ICICNCT66124.2025.11232744.
[13] Q. Hu, "Research on Dynamic Identification and Prediction Model of Tax Fraud Based on Deep Learning," 2025 2nd International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS), Hassan, India, 2025, pp. 1-6, doi: 10.1109/IACIS65746.2025.11211426.
[14] D. Shen, "Complex Pattern Recognition and Clinical Application of Artificial Intelligence in Medical Imaging Diagnosis," 2025 International Conference on Intelligent Communication Networks and Computational Techniques (ICICNCT), Bidar, India, 2025, pp. 1-8, doi: 10.1109/ICICNCT66124.2025.11232821.
[15] X. Liu, "Research on User Preference Modeling and Dynamic Evolution Based on Multimodal Sequence Data," 2025 2nd International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS), Hassan, India, 2025, pp. 1-7, doi: 10.1109/IACIS65746.2025.11211273.
[16] Ding, J. (2025). Intelligent Sensor and System Integration Optimization of Auto Drive System. International Journal of Engineering Advances, 2(3), 124-130.
[17] Mingjie Chen. (2025). Exploration of the Application of the LINDDUN Model in Privacy Protection for Electric Vehicle Users. Engineering Advances, 5(4), 160-165.
[18] Liu, X. (2025). Research on Real-Time User Feedback Acceleration Mechanism Based on Genai Chatbot. International Journal of Engineering Advances, 2(3), 109-116.
[19] Zhang, M. (2025). Research on Collaborative Development Mode of C# and Python in Medical Device Software Development. Journal of Computer, Signal, and System Research, 2(7), 25-32.
[20] Wang, Y. (2025). Intervention Research and Optimization Strategies for Neuromuscular Function Degeneration in the Context of Aging. Journal of Computer, Signal, and System Research, 2(7), 14-24.