International Journal of Big Data Intelligent Technology, 2026, 7(1); doi: 10.38007/IJBDIT.2026.0701112.
Zhixian Zhang
School of Professional Studies, New York University, New York, 10003, United States of America
Data silos, inconsistent feature definitions, and inadequate software project management have led to a continuous decline in the scale of enterprise AI projects. Methodology: This paper proposes a lakehouse-native architecture approach for SDE (Software Development Engineer), integrating data contracts, policy as code and data, CI/CD of models, and establishing a multi-objective optimization model for pipeline resource allocation to meet latency and cost requirements. Results: Compared to baseline lakehouse settings, our proposed design reduces the median end-to-end latency of TPC DS-type analytics workloads and streaming media services by 22.6%, a significant improvement (p < 0.01), with a narrower 95% confidence interval. Conclusions: This approach enhances scalability and reusability for enterprises, while also enabling rule compliance through traceable metadata and legacy systems.
Enterprise artificial intelligence; Data platform; Lakehouse; Data contract; MLOps; Scalability
Zhixian Zhang. Research on the Design of Scalable Enterprise-Level AI Systems Data Platform Architectures from an SDE Perspective. International Journal of Big Data Intelligent Technology (2026), Vol. 7, Issue 1: 96-101. https://doi.org/10.38007/IJBDIT.2026.0701112.
[1] M. Armbrust, A. Ghodsi, R. Xin, M. Zaharia, Lakehouse: A Recent Generation of Open Platforms That Integrate Data Warehousing and Advanced Analytics. In: CIDR, 2021.
[2] Kreuzberger D, Kühl N, Hirschl S. Machine Learning Operations (MLOps): Overview, Definition and Architecture. arXiv: 2205.02302, 2022.
[3] de la Rúa Martinez J, et al. Machine Learning using Hopsworks Feature Store. ACM, 2023.
[4] Data Mesh: A Systematic Gray Literature Review. ACM, 2024.
[5] SchillB A, et al. Investigating Data Mesh Architecture: A Comparative Analysis on Industrial Practice. GI Proceedings, 2025.
[6] Hui, X. (2026). Research on the Design and Optimization of Automated Data Collection and Visual Dashboard in the Medical Industry. Journal of Computer, Signal, and System Research, 3(1), 27-34.
[7] Shen, D. (2026). Application of Large Language Model in Mental Health Clinical Decision Support System. International Journal of Engineering Advances, 3(1), 23-30.
[8] Wang, Y. (2026). Research on Optimization of Neuromuscular Rehabilitation Program Based on Physiological Assessment. European Journal of AI, Computing & Informatics, 2(1), 21-30.
[9] Ding, J. (2026). Optimization Strategies for Supply Chain Management and Quality Control in the Automotive Manufacturing Industry. Strategic Management Insights, 3(1), 17-23.
[10] Zhang, Q. (2026). How to Improve Marketing Efficiency and Precision through AI-Driven Innovative Products. Strategic Management Insights, 3(1), 1-8.