Machine Learning Theory and Practice, 2025, 5(1); doi: 10.38007/ML.2025.050112.
Xindi Wei
Pepperdine Graziadio Business School, Master of Science in Business Analytics, Malibu, California, 90263, USA
With the rapid progress of intelligent technology, machine learning is increasingly widely used in all walks of life, and the role of data engineering is increasingly prominent, becoming the core link that determines the efficiency and practicability of the model. The quality, purification, storage and control of data are directly related to the quality and speed of model training. The optimization of techniques, such as feature extraction, hyperparameter fine-tuning, regularization processing, etc., continues to promote the leap in model performance. Data engineering is not only the basis of model training, but also plays an indispensable role in the subsequent steps of model deployment, monitoring and iterative upgrading. This paper introduces the practical application effect of machine learning model enhanced by means of feature engineering, distributed computing, big data environment, etc., in order to promote the popularization and deepening development of intelligent technology.
data engineering; Machine learning model; Feature engineering; Hyperparameter tuning
Xindi Wei. Optimization of Machine Learning Models and Application Supported by Data Engineering. Machine Learning Theory and Practice (2025), Vol. 5, Issue 1: 117-124. https://doi.org/10.38007/ML.2025.050112.
[1] Palma G R , Thornberry C ,Seán Commins,et al.Understanding Learning from EEG Data: Combining Machine Learning and Feature Engineering Based on Hidden Markov Models and Mixed Models[J].Neuroinformatics, 2024, 22(4):487-497.
[2] Purbey R , Parijat H , Agarwal D ,et al.Machine learning and data mining assisted petroleum reservoir engineering: a comprehensive review[J].International Journal of Oil, Gas and Coal Technology: IJOGCT, 2022(4):30.
[3] Anaraki F , Hariri-Ardebili M , Becker S ,et al.Call for Special Issue Papers:Big Scientific Data and Machine Learning in Science and Engineering.[J].Big data, 2021, 9(5):404-405.
[4] Habib M , Okayli M .Evaluating the Sensitivity of Machine Learning Models to Data Preprocessing Technique in Concrete Compressive Strength Estimation[J].Arabian journal for science and engineering, 2024(10):49.
[5] Klamrowski M M , Klein R , Mccudden C ,et al.Derivation and Validation of a Machine Learning Model for the Prevention of Unplanned Dialysis[J].Clinical Journal of the American Society of Nephrology, 2024, 19(9):1098-1108.
[6] K. Zhang, "Optimization and Performance Analysis of Personalized Sequence Recommendation Algorithm Based on Knowledge Graph and Long Short Term Memory Network," 2025 2nd International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS), Hassan, India, 2025, pp. 1-6, doi: 10.1109/IACIS65746.2025.11211298.
[7] Y. Zhao, "Design and Financial Risk Control Application of Credit Scoring Card Model Based on XGBoost and CatBoost," 2025 International Conference on Intelligent Communication Networks and Computational Techniques (ICICNCT), Bidar, India, 2025, pp. 1-5, doi: 10.1109/ICICNCT66124.2025.11233033.
[8] B. Li, "Research on the Spatial Durbin Model Based on Big Data and Machine Learning for Predicting and Evaluating the Carbon Reduction Potential of Clean Energy," 2025 International Conference on Intelligent Communication Networks and Computational Techniques (ICICNCT), Bidar, India, 2025, pp. 1-5, doi: 10.1109/ICICNCT66124.2025.11232698.
[9] Q. Xu, "Implementation of Intelligent Chatbot Model for Social Media Based on the Combination of Retrieval and Generation," 2025 2nd International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS), Hassan, India, 2025, pp. 1-7, doi: 10.1109/IACIS65746.2025.11210989.
[10] Y. Zou, "Research on the Construction and Optimization Algorithm of Cybersecurity Knowledge Graphs Combining Open Information Extraction with Graph Convolutional Networks," 2025 2nd International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS), Hassan, India, 2025, pp. 1-5, doi: 10.1109/IACIS65746.2025.11211353.
[11] M. Zhang, "Research on Joint Optimization Algorithm for Image Enhancement and Denoising Based on the Combination of Deep Learning and Variational Models," 2025 International Conference on Intelligent Communication Networks and Computational Techniques (ICICNCT), Bidar, India, 2025, pp. 1-5, doi: 10.1109/ICICNCT66124.2025.11232800.
[12] W. Han, "Using Spark Streaming Technology to Drive the Real-Time Construction and Improvement of the Credit Rating System for Financial Customers," 2025 International Conference on Intelligent Communication Networks and Computational Techniques (ICICNCT), Bidar, India, 2025, pp. 1-6, doi: 10.1109/ICICNCT66124.2025.11232932.
[13] J. Huang, "Research on Multi-Model Fusion Machine Learning Demand Intelligent Forecasting System in Cloud Computing Environment," 2025 2nd International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS), Hassan, India, 2025, pp. 1-7, doi: 10.1109/IACIS65746.2025.11210946.
[14] J. Huang, "Performance Evaluation Index System and Engineering Best Practice of Production-Level Time Series Machine Learning System," 2025 International Conference on Intelligent Communication Networks and Computational Techniques (ICICNCT), Bidar, India,
[15] X. Liu, "Research on User Preference Modeling and Dynamic Evolution Based on Multimodal Sequence Data," 2025 2nd International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS), Hassan, India, 2025, pp. 1-7, doi: 10.1109/IACIS65746.2025.11211273.