International Journal of World Medicine, 2025, 6(1); doi: 10.38007/IJWM.2025.060102.
Bukun Ren
College of Engineering, University of California Berkeley, Berkeley, CA 94720, California, United States
With the wide application of Electronic Medical Records (EMR) in the medical field, how to effectively calculate the similarity between different medical records has become a key problem in medical data analysis. Traditional medical record similarity calculation methods based on rules or statistics have some problems, such as insufficient feature extraction and limited semantic understanding, which are difficult to meet the needs of complex medical data. This paper proposes a calculation model of medical record similarity measure based on deep learning and multi-modal abstract extraction, combining natural language processing (NLP) and Computer Vision (CV) technology to extract deep semantic features from multi-modal data such as text and image. In the text part, BERT (Bidirectional Encoder Representations from Transformers) is used for feature extraction, and Pointer Network is used for abstract extraction. In the image part, convolutional neural network (CNN) such as ResNet is used for feature extraction, and attention mechanism is used to optimize summary generation. Transformer is used for cross-modal feature fusion and medical record similarity calculation based on Siamese network. The experimental results show that the proposed method is superior to the traditional method in accuracy, F1-score and AUC, which proves its validity in the calculation of the similarity of electronic medical records. Finally, we implemented a complete medical record similarity computing system based on Django and PyTorch, which provided technical support for medical big data analysis and intelligent diagnosis support.
Electronic medical record, Multimodal summary, Deep learning, similarity measure, natural language processing
Bukun Ren. Calculation Model and System Implementation of Similarity Measurement of Multiple Electronic Medical Records Based on Deep Learning and Multi-Modal Abstract Extraction. International Journal of World Medicine (2025), Vol. 6, Issue 1: 9-18. https://doi.org/10.38007/IJWM.2025.060102.
[1] Yang J. Research on the Application of Medical Text Matching Technology Combined with Twin Network and Knowledge Distillation in Online Consultation[J].
[2] Zhang, Yiru. "Design and Implementation of a Computer Network Log Analysis System Based on Big Data Analytics."
[3] Liu, Boyang. "Study on the Frequency of Computer Language Use Based on Big Data Analysis." Academic Journal of Computing & Information Science 7.10 (2024).
[4] Shi, Chongwei. "Research on Gene Identification Algorithms Based on Signal Processing Techniques." 2024 6th International Conference on Artificial Intelligence and Computer Applications (ICAICA). IEEE, 2024.
[5] Chen, Junyu. "Research on Intelligent Data Mining Technology Based on Geographic Information System." Journal of Computer Science and Artificial Intelligence 2.2 (2025): 12-16.
[6] Zhu, Zhongqi. "Strategies for Improving Vector Database Performance through Algorithm Optimization." Scientific Journal of Technology 7.2 (2025): 138-144.
[7] Xu, Yue. "Research on Maiustream Web Database Development Technclogy." Journal of Computer Science and Artificial Intelligence 2.2 (2025): 29-32.
[8] Yang J. Research on the Strategy of MedKGGPT Model in Improving the Interpretability and Security of Large Language Models in the Medical Field[J]. Academic Journal of Medicine & Health Sciences, 5(9): 40-45.
[9] Cao, Y., Cao, P., Chen, H., Kochendorfer, K. M., Trotter, A. B., Galanter, W. L., ... & Iyer, R. K. (2022). Predicting ICU admissions for hospitalized COVID-19 patients with a factor graph-based model. In Multimodal AI in healthcare: A paradigm shift in health intelligence (pp. 245-256). Cham: Springer International Publishing.
[10] Chen, H., Wang, Z., & Han, A. (2024). Guiding Ultrasound Breast Tumor Classification with Human-Specified Regions of Interest: A Differentiable Class Activation Map Approach. In 2024 IEEE Ultrasonics, Ferroelectrics, and Frequency Control Joint Symposium (UFFC-JS) (pp. 1-4). IEEE.
[11] Varatharajah, Y., Chen, H., Trotter, A., & Iyer, R. K. (2020). A Dynamic Human-in-the-loop Recommender System for Evidence-based Clinical Staging of COVID-19. In HealthRecSys@ RecSys (pp. 21-22).
[12] Chen, H., Yang, Y., & Shao, C. (2021). Multi-task learning for data-efficient spatiotemporal modeling of tool surface progression in ultrasonic metal welding. Journal of Manufacturing Systems, 58, 306-315.
[13] 5. Chen, H., Ma, K., & Shen, J. (2024). Interpretable Machine Learning Facilitates Disease Prognosis: Applications on COVID-19 and Onward. International Journal of Computer Science and Information Technology, 3(3), 428-436.
[14] Shi, C. (2024). DNA Microarray Technology Principles and Applications in Genetic Research. Computer Life.Vol. 12, No. 3, 2024,19-24
[15] Liu, Boyang. "Design and Application of Experimental Data Management System Integrating Remote Monitoring and Historical Data Analysis." Journal of Electronics and Information Science 9.3 (2024): 160-167.
[16] Zhao, Fengyi. "Risk Assessment Model and Empirical Study of in Vitro Diagnostic Reagent Project Based on Analytic Hierarchy Process." International Journal of New Developments in Engineering and Society 8.5 (2024), 76-82
[17] Zhang, Jinshuo "Research on Real Time Condition Monitoring and Fault Warning System for Construction Machinery under Multi Source Heterogeneous Data Fusion." Journal of Engineering Mechanics and Machinery (2024), 9(2): 139-144
[18] Shi, C. (2024). Research on the Application of Computer Technology in Biostatistics. Journal of Computing and Electronic Information Management.Vol. 14, No. 3, 2024,12-15
[19] Pan, Yu. "Research on the Evolutionary Path of Resource Management and Capability Building for Platform Enterprises." International Journal of Finance and Investment 2.1 (2025): 78-81.
[20] Yang, Jinzhu "Integrated Application of LLM Model and Knowledge Graph in Medical Text Mining and Knowledge Extraction."Social Medicine and Health Management (2024), 5(2): 56-62
[21] Zhang, Jingtian. "Research on Worker Allocation Optimization Based on Real-Time Data in Cloud Computing." Frontiers in Science and Engineering 5.2 (2025): 119-125.
[22] Xu, Qianru. "Practical Applications of Large Language Models in Enterprise-Level Applications." Journal of Computer Science and Artificial Intelligence 2.2 (2025): 17-21.
[23] Shi C. Research on Deep Learning Algorithms for Predicting DNA-Binding Proteins Based on Sequence Information[C]//2024 IEEE 2nd International Conference on Electrical, Automation and Computer Engineering (ICEACE). IEEE, 2024: 1566-1570.
[24] Yang J. Application of Multi-model Fusion Deep NLP System in Classification of Brain Tumor Follow-Up Image Reports[C]//The International Conference on Cyber Security Intelligence and Analytics. Cham: Springer Nature Switzerland, 2024: 380-390.
[25] Zhu P. Construction and Experimental Verification of Automatic Classification Process Based on K-Mer Frequency Statistics[C]//The International Conference on Cyber Security Intelligence and Analytics. Cham: Springer Nature Switzerland, 2024: 391-400.
[26] Wang Y. Design and Implementation of a General Data Collection System Architecture Based on Relational Database Technology[C]//The International Conference on Cyber Security Intelligence and Analytics. Cham: Springer Nature Switzerland, 2024: 561-572.
[27] Wang, Yuxin "Research on Intelligent Macro Image Recognition Algorithm of Oil Pipe Failure Based on Deep Learning." Journal of Image Processing Theory and Applications (2025), 8(1): 1-7
[28] Zhao, Fengyi "Development Design and Signal Processing Algorithm Optimization of Traditional Chinese Medicine Pulse Acquisition System Based on CP301 Sensor." Advances in Computer, Signals and Systems (2024), 8(6): 106-111