Welcome to Scholar Publishing Group

Machine Learning Theory and Practice, 2020, 1(1); doi: 10.38007/ML.2020.010103.

Machine Learning Based News Text Classification

Author(s)

Tian Zhang

Corresponding Author:
Tian Zhang
Affiliation(s)

Changchun Normal University, Changchun 130032, China

Abstract

With the rapid development of the Internet era, the explosive growth of news data volume and the lack of effective management are gradually becoming serious problems, and it is increasingly difficult for readers to obtain valuable information quickly. How to quickly search for valuable information from the large amount of news text information is a meaningful task in text classification. Existing research methods still have some problems, such as directly combining the headline with the text, thus neglecting the importance of the headline, and the single model of classification, which leads to low classification results. For this reason, the main objective of this paper is to investigate the classification of news texts based on machine learning. This paper examines the current state of deep learning-based text classification and, in combination with the characteristics of news texts, chooses a machine learning-based text classification method to further explore and study news texts. Through the response time and core algorithm accuracy test of this system, the system better reflects the excellent performance of the system and meets the actual performance requirements of the system. The proposed news system can not only process system requests quickly, but also has excellent accuracy rate, which can better assist users to filter information and improve the user's experience of reading news.

Keywords

Machine Learning, News Text, Text Classification, Feature Extraction

Cite This Paper

Tian Zhang. Machine Learning Based News Text Classification. Machine Learning Theory and Practice (2020), Vol. 1, Issue 1: 22-32. https://doi.org/10.38007/ML.2020.010103.

References

[1] Mete Eminagaoglu: A new similarity measure for vector space models in text classification and information retrieval. J. Inf. Sci. 48(4): 463-476 (2020). https://doi.org/10.1177/0165551520968055

[2] Mumtahina Ahmed, Mohammad Shahadat Hossain, Raihan Ul Islam, Karl Andersson: Explainable Text Classification Model for COVID-19 Fake News Detection. J. Internet Serv. Inf. Secur. 12(2): 51-69 (2020).

[3] Aytug Onan: Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment classification. J. King Saud Univ. Comput. Inf. Sci. 34(5): 2098-2117 (2020). 

[4] Dede Rohidin, Noor Azah Samsudin, Mustafa Mat Deris: Association rules of fuzzy soft set based classification for text classification problem. J. King Saud Univ. Comput. Inf. Sci. 34(3): 801-812 (2020). https://doi.org/10.1016/j.jksuci.2020.03.014

[5] Mumtahina Ahmed, Mohammad Shahadat Hossain, Raihan Ul Islam, Karl Andersson: Explainable Text Classification Model for COVID-19 Fake News Detection. J. Internet Serv. Inf. Secur. 12(2): 51-69 (2020).

[6] Mohammadreza Samadi, Maryam Mousavian, Saeedeh Momtazi: Persian Fake News Detection: Neural Representation and Classification at Word and Text Levels. ACM Trans. Asian Low Resour. Lang. Inf. Process. 21(1): 10:1-10:11 (2020). https://doi.org/10.1145/3472620

[7] Vitor Garcia dos Santos, Ivandré Paraboni: Myers-Briggs personality classification from social media text using pre-trained language models. J. Univers. Comput. Sci. 28(4): 378-395 (2020). https://doi.org/10.3897/jucs.70941

[8] Himashi Rathnayake, Janani Sumanapala, Raveesha Rukshani, Surangika Ranathunga: Adapter-based fine-tuning of pre-trained multilingual language models for code-mixed and code-switched text classification. Knowl. Inf. Syst. 64(7): 1937-1966 (2020). 

[9] Prabhat Dansena, Soumen Bag, Rajarshi Pal: Pen ink discrimination in handwritten documents using statistical and motif texture analysis: A classification based approach. Multim. Tools Appl. 81(21): 30881-30909 (2020). 

[10] Asad Masood Khattak, Muhammad Zubair Asghar, Hassan Ali Khalid, Hussain Ahmad: Emotion classification in poetry text using deep neural network. Multim. Tools Appl. 81(18): 26223-26244 (2020). 

[11] Rajib Ghosh: A recurrent neural network based deep learning model for text and non-text stroke classification in online handwritten Devanagari document. Multim. Tools Appl. 81(17): 24245-24263 (2020). 

[12] Ngoc Lethikim, Thao Nguyen-Trang, Tai Vovan: A new image classification method using interval texture feature and improved Bayesian classifier. Multim. Tools Appl. 81(25): 36473-36488 (2020). 

[13] Abadhan Ranganath, Manas Ranjan Senapati, Pradip Kumar Sahu: A novel pixel range calculation technique for texture classification. Multim. Tools Appl. 81(13): 17639-17667 (2020). 

[14] Jeow Li Huan, Arif Ahmed Sekh, Chai Quek, Dilip K. Prasad: Emotionally charged text classification with deep learning and sentiment semantic. Neural Comput. Appl. 34(3): 2341-2351 (2020). 

[15] Hozayfa El Rifai, Leen Al Qadi, Ashraf Elnagar: Arabic text classification: the need for multi-labeling systems. Neural Comput. Appl. 34(2): 1135-1159 (2020). 

[16] Gilles Jacobs, Cynthia Van Hee, Véronique Hoste: Automatic classification of participant roles in cyberbullying: Can we detect victims, bullies, and bystanders in social media text? Nat. Lang. Eng. 28(2): 141-166 (2020). https://doi.org/10.1017/S135132492000056X

[17] Lydia Binti Abdul Hamid, Anis Salwa Mohd Khairuddin, Uswah Khairuddin, Nenny Ruthfalydia Rosli, Norrima Mokhtar: Texture image classification using improved image enhancement and adaptive SVM. Signal Image Video Process. 16(6): 1587-1594 (2020). 

[18] Soulib Ghosh, Khalid Hassan Sheikh, Hussain Ali Khan, Ankur Manna, Showmik Bhowmik, Ram Sarkar: Application of texture-based features for text non-text classification in printed document images with novel feature selection algorithm. Soft Comput. 26(2): 891-909 (2020). 

[19] Amir Kenarang, Mehrdad Farahani, Mohammad Manthouri: BiGRU attention capsule neural network for persian text classification. J. Ambient Intell. Humaniz. Comput. 13(8): 3923-3933 (2020). 

[20] Dangguo Shao, Chengyao Li, Chusheng Huang, Qing An, Yan Xiang, Junjun Guo, Jianfeng He: The short texts classification based on neural network topic model. J. Intell. Fuzzy Syst. 42(3): 2143-2155 (2020). https://doi.org/10.3233/JIFS-211471

[21] Kushagri Tandon, Niladri Chatterjee: Multi-label text classification with an ensemble feature space. J. Intell. Fuzzy Syst. 42(5): 4425-4436 (2020). https://doi.org/10.3233/JIFS-219232

[22] N. Venkata Sailaja, L. Padma Sree, Nimmala Mangathayaru: Statistically Empirical Integrated Approach for Knowledge Refined Text Classification. J. Inf. Knowl. Manag. 21(2): 2250027:1-2250027:21 (2020). https://doi.org/10.1142/S0219649222500277