Optimizing Mail Sorting with Naive Bayes Classifier and Enhanced Feature Extraction Method

Email sorting refers to the process of organizing and categorizing incoming emails in order to efficiently manage and prioritize them. By implementing various sorting techniques, users can quickly identify important messages, reduce clutter, and enhance overall productivity. The Naive Bayes Classifi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	SN computer science 2024-09, Vol.5 (7), p.914, Article 914
Hauptverfasser:	Pavithra, C., Saradha, M., Nisha, B. Antline
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Advances in Computational Approaches for Image Processing Algorithms Artificial intelligence Classification Cloud Applications and Network Security Clutter Computer Imaging Computer Science Computer Systems Organization and Communication Networks Conditional probability Data Structures and Information Theory Datasets Electronic mail systems Entropy Feature extraction Information Systems and Communication Service Language Literature reviews Machine learning Maximum entropy method Natural language Original Research Pattern Recognition and Graphics Probability Sentences Software Engineering/Programming and Operating Systems Spamming Support vector machines Text categorization Vision Wireless Networks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Email sorting refers to the process of organizing and categorizing incoming emails in order to efficiently manage and prioritize them. By implementing various sorting techniques, users can quickly identify important messages, reduce clutter, and enhance overall productivity. The Naive Bayes Classifier is used in the research to classify emails as spam or not spam using the conditional probability distribution idea. The Objective of the research is to implement Naive Bayes Classifier to classify emails as spam or not spam using the conditional probability distribution idea. In this method, the bag of phrases is frequently used along with the maximum entropy method for text classification. Stop words are used to reduce redundant terms, and each word’s frequency is a key factor in the classifier’s training. Further the feature sets are being classified to positive and negative data using the binary values 0 and 1, and the probabilities of the same are calculated using Naive Bayes classifier. P(y = True/sentence) = 0.0073 and P(y = False/sentence) = 0.0123. The significance of the research is to measure the performance of the classifier is then assessed after normalizing these values. We have obtained Normalized (P)y = True/Sentence = 0.848 and Normalized (P)y = False/ Sentence = 0.1511.
ISSN:	2661-8907 2662-995X 2661-8907
DOI:	10.1007/s42979-024-03178-5