Optimizing Mail Sorting with Naive Bayes Classifier and Enhanced Feature Extraction Method

Email sorting refers to the process of organizing and categorizing incoming emails in order to efficiently manage and prioritize them. By implementing various sorting techniques, users can quickly identify important messages, reduce clutter, and enhance overall productivity. The Naive Bayes Classifi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:SN computer science 2024-09, Vol.5 (7), p.914, Article 914
Hauptverfasser: Pavithra, C., Saradha, M., Nisha, B. Antline
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Email sorting refers to the process of organizing and categorizing incoming emails in order to efficiently manage and prioritize them. By implementing various sorting techniques, users can quickly identify important messages, reduce clutter, and enhance overall productivity. The Naive Bayes Classifier is used in the research to classify emails as spam or not spam using the conditional probability distribution idea. The Objective of the research is to implement Naive Bayes Classifier to classify emails as spam or not spam using the conditional probability distribution idea. In this method, the bag of phrases is frequently used along with the maximum entropy method for text classification. Stop words are used to reduce redundant terms, and each word’s frequency is a key factor in the classifier’s training. Further the feature sets are being classified to positive and negative data using the binary values 0 and 1, and the probabilities of the same are calculated using Naive Bayes classifier. P(y = True/sentence) = 0.0073 and P(y = False/sentence) = 0.0123. The significance of the research is to measure the performance of the classifier is then assessed after normalizing these values. We have obtained Normalized (P)y = True/Sentence = 0.848 and Normalized (P)y = False/ Sentence = 0.1511.
ISSN:2661-8907
2662-995X
2661-8907
DOI:10.1007/s42979-024-03178-5