Optimizing Mail Sorting with Naive Bayes Classifier and Enhanced Feature Extraction Method
Email sorting refers to the process of organizing and categorizing incoming emails in order to efficiently manage and prioritize them. By implementing various sorting techniques, users can quickly identify important messages, reduce clutter, and enhance overall productivity. The Naive Bayes Classifi...
Gespeichert in:
Veröffentlicht in: | SN computer science 2024-09, Vol.5 (7), p.914, Article 914 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Email sorting refers to the process of organizing and categorizing incoming emails in order to efficiently manage and prioritize them. By implementing various sorting techniques, users can quickly identify important messages, reduce clutter, and enhance overall productivity. The Naive Bayes Classifier is used in the research to classify emails as spam or not spam using the conditional probability distribution idea. The Objective of the research is to implement Naive Bayes Classifier to classify emails as spam or not spam using the conditional probability distribution idea. In this method, the bag of phrases is frequently used along with the maximum entropy method for text classification. Stop words are used to reduce redundant terms, and each word’s frequency is a key factor in the classifier’s training. Further the feature sets are being classified to positive and negative data using the binary values 0 and 1, and the probabilities of the same are calculated using Naive Bayes classifier. P(y = True/sentence) = 0.0073 and P(y = False/sentence) = 0.0123. The significance of the research is to measure the performance of the classifier is then assessed after normalizing these values. We have obtained Normalized (P)y = True/Sentence = 0.848 and Normalized (P)y = False/ Sentence = 0.1511. |
---|---|
ISSN: | 2661-8907 2662-995X 2661-8907 |
DOI: | 10.1007/s42979-024-03178-5 |