Email Classification

Email Classifier Multi-agent System 1. Project Overview A course final project delivering an end-to-end email classification system. It expands a HuggingFace base dataset (jason23322/high-accuracy-email-classifier) with synthetic emails generated via OpenAI API, trains and evaluates models in notebooks, and ships a Streamlit app for interactive use. Deployed demo: https://email-manager.streamlit.app/. 2. Repository Structure Data: Combined dataset with synthetic augmentation. Notebooks: EDA, preprocessing, training, pipeline export, API call simulation (Classification.ipynb, Final project.ipynb). Email pipeline: Production script (email_pipeline.py) and joblib checker (check_joblib.py). Deployment: Streamlit app config (.streamlit/) and Vercel deployment files. Reports & Slides: LaTeX/Overleaf reports and presentation. CI: GitHub workflows for lint/test. 3. Model & Pipeline Text cleaning, tokenization, and vectorization for email bodies. Supervised classifiers (documented in notebooks) with joblib-exported pipelines. Evaluation tracked in notebooks and reports; artifacts stored for reuse. 4. How to Run (Local) pip install -r requirements.txt Explore/train in notebooks (Classification.ipynb / Final project.ipynb). Serve app: streamlit run email_pipeline.py (or follow deployment/ README). Verify artifacts: python check_joblib.py. 5. Highlights Dataset augmentation via LLM to improve coverage. Full transparency: notebooks document each step from data to deployment. Deployed Streamlit demo plus reproducible local scripts. Project Link https://github.com/naufalad/IS5126-Final-Project ...