This project demonstrates a basic Machine Learning pipeline for text classification, specifically designed to categorise customer reviews as 'positive' or 'negative'. It showcases a common project structure, automated testing (unit, regression, and integration), and a simple application runner.
You run an e-commerce platform filled with product reviews. Your
TextClassifieris your smart helper, enabling you to tell positive from negative reviews. To ensure this smart helper can adapt and respond to new features and changes, you introduce automated testing into your project.
-
Open the project in your IDE and open the integrated terminal.
-
Create a virtual environment and then activate it.
- Create with
venv
# Create python -m venv venv # Activate with Windows .\venv\Scripts\activate # Activate with macOS/Linux source venv/bin/activate
- Create with
conda
# Create conda create -n text_classifier_env python=3.8 # Or your preferred Python version conda activate text_classifier_env
- Create with
-
Install Dependencies
pip install -r requirements.txt
-
Run the Application (This demo will train the classifier on the provided CSV data and output predictions and evaluation results.)
python app.py
-
Run Tests (-v flagProvides verbose output, showing individual test results.)
# Run all pytest -v # Run specific pytest -v tests/test_TextClassifier_unit.py
testing-mini-project/
├── data/
│ └── raw/
│ └── text-label.csv
├── src/
│ └── TextClassifier.py
├── tests/
│ └── conftest.py
│ └── test_TextClassifier_integration.py
│ └── test_TextClassifier_regression.py
│ └── test_TextClassifier_unit.py
├── app.py
├── requirements.txt
└── README.md