Skip to content

ikram98ai/fakenews_detection_bert_aws_pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🚀 Building a SageMaker Pipeline to Train & Deploy a RoBERTa Fake News Detection Model

Python Version SageMaker SDK License

A fully automated AWS SageMaker Pipeline that ingests a raw “fake news” dataset, cleans & balances it, trains a RoBERTa classifier, evaluates its performance, and—if it meets your quality gates—packages & registers the model for deployment after human approval.

Ask DeepWiki

🏗️ Architecture

Pipeline Architecture Diagram

  1. Data Registration & understanding
  2. Pipeline Definition
  3. Processing (clean, balance, transform, split)
  4. Training (train on train+validation)
  5. Evaluation (test the trained model's performance on the test dataset)
  6. Conditional Model Registration
  7. Human approval and SageMaker endpoint deployment

⚙️ Prerequisites

  • Python 3.8 or above
  • AWS account with permissions for SageMaker, S3, IAM, CloudWatch
  • AWS CLI v2 configured
  • boto3, sagemaker, transformers
pip install boto3 sagemaker protobuf transformers pandas

About

End-to-end ml pipeline with sagemaker for detecting fake-news with BERT

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published