An AI-powered document assistant that lets users upload files, converts them into embeddings, stores them in Pinecone, and answers questions using Retrieval-Augmented Generation (RAG) powered by OpenAI CHAT GPT.
🚀 Built with React (frontend), Express (backend), OpenAI API, and Pinecone Vector DB using namespaces.
The RAG Document Assistant is a full-stack project that demonstrates how to combine Large Language Models (LLMs) with external knowledge bases using Retrieval-Augmented Generation (RAG).
The goal of this project is to let users chat with their own documents:
- Upload files (PDF, DOCX, TXT)
- Convert them into embeddings using OpenAI
- Store them in Pinecone for efficient semantic search
- Ask questions and get context-aware answers with source references
- Add different documents and it will identify the context, will save the context for future usage aswell.
- 📂 Upload PDF, DOCX, or TXT documents
- 🔍 Automatic text extraction + smart chunking
- 🧠 Embedding generation with OpenAI API
- 📦 Vector storage in Pinecone
- 💬 Ask questions and get contextual answers
- 📑 Source references for transparency
- 🎨 Clean Material-UI interface
- Frontend: React, Material-UI, Axios
- Backend: Express, Node.js, Multer
- AI: OpenAI GPT, OpenAI Embeddings
- Vector DB: Pinecone
- Utilities: pdfjs-dist, mammoth (for text extraction)
- Deployment: Docker / Vercel / Render (future)
- git clone https://github.com/<your-username>/RAG-document-assistant.git
- cd RAG-document-assistant
- npm install
- npm run dev
- cd server
- npm install
- npm run devserver running on localhost:5000
- Upload a document (PDF/DOCX/TXT).
- Wait for embeddings to be processed and stored in Pinecone.
- Ask questions in the chat interface.
- Get contextual answers with source references.