Course-RAG-PROJECT

HOW TO run this Project Intall Python

Just Download all the folders and open the ipynb file where the codes are and follow that

DESCRIPTION OF PROJECT

🚀 Advanced RAG Pipeline

This project implements a Retrieval-Augmented Generation (RAG) pipeline designed to provide comprehensive and accurate answers by combining information retrieval with large language model (LLM) generation. It allows you to query your extensive document collection and synthesize context-aware responses.

✨ Key Components & Features

Document Ingestion & Preparation: Handles .docx and .pdf files, converting them to .txt for consistent processing.
Enhanced Intelligent Document Chunking:
- Splits large documents into semantically meaningful chunks (approx. 512 tokens with 100 token overlap).
- Utilizes intfloat/e5-base tokenizer for accurate token estimation.
- Incorporates semantic splitting, category-specific configurations, and enhanced keyword extraction.
Dense Embedding Generation:
- Converts each document chunk into a 768-dimensional numerical embedding using intfloat/e5-base.
- Performed offline for efficiency.
Vector Store (FAISS) Creation:
- Indexes all dense embeddings in a FAISS (IndexFlatIP) vector store for extremely fast and accurate similarity search (cosine similarity).
- Pre-computed offline.
Dense Retrieval:
- At runtime, embeds user queries using intfloat/e5-base.
- Performs rapid similarity search in the FAISS index to retrieve the top K (e.g., 4) most relevant document chunks.
Context Assembly & LLM Generation:
- Compiles retrieved chunks as context for the LLaMA 3.2-1B LLM.
- Generates comprehensive, accurate, and detailed responses based only on the provided context.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
documents		documents
enhanced_chunks_500		enhanced_chunks_500
faiss_vector_store		faiss_vector_store
models		models
rag_embeddings_500_complete		rag_embeddings_500_complete
.gitignore		.gitignore
FOR_GITHUB_DOWNLOADERS.ipynb		FOR_GITHUB_DOWNLOADERS.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Course-RAG-PROJECT

HOW TO run this Project Intall Python

Just Download all the folders and open the ipynb file where the codes are and follow that

DESCRIPTION OF PROJECT

🚀 Advanced RAG Pipeline

✨ Key Components & Features

About

Uh oh!

Releases

Packages

Languages

NIMRA-ABBASI/Course-RAG-Project_NLP

Folders and files

Latest commit

History

Repository files navigation

Course-RAG-PROJECT

HOW TO run this Project Intall Python

Just Download all the folders and open the ipynb file where the codes are and follow that

DESCRIPTION OF PROJECT

🚀 Advanced RAG Pipeline

✨ Key Components & Features

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages