- Регистрация
- 1 Мар 2015
- Сообщения
- 1,481
- Баллы
- 155
? Today, I got hands-on with a Retrieval-Augmented Generation (RAG) setup that runs entirely offline. I built a private AI assistant that can answer questions from Markdown and PDF documentation — no cloud, no API keys.
? Ollama for local LLM & embedding
? LangChain for RAG orchestration + memory
? ChromaDB for vector storage
? Streamlit for the chatbot UI
Key features:
● Upload .md or .pdf Files
● Auto-re-index and embed with nomic-embed-text
● Ask natural questions to mistral (or other local LLMs)
● Multi-turn chat with memory
● Source highlighting for every answer
? How This Local RAG Chatbot Works (Summary)
1) Upload Your Docs
Drag and drop .md and .pdf files into the Streamlit app. The system supports both structured and unstructured formats — no manual formatting needed.
2) Chunking + Embedding
Each document is split into small, context-aware text chunks and embedded locally using the nomic-embed-text model via Ollama.
3) Store in Chroma Vector DB
The resulting embeddings are stored in ChromaDB, enabling fast and accurate similarity search when queries are made.
4) Ask Natural Questions
You type a question like “What are DevOps best practices?”, and the app retrieves the most relevant chunks using semantic search.
5) Answer with LLM + Memory
Retrieved context is passed to mistral (or any Ollama-compatible LLM). LangChain manages session memory for multi-turn Q&A.
6) Sources Included
Each answer shows where it came from — including the filename and content snippet — so you can trust and trace every response.
Display answer + source documents in Streamlit
? Example Prompts
"What is a microservice?"
"How does Kubernetes manage pod lifecycle?"
"Give me an example Docker Compose file."
"What are DevOps best practices?"
Honestly, this was one of those projects that reminded me how far local AI tools have come. No cloud APIs, no fancy GPU rig — just a regular laptop, and I was able to build a fully working RAG chatbot that reads my docs and gives solid, contextual answers.
If you’ve ever wanted to interact with your own knowledge base — internal docs, PDFs, notes — in a more natural way, this setup is 100% worth trying. It's private, surprisingly fast, and honestly, kind of fun to put together.
? Ollama for local LLM & embedding
? LangChain for RAG orchestration + memory
? ChromaDB for vector storage
? Streamlit for the chatbot UI
Key features:
● Upload .md or .pdf Files
● Auto-re-index and embed with nomic-embed-text
● Ask natural questions to mistral (or other local LLMs)
● Multi-turn chat with memory
● Source highlighting for every answer
? How This Local RAG Chatbot Works (Summary)
1) Upload Your Docs
Drag and drop .md and .pdf files into the Streamlit app. The system supports both structured and unstructured formats — no manual formatting needed.
2) Chunking + Embedding
Each document is split into small, context-aware text chunks and embedded locally using the nomic-embed-text model via Ollama.
3) Store in Chroma Vector DB
The resulting embeddings are stored in ChromaDB, enabling fast and accurate similarity search when queries are made.
4) Ask Natural Questions
You type a question like “What are DevOps best practices?”, and the app retrieves the most relevant chunks using semantic search.
5) Answer with LLM + Memory
Retrieved context is passed to mistral (or any Ollama-compatible LLM). LangChain manages session memory for multi-turn Q&A.
6) Sources Included
Each answer shows where it came from — including the filename and content snippet — so you can trust and trace every response.
Display answer + source documents in Streamlit
? Example Prompts
"What is a microservice?"
"How does Kubernetes manage pod lifecycle?"
"Give me an example Docker Compose file."
"What are DevOps best practices?"
Honestly, this was one of those projects that reminded me how far local AI tools have come. No cloud APIs, no fancy GPU rig — just a regular laptop, and I was able to build a fully working RAG chatbot that reads my docs and gives solid, contextual answers.
If you’ve ever wanted to interact with your own knowledge base — internal docs, PDFs, notes — in a more natural way, this setup is 100% worth trying. It's private, surprisingly fast, and honestly, kind of fun to put together.