Describe your data and use case. Get a complete, opinionated RAG architecture. Diagnose what's wrong with the one you already have. Estimate what it'll cost.
$ python main.py diagnose --interactive [1/5] Vector database? Pinecone [2/5] Chunking strategy? 512-token fixed [3/5] Embedding model? ada-002 [4/5] Retrieval method? dense only [5/5] Problems? (blank to finish) > misses exact clause references > hallucinates contract terms > Diagnosis — overall severity: CRITICAL chunking_strategy: critical Fixed chunks split mid-clause → switch to hierarchical chunking retrieval_method: high Dense-only misses exact terms → add BM25 hybrid + RRF merge Quick fix today: Enable 10% token overlap in fixed chunks as immediate patch
Every RAG architecture blog post ends with "it depends." Teams spend weeks evaluating options with no framework, ship the wrong choices, then discover root causes six months later — after ingesting 50,000 documents, signing a cloud contract that violates GDPR, and watching precision tank on exact-term queries.
RAG Readiness pre-scores complexity from rules, then Claude returns one specific recommendation per component. If you already have a stack, diagnose it — root causes ordered by severity, one concrete fix each. If you need to iterate, every session persists so you can refine against new constraints. Cost estimation, eval dataset generation, and implementation bundles included.
Complexity scoring before the LLM call means recommendations are calibrated to your actual constraints, not generic best practices.
From blank-slate architecture to debugging a production system — all from the same CLI and API.
git clone https://github.com/swapnanil/rag-readiness cd rag-readiness cp .env.example .env # add your ANTHROPIC_API_KEY docker-compose up api # or: pip install -r requirements.txt && python api.py
python main.py audit --interactive python main.py audit --file examples/usecase_legal_contracts.json --with-cost
python main.py diagnose --interactive python main.py diagnose --file examples/diagnosis_pinecone_fixed.json
python main.py multi-audit examples/multi_usecase_lexvault.json python main.py sessions python main.py refine <session-id> --feedback "Qdrant was too heavy" python main.py cost <session-id> python main.py eval-dataset <session-id> --num-questions 20
{ "existing_architecture": { "vector_database": "Pinecone", "chunking_strategy": "512-token fixed", "embedding_model": "ada-002", "retrieval_method": "dense", "observed_problems": [ "misses clause references", "hallucinates terms" ] } }
overall_severity: critical chunking_strategy critical Fixed chunks split mid-clause in long legal documents Fix: parent-child hierarchical chunking, 512-token child nodes retrieval_method high Dense-only misses exact terms like dollar amounts in clauses Fix: hybrid BM25+dense + RRF quick_fix: Enable 10% token overlap today
Each tool is a standalone CLI + REST API solving a real enterprise problem with Claude.