Blog — LLMs, AI Agents & Context Engineering

20 July 2026 · 19 min read

Your MCP Tool Works. The Model Still Won't Call It.

A controlled study of MCP tool adoption: 0 calls in 63 turns with 30 mentions of the tool, zero in 114 turns with the answers already in the store — and the harness injection that flipped it.

mcp ai-agents tool-adoption agent-memory hooks claude-code

Read →

19 July 2026 · 20 min read

I Deleted Every Ranking Heuristic From My Code Search Engine

3,817 lines of query-side if/else, deleted in one commit — with the measured evidence for why keyword branches can never rank code search, and the two index-time priors that separated canonical from look-alike.

code-search retrieval ranking pagerank technical-debt vectr

Read →

18 July 2026 · 21 min read

Embedding Dilution: Why Semantic Code Search Misses the Answer

The target function's docstring paraphrased my query almost word for word, yet it ranked below 200 look-alikes — because everything else in the chunk averaged it away. A measured post-mortem, and the dual-vector fix it shipped.

embeddings semantic-search rag code-retrieval dense-retrieval vectr

Read →

17 July 2026 · 28 min read

What Actually Survives /compact in Claude Code: An Empirical Map

We forced 100+ compactions across two instrumented runs and graded, fact by fact, what the boundary keeps and what silently dies — with the survival curve, the confabulation, and the honest costs.

claude-code context-compaction agent-memory llm-agents hooks context-window

Read →

11 July 2026 · 22 min read

Vectr 1.1.0: Team Mode, and the Seven-Agent Test That Gated It

Team mode, API-key auth, and encryption at rest — gated by a test where seven AI agents built and reviewed a real product coordinating only through shared working memory.

vectr mcp multi-agent encryption developer-tools llm-agents

Read →

8 July 2026 · 21 min read

Vectr v1.0.0: The Release Gate, the Bugs It Caught, and the Numbers I Didn't Round Up

Vectr v1.0.0 shipped after a first-person dogfood gate caught two release-blocking bugs — plus the honest cost numbers, wins and losses, that shipped with it.

vectr mcp developer-tools ai-code-editors release-engineering llm-agents

Read →

6 July 2026 · 24 min read

The Four Families of Context Relief for LLM Coding Agents

Eviction, offload-and-recall, retrieval-over-stuffing, and subagent isolation — the four ways to keep a coding agent from drowning in its own context, and why they only work when composed.

llm-agents context-window prompt-caching context-editing agent-architecture mcp

Read →

4 July 2026 · 26 min read

Claude Code Hooks: A Practical Deep-Dive on Deterministic Agent Behavior

How Claude Code hooks work — events, the settings.json contract, exit codes — and how I use them to inject working memory into an agent deterministically.

claude-code hooks llm-agents mcp developer-tools agent-memory

Read →

14 June 2026 · 36 min read

Building Vectr, Part 3: What the Benchmark Numbers Actually Mean

How I benchmarked an AI code editor tool without fooling myself — the research vs implementation distinction that makes a +19% headline misleading, the 5 of 6 CPython tasks where re-discovery dropped, and the limitations that decide whether any of it applies to you.

benchmarking ai-code-editors developer-tools llm-evaluation mcp cpython

Read →

11 June 2026 · 42 min read

Building Vectr, Part 2: What /compact Destroys and How to Survive It

Why /compact kills precision, what the KV cache actually is, and how I built a working memory layer that survives session boundaries — with the bugs that shaped the final design.

working-memory llm-context kv-cache mcp developer-tools

Read →

9 June 2026 · 38 min read

Building Vectr, Part 1: Why grep Fails When You Don't Know the Keywords

How I built a local semantic code indexer for AI editors — covering AST chunking, hybrid vector+BM25 search, symbol graphs, and why naive approaches break on real codebases.

code-search semantic-search ast-parsing embeddings developer-tools mcp

Read →

26 May 2026 · 40 min read

Why LLM Context Windows Fill Up Faster Than You Think — and What to Do About It

The token arithmetic behind system prompts, tool schemas, RAG retrieval, and conversation history — why the effective limit is below the advertised one, and four production strategies to manage it.

LLMs Context Windows Production AI Token Budget Deep Dive

Read →

24 May 2026 · 35 min read

Why AI Code Assistants Waste Your Context Window — and How RAG Fixes It

A deep look at why attention dilution and positional bias make context stuffing counterproductive, and how AST chunking, hybrid BM25+dense retrieval, and RRF fusion fix it.

RAG Code Assistants LLMs Production AI Deep Dive

Read →

23 May 2026 · 90 min read

The Complete Guide to Text Embeddings, Vector Databases & LLMs

From tokenization and transformers to cosine similarity, HNSW graph search, RAG pipelines, and LLM training costs. The deepest guide on the web — with 7 interactive demos.

Embeddings LLMs RAG Vector Search Deep Dive

Read →

22 May 2026 · 30 min read

DPDP Act 2023 + AI/LLM: Automating Data Principal Requests in India

How AI and LLMs automate DPDP Act compliance — classifying data principal requests, drafting multilingual replies in 5 Indian languages, tracking 30-day SLAs, and generating audit-ready evidence trails.

DPDP AI / LLM India Privacy Compliance Open Source

Read →