Blog

News, updates, and insights from the Ragtoolina team.

Introducing Ragtoolina

Meet Ragtoolina — the semantic codebase context tool that cuts your AI coding costs by 2.6x while keeping the same quality answers.

A custom evaluation suite, honest numbers, and what the model gets wrong

From HuggingFace safetensors to a 153 MB file running at 2.2ms per embedding on Apple Silicon

A 149M-parameter fine-tune on Lambda Labs A100, from SSH to finished model in under 2 hours

How we combined 475K open-source code pairs with 2,400 hand-crafted Swift/TypeScript examples

How we chose Alibaba-NLP/gte-modernbert-base over four other embedding models for local code search

How we trained ragtoolina-embed-v1 on a rented GPU and shipped it inside a macOS app via llama.cpp

AI coding assistants are only as good as the context they receive. Here's why sending the right files — not all files — makes the difference.

A step-by-step guide to connecting Ragtoolina's MCP server with Claude Code, Cursor, and other AI coding tools.

How engineering teams use shared semantic indexes to onboard faster, answer questions more accurately, and cut AI costs across the organization.

A deep dive into the benchmarks behind Ragtoolina's 2.6x cost reduction claim — methodology, results, and what it means for your budget.