Blog
News, updates, and insights from the Ragtoolina team.
Introducing Ragtoolina
Meet Ragtoolina — the semantic codebase context tool that cuts your AI coding costs by 2.6x while keeping the same quality answers.
Read more →Part 5: Benchmarking and Results
A custom evaluation suite, honest numbers, and what the model gets wrong
Read more →Part 4: GGUF Conversion and Serving with llama.cpp
From HuggingFace safetensors to a 153 MB file running at 2.2ms per embedding on Apple Silicon
Read more →Part 3: Training on a Rented GPU
A 149M-parameter fine-tune on Lambda Labs A100, from SSH to finished model in under 2 hours
Read more →Part 2: Building the Training Dataset — CodeSearchNet + Synthetic Pairs
How we combined 475K open-source code pairs with 2,400 hand-crafted Swift/TypeScript examples
Read more →Part 1: Picking the Right Base Model for Code Embeddings
How we chose Alibaba-NLP/gte-modernbert-base over four other embedding models for local code search
Read more →Fine-Tuning a Code Embedding Model That Runs Entirely on Your Mac
How we trained ragtoolina-embed-v1 on a rented GPU and shipped it inside a macOS app via llama.cpp
Read more →Why Context Matters for AI Coding
AI coding assistants are only as good as the context they receive. Here's why sending the right files — not all files — makes the difference.
Read more →Setting Up the MCP Server Integration
A step-by-step guide to connecting Ragtoolina's MCP server with Claude Code, Cursor, and other AI coding tools.
Read more →Ragtoolina for Teams: Shared Context at Scale
How engineering teams use shared semantic indexes to onboard faster, answer questions more accurately, and cut AI costs across the organization.
Read more →How We Reduced AI Coding Costs by 63%
A deep dive into the benchmarks behind Ragtoolina's 2.6x cost reduction claim — methodology, results, and what it means for your budget.
Read more →