PP
Home/Projects/RAG Pipeline Optimization for Tender Writing Platform
R

AI Agent

RAG Pipeline Optimization for Tender Writing Platform

Consulting engagement to optimize RAG pipeline, reducing costs while improving answer quality through better chunking, evaluation frameworks, and model selection.

Completed February 2026

Key features

RAG OptimizationCost ReductionEvaluation FrameworksDocument ChunkingPrompt Engineering

RAG Pipeline Optimization for Tender Writing Platform

A tender writing AI platform was bleeding costs and producing inconsistent answers. Their RAG pipeline was functional but inefficient, with poor chunking strategies, no evaluation framework, and suboptimal model selection. Through systematic optimization, costs dropped significantly while answer quality and consistency improved.

Overview

This consulting engagement focused on auditing and optimizing every component of the RAG pipeline. The process was systematic and data-driven.

First, I met with the client to understand their current setup, pain points, and business constraints. Then I dove into their codebase to understand how the RAG pipeline actually worked: document ingestion, chunking strategy, embedding generation, retrieval logic, and response generation.

With a clear picture of the system, I recommended a prioritized list of optimizations based on impact and implementation effort. This included improving the document chunking strategy (semantic chunking vs. fixed-size), implementing an evaluation framework to measure answer quality objectively, optimizing model selection for different query types, and refining the retrieval process.

Finally, I set up an iterative testing structure so the team could measure improvements quantitatively. This allowed them to validate each change and continue optimizing after the engagement ended.

Key Features

  • RAG Pipeline Audit & Optimization: Comprehensive review of document processing, embedding, retrieval, and generation components.
  • Document Chunking Strategy: Redesigned chunking approach from fixed-size to semantic chunking for better context preservation.
  • Evaluation Framework Implementation: Built objective evaluation system to measure answer quality, relevance, and consistency across query types.
  • Model Selection & Cost Optimization: Optimized model selection for different query types, reducing costs while maintaining or improving quality.
  • Iterative Testing Structure: Established testing framework for ongoing optimization and validation of pipeline changes.
  • Answer Quality Improvement: Improved consistency and accuracy of AI-generated responses through prompt engineering and retrieval optimization.
  • About us

    We turn your goals into AI and software that actually works

    A team of product engineers based in Queenstown, NZ. We work with you to understand the problem first, then build the right thing — not just the possible thing.

    Book a consultation