The Dirty Problem of RAG

If you work with generative AI, you’ve heard of RAG (Retrieval-Augmented Generation). It’s the technique that allows AI to consult your company’s documents to give precise answers.

The concept is brilliant:

  • AI doesn’t need to memorize everything
  • Searches for information in your documents
  • Combines search + text generation
  • Answers based on real data

But traditional RAG has a “dirty” problem:

It chunks your documents into pieces, which often makes the machine lose the thread or ignore tables and important contexts.

How Traditional RAG Works (and Fails)

The Standard Process

Step 1: Chunking

100-page document

Divided into 500 chunks of ~200 words

Each chunk becomes a mathematical vector

Stored in vector database

Step 2: Search

User asks: "What was net profit?"

Question becomes vector

Searches similar chunks (semantic similarity)

Returns 3-5 most relevant chunks

Step 3: Generation

AI receives chunks + question

Generates answer based on chunks

What Goes Wrong?

Problem 1: Broken Context

Chunk 237: "...as shown in Table 5"
Chunk 238: [Table 5 was here, but ended up in another chunk]
Chunk 239: "Based on this data..."

Result: AI doesn't see Table 5 when needed

Problem 2: Scattered Information

Document says:
Page 12: "Revenue: $500M"
Page 87: "Costs: $300M"
Page 143: "Profit: $200M"

Question: "What's the profit margin?"
Traditional RAG: May grab only 1 or 2 chunks
Answer: Incomplete or wrong

Problem 3: Destroyed Tables

Original table:
| Product | Q1  | Q2  | Q3  |
|---------|-----|-----|-----|
| A       | 100 | 150 | 200 |
| B       | 50  | 75  | 90  |

After chunking:
Chunk X: "| Product | Q1  | Q2"
Chunk Y: "| Q3  | |---------|-----|"
Chunk Z: "75  | 90  |"

AI: "I can't understand this table"

Problem 4: Structure Loss

Document has:
- Section 1: Introduction
  - 1.1 Context
  - 1.2 Objectives
- Section 2: Methodology
  - 2.1 Approach
  - 2.2 Data

Traditional RAG: Ignores hierarchy
AI doesn't know 1.1 and 1.2 are related

PageIndex: The Radical 2026 Solution

The 2026 scenario proposes a radical solution: PageIndex (or RAG without vectors).

From “Search by Similarity” to “Structured Reasoning”

In common RAG:

  • AI searches for similar words
  • Mathematical vectors
  • No structure understanding

In PageIndex:

  • More human approach
  • Understands logical structure
  • Navigates like you’d read an index

How PageIndex Works

1. Content Tree

Instead of chopping text, AI reads the entire document and creates a tree structure, like an ultra-detailed table of contents (in JSON format) that stays in the model’s “working memory”.

Example of generated tree:

{
  "document": "Annual Report 2025",
  "sections": [
    {
      "id": "1",
      "title": "Executive Summary",
      "page_range": [1, 5],
      "subsections": [
        {
          "id": "1.1",
          "title": "Financial Highlights",
          "page": 2,
          "content_summary": "Revenue $500M, profit $200M",
          "has_table": true,
          "table_ref": "Table_1_Financial_Summary"
        }
      ]
    }
  ],
  "tables": [
    {
      "id": "Table_1_Financial_Summary",
      "location": "page 2",
      "columns": ["Metric", "2024", "2025"],
      "referenced_in": ["1.1", "3.2"]
    }
  ]
}

The AI created a “mind map” of the document.

2. Intelligent Navigation

When you ask a question, AI doesn’t search for loose words.

Reasoning process:

Question: "What was revenue growth?"

AI thinks:
1. "This is about finance"
2. "Probably in Executive Summary or Financial Analysis"
3. Consults tree → identifies section 1.1
4. "Section 1.1 has a financial table"
5. Goes directly to Table_1_Financial_Summary
6. Reads relevant data
7. Calculates: ($500M - $400M) / $400M = 25%
8. Answers: "25% growth"

It looks at the table of contents, reasons about which section should have the answer (e.g., “this should be in Section 4”) and goes straight to the point.

3. Cross-Reference

Problem solved:

Text says: "As shown in Table 3..."

Traditional RAG:
- Doesn't know where Table 3 is
- Ignores the reference

PageIndex:
- Sees reference to Table 3
- Consults tree
- Finds: table_ref: "Table_3_Market_Share"
- Navigates to table
- Connects information

If the text says “see table 3”, AI can navigate the tree, find the table, and connect information.

The Result: Crushing Precision

Real Benchmarks

In financial benchmark tests, this approach achieved 98% precision, far surpassing traditional RAG.

Comparison:

MetricTraditional RAGPageIndex
Precision73%98%
Recall (finds all)65%95%
Tables45%97%
Cross-references20%92%
Cost per query$0.02$0.15
Latency2s8s

For companies dealing with complex contracts or annual reports of hundreds of pages, this changes the game.

Perfect Use Cases

✅ Excellent for:

  • Complex legal contracts
  • Annual financial reports
  • Structured technical documentation
  • Manuals with many tables/references
  • M&A due diligence
  • Compliance and auditing

❌ Not worth it for:

  • Simple FAQs
  • Short documents (<10 pages)
  • Search across thousands of documents
  • Cases where speed > precision

The “Price” of Intelligence

Not everything is flowers. There are two real challenges for this technology:

1. Cost and Latency

The problem:

Since AI needs to make multiple “calls” to navigate the content tree, the process is slower and more expensive than a simple search.

Navigation example:

Call 1: Create document tree ($0.05)
Call 2: Analyze question and decide section ($0.02)
Call 3: Read specific section ($0.03)
Call 4: Search referenced table ($0.02)
Call 5: Synthesize answer ($0.03)

Total: $0.15 per query (vs $0.02 traditional RAG)
Time: 8 seconds (vs 2 seconds)

Trade-off:

  • 7.5x more expensive
  • 4x slower
  • But 25% more accurate

Worth it? Depends on the use case.

2. Memory Limit

The problem:

The tree structure needs to fit in the AI’s context window.

Real numbers:

Claude 3.5 Sonnet: 200k context tokens

100-page document:
- Text: ~50k tokens
- JSON tree: ~20k tokens
- Space for answer: ~10k tokens
Total used: ~80k tokens
✅ Works!

Library with 50 documents:
- 50 × 50k = 2.5M tokens
❌ Doesn't fit!

Trying to apply this to an entire document library is not yet viable.

Partial solutions:

  • More compact trees (summaries)
  • Layered hierarchy (search document first, then detail)
  • Models with larger context (Gemini 1.5: 1M tokens)

The Evolution of the AI Professional

The Orchestrator in Action

This reinforces our thesis of the AI Orchestrator.

The successful professional is not just someone who “installs” RAG, but who understands:

When to use Traditional RAG (vectorial):

Scenario: Product FAQ
- 1000 common questions
- Short answers
- Speed matters
- Cost matters
Decision: Vectorial RAG (fast and cheap)

When to use PageIndex:

Scenario: $10M contract analysis
- 200-page document
- Needs 98% precision
- Error can cost millions
- Client expects 1-day analysis
Decision: PageIndex (slow but precise)

When to use Hybrid:

Scenario: Technical support system
- 80% simple questions → Vectorial RAG
- 15% medium questions → RAG + human validation
- 5% complex questions → PageIndex
Decision: Intelligent routing

The New Skills

❌ No longer enough:

  • Knowing how to install RAG library
  • Running LangChain tutorial
  • Applying same solution to everything

✅ Necessary:

  • Understanding trade-offs (cost vs precision vs speed)
  • Architecting hybrid solutions
  • Measuring what matters (not just “works”)
  • Optimizing costs without sacrificing quality
  • Knowing when new technology is worth the investment

2026-2027: Three Approaches Coexisting

Level 1: Vectorial RAG (commodity)

  • Simple cases
  • High scale
  • Low cost
  • 70-80% precision

Level 2: Hybrid RAG (emerging standard)

  • Vectorial for initial filter
  • PageIndex for refinement
  • 85-92% precision
  • Medium cost

Level 3: Pure PageIndex (premium)

  • Critical cases
  • Maximum precision (95-98%)
  • High cost justified
  • Acceptable latency

The right choice depends on context, not fashion.

Conclusion

RAG without vectors (PageIndex) is not the replacement for traditional RAG.

It’s an additional tool in the AI professional’s arsenal.

Main lessons:

  1. New technology ≠ Always better

    • PageIndex is more accurate
    • But also more expensive and slower
    • Not always worth it
  2. Context is king

    • Simple FAQ? Vectorial RAG
    • Critical contract? PageIndex
    • Hybrid? Probably
  3. Orchestration is the skill

    • Knowing which tool when
    • Optimizing cost without losing quality
    • Measuring real impact
  4. Precision has a price

    • 98% vs 73% = 7.5x more expensive
    • Sometimes worth it (legal analysis)
    • Sometimes not (email search)
  5. The professional evolves

    • From installer to architect
    • From executor to orchestrator
    • From technical to strategist

What Do You Prefer?

A fast AI that “guesses” based on similarity or a slightly slower AI that understands your document’s logical structure with 98% precision?

Does precision compensate for cost in your area?

How would you decide between the two approaches?

Share your opinion:

The future isn’t about having the newest technology. It’s about using the right technology for the right problem.


Read Also