Back to Blog
RAGAIEmbeddingsVector Search

Understanding RAGs – From Chunking to Vectorization (With Real-Life Examples)

A beginner-friendly explanation of RAGs using real-life analogies — covering indexing, vectorization, chunking, and overlapping with easy examples.

3 min read
Share
Understanding RAGs – From Chunking to Vectorization (With Real-Life Examples)

Understanding RAGs – From Chunking to Vectorization (With Real-Life Examples)

Ever heard of RAGs (Retrieval-Augmented Generation) and felt like it was rocket science? Don’t worry — in this blog, we’ll break it down using real-world analogies so even a non-techie can understand.

We'll explain:

  • What is Indexing?
  • Why Vectorization Matters
  • Why RAGs Exist
  • What is Chunking?
  • Why Do We Overlap Chunks?

Whether you're a student, content creator, or developer — this one's for you.

---

What is Indexing?

Analogy:

Imagine running a mobile accessories shop in Bengaluru.

Every product — chargers, cables, cases — is labeled and placed neatly on shelves.

That’s how you find items quickly.

In AI:

  • Indexing helps the system quickly find the most relevant information.
  • It's like creating a catalog for fast lookup.

Without indexing, AI would have to “scan the entire shop” every time you ask something.

---

Why Do RAGs Exist?

Generative AI like GPT is powerful but doesn’t know your private data like:

  • PDFs
  • Website content
  • Helpdesk FAQs
  • Internal documents

That's where RAG comes in.

**RAG = Retrieval + Generation**

  1. **Retrieves** the most relevant chunks from your data
  2. **Generates** an answer using GPT with that info as context

RAG = AI with your company’s brain + GPT’s intelligence

---

Why Do We Perform Vectorization?

Analogy:

A customer walks in and says:

“Give me the best 5G phone under ₹15,000 with good battery.”

Your assistant understands:

  • Budget
  • 5G requirement
  • Battery preference

AI needs this semantic understanding too.

Vectorization converts text → numerical meaning (vectors)

This enables semantic search, not just keyword search.

Example:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
vectors = model.encode(["good battery", "long-lasting power"])
print(vectors)

“good battery” ≈ “long-lasting power” in vector space — because they *mean* the same.

---

Why Do We Perform Chunking?

Problem:

LLMs like GPT have token limits — they cannot read 200 pages at once.

Solution:

Break documents into smaller pieces called chunks.

Analogy:

When reading a new mobile catalog, you don’t read the entire book at once.

You split it into:

  • Samsung Phones
  • iPhones
  • Chargers
  • Earphones

That’s chunking!

---

Why Do We Overlap Chunks?

Sometimes important info sits at the boundary:

  • End of one paragraph
  • Start of the next

Strict splitting may cause AI to lose context.

Example:

Chunk 1: Info A, B, C

Chunk 2: Info C, D, E

The shared C preserves meaning.

def chunk_with_overlap(text, chunk_size=200, overlap=50):
    chunks = []
    start = 0
    while start < len(text):
        end = start + chunk_size
        chunks.append(text[start:end])
        start += chunk_size - overlap
    return chunks

Overlap helps the AI understand continuity.

---

Summary Table

| Concept | Analogy in Mobile Shop | Purpose in RAGs |

|---------------|----------------------------------|-------------------------------------|

| Indexing | Labeling shelves | Fast retrieval of chunks |

| Vectorization | Understanding customer intent | Semantic search |

| RAGs | Staff + catalog + answers | Combine private data + GPT |

| Chunking | Splitting catalog into sections | Fit data into LLM limits |

| Overlap | Repeating edge info | Preserve meaning across chunks |

---

✅ Final Thought

RAGs are not complicated.

They're simply a smart way of making AI feel more like a real expert — one who:

  • Understands your data
  • Remembers context
  • Provides accurate answers

Start simple.

Play with chunking + vectorization.

Soon you’ll be building powerful knowledge-enhanced AI apps!

J

Written by Tech Swamy Kannada

Full-stack developer with 4+ years of experience building modern web applications with Next.js, React, TypeScript, and Node.js. Passionate about sharing knowledge through tutorials and open source.