RAG

Notes

What is it

The problem we face when using public pre-trained LLMs is that they know nothing about our private data sources, but training your own LLM is usually not affordable for mose complanies/people.

RAG is a way to boost a LLM’s capability of understanding and anwersing questions to your private data.

How to create your own RAG

1. prepare your private database

First, you need tp prepare your own private data in a dtabase.
a. Started by chunking your private data into meaningful sentences/paragraphs
b. Convert them into numerical representations(vectors), it’s called embedding language models.
c. Store them in a vector database, a kind of database that’s specialized in comparing/mathcing vectors.

An example of a database record

{
  "id": "chunk_1",
  "vector": [0.15, -0.46, 0.3, 0.2026, ...],   // hundreds or thousands of numbers representing the meaning of the text chunk
  "text": "Phone bill reimbursement 101. You need to first create an account in ...",
  "metadata": {
    "source": "www.wiki.company.com/internal/expense/reimbursement",
    "page": 5
  }
}

2. Retrieve relevant information when a uesr queries

For example, when a user asks “How can I expense my phone bills?”, it will be converted to a vector like [1.5, 2.7, -1.02 …].
The vector will then be used to find and retrieve the relevant information in the vector database.

3. Talking to the LLM

The original prompt will be augumented, becoming something like

"UserQuestion": "How can I expense my phone bills?"

"Context": "Phone bill reimbursement 101. You need to first create an account in ..." // your private data

The LLM will then have the ability to answer your question regarding the data that it has never been trained on.

Summary

So basically the LLM in RAG architecture does change at all, its job is still generating outputs given inputs.
It’s just one more extra step before that to add more context to the input for LLM.

Retrieval-Augmented Generation

RAG

What is it

How to create your own RAG

1. prepare your private database

2. Retrieve relevant information when a uesr queries

3. Talking to the LLM

Other topics

Can the data in the database gets stale?

How to find relevant information given a query’s vector

It’s different from fine tuning?

Summary

Retrieval-Augmented Generation

RAG

What is it

How to create your own RAG

1. prepare your private database

2. Retrieve relevant information when a uesr queries

3. Talking to the LLM

Other topics

Can the data in the database gets stale?

How to find relevant information given a query’s vector

It’s different from fine tuning?

Summary

See Also