Most retrieval-augmented generation (RAG) pipelines still think like file utilities.
We read a file, extract text, split it into chunks, generate embeddings, and push those chunks into a vector database. That workflow is good enough for many systems today, which is exactly why it has become the default. It is simple, practical, and easy to explain.
But I do not think it is the right long-term abstraction.
If we want the next generation of RAG systems to behave less like glorified semantic search and more like working memory for software, products, and organizations, then we need to rethink data ingestion from the ground up. In Go terms, I think future RAG needs more than os.ReadFile.
TL;DR: Current RAG pipelines are optimized to read files, extract text, and chunk it later. That is good enough for many systems today, but it is not enough for future RAG. The next generation of ingestion needs to capture structure, provenance, relationships, and memory-ready context at read time, not after meaning has already been flattened away.
os.ReadFile Is a Great API for the Problem It Solves
The Go documentation for os.ReadFile is beautifully straightforward. It reads the named file and returns its contents as []byte. That is the whole point.
func ReadFile(name string) ([]byte, error)
This is a very good abstraction for traditional software.
If your program needs the contents of a config file, a Markdown document, a JSON payload, or a source file, os.ReadFile gives you exactly what you asked for: bytes. Nothing more. Nothing less. It is intentionally not trying to understand the meaning of those bytes. Interpretation happens later, in the layers above it.
That design is correct.
So this is not an argument that os.ReadFile is broken. It is an argument that the abstraction we use for AI ingestion is still too close to os.ReadFile thinking. We are still saying, “Give me the raw content first, and I will try to reconstruct meaning afterward.”
That worked well in the pre-AI world.
I do not think it will be enough in the AI world.
Reading Bytes Is Not the Same as Building Memory
The core problem is simple:
reading content is not the same as forming memory from content.
Future RAG systems will not just need to retrieve strings. They will need to retrieve context with structure, lineage, permissions, relationships, time, and meaning preserved.
That changes the job of ingestion.
Today, ingestion is often treated like a preprocessing step. Read the file. Clean the text. Chunk it. Embed it. Store it. In many systems, the actual “understanding” is deferred until retrieval time, when the model tries to make sense of whatever chunks it got back.
That is a weak place to do most of the thinking.
If the system throws away too much structure during ingestion, retrieval is forced to recover meaning from fragments. It is like photocopying a book into hundreds of disconnected slips of paper and then expecting a model to rebuild the author’s mental model on demand.
A Person With Photographic Memory Does Not Read With Pen and Paper
The analogy that keeps coming back to me is this:
If a book is given to a person with photographic memory, they do not need to read it with pen and paper in hand just to preserve the important parts. The act of reading is already creating memory.
Our current RAG pipelines often behave like the opposite.
They read first, then try to create memory later by cutting the material into chunks and attaching vectors to it. That is closer to reading with a stack of index cards than reading with deep memory formation.
The deeper point is not about compression. It is about when meaning gets captured.
If an AI system is going to treat a document, codebase, conversation, policy, or spreadsheet as usable memory, then the act of ingestion should already be encoding the parts that matter:
- structural boundaries
- section hierarchy
- authorship and source location
- timestamps and version history
- entities and relationships
- permissions and visibility rules
- modality hints such as table, code, prose, transcript, or screenshot
By the time retrieval happens, the system should not be encountering raw text for the first time. It should be working with memory objects that were created deliberately.
Current Ingestion Pipelines Throw Away Too Much
This is where I think today’s techniques are too shallow for where RAG is going.
The default pipeline usually looks something like this:
- Read the file into memory.
- Convert it to plain text.
- Split it into fixed or semi-fixed chunks.
- Generate embeddings.
- Store those chunks somewhere searchable.
That pipeline is attractive because it is general. It works across many file types and many vector databases. But it also strips away exactly the information that future systems will care about most.
For example, if you ingest a source file this way, you often lose the distinction between:
- package-level intent
- import relationships
- function boundaries
- comments versus implementation
- public API surface versus internal helper logic
If you ingest a product spec this way, you often lose:
- which section defines requirements
- which paragraph contains constraints
- which text is a future idea versus a committed decision
- who authored the change
- what changed between versions
If you ingest a handbook or policy document this way, you can lose the document’s real operational shape. The heading structure, exception rules, effective dates, and scope boundaries often matter more than the raw text alone.
This is not just a philosophical complaint. AWS’s RAG guidance explicitly calls out lack of structure and metadata in raw documents as a major source-data problem for retrieval systems, and Microsoft’s RAG chunking guidance makes the same point from another angle: chunk quality depends on preserving semantically coherent units rather than blindly slicing text. AWS Microsoft
Future RAG Is a Memory Architecture Problem
I think the industry still talks about RAG as if it is mainly a retrieval problem.
It is not.
The future version of RAG is really a memory architecture problem.
The retrieval step still matters, of course. But retrieval quality is downstream of what the system decided to preserve during ingestion. If the system never captured structure, provenance, or relationships in the first place, no retriever can invent them later without guessing.
That matters more as RAG systems become:
- more agentic
- more real-time
- more multimodal
- more organization-specific
- more tightly connected to operational workflows
In that world, the knowledge base is not just a place to store chunks. It starts to look more like a memory layer for software.
And memory layers need richer inputs than raw bytes.
We Need an AI-Era Ingestion Library
I do not mean os.ReadFile itself should become intelligent. That would be the wrong place for this logic.
What I mean is that we need a library one layer above it, designed for AI-native systems the same way other abstractions were designed for earlier computing eras. os.ReadFile gave us a clean interface for file contents. database/sql gave us a cleaner interface for structured persistence. Future RAG needs a similarly natural abstraction for knowledge ingestion.
That library should not merely return []byte or even plain text. It should expose a way to create memory objects while reading.
Conceptually, it could look more like this:
type KnowledgeSink interface {
Upsert(ctx context.Context, memory MemoryUnit) error
}
type MemoryUnit struct {
ID string
Content string
Summary string
SourcePath string
SectionPath []string
Entities []string
Relations []string
UpdatedAt time.Time
Visibility []string
}
type IngestionReader interface {
ReadIntoMemory(ctx context.Context, path string, sink KnowledgeSink) error
}
The exact interface does not matter as much as the shift in thinking.
The point is that reading should already be knowledge-aware.
When a Markdown document is read, the system should know where the headings are, what section a paragraph belongs to, and what source anchor should be used for citation. When code is read, the system should know symbols, imports, package boundaries, and docstrings. When a spreadsheet is read, the system should understand tabs, headers, and cell neighborhoods. When a policy changes, the system should understand diffs, not just re-embed another bag of text.
That is a very different worldview from “read everything, strip it down, and hope chunking saves us.”
Why Connecting the Reader to the Knowledge Base Matters
If the read layer can expose an interface directly to the knowledge base, a lot of downstream problems become easier to solve.
Not because it magically solves retrieval, but because it prevents information loss at the earliest possible moment.
Instead of forcing every pipeline to go through the same crude sequence of text extraction, chunking, and embedding, the ingestion layer could:
- preserve provenance from the beginning
- emit structured memory units instead of anonymous chunks
- decide whether something should be embedded, graphed, summarized, diffed, or linked
- update memory incrementally when the source changes
- retain multiple retrieval shapes for the same source
That last point is important.
One source may need to exist in several forms at once:
- a dense embedding chunk for semantic retrieval
- a structured representation for rule-based lookup
- a graph edge for entity relationships
- a citation anchor for grounded answers
- a diff-aware history record for temporal questions
Current ingestion pipelines often flatten all of that into one representation too early.
The Real Limitation Is Not File Reading. It Is Meaning Capture
The reason I keep coming back to os.ReadFile is that it shows the contrast so clearly.
os.ReadFile does exactly what it promises. It reads a named file and returns the contents. That is a clean and beautiful contract for systems that treat files as byte containers.
But future RAG systems cannot afford to see the world as byte containers.
They need to see sources as:
- structured knowledge
- evolving memory
- permissioned context
- linked concepts
- retrievable evidence
Once you frame the problem that way, it becomes obvious why current ingestion techniques feel insufficient. They were designed to get data into a system. Future RAG needs ingestion that helps the system understand what the data is.
That is a much bigger job.
Final Thoughts
I do not think the future of RAG will be won by whoever has the cleverest chunking heuristic alone.
Chunking still matters. Embeddings still matter. Rerankers still matter.
But the bigger shift is upstream.
The systems that win will treat ingestion as the first act of memory formation, not as a dumb pipe that moves bytes from a file system into a vector store.
That is why I think future RAG needs more than os.ReadFile.
Not because os.ReadFile is wrong, but because it is right for a different era.
The AI era needs a new layer: an ingestion interface that reads source material the way intelligent systems should read it, by preserving meaning as early as possible.
FAQ
Am I saying os.ReadFile is a bad API?
No. It is a very good API for file I/O. The problem is not the function. The problem is assuming that a byte-level read abstraction is sufficient for knowledge ingestion in AI systems.
Why is chunking alone not enough for future RAG?
Chunking helps retrieval, but chunking happens after the system has already decided what to preserve and what to discard. If structure, provenance, or relationships are lost before chunking, retrieval quality will always be capped.
What should an AI-era ingestion library capture?
At minimum, it should capture content plus structure, source anchors, timestamps, permissions, and relationships. In many cases it should also preserve symbol boundaries, diffs, and modality-specific information.
Should ingestion write directly to a vector database?
Sometimes, but not only that. A better design is to expose a knowledge sink interface so the same read can produce embeddings, graph relationships, citations, summaries, and update records depending on the system’s needs.