I’m exploring efficient approaches for RAG setups, especially whether to store document chunks alongside vectors or in independent storage, and how to manage their metadata.
imho, storing vectors with dokuments simplifies queries but can get mesy. sometimes its bettr to keep them seperate for solid meta control. depends on your projct needs.
hmmm, i truely think merging them can speed up retreival, but sometimes flexiblity suffers. maybe a hybrid way works? what do u reckon?
In my experience, designing a RAG system benefits from separating document storage from vector storage when metadata management is a priority. When document chunks are stored independently, it becomes easier to maintain and adjust associated metadata, and this separation also allows the vector database to focus purely on efficient similarity search. This approach provides greater flexibility and scalability especially as project requirements evolve. Although combining them might simplify queries in some scenarios, the long-term benefits of modularity and more comprehensive data governance help maintain clarity and improve system maintainability.
im kinda leaning towards seperate storage for clearer meta control, though keeping them close might be benefical for speed, dependin on your use-case. modularity usually wins for me.
hey, im thinkin a mix can work but careful with meta mishaps. having separate storage helps scale and eases management, though sometimes co-location speeds things up. anyone tried a hybrid approach? curious how it played out in your cases.