RAG System: Store Document Fragments Together with Vectors or Separately?

Considering a RAG deploy on AWS, is it best to store document fragments with vectors in one system, or separately using S3 or similar? How to manage metadata?

Considering my experience with RAG deployments on AWS, I have noticed that storing document fragments and their corresponding vector embeddings within a unified database can significantly simplify the retrieval process when metadata is well integrated. This approach typically provides lower latency because queries do not need to join disparate data sources. Nevertheless, if the volume of document fragments is very high or if cost is a major concern, storing larger document bodies on S3 while keeping vectors and critical metadata in a fast lookup database may be more appropriate. It is important to weigh query performance against storage complexity.

hey all, i’ve been mulling over this idea too. maybe splitting heavy docs on s3 while keeping vector metdata together in a fast db could work well for performance. what about handling retrival failures? any experriences with that?

imho, storing vectors and fragments in one db simplifies retrieval if device size isnt huge, though for big data, offloading full docs to s3 while indexing essential metdata in a fast store is smart. its all about balancing speed and cost.

Based on my experience in designing retrieval systems, a hybrid storage model often yields the best results when balancing cost and performance. I’ve observed that storing vectors and metadata in a specialized, fast-access database leads to efficient similarity searches, while maintaining the full documents in a cost-effective storage service such as S3 reduces overall expenditure. The key lies in ensuring tight integration through well-defined indexing and synchronization. This approach simplifies metadata management and keeps retrieval times optimal, even when processing large volumes of data. It is essential to tailor the solution based on specific workload characteristics.

hey, ive found that keeping vectors and meta in a fast db, while placing full docs on s3, works well. syncing issues can arise but test small first to catch any hiccups. it’s not flawless but it works in my cas.