RAG Implementations: Should Document Chunks Be Stored with Vectors or Separately?

Implementing a RAG solution on AWS: Should document fragments be stored with your vector data or in separate systems? I’m seeking insights on storage strategies and metadata linkage.

hey iris72, i feel storing doc chunks with vec data keeps things simple though might limit scaling flexability. what have u seen in real setups? im curious if others found splitting storage eases updates for metadata links. any experimnce to share?

hey iris72, i lean more on tied up storage for small scale. it makes managing metdata and updates easier. but if you plan to scale drastically, severing the two can ease maintenance in the long run. try out both for small tests to see what works for ya.

Drawing from my hands-on experience, storing document fragments together with vector data provides the benefit of immediate coherence, which is particularly effective when metadata changes are infrequent. However, this approach may become limited as the project scales and metadata complexities increase. When metadata operations become more critical, separating the two can enhance flexibility and simplify maintenance, allowing independent updates and richer data management. Weighing system dynamics and anticipated growth during initial design can ultimately prevent major restructuring efforts later in the lifecycle.

hey iris72, i reckon bundling them keeps queries simple for starters although decoupling can aid flexabilty later. has any1 here tried a mixed apraoch? im curious how it affected perfomance and metadata updates in your experiments.

hey iris72, in my experience keeping vectors and docs together works well for low scale setups. but if you foresee frequent metadata updates, separating them avoids extra load. consider your update rate and scale needs when choosing your approach