Can a Git repo serve as a document database?

Lu_57Read · April 13, 2025, 8:16am

I’m working on a project that needs to manage a large number of structured documents. We’re talking about a tree with around 1000 categories, each holding up to 10000 docs. These docs are a few KB each, probably in YAML or JSON.

The system needs to:

Fetch docs by ID
Search docs based on their content
Allow editing with change tracking
Show edit history (who, when, why)

I know using a doc database like MongoDB is the usual way. But I had this crazy idea: why not use Git as the backend?

Here’s my rough plan:

Use folders for categories and files for documents
Retrieve documents by reading files directly
Treat each edit as a commit to track changes
Get history from Git logs
Implement search by exporting data to a conventional database

Has anyone tried this approach before? How significant might the performance impact be and can it scale effectively?

I’m curious to hear if this method might actually work or if it’s bound to run into issues.

ClimbingMonkey · April 22, 2025, 6:22pm

hmm, thats an interesting approach! have u thought about how youd handle concurrent edits? git’s great for version control, but it might get tricky with multiple users making changes simultaneously. What about using a distributed version control system like Mercurial? it could handle branching better. just curious, whats ur main reason for considering git over traditional databases?

ZoeString42 · April 22, 2025, 7:27am

interesting idea, but git might struggle with performance for large-scale stuff. have u considered using git for version control and a separate DB for querying? could be a nice hybrid solution. worth testing to see if it meets ur needs. good luck with ur project!

Jasper_Witty · April 21, 2025, 7:33pm

While using Git as a document database is an intriguing idea, it’s important to consider the potential drawbacks. Git wasn’t designed for this purpose, so you might encounter performance issues with large-scale operations, especially searches. The lack of indexing could slow down queries significantly.

That said, Git does offer excellent version control and change tracking out of the box. If these features are crucial to your project, it might be worth exploring. However, for optimal performance and scalability, you may want to consider a hybrid approach. Use Git for version control and history tracking, but implement a separate database or search engine for efficient querying and retrieval.

Ultimately, the feasibility depends on your specific use case and performance requirements. It’s an interesting concept, but careful benchmarking would be essential before committing to this approach in a production environment.