GraphRAG sounds elegant in theory: build a knowledge graph from your documents, traverse it intelligently, get better answers than vector search alone.
Then you look at the compute requirements.
Processing a single 100-page PDF means thousands of API calls, millions of similarity computations, and hours of processing. For a corpus of 10,000 documents, the infrastructure requirements balloon: parallel execution across heterogeneous compute (PDF parsing needs memory, embeddings are I/O-bound, concept extraction needs CPU), durable checkpointing so failures don't restart everything, and job orchestration that actually understands dependencies.
The DIY tax
Building this yourself means assembling Kubernetes for container orchestration, Celery and Redis for task queuing, Spark for actual parallel compute, and Postgres for job state. Each system was built for different assumptions. Kubernetes manages containers, not computations. Celery distributes tasks but doesn't understand data locality or aggregation. Spark wants to own the entire pipeline. The result: hundreds of lines of glue code bridging systems that don't integrate cleanly.
The timing problem compounds this. Kubernetes Cluster Autoscaler checks for unschedulable pods every 10 seconds, then provisions nodes that take 30-60 seconds to come online. For a GraphRAG pipeline that needs 500 workers immediately, that's minutes of latency before work starts. The autoscaler prioritizes stability over speed, a reasonable tradeoff for web services but painful for batch processing.
This is why most GraphRAG implementations stay as notebooks. The infrastructure tax is too high.
Object storage as compute substrate
Tensorlake claims to replace the entire stack with object-store-native compute. Their Document Ingestion API processes millions of documents monthly, treating object storage (think MinIO or S3) as both data layer and coordination mechanism. Write your workflow as if it runs on a single machine; in production, it transparently distributes across CPUs and GPUs.
The approach mirrors recent GraphRAG experiments: MinIO-based ingestion pipelines for parallel processing, AWS's GraphRAG Toolkit combining Neptune and OpenSearch Serverless, and lightweight forks like nano-graphrag that swap storage backends (NetworkX, Neo4j, HNSW) without rewriting logic.
The skepticism is warranted. Official GraphRAG implementations are notoriously difficult to customize. Database-backed systems (Neo4j for entities, vector stores for embeddings) remain standard for enterprise scale. Pure object storage for GraphRAG is unproven beyond development workloads.
But the problem Tensorlake identified is real: the infrastructure gap between "works in a notebook" and "ships to production" kills more GraphRAG projects than algorithmic limitations. Whether object storage solves it or just shifts complexity remains to be seen.