GraphRAG's infrastructure problem: why notebooks don't ship to production

Building a knowledge graph from documents sounds elegant until you hit the infrastructure wall. Processing 10,000 documents means orchestrating Kubernetes, Celery, Spark, and Postgres. Most teams give up. One startup claims to replace the entire stack with object storage.

The Biggish Editorial · Tuesday, February 3, 2026

GraphRAG sounds elegant in theory: build a knowledge graph from your documents, traverse it intelligently, get better answers than vector search alone.

Then you look at the compute requirements.

Processing a single 100-page PDF means thousands of API calls, millions of similarity computations, and hours of processing. For a corpus of 10,000 documents, the infrastructure requirements balloon: parallel execution across heterogeneous compute (PDF parsing needs memory, embeddings are I/O-bound, concept extraction needs CPU), durable checkpointing so failures don't restart everything, and job orchestration that actually understands dependencies.

The DIY tax

Building this yourself means assembling Kubernetes for container orchestration, Celery and Redis for task queuing, Spark for actual parallel compute, and Postgres for job state. Each system was built for different assumptions. Kubernetes manages containers, not computations. Celery distributes tasks but doesn't understand data locality or aggregation. Spark wants to own the entire pipeline. The result: hundreds of lines of glue code bridging systems that don't integrate cleanly.

The timing problem compounds this. Kubernetes Cluster Autoscaler checks for unschedulable pods every 10 seconds, then provisions nodes that take 30-60 seconds to come online. For a GraphRAG pipeline that needs 500 workers immediately, that's minutes of latency before work starts. The autoscaler prioritizes stability over speed, a reasonable tradeoff for web services but painful for batch processing.

This is why most GraphRAG implementations stay as notebooks. The infrastructure tax is too high.

Object storage as compute substrate

Tensorlake claims to replace the entire stack with object-store-native compute. Their Document Ingestion API processes millions of documents monthly, treating object storage (think MinIO or S3) as both data layer and coordination mechanism. Write your workflow as if it runs on a single machine; in production, it transparently distributes across CPUs and GPUs.

The approach mirrors recent GraphRAG experiments: MinIO-based ingestion pipelines for parallel processing, AWS's GraphRAG Toolkit combining Neptune and OpenSearch Serverless, and lightweight forks like nano-graphrag that swap storage backends (NetworkX, Neo4j, HNSW) without rewriting logic.

The skepticism is warranted. Official GraphRAG implementations are notoriously difficult to customize. Database-backed systems (Neo4j for entities, vector stores for embeddings) remain standard for enterprise scale. Pure object storage for GraphRAG is unproven beyond development workloads.

But the problem Tensorlake identified is real: the infrastructure gap between "works in a notebook" and "ships to production" kills more GraphRAG projects than algorithmic limitations. Whether object storage solves it or just shifts complexity remains to be seen.

The DIY tax

Object storage as compute substrate

Related Articles

Anthropic API hits fourth outage in two weeks, 25-minute degradation ongoing

European tech spending hits €1.5 trillion as tariffs accelerate shift from US clouds

Bunny.net previews Database product, SQLite-compatible edge SQL service