Benchmarks: 1,652x Faster Recursive Queries with Incremental Computation
If your compliance system takes 11 seconds to update after an ownership change, that's 11 seconds where a transaction can slip through against a sanctioned entity. If your recommendation engine takes 11 seconds to reflect a stockout, that's 11 seconds of customers clicking "buy" on products you can't ship. If your access control takes 11 seconds to propagate a role change, that's 11 seconds where a terminated employee's permissions are still active. These aren't hypotheticals. They're the cost of recalculating everything from scratch in systems that need to reason over connected data.
One fact changes in a knowledge graph with 400,000 derived relationships. Full recompute: 11.3 seconds. InputLayer: 6.83 milliseconds.
That's a 1,652x difference. Not a faster version of the same thing, but a fundamentally different class of system. One that turns batch-only workloads into real-time operations.
The benchmark setup
We wanted to test something that reflects real-world usage, not a synthetic micro-benchmark. So we picked a common pattern: following chains of authority in an organizational graph. Think "Alice manages Bob, Bob manages Charlie, so Alice has indirect authority over Charlie." This is the same kind of computation you'd need for access control chains, supply chain risk propagation, or entity resolution across corporate structures.
The test: add one new edge, then measure how long it takes to update all derived relationships.
The 400,000 derived relationships come from following chains of authority. If A manages B and B manages C, then A has indirect authority over C. Follow that logic through 2,000 nodes with an average depth of 8-10 levels, and the number of derived relationships grows fast.
The results
| Approach | Time | What it does |
|---|---|---|
| Recalculate everything | 11,280 ms | Throws away all 400,000 derived relationships, re-derives them all |
| InputLayer (incremental) | 6.83 ms | Identifies affected relationships, updates only those |
Recalculating everything doesn't care that you only changed one edge. It treats the entire graph as dirty and rebuilds everything. InputLayer's engine, on the other hand, traces the impact of the change through the chain of reasoning and touches only what's affected.
To put 6.83ms in perspective: that's fast enough to run inline with an API request. You can check permissions, compute supply chain exposure, or resolve entity relationships at query time rather than pre-computing them in a batch process.
What this means in practice
The practical takeaway here is about which architectural patterns become possible.
Without incremental computation, you're stuck with batch processing. Pre-compute permissions overnight. Rebuild recommendation indexes hourly. Re-run compliance checks on a schedule. And accept that between runs, your derived data is stale.
With incremental computation, you can do these things live:
| Use case | Batch approach | Incremental approach |
|---|---|---|
| Access control | Nightly permission rebuild | Live permission check at query time |
| Supply chain risk | Hourly risk recalculation | Instant risk update when a supplier status changes |
| Compliance screening | Daily sanctions check | Real-time flag when ownership structure changes |
| Recommendations | Model retrain every few hours | Instant update when user behavior or inventory changes |
The 1,652x speedup isn't about making a slow thing faster. It's about making batch-only workloads work in real time. That's a qualitative difference in what you can build.
The scaling story
Here's where it gets really interesting. The incremental advantage doesn't stay constant as your graph grows. It gets dramatically better.
| Graph size | Derived relationships | Recalculate everything | Incremental update | Speedup |
|---|---|---|---|---|
| 500 nodes | ~25,000 | 420 ms | 1.2 ms | 350x |
| 1,000 nodes | ~100,000 | 2,800 ms | 3.1 ms | 903x |
| 2,000 nodes | ~400,000 | 11,280 ms | 6.83 ms | 1,652x |
Look at how the two columns grow. Recalculating everything grows much faster than linearly. Double the nodes, quadruple the time. But incremental updates grow much slower, because most single-fact changes only ripple through a small portion of the graph.
500 nodes: 420ms vs 1.2ms
350x
1,000 nodes: 2.8s vs 3.1ms
903x
2,000 nodes: 11.3s vs 6.8ms
1,652x
This scaling behavior is fundamental, not accidental. Recalculating everything has to process the entire graph regardless of what changed. Incremental updates process only the ripple effect of the change, which stays relatively small even as the total graph grows.
At 10,000 nodes, recalculating everything would take over a minute. The incremental update would still be in the low tens of milliseconds. That's the difference between a feature that's practical in production and one that isn't.
Why the numbers work this way
InputLayer is built on Differential Dataflow, a Rust library for incremental computation created by Frank McSherry. The core idea is simple: instead of storing derived results as static data, the engine tracks what changed and efficiently passes those changes along.
Here's how a fact change flows through the system:
The engine didn't scan the entire graph. It didn't recompute relationships for nodes that weren't affected. It started from the change, followed the ripple effects, and stopped as soon as the ripple died out.
For recursive reasoning, like indirect authority where conclusions feed back into the computation, the engine runs a loop until it reaches a stable point where no new changes are produced. When something changes later, it re-enters that loop at the point of change and computes only the new changes.
InputLayer also uses an optimization that makes queries on-demand rather than exhaustive. When you ask "who does Alice have authority over?", the engine doesn't compute authority for every person in the organization. It starts from Alice and follows only the relevant paths. Query time becomes proportional to Alice's portion of the graph, not the entire organization.
Correct retraction: the hard part
Adding facts is relatively straightforward to handle incrementally. Removing them is where things get genuinely hard.
Say Alice has authority over Charlie through two independent paths:
If you remove Bob's management of Charlie, Alice should still have authority over Charlie through Diana. But if you remove Diana's management of Charlie too, the authority should disappear entirely.
The engine tracks this through weighted differences. Each derived relationship has a weight based on the number of independent paths that support it. When a path is removed, the weight goes down. Only when it reaches zero does the conclusion go away.
Both paths: weight = 2
via Bob + Diana
Remove Bob path: weight = 1
survives
Remove Diana path: weight = 0
retracted
On our benchmark graph, retracting a single edge and propagating all downstream changes takes under 10ms. Bulk retractions (removing 100 edges) complete in about a second. Fast enough for real-time applications where facts change frequently.
Try it yourself
InputLayer is open-source:
docker run 8080:8080 ghcr.io/inputlayer/inputlayer
Start with the quickstart guide to build your first knowledge graph, or dive into the recursion documentation to see how recursive reasoning works under the hood.