Benchmarks: 1,652x Faster Recursive Queries with Incremental Computation

If your compliance system takes 11 seconds to update after an ownership change, that's 11 seconds where a transaction can slip through against a sanctioned entity. If your recommendation engine takes 11 seconds to reflect a stockout, that's 11 seconds of customers clicking "buy" on products you can't ship. If your access control takes 11 seconds to propagate a role change, that's 11 seconds where a terminated employee's permissions are still active. These aren't hypotheticals. They're the cost of recalculating everything from scratch in systems that need to reason over connected data.

One fact changes in a knowledge graph with 400,000 derived relationships. Full recompute: 11.3 seconds. InputLayer: 6.83 milliseconds.

That's a 1,652x difference. Not a faster version of the same thing, but a fundamentally different class of system. One that turns batch-only workloads into real-time operations.

The benchmark setup

We wanted to test something that reflects real-world usage, not a synthetic micro-benchmark. So we picked a common pattern: following chains of authority in an organizational graph. Think "Alice manages Bob, Bob manages Charlie, so Alice has indirect authority over Charlie." This is the same kind of computation you'd need for access control chains, supply chain risk propagation, or entity resolution across corporate structures.

2,000 nodes

6,000 edges

400,000 derived relationships

The test: add one new edge, then measure how long it takes to update all derived relationships.

The 400,000 derived relationships come from following chains of authority. If A manages B and B manages C, then A has indirect authority over C. Follow that logic through 2,000 nodes with an average depth of 8-10 levels, and the number of derived relationships grows fast.

The results

Approach	Time	What it does
Recalculate everything	11,280 ms	Throws away all 400,000 derived relationships, re-derives them all
InputLayer (incremental)	6.83 ms	Identifies affected relationships, updates only those

Recalculating everything doesn't care that you only changed one edge. It treats the entire graph as dirty and rebuilds everything. InputLayer's engine, on the other hand, traces the impact of the change through the chain of reasoning and touches only what's affected.

To put 6.83ms in perspective: that's fast enough to run inline with an API request. You can check permissions, compute supply chain exposure, or resolve entity relationships at query time rather than pre-computing them in a batch process.

What this means in practice

The practical takeaway here is about which architectural patterns become possible.

Without incremental computation, you're stuck with batch processing. Pre-compute permissions overnight. Rebuild recommendation indexes hourly. Re-run compliance checks on a schedule. And accept that between runs, your derived data is stale.

With incremental computation, you can do these things live:

Use case	Batch approach	Incremental approach
Access control	Nightly permission rebuild	Live permission check at query time
Supply chain risk	Hourly risk recalculation	Instant risk update when a supplier status changes
Compliance screening	Daily sanctions check	Real-time flag when ownership structure changes
Recommendations	Model retrain every few hours	Instant update when user behavior or inventory changes

The 1,652x speedup isn't about making a slow thing faster. It's about making batch-only workloads work in real time. That's a qualitative difference in what you can build.

The scaling story

Here's where it gets really interesting. The incremental advantage doesn't stay constant as your graph grows. It gets dramatically better.

Graph size	Derived relationships	Recalculate everything	Incremental update	Speedup
500 nodes	~25,000	420 ms	1.2 ms	350x
1,000 nodes	~100,000	2,800 ms	3.1 ms	903x
2,000 nodes	~400,000	11,280 ms	6.83 ms	1,652x

Look at how the two columns grow. Recalculating everything grows much faster than linearly. Double the nodes, quadruple the time. But incremental updates grow much slower, because most single-fact changes only ripple through a small portion of the graph.

500 nodes: 420ms vs 1.2ms

350x

1,000 nodes: 2.8s vs 3.1ms

903x

2,000 nodes: 11.3s vs 6.8ms

1,652x

This scaling behavior is fundamental, not accidental. Recalculating everything has to process the entire graph regardless of what changed. Incremental updates process only the ripple effect of the change, which stays relatively small even as the total graph grows.

At 10,000 nodes, recalculating everything would take over a minute. The incremental update would still be in the low tens of milliseconds. That's the difference between a feature that's practical in production and one that isn't.

Why the numbers work this way

InputLayer is built on Differential Dataflow, a Rust library for incremental computation created by Frank McSherry. The core idea is simple: instead of storing derived results as static data, the engine tracks what changed and efficiently passes those changes along.

Here's how a fact change flows through the system:

Alice

Diana --manages-> Eve

Bob

Eve

Frank

The engine didn't scan the entire graph. It didn't recompute relationships for nodes that weren't affected. It started from the change, followed the ripple effects, and stopped as soon as the ripple died out.

For recursive reasoning, like indirect authority where conclusions feed back into the computation, the engine runs a loop until it reaches a stable point where no new changes are produced. When something changes later, it re-enters that loop at the point of change and computes only the new changes.

InputLayer also uses an optimization that makes queries on-demand rather than exhaustive. When you ask "who does Alice have authority over?", the engine doesn't compute authority for every person in the organization. It starts from Alice and follows only the relevant paths. Query time becomes proportional to Alice's portion of the graph, not the entire organization.

Correct retraction: the hard part

Adding facts is relatively straightforward to handle incrementally. Removing them is where things get genuinely hard.

Say Alice has authority over Charlie through two independent paths:

Alice

Bob --manages-> Charlie

Diana --manages-> Charlie

If you remove Bob's management of Charlie, Alice should still have authority over Charlie through Diana. But if you remove Diana's management of Charlie too, the authority should disappear entirely.

The engine tracks this through weighted differences. Each derived relationship has a weight based on the number of independent paths that support it. When a path is removed, the weight goes down. Only when it reaches zero does the conclusion go away.

Both paths: weight = 2

via Bob + Diana

Remove Bob path: weight = 1

survives

Remove Diana path: weight = 0

retracted

On our benchmark graph, retracting a single edge and propagating all downstream changes takes under 10ms. Bulk retractions (removing 100 edges) complete in about a second. Fast enough for real-time applications where facts change frequently.

Try it yourself

InputLayer is open-source:

docker run -p 8080:8080 ghcr.io/inputlayer/inputlayer

Start with the quickstart guide to build your first knowledge graph, or dive into the recursion documentation to see how recursive reasoning works under the hood.

Benchmarks: 1,652x Faster Recursive Queries with Incremental Computation

Benchmarks: 1,652x Faster Recursive Queries with Incremental Computation

The benchmark setup

The results

What this means in practice

The scaling story

Why the numbers work this way

Correct retraction: the hard part

Try it yourself

Ready to get started?