Query Profiling
InputLayer provides per-stage timing breakdowns for queries, letting you see exactly where time is spent in the execution pipeline. The profiling data is embedded in query results - no separate profiling tool needed.
Enabling Profiling
Profiling is controlled by the timing_mode setting in your server config under [storage.performance]:
# "summary" - stage-level timing (default, ~200ns overhead)
# "detailed" - stage-level + per-rule + optimizer/IR builder sub-timing
# "off" - no timing collection, zero overhead
timing_mode = "summary"
Summary mode is a good default for production. Switch to Detailed when you need per-rule granularity or optimizer iteration counts. Off mode skips all Instant::now() calls entirely.
Pipeline Stages
When timing is enabled, every query result includes a timing_breakdown with these stages (all times in microseconds):
| Field | Stage | What it measures |
|---|---|---|
parse_us | Parse | Source code parsing and AST construction |
sip_us | SIP Rewriting | Sideways information passing for recursive queries |
magic_sets_us | Magic Sets | Demand-driven query adornment |
ir_build_us | IR Build | Intermediate representation construction |
optimize_us | Optimize | Optimizer passes (filter pushdown, fusion, identity elimination) |
shared_views_us | Shared Views | Common subexpression pre-computation |
total_us | Total | End-to-end engine time |
Per-rule execution timing appears in the rules array (Detailed mode only). Each entry includes the rule head name, execution time, whether it ran recursively, and worker count.
In Detailed mode, you also get optimizer_detail (iteration count, rules vs fusion time) and ir_builder_detail (per-phase: scans, joins, computed columns, filters, antijoins, projection).
Practical Example
Here's a typical workflow for identifying a bottleneck. Start by inserting some data and defining a recursive rule:
+edge[(1, 2), (2, 3), (3, 4), (4, 5)]
+tc(X, Y) <- edge(X, Y)
+tc(X, Y) <- tc(X, Z), edge(Z, Y)
?tc(X, Y)
With timing_mode = "detailed", the query result includes a breakdown like:
{
"timing_breakdown": {
"total_us": 1250,
"parse_us": 85,
"sip_us": 12,
"magic_sets_us": 8,
"ir_build_us": 120,
"optimize_us": 95,
"shared_views_us": 0,
"rules": [
{ "rule_head": "tc", "execution_us": 890, "is_recursive": true, "workers": 1 }
],
"optimizer_detail": { "iterations": 2, "rules_us": 78, "fusion_us": 17 },
"ir_builder_detail": { "scans_us": 45, "joins_us": 38, "filters_us": 0, ... }
}
}
From this you can see that execution dominates (890us out of 1250us total), the tc rule is recursive, and the optimizer converged in 2 iterations. If the rule were slow, you'd look at whether adding an index or restructuring the join order helps.
GUI
The Performance tab in query results shows a waterfall bar chart with stages color-coded, a table with exact timings, and a collapsible per-rule breakdown (Detailed mode).
In the Explorer, clicking a view and opening the Performance tab shows timing sub-tabs:
- Overview - evaluation cost summary with a mini waterfall
- Timing - full stage breakdown and per-rule detail
- Inputs - dependency graph showing which base relations feed the view, their fact counts, and recent mutations
- Query Plan - the optimizer pipeline trace
- Rule - the Datalog definition
SDK Access
JavaScript / TypeScript
const il = new InputLayer({
url: "ws://localhost:8080/ws",
username: "admin",
password: "admin",
});
await il.connect();
const kg = il.knowledgeGraph("default");
const rs = await kg.execute("?tc(X, Y)");
if (rs.timingBreakdown) {
console.log(`Engine time: ${rs.timingBreakdown.total_us}us`);
console.log(`Parse: ${rs.timingBreakdown.parse_us}us`);
console.log(`Optimize: ${rs.timingBreakdown.optimize_us}us`);
// Per-rule timing (Detailed mode)
for (const rule of rs.timingBreakdown.rules ?? []) {
console.log(` ${rule.rule_head}: ${rule.execution_us}us${rule.is_recursive ? " (recursive)" : ""}`);
}
}
Python
from inputlayer import InputLayer
async with InputLayer("ws://localhost:8080/ws", username="admin", password="admin") as il:
kg = il.knowledge_graph("default")
rs = await kg.execute("?tc(X, Y)")
if rs.timing_breakdown:
tb = rs.timing_breakdown
print(f"Engine time: {tb['total_us']}us")
print(f"Parse: {tb['parse_us']}us")
print(f"Optimize: {tb['optimize_us']}us")
# Per-rule timing (Detailed mode)
for rule in tb.get("rules", []):
rec = " (recursive)" if rule["is_recursive"] else ""
print(f" {rule['rule_head']}: {rule['execution_us']}us{rec}")
Note: in the Python SDK, timing_breakdown is a dict[str, Any], so fields are accessed with bracket notation.
Explainability Queries
The .why and .why_not commands also carry timing breakdowns that separate query execution from proof/explanation work:
why ?tc(1, 3)
The breakdown includes two entries in the rules array:
query_execution- time running the underlying query and collecting contextproof_construction- time building backward-chaining proof trees
For .why_not, the entries are query_execution and explanation.
This timing respects timing_mode - when set to "off", no breakdown is produced for .why/.why_not either.
Prometheus Metrics
Timing histograms are exported at /metrics/prometheus in standard Prometheus text format:
| Metric | Description |
|---|---|
inputlayer_query_parse_seconds | Parse stage duration |
inputlayer_query_optimize_seconds | Optimization stage duration |
inputlayer_query_execute_seconds | DD computation duration |
inputlayer_query_total_seconds | Total query duration |
Bucket boundaries: 1ms, 5ms, 10ms, 50ms, 100ms, 500ms, 1s, 5s, +Inf. Each metric includes _bucket, _sum, and _count suffixes.
Performance Impact
| Mode | Overhead | What you get |
|---|---|---|
| Off | Zero | Nothing |
| Summary | ~200-350ns/query | Stage totals |
| Detailed | ~500ns-1us/query | Stage totals + per-rule + optimizer/IR builder sub-timing |
For production, Summary is the right choice. The overhead is invisible for any real workload, and having timing data available without restarting the server pays for itself the first time you need to debug a slow query.