Skip to content

perf-compare

import { Aside } from ‘@astrojs/starlight/components’;

perf-compare compares two sets of performance test results and determines whether a statistically significant regression has occurred. It integrates with perf-results-db to pull run data automatically.

TierMethodPrice
Community--method simple (percentage threshold)Free
Professional--method statistical (Mann-Whitney U + Cohen’s d)£99/year

Install:

Terminal window
npm install -g @martkos-it/perf-compare
# or without global install:
npx @martkos-it/perf-compare --help

Compares two runs by percentage difference against a threshold:

Terminal window
perf-compare \
--url http://localhost:4000 \
--project your-project-uuid \
--method simple \
--threshold 0.10 # 10% degradation = regression

Output:

Comparison: simple (threshold: 10.0%)
─────────────────────────────────────
Metric Baseline Current Delta Status
p50_ms 142 149 +5.0% OK
p95_ms 387 431 +11.4% REGRESSION
p99_ms 612 598 -2.3% OK
error_rate 0.1% 0.2% +100% REGRESSION
throughput 312 308 -1.3% OK
Result: FAIL (2 regressions detected)

Exit codes: 0 pass, 1 regression, 2 tool error.

Uses Mann-Whitney U test (non-parametric) and Cohen’s d (effect size) for robust regression detection that accounts for variance in the data:

Terminal window
perf-compare \
--url http://localhost:4000 \
--project your-project-uuid \
--method statistical \
--baseline 10 \ # use last 10 runs as baseline
--current 5 \ # compare against last 5 runs
--alpha 0.05 # significance level (default: 0.10)

Output:

Comparison: statistical (α=0.05)
─────────────────────────────────────────────────────────────────
Metric Baseline μ Current μ Δ% p-value Effect Status
p50_ms 142.3 149.1 +4.8% 0.031 small REGRESSION
p95_ms 387.4 430.8 +11.2% 0.004 medium REGRESSION
p99_ms 611.7 597.2 -2.4% 0.412 — OK
error_rate 0.10% 0.22% +120% 0.001 large REGRESSION
Result: FAIL (3 regressions at α=0.05)

The p-value tells you whether the difference is statistically significant. Cohen’s d tells you whether it’s practically significant (small < 0.5, medium < 0.8, large ≥ 0.8).

Terminal window
export PERF_COMPARE_LICENSE_KEY=your-license-key
# or
perf-compare --license your-key --method statistical ...

License is validated against updates.martkos-it.co.uk with a 24-hour file cache and 7-day offline grace period.

perf-ecosystem.yml (auto-discovered by walking up to .git):

services:
perf_results_db:
url: "http://localhost:4000"
api_key: "${PERF_RESULTS_DB_API_KEY}"
project_id: "${PERF_RESULTS_DB_PROJECT_ID}"

When present, --url and --project can be omitted.

Alternatively, environment variables:

Terminal window
export PERF_RESULTS_DB_API_KEY=prdb_your_key
export PERF_RESULTS_DB_URL=http://localhost:4000
Terminal window
perf-compare --method statistical --json
{
"result": "fail",
"method": "statistical",
"alpha": 0.05,
"regressions": [
{
"metric": "p95_ms",
"baselineMean": 387.4,
"currentMean": 430.8,
"deltaPercent": 11.2,
"pValue": 0.004,
"effectSize": "medium"
}
]
}
- name: Check for regressions
env:
PERF_RESULTS_DB_API_KEY: ${{ secrets.PERF_RESULTS_DB_API_KEY }}
PERF_COMPARE_LICENSE_KEY: ${{ secrets.PERF_COMPARE_LICENSE_KEY }}
PERF_COMPARE_CONFIG_DIR: /tmp/perf-compare-cache
run: |
npx @martkos-it/perf-compare \
--url ${{ vars.PERF_RESULTS_DB_URL }} \
--project ${{ vars.PERF_RESULTS_DB_PROJECT_ID }} \
--method statistical \
--baseline 10 --current 3

The PERF_COMPARE_CONFIG_DIR env var sets the license cache directory (useful in ephemeral CI environments).

regression-check:
script:
- npx @martkos-it/perf-compare
--url $PERF_RESULTS_DB_URL
--project $PERF_RESULTS_DB_PROJECT_ID
--method statistical
--baseline 10 --current 3 --alpha 0.05
FlagDefaultDescription
--url <url>from configperf-results-db base URL
--project <uuid>from configProject UUID
--method <simple|statistical>requiredComparison method
--threshold <float>0.1Simple: max % degradation
--baseline <n>10Statistical: number of baseline runs
--current <n>5Statistical: number of current runs
--alpha <float>0.10Statistical: significance level
--json / -jfalseJSON output
--license <key>env varLicense key override