Perf Compare
Statistical performance regression detector for JMeter, K6, and Gatling. Free tier included.
Community free · Professional £99/year
Community tier is free — register to get your license key and install updates. No payment required.
What's Included
Overview
Is that slowdown a real regression, or just noise? Perf Compare answers the question statistically.
The free Community tier (--method simple) compares the latest run against a percentage threshold. The paid Professional tier (--method statistical) runs a Mann-Whitney U test to tell you whether the difference is statistically significant — not just whether the number went up.
Works directly with perf-results-db and integrates into any CI/CD pipeline with a single command.
How It Works
Perf Compare calls the perf-results-db regression endpoint and reports the result:
# Community — percentage threshold (free)
perf-compare --url https://perf-db.example.com --project <uuid> --method simple
# Professional — statistical regression detection
perf-compare --url https://perf-db.example.com --project <uuid> --method statistical
Exit codes work directly as CI/CD pass/fail signals:
| Exit Code | Meaning |
|---|---|
0 | No regression detected — build passes |
1 | Regression detected — build fails |
2 | Error (misconfiguration, network failure, insufficient data) |
Methods
Simple (Free)
Compares the latest run’s metrics against a percentage threshold. If any metric exceeds the threshold, it’s flagged as a regression.
perf-compare \
--url https://perf-db.example.com \
--project my-project \
--method simple \
--threshold 0.1 # flag anything >10% worse
Statistical (Pro)
Uses Mann-Whitney U (a non-parametric rank-sum test) to determine whether the distribution of recent runs is significantly different from the baseline. Reports both a p-value and Cohen’s d effect size.
perf-compare \
--url https://perf-db.example.com \
--project my-project \
--method statistical \
--baseline 10 \ # number of baseline runs to compare against
--current 5 \ # number of recent runs to compare
--alpha 0.05 # significance threshold
Only flags a regression when the result is both statistically significant (p < alpha) and a meaningful effect size (Cohen’s d ≥ medium). This eliminates false positives from natural run-to-run variance.
Pricing
| Feature | Community |
Professional
Statistical
|
|---|---|---|
| Simple threshold comparison | ✓ | ✓ |
| CI/CD exit codes (0/1/2) | ✓ | ✓ |
| JSON output mode | ✓ | ✓ |
| perf-ecosystem.yml config | ✓ | ✓ |
| Mann-Whitney U significance test | — | ✓ |
| Cohen's d effect size | — | ✓ |
| Configurable baseline window | — | ✓ |
| Price |
Free
forever
|
£99
per year
|
| Get Free Tier | Buy Professional |
Installation
Register for a free Community key or purchase a Pro license at the top of this page, then download the binary for your platform from the download page.
# Linux — extract and add to PATH
unzip perf-compare-1.0.0-linux-x64.zip -d /usr/local/bin
chmod +x /usr/local/bin/perf-compare
CI/CD Integration
GitHub Actions
- name: Check for performance regressions
run: |
perf-compare \
--url ${{ vars.PERF_DB_URL }} \
--project ${{ vars.PERF_DB_PROJECT_ID }} \
--method simple
env:
PERF_RESULTS_DB_API_KEY: ${{ secrets.PERF_DB_API_KEY }}
PERF_COMPARE_LICENSE_KEY: ${{ secrets.PERF_COMPARE_LICENSE_KEY }}
PERF_COMPARE_CONFIG_DIR: /tmp/perf-compare-cache
perf-ecosystem.yml (config file)
Drop a perf-ecosystem.yml in your project root to avoid repeating flags:
services:
perf_results_db:
url: https://perf-db.example.com
api_key: ${PERF_DB_API_KEY}
project_id: your-project-uuid
perf_compare:
license_key: ${PERF_COMPARE_LICENSE_KEY}
Then run with no flags at all:
perf-compare --method statistical
FAQ
Does the Community tier require a license key?
Yes, but the Community key is free. Register at the top of this page and you’ll receive your key by email immediately. This allows us to notify you of updates and gives you a path to upgrade without losing your configuration.
What does the statistical method actually test?
It runs a Mann-Whitney U test (a non-parametric rank-sum test that makes no assumptions about the distribution of response times) across the baseline and current run windows. It reports a p-value and Cohen’s d effect size. A regression is flagged only when both the p-value is below alpha and the effect size is at least medium (d ≥ 0.5).
How does it get performance data?
Perf Compare doesn’t collect data itself. It calls the regression analysis endpoint on your perf-results-db instance, which stores historical run data uploaded by the perf-results-db CLI uploader.
Can I use it without perf-results-db?
Not currently. Perf Compare is designed as a companion to perf-results-db, which provides the historical run storage and regression analysis API.