Skip to main content
Back to blog
performance 10 April 2024 6 min read

Building a Performance Testing Strategy: From Requirements to Metrics

A comprehensive guide to developing a performance testing strategy, from gathering requirements to defining success criteria and reporting results.

M

Mark

Performance Testing Expert

Performance testing without a strategy is just running tools. A proper strategy aligns testing activities with business objectives, defines clear success criteria, and provides actionable results. Here’s how to build one.

Starting with Business Requirements

Performance requirements should trace back to business needs, not arbitrary technical targets. Ask stakeholders:

  • What user experience are we promising?
  • What happens if the system is slow? Lost revenue? Reputation damage?
  • What growth do we expect over the next 12 months?
  • Are there regulatory or contractual SLAs?

Convert business requirements to technical requirements:

Business RequirementTechnical Requirement
”Pages should feel instant”Page load < 2 seconds at p95
”Support 10,000 concurrent users”System handles 10,000 active sessions
”Handle Black Friday traffic”5x normal throughput for 6 hours
”99.9% uptime SLA”Error rate < 0.1% under load

Defining Workload Models

A workload model describes how users interact with your system. Base it on production data when possible:

User Distribution

Typical E-commerce Workload:
- 60% Browse products (read-heavy)
- 25% Search (compute-intensive)
- 10% Add to cart (write operations)
- 4% Checkout (complex transactions)
- 1% Account management

Think Time and Session Length

User TypeAvg SessionThink TimeActions/Session
Browser5 min15-30 sec8-12
Buyer12 min10-20 sec20-30
Power User30 min5-10 sec100+

Peak vs Normal Load

Define multiple load profiles:

Normal: 500 concurrent users, 50 requests/sec
Peak: 2,000 concurrent users, 200 requests/sec
Stress: 5,000 concurrent users, 500 requests/sec
Break: Increase until failure

Test Types and When to Use Them

Test TypePurposeDurationLoad Level
SmokeVerify basic functionality5-10 minMinimal
LoadValidate requirements30-60 minExpected peak
StressFind breaking points15-30 minBeyond peak
Soak/EnduranceMemory leaks, degradation4-24 hoursNormal load
SpikeSudden traffic bursts15-30 minRapid changes
ScalabilityCapacity planningVariableIncremental

Schedule these appropriately:

  • Smoke tests: Every deployment
  • Load tests: Weekly or before releases
  • Stress tests: Monthly or quarterly
  • Soak tests: Before major releases
  • Spike tests: Before events (sales, launches)

Success Criteria

Define unambiguous pass/fail criteria before testing:

Response Time Thresholds

API Endpoints:
- p50 < 100ms
- p95 < 500ms
- p99 < 1000ms

Page Loads:
- Time to First Byte < 200ms
- First Contentful Paint < 1.5s
- Largest Contentful Paint < 2.5s

Throughput Requirements

Minimum sustainable throughput:
- API: 1,000 requests/second
- Database: 5,000 queries/second
- Message queue: 10,000 messages/second

Error Budget

Acceptable error rate: < 0.1%
- 4xx errors: < 0.05% (client errors shouldn't increase under load)
- 5xx errors: < 0.05% (server errors indicate capacity issues)
- Timeouts: < 0.01%

Resource Utilisation

Under peak load:
- CPU: < 70% average, < 90% peak
- Memory: < 80% with headroom for spikes
- Disk I/O: < 80% of provisioned capacity
- Network: < 70% of bandwidth limit

Environment Strategy

Test Environment Parity

AspectProductionPerformance Test
Instance typesc5.4xlargec5.4xlarge (same)
Instance count105 (scaled ratio)
Databaser5.2xlarger5.2xlarge (same)
Data volume500GB100GB (subset)
CDNYesYes or bypassed

Perfect parity is often impractical. Document differences and adjust expectations accordingly.

Data Preparation

Performance test data should:

  • Match production volume ratios
  • Include edge cases and variety
  • Be anonymised if from production
  • Support the required concurrent user count
Example data sizing:
- 100,000 user accounts (10x concurrent target)
- 1,000,000 products (realistic catalog)
- 10,000,000 orders (historical data for queries)

Monitoring and Observability

What to Measure

Application metrics:

  • Response time percentiles (p50, p95, p99)
  • Throughput (requests/second)
  • Error rates by type
  • Active users/sessions

Infrastructure metrics:

  • CPU, memory, disk I/O
  • Network throughput and latency
  • Database connections and query times
  • Cache hit rates

Business metrics:

  • Conversion rate under load
  • Cart abandonment timing
  • Search result latency

Baseline Establishment

Before testing changes, establish baselines:

Baseline Test (v1.5.2, 2024-04-01):
- Peak throughput: 850 req/s
- p95 response time: 245ms
- Error rate: 0.02%
- CPU at peak: 65%

Compare all future tests against this baseline.

Reporting and Communication

For Technical Teams

Detailed metrics and analysis:

  • Full percentile distribution
  • Resource utilisation timelines
  • Specific bottleneck identification
  • Reproduction steps for issues

For Stakeholders

Business-focused summary:

  • Did we meet the requirements? (Yes/No)
  • What’s the capacity headroom?
  • What are the risks?
  • What’s the recommendation?

Report Template

## Performance Test Summary

**Test Date:** 2024-04-08
**Version:** 2.1.0
**Environment:** Staging (scaled 1:2)

### Results vs Requirements

| Requirement | Target | Actual | Status |
|-------------|--------|--------|--------|
| Response time p95 | < 500ms | 312ms | PASS |
| Throughput | 1000 rps | 1,247 rps | PASS |
| Error rate | < 0.1% | 0.03% | PASS |
| CPU utilisation | < 70% | 68% | PASS |

### Key Findings
1. Database connection pool saturates at 1,100 rps
2. Memory usage stable over 4-hour soak test
3. No degradation during simulated failover

### Recommendations
1. Increase connection pool size before go-live
2. Add monitoring alert for connection pool usage
3. Approved for production deployment

Continuous Performance Testing

Integrate performance testing into CI/CD:

# On every commit
smoke_test:
  - 10 users, 2 minutes
  - Basic functionality check
  - Fail build if p95 > 1s

# On merge to main
load_test:
  - 100 users, 10 minutes
  - Compare against baseline
  - Alert if regression > 10%

# Weekly scheduled
full_load_test:
  - Production-like load
  - 1 hour duration
  - Full report generation

Common Pitfalls

Testing in isolation: Performance depends on real infrastructure, network, and data. Synthetic environments give synthetic results.

Testing too late: Finding performance issues in production is expensive. Test early and often.

Ignoring variability: Single test runs aren’t statistically significant. Run multiple iterations and report ranges.

Optimising prematurely: Measure first. Don’t guess where bottlenecks are.

Focusing only on averages: Averages hide problems. Always look at percentiles and tail latency.

A performance testing strategy isn’t a document that sits on a shelf. It’s a living framework that evolves with your application and guides decision-making throughout the development lifecycle. Start with clear requirements, measure consistently, and communicate results in terms stakeholders understand.

Tags:

#strategy #methodology #performance-testing #requirements

Need help with performance testing?

Let's discuss how I can help improve your application's performance.

Get in Touch