Azure Performance Testing: App Service and AKS Optimization
Practical guidance for performance testing applications on Microsoft Azure, covering App Service scaling, AKS configuration, and Azure-specific considerations.
Mark
Performance Testing Expert
Performance testing on Azure requires understanding the platform’s specific behaviours around scaling, networking, and managed services. Here’s what I’ve learned from testing applications across Azure App Service, AKS, and related services.
App Service Performance
Tier Selection
App Service tiers directly impact performance capabilities:
| Tier | CPU | Memory | Max Instances | Use Case |
|---|---|---|---|---|
| B1 | 1 core | 1.75 GB | 3 | Development |
| S1 | 1 core | 1.75 GB | 10 | Light production |
| P1v3 | 2 cores | 8 GB | 30 | Production |
| P3v3 | 8 cores | 32 GB | 30 | High performance |
During load testing, monitor for tier-specific limitations:
# Check current instance count
az webapp show --name myapp --resource-group myrg --query "siteConfig.numberOfWorkers"
# View autoscale settings
az monitor autoscale show --name myautoscale --resource-group myrg
Always On Setting
Without “Always On” enabled, App Service unloads idle applications, causing cold start latency:
# Enable Always On
az webapp config set --name myapp --resource-group myrg --always-on true
First request latency without Always On can be 10-30 seconds. Include warm-up time in your test design or ensure this setting is enabled for production.
Autoscale Configuration
App Service autoscale responds to metrics like CPU and memory:
{
"profiles": [{
"capacity": {
"minimum": "2",
"maximum": "10",
"default": "2"
},
"rules": [{
"metricTrigger": {
"metricName": "CpuPercentage",
"operator": "GreaterThan",
"threshold": 70,
"timeAggregation": "Average",
"timeWindow": "PT5M"
},
"scaleAction": {
"direction": "Increase",
"type": "ChangeCount",
"value": "1",
"cooldown": "PT5M"
}
}]
}]
}
During load tests, measure:
- Time from threshold breach to new instance serving traffic
- Request queue depth during scaling events
- Performance degradation during scale-out
Typical scale-out time: 2-5 minutes for new instances to start serving.
AKS Performance Testing
Node Pool Sizing
AKS node pool configuration affects pod scheduling and performance:
# Create a performance-optimized node pool
az aks nodepool add \
--resource-group myrg \
--cluster-name mycluster \
--name perfpool \
--node-count 3 \
--node-vm-size Standard_D4s_v3 \
--enable-cluster-autoscaler \
--min-count 3 \
--max-count 10
VM size recommendations for performance testing:
| Workload Type | Recommended VM | Notes |
|---|---|---|
| General | Standard_D4s_v3 | Balanced CPU/memory |
| CPU intensive | Standard_F8s_v2 | Higher CPU ratio |
| Memory intensive | Standard_E4s_v3 | More memory per core |
| High IOPS | Standard_L8s_v2 | Local NVMe storage |
Pod Resource Limits
Test with realistic resource limits to identify constraints:
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
Monitor for:
- CPU throttling (pod using limit but needs more)
- OOM kills (memory limit exceeded)
- Pending pods (insufficient cluster resources)
# Check pod resource usage
kubectl top pods -n production
# Check for OOM events
kubectl get events --field-selector reason=OOMKilled
Horizontal Pod Autoscaler
Configure and test HPA behaviour:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Measure scaling responsiveness during load tests:
# Watch HPA during load test
kubectl get hpa myapp-hpa -w
Azure SQL Database
DTU vs vCore Models
Azure SQL performance varies significantly by tier:
| Tier | DTUs/vCores | Max Sessions | Use Case |
|---|---|---|---|
| Basic | 5 DTU | 30 | Development |
| S3 | 100 DTU | 200 | Small production |
| P1 | 125 DTU | 200 | Production |
| GP_Gen5_4 | 4 vCores | 3200 | High concurrency |
During load testing, monitor:
-- Check current DTU usage
SELECT
end_time,
avg_cpu_percent,
avg_data_io_percent,
avg_log_write_percent,
avg_memory_usage_percent
FROM sys.dm_db_resource_stats
ORDER BY end_time DESC;
Connection Pool Limits
Azure SQL enforces connection limits. Test for connection exhaustion:
// Connection string with explicit pool size
"Server=tcp:myserver.database.windows.net;Database=mydb;Max Pool Size=100;..."
Monitor connection usage:
SELECT
COUNT(*) as connection_count,
login_name,
program_name
FROM sys.dm_exec_sessions
GROUP BY login_name, program_name;
Application Gateway
WAF Impact
If using Application Gateway with WAF, measure the overhead:
// k6 test comparing direct vs WAF paths
import http from 'k6/http';
import { Trend } from 'k6/metrics';
const directLatency = new Trend('direct_latency');
const wafLatency = new Trend('waf_latency');
export default function () {
// Direct to App Service
const direct = http.get('https://myapp.azurewebsites.net/api/test');
directLatency.add(direct.timings.duration);
// Through Application Gateway + WAF
const waf = http.get('https://myapp.mydomain.com/api/test');
wafLatency.add(waf.timings.duration);
}
WAF typically adds 5-20ms latency depending on rule complexity.
Backend Pool Health
Monitor backend health during load tests:
az network application-gateway show-backend-health \
--name myappgw \
--resource-group myrg \
--query "backendAddressPools[].backendHttpSettingsCollection[].servers[]"
Azure Cache for Redis
Cache Hit Rates
During load tests, monitor cache effectiveness:
az redis show --name myredis --resource-group myrg --query "accessKeys"
redis-cli -h myredis.redis.cache.windows.net -a <key> INFO stats
Key metrics:
keyspace_hits/keyspace_misses= hit rateconnected_clientsvs maximum connections- Memory usage vs tier limit
Tier Performance
| Tier | Memory | Connections | Bandwidth |
|---|---|---|---|
| C0 | 250 MB | 256 | 5 Mbps |
| C1 | 1 GB | 1000 | 100 Mbps |
| C3 | 6 GB | 5000 | 500 Mbps |
| P1 | 6 GB | 7500 | 1 Gbps |
Premium tiers (P*) offer clustering and geo-replication for higher throughput.
Azure Monitor Integration
Set up comprehensive monitoring during tests:
# Create Log Analytics workspace query for request performance
az monitor log-analytics query \
--workspace myworkspace \
--analytics-query "
requests
| where timestamp > ago(1h)
| summarize
avg(duration),
percentile(duration, 95),
percentile(duration, 99),
count()
by bin(timestamp, 1m)
| order by timestamp desc
"
Application Insights
Enable Application Insights for detailed performance data:
// k6 with Application Insights correlation
import http from 'k6/http';
export default function () {
const headers = {
'Request-Id': `|${__VU}-${__ITER}.`,
};
http.get('https://myapp.azurewebsites.net/api/test', { headers });
}
Query performance in Application Insights:
requests
| where timestamp > ago(1h)
| summarize
avg(duration),
percentiles(duration, 50, 95, 99),
count()
by name
| order by count_ desc
Cost Considerations
Azure performance testing incurs costs. Optimize by:
- Use spot instances for AKS load generator nodes
- Scale down after tests - don’t leave high-tier resources running
- Use Azure Load Testing service for managed infrastructure
- Set budget alerts before running large-scale tests
# Create budget alert
az consumption budget create \
--budget-name perf-testing-budget \
--amount 500 \
--time-grain Monthly \
--category Cost
Azure Load Testing Service
Azure’s managed load testing service simplifies infrastructure:
# load-test.yaml
version: v0.1
testId: myloadtest
testPlan: tests/load-test.jmx
engineInstances: 5
az load test create \
--name myloadtest \
--resource-group myrg \
--load-test-config-file load-test.yaml
This eliminates the need to manage load generator VMs while providing integrated Azure Monitor dashboards.
Performance testing on Azure requires understanding both your application and the platform’s scaling behaviours. Start with baseline tests in lower tiers, then validate performance at production scale before go-live.
Tags: