Load Balancing
API Gateway supports multiple load balancing algorithms to distribute traffic across backend service instances.
Overview
Load balancing allows you to:
- Distribute traffic across multiple backend instances
- Improve availability and fault tolerance
- Scale horizontally by adding more instances
- Use different algorithms based on your needs
Configuration
Single Server Mode (Backward Compatible)
The existing single-server configuration continues to work without modification:
routes:
- name: single-service
path: /api
backend:
service:
name: api.example.com
port: 8080
protocol: httpsMulti-Server Mode
To enable load balancing, configure multiple servers using the servers field:
routes:
- name: load-balanced-service
path: /api
backend:
service:
algorithm: round-robin
servers:
- name: server1.example.com
port: 8080
- name: server2.example.com
port: 8080
- name: server3.example.com
port: 8080Load Balancing Algorithms
Round-Robin
Distributes requests evenly across all healthy servers in rotation.
backend:
service:
algorithm: round-robin
servers:
- name: server1.example.com
port: 8080
- name: server2.example.com
port: 8080
- name: server3.example.com
port: 8080Use Cases:
- When all servers have similar capacity
- For stateless services
- When you want even distribution
Weighted Round-Robin
Distributes requests based on server weights. Servers with higher weights receive more traffic.
backend:
service:
algorithm: weighted
servers:
- name: server1.example.com
port: 8080
weight: 1 # Receives 25% of traffic
- name: server2.example.com
port: 8080
weight: 2 # Receives 50% of traffic
- name: server3.example.com
port: 8080
weight: 1 # Receives 25% of trafficUse Cases:
- When servers have different capacities
- For gradual traffic migration
- For A/B testing scenarios
Least Connections
Routes requests to the server with the fewest active connections.
backend:
service:
algorithm: least-connections
servers:
- name: server1.example.com
port: 8080
- name: server2.example.com
port: 8080Use Cases:
- For long-lived connections
- When request processing times vary
- For WebSocket connections
IP Hash
Routes requests based on the client IP address, ensuring the same client always reaches the same server.
backend:
service:
algorithm: ip-hash
servers:
- name: server1.example.com
port: 8080
- name: server2.example.com
port: 8080Use Cases:
- For session persistence
- When you need sticky sessions
- For caching optimization
Note: IP Hash uses the client IP from X-Forwarded-For or X-Real-IP headers if available, otherwise falls back to RemoteAddr.
Server Configuration
Basic Server Fields
servers:
- name: server1.example.com # Required: Server hostname or IP
port: 8080 # Required: Server port
protocol: https # Optional: Protocol (http/https), inherits from base if not set
weight: 1 # Optional: Weight for weighted algorithm (default: 1)
disabled: false # Optional: Disable server (default: false, server is enabled by default)Server-Level Configuration Override
You can override global configuration at the server level:
backend:
service:
algorithm: round-robin
servers:
- name: server1.example.com
port: 8080
# Uses global configuration
- name: server2.example.com
port: 8080
# Override request headers
request:
headers:
X-Instance: server-2
# Override health check path
health_check:
path: /custom-health
# Global configuration (applied to all servers unless overridden)
request:
headers:
X-Service: my-service
health_check:
path: /healthHealth Checks
Health checks automatically remove unhealthy servers from the load balancing pool.
Global Health Check
backend:
service:
algorithm: round-robin
servers:
- name: server1.example.com
port: 8080
- name: server2.example.com
port: 8080
health_check:
method: GET
path: /health
status: [200]
interval: 30 # Check interval in seconds
timeout: 5 # Request timeout in secondsServer-Level Health Check
servers:
- name: server1.example.com
port: 8080
health_check:
path: /custom-health # Overrides global health check path
interval: 60 # Overrides global intervalHealth Check Options
| Field | Type | Default | Description |
|---|---|---|---|
enable | bool | false | Enable health checking |
method | string | GET | HTTP method for health check |
path | string | /health | Health check endpoint path |
status | array | [200] | Valid HTTP status codes |
interval | int | 30 | Check interval in seconds |
timeout | int | 5 | Request timeout in seconds |
ok | bool | false | Always consider healthy (skip checks) |
Examples
Complete Example: Round-Robin with Health Checks
routes:
- name: api-service
path: /api
backend:
service:
algorithm: round-robin
servers:
- name: api1.example.com
port: 8080
- name: api2.example.com
port: 8080
- name: api3.example.com
port: 8080
request:
headers:
X-Service: api-service
X-API-Version: v1
response:
headers:
X-Powered-By: api-gateway
health_check:
method: GET
path: /health
status: [200, 201]
interval: 30
timeout: 5Example: Weighted Distribution
routes:
- name: weighted-api
path: /api/weighted
backend:
service:
algorithm: weighted
servers:
- name: api-small.example.com
port: 8080
weight: 1 # 20% of traffic
- name: api-medium.example.com
port: 8080
weight: 2 # 40% of traffic
- name: api-large.example.com
port: 8080
weight: 2 # 40% of trafficExample: Session Persistence with IP Hash
routes:
- name: session-service
path: /session
backend:
service:
algorithm: ip-hash
servers:
- name: session1.example.com
port: 8080
- name: session2.example.com
port: 8080Behavior
Server Selection
- Only healthy and enabled servers are considered
- If no healthy servers are available, the request returns a 503 error
- Health checks run asynchronously and don't block requests
Configuration Merging
- Global configuration is applied to all servers
- Server-level configuration overrides global configuration
- Merged configurations include:
- Request headers and query parameters
- Response headers
- Authentication settings
- Health check settings
Backward Compatibility
- Existing single-server configurations work without modification
- Single-server mode is automatically converted to multi-server mode internally
- All existing features continue to work as before
Best Practices
- Health Checks: Always enable health checks for production deployments
- Algorithm Selection: Choose the algorithm based on your use case:
- Use
round-robinfor even distribution - Use
weightedfor servers with different capacities - Use
least-connectionsfor long-lived connections - Use
ip-hashfor session persistence
- Use
- Server Weights: Set appropriate weights based on server capacity
- Monitoring: Monitor server health and traffic distribution
- Gradual Rollout: Use weighted distribution for gradual traffic migration
Troubleshooting
No Healthy Servers
If all servers are unhealthy:
- Check health check configuration
- Verify health check endpoints are accessible
- Review health check logs
Uneven Traffic Distribution
- For weighted algorithm, verify weights are set correctly
- Check if some servers are being marked as unhealthy
- Verify all servers are not disabled
Session Issues with IP Hash
- Ensure
X-Forwarded-FororX-Real-IPheaders are set correctly - Verify client IPs are being extracted properly
- Check if load balancer is preserving client IPs
Next Steps
- Configuration - Complete configuration reference
- Health Check - Health check configuration
- Examples - More configuration examples