Rate limiting plugin
Package: github.com/go-zoox/api-gateway/plugin/ratelimit
The rate limiting plugin is registered when either the global rate_limit.enable flag is true or at least one route sets rate_limit.enable. It runs in OnRequest before traffic reaches backends and can return 429 Too Many Requests when a client exceeds its quota.
Features
- Keys: IP (with
X-Forwarded-For/X-Real-IPsupport), user id (Bearer /X-User-ID), API key (X-API-Key,Authorization: ApiKey …, orapi_keyquery), client id (X-Client-IDorclient_idquery), or a custom header. - Algorithms:
token-bucket,leaky-bucket,fixed-window. - Counters: Stored only via
zoox.Application.Cache()(set top-levelcachein YAML for Redis; otherwise the framework’s in-memory KV). - Scope: Global defaults plus per-route overrides.
Configuration
Global
cache:
host: redis.example.com
port: 6379
# ...
rate_limit:
enable: true
algorithm: token-bucket
key_type: ip
limit: 100
window: 60
burst: 20
message: "Rate limit exceeded"Per route
Route-level rate_limit overrides the global policy for matching paths.
routes:
- name: user-service
path: /v1/user
rate_limit:
enable: true
algorithm: token-bucket
key_type: user
limit: 10
window: 60
burst: 5
backend:
service:
name: user-serviceField reference (rate_limit)
YAML keys use snake_case (e.g. key_type). Required? is whether you must set the field for an effective rate-limit policy; Default is the value when the field is omitted (from struct tags / zero values). The Summary column is short; each field has a dedicated section under Field details.
| Field | Required? | Default | Summary |
|---|---|---|---|
limit | Yes | — | Max requests counted per active window for this policy. Details |
window | Yes | — | Window length in seconds; drives refill/leak behaviour with limit. Details |
enable | No | false | Turns the plugin on for global and/or route scope. Details |
algorithm | No | token-bucket | Which limiter implementation runs (token-bucket, leaky-bucket, fixed-window). Details |
key_type | No | ip | How the per-client rate-limit key is derived from the request. Details |
key_header | No | (empty) | Header name when key_type is header. Details |
burst | No | 0 | Token-bucket bucket capacity; other algorithms usually ignore it. Details |
message | No | Too Many Requests | Plain-text body returned with HTTP 429 when blocked. Details |
headers | No | (empty map) | Extra HTTP headers attached only to 429 responses. Details |
Where to put fields: Use the top-level rate_limit: block for defaults; add rate_limit: under a route to override for paths that match that route (see Route matching precedence).
Field details
Each subsection describes one YAML field: meaning, default, usage, and an example snippet.
limit
- Meaning: Maximum number of requests allowed within one policy window for a single rate-limit key (see
key_type). The exact semantics depend onalgorithm, but it always acts as the primary quota number (e.g. sustained rate or fixed-window cap). - Default: No default — the field is required for an effective policy. If set to zero or negative, the policy is skipped (fail-open).
- Usage: Set higher limits on trusted routes or admin APIs; tighten per-route overrides for expensive endpoints. Pair with
windowto express “N requests per window seconds”. - Example: Allow at most 100 counted requests per active window (with
windowdefining the window length):
rate_limit:
enable: true
limit: 100
window: 60window
- Meaning: Length of the policy time window in seconds (integer). With token-bucket, refill rate is
limit / window. With fixed-window, counts reset when the stored window expires. With leaky-bucket, leak rate uses the same ratio. - Default: No default — required. If zero or negative, the policy is skipped.
- Usage: Short windows react quickly to bursts; long windows smooth traffic over minutes or hours. Must be consistent with how clients retry (see
Retry-After). - Example: A 60-second window with
limit: 100caps at 100 requests per minute (behaviour varies slightly by algorithm):
rate_limit:
enable: true
limit: 100
window: 60enable
- Meaning: Enables the plugin for the scope where it appears: the root
rate_limitblock (global defaults) or a route’srate_limit(override for matching paths). - Default:
false. The plugin is registered only if at least onerate_limit.enable: trueexists (global or any route). - Usage: Turn on globally and disable specific routes only if your schema supports it; or leave global off and enable only sensitive routes.
- Example: Enable only on one route (global
rate_limitomitted orenable: false):
routes:
- path: /v1/expensive
rate_limit:
enable: true
limit: 20
window: 60algorithm
- Meaning: Which limiter implementation runs. Values:
token-bucket(default),leaky-bucket,fixed-window— see Algorithms (summary). Unknown values fall back to token-bucket in the factory. - Default:
token-bucketwhen omitted. - Usage: Use fixed-window for simple counting in cache; token-bucket for refill + burst; leaky-bucket for smoothing (rate-like behaviour).
- Example: Use a simple fixed window for an IP-keyed public API:
rate_limit:
enable: true
algorithm: fixed-window
key_type: ip
limit: 50
window: 60key_type
- Meaning: How the per-client rate-limit key is derived. Values:
ip,user,apikey,clientid,header. Any other string is treated likeip. - Default:
ipwhen omitted. - Details:
ip— firstX-Forwarded-Forhop, thenX-Real-IP, thenRemoteAddr.user—Authorization: Bearertoken value, thenX-User-ID; else falls back likeip.apikey—X-API-Key, thenAuthorization: ApiKey …, then queryapi_key; else IP.clientid—X-Client-ID(wins if set), else queryclient_id; else IP.header— useskey_header; if empty, falls back likeip. - Usage: Choose
ipfor anonymous traffic;userorapikeyfor authenticated quotas;clientidfor first-class client ids;headerfor tenancy or other custom dimensions. - Example (API key): header
X-API-Key(and fallbacks) as the key:
rate_limit:
enable: true
key_type: apikey
limit: 1000
window: 3600- Example (
clientid):X-Client-ID, or queryclient_idif the header is absent:
rate_limit:
enable: true
key_type: clientid
limit: 200
window: 60key_header
- Meaning: HTTP header name (not value) when
key_type: header. The rate-limit key includes that header’s value. - Default: Empty string. With
key_type: headerand an empty name, extraction falls back likeip. - Usage: Set to stable tenant or client identifiers (e.g.
X-Tenant-ID). Avoid highly volatile headers unless intentional. - Example: One quota bucket per tenant id carried in
X-Tenant-ID:
rate_limit:
enable: true
key_type: header
key_header: X-Tenant-ID
limit: 500
window: 60burst
- Meaning: For token-bucket, maximum bucket capacity (burst size). Refill rate stays
limit / window. Ifburst≤ 0 or omitted, capacity defaults tolimit. Leaky-bucket / fixed-window may ignore this field. - Default:
0(meaning “uselimitas bucket capacity” for token-bucket). - Usage: Set
burstgreater thanlimitonly when you want a larger short-term spike than sustainedlimit/windowalone. - Example: Sustain ~10 req/s (
limit: 10,window: 1) but allow up to 50 concurrent burst tokens:
rate_limit:
enable: true
algorithm: token-bucket
limit: 10
window: 1
burst: 50message
- Meaning: Response body when the gateway returns 429 Too Many Requests for rate limiting.
- Default:
Too Many Requestswhen omitted or empty (depending on gateway handling of empty strings). - Usage: Use plain text or a JSON string your clients parse consistently with other errors.
- Example: Return a small JSON payload (quote the whole value in YAML):
rate_limit:
enable: true
limit: 60
window: 60
message: '{"error":"too_many_requests","retry":true}'headers
- Meaning: Extra response headers sent only on 429, in addition to
X-RateLimit-*/Retry-Afterwhen the writer is available. - Default: Empty map — no extra headers.
- Usage: Add policy names, hints, or correlation ids — never secrets.
- Example: Attach a stable policy label for monitoring:
rate_limit:
enable: true
limit: 100
window: 60
headers:
X-Rate-Policy: standard-tierRoute matching precedence
When multiple routes define rate limits:
- Candidate routes are sorted by path length (longest first).
- The request path must exactly match a route path, or match as a prefix where the next character is
/(so/usersmatches/users/123but not/users-extra).
Longer paths win over shorter prefixes (for example /api/v1 wins over /api for /api/v1/foo).
Algorithms (summary)
| Algorithm | Behaviour |
|---|---|
token-bucket | Allows bursts up to burst; refills against limit / window. |
leaky-bucket | Smooth throughput; burst is not used as extra capacity in the same way as token bucket. |
fixed-window | Simple counting within a rolling window (backed by Application.Cache()). |
Cache / KV backend
Counters live in zoox.Application.Cache() — the same cache.Cache instance the framework builds from Config.Cache (cache.New). Gateway prepare() writes Config.Cache when your YAML sets cache (e.g. Redis). If you omit cache, zoox still exposes Application.Cache() using its default in-memory KV engine (not a separate plugin-owned map).
Counters use newCacheStorage(app.Cache()) only (no alternate backend selector in config).
Response headers
Successful passes and many responses include:
X-RateLimit-LimitX-RateLimit-RemainingX-RateLimit-Reset(Unix timestamp)
When returning 429, the gateway also sets Retry-After (seconds) when possible, plus any custom keys from headers.
If the algorithm or cache layer returns an error, the plugin allows the request (fail-open) and logs the error.