Skip to content

Rate limiting

Disabled by default. Add a rate_limit: block to a route — or to global: as for all the routes — to cap the request rate per client.

routes:
  - match:
      paths: [/login]
    upstream: http://app:8000
    rate_limit:
      rps: 2
      source:
        type: ip

A request that exceeds the budget is rejected with 429 Too Many Requests and a Retry-After: 1 header. Allowed requests pass through with no measurable overhead.

How it works

Each client is allowed rps requests in any 1-second window. Requests beyond that are blocked until older ones fall out of the window. The "client" is whatever source is set to — usually an IP address, but it can also be the value of a header (an API key, a user ID, a forwarded IP).

A worked example: protecting a login page

A common use is throttling /login to stop credential-stuffing attacks — an attacker trying thousands of username/password combinations in a tight loop. A real user submits the form once or twice; an attacker submits hundreds of times per second.

routes:
  - match:
      paths: [/login]
    upstream: http://app:8000
    rate_limit:
      rps: 2
      source:
        type: ip

Two login attempts per second, keyed by client IP. Now suppose an attacker hammers /login from a single IP:

Attempt Time Attempts from this IP in the last second Result
1 12:00:00.000 none 200 OK (reaches app)
2 12:00:00.300 1 200 OK (reaches app)
3 12:00:00.600 2 429 Too Many Requests
4 12:00:00.900 2 429 Too Many Requests

Attempts 3 and 4 never reach your application — Barbacana rejects them before the login form is ever processed. The attacker is capped at 2 attempts per second per IP, which makes a brute-force run impractical.

Meanwhile, a legitimate user logging in from a different IP gets the full budget — each IP has its own independent counter. And a real user who mistypes their password and tries again twice is well within the limit; they never see a 429.

Rate limiting runs as the first stage of the pipeline, before request validation and rule evaluation. A blocked request never reaches the upstream.

What happens when a rate is hit

Response:

HTTP/1.1 429 Too Many Requests
Retry-After: 1
Content-Type: application/json

{"error":"blocked","request_id":"01HXYZ..."}

Retry-After: 1 is required by RFC 6585 §4 for 429 responses. The fixed value of 1 second matches the sliding window — by the time the client retries, at least one slot will have freed up.

The block is recorded in the audit log with protection: rate-limit and CWE references CWE-400 (uncontrolled resource consumption) and CWE-770 (allocation of resources without limits). See Logs & SIEM.

Identifying the client

source.type selects how the rate-limit key is derived from the request.

ip — client IP from the connection

rate_limit:
  rps: 100
  source:
    type: ip

The key is r.RemoteAddr with the port stripped. Correct only when Barbacana terminates TLS directly (edge deployment). Behind a load balancer or reverse proxy, every request appears to come from the proxy and the limiter degrades to a single shared bucket — use header source instead.

header — value of a named header

Use this when Barbacana sits behind a proxy or load balancer: the connection IP is the proxy's, so to throttle the real client you have to read the header the proxy injects (typically X-Forwarded-For) instead.

rate_limit:
  rps: 100
  source:
    type: header
    key: X-Forwarded-For

The key is the value of the named header. Typical uses:

  • X-Forwarded-For (or X-Real-IP) behind a trusted reverse proxy that injects the client IP.
  • Authorization to rate-limit per API token.
  • X-Api-Key to rate-limit per tenant key.

If the configured header is absent on the request, the limiter falls back to the connection IP and emits a warning log line. The proxy in front of Barbacana is responsible for ensuring the header is present and trustworthy — Barbacana does not strip a client-supplied header, so an upstream-trusted header must be sanitised by the proxy before it reaches Barbacana.

Trust the source you're identifying

A client that controls the header value can rotate it to bypass the limit. Use header source only when the value is set by infrastructure you control (a load balancer, an authenticating proxy, an API gateway).

Field reference

rate_limit:
  rps: 100                      # required
  source:
    type: ip                    # required: "ip" or "header"
    key: X-Forwarded-For        # required when type == "header"
  backend:                      # optional; defaults below
    type: memory
    max_keys: 100000
    ttl: 10m
Path Type Default Validation
rate_limit.rps int — (required) >= 1
rate_limit.source.type enum — (required) one of ip, header
rate_limit.source.key string required when source.type is header
rate_limit.backend.type enum memory currently must be memory
rate_limit.backend.max_keys int 100000 >= 1; LRU eviction once the cap is reached
rate_limit.backend.ttl duration 10m >= 1s; idle keys are evicted after this period

State, restarts, and horizontal scaling

The memory backend — the only backend available today — keeps all counters in process memory. Two consequences worth knowing before you size a deployment:

Counters reset on restart. A new deploy, a crash, or a restart empties the cache. A client that was at its limit when the process went down starts the next second with a full budget. This is rarely a problem for the use case Barbacana is built for (slowing down attackers) — even an attacker who manages to time a restart only buys a single 1-second window before the new process starts counting them again.

Each instance counts independently. With N replicas behind a load balancer, a single client can use up to N × rps in the worst case. Usually fine for slowing attackers, not a true shared budget. A Redis backend is on the roadmap — the backend: block already accepts the type, so the YAML won't change when it lands.

All this means that rate limiting is not a hard quota system. Don't use it for billing or to enforce contractual API limits.

Global default and per-route override

A rate_limit: block is accepted at both global: and routes[]:. The route-level block replaces the global block entirely — there is no field-level merging. A route with no rate_limit: block inherits the global one; there is no syntax to opt a single route out of a global default, so attach rate_limit: to the routes that need it rather than to global: if some routes should stay unlimited.

global:
  rate_limit:
    rps: 50                     # fleet-wide default
    source:
      type: ip

routes:
  - id: app
    match: { paths: [/*] }
    upstream: http://app:8000
    # inherits the global rate_limit (50/s)

  - id: login
    match: { paths: [/login] }
    upstream: http://app:8000
    rate_limit:                 # replaces the global block — not merged
      rps: 2
      source:
        type: ip

Detect-only mode

When the route is in detect_only mode (or the whole instance via global.mode: detect_only), a request that would have been blocked is forwarded to the upstream and an audit entry is emitted with action: detected. Use this when you need to measure traffic against a candidate rps value before turning blocking on. See Detect-only mode.

Examples

Login throttle at the edge

Barbacana terminates TLS, so r.RemoteAddr is the real client IP. Two login attempts per second per IP — enough for a real user who mistypes, far too few to brute-force a password.

version: v1alpha1
host: app.example.com

routes:
  - match: { paths: [/login] }
    upstream: http://login:8000
    rate_limit:
      rps: 2
      source:
        type: ip

Login throttle behind a load balancer

Barbacana runs behind a reverse proxy. Using source.type: ip would key on the load balancer's address (a single bucket for every login attempt from every user), so read the forwarded header instead.

version: v1alpha1
port: 8080

routes:
  - match: { paths: [/login] }
    upstream: http://login:8000
    rate_limit:
      rps: 2
      source:
        type: header
        key: X-Forwarded-For

The proxy must strip any client-supplied X-Forwarded-For before appending its own — otherwise an attacker can forge the header and rotate the value to bypass the limit.

Stricter limit on /login, looser default everywhere else

A fleet-wide default protects every route from runaway clients; /login gets a much tighter cap because credential stuffing is the threat there.

global:
  rate_limit:
    rps: 50                     # plenty of headroom for normal browsing
    source:
      type: ip

routes:
  - id: app
    match: { paths: [/*] }
    upstream: http://app:8000
    # inherits the 50/s global default

  - id: login
    match: { paths: [/login] }
    upstream: http://login:8000
    rate_limit:                 # overrides the global block — not merged
      rps: 2
      source:
        type: ip

Per-API-key limit

A different shape: rate-limit each authenticated tenant regardless of which IP they call from. Useful for an API where you publish per-customer quotas.

version: v1alpha1
port: 8080

routes:
  - id: api
    match: { paths: ["/v1/*"] }
    upstream: http://api:8000
    rate_limit:
      rps: 20
      source:
        type: header
        key: X-Api-Key