API Rate Limiting Guide | Algorithms, Redis Implementation, and Best Practices

API Rate Limiting Guide | Algorithms, Redis Implementation, and Best Practices

이 글의 핵심

Rate limiting protects your API from abuse, ensures fair use, and prevents cascading failures. This guide covers the three main algorithms, Redis-backed implementation, and production patterns for Node.js APIs.

Why Rate Limiting Matters

Without rate limiting:

  • A single script can exhaust your database connections
  • One angry user can DoS your API
  • Credential stuffing attacks run unhindered
  • You have no way to enforce pricing tiers

Rate Limiting Algorithms

1. Fixed Window

Count requests in a fixed time window (e.g., 100 requests per minute).

Minute 0:00-1:00 → 100 requests allowed
Minute 1:00-2:00 → counter resets, 100 more allowed

Problem: A user can make 100 requests at 0:59 and 100 more at 1:01 — 200 requests in 2 seconds.

// Redis implementation
async function fixedWindowLimit(key, limit, windowSeconds) {
  const now = Math.floor(Date.now() / 1000)
  const windowKey = `ratelimit:${key}:${Math.floor(now / windowSeconds)}`

  const count = await redis.incr(windowKey)
  if (count === 1) await redis.expire(windowKey, windowSeconds)

  return { allowed: count <= limit, count, limit }
}

2. Sliding Window Log

Track exact timestamps of each request. Count requests in the last N seconds.

async function slidingWindowLog(key, limit, windowSeconds) {
  const now = Date.now()
  const windowStart = now - windowSeconds * 1000
  const logKey = `ratelimit:log:${key}`

  const pipeline = redis.pipeline()
  pipeline.zremrangebyscore(logKey, 0, windowStart)  // remove old entries
  pipeline.zadd(logKey, now, `${now}-${Math.random()}`)  // add current
  pipeline.zcard(logKey)                             // count in window
  pipeline.expire(logKey, windowSeconds)
  const results = await pipeline.exec()

  const count = results[2][1]
  return { allowed: count <= limit, count, limit }
}

Accurate but memory-intensive — stores one entry per request.

A bucket fills with tokens at a constant rate. Each request consumes one token. Allows bursts up to bucket capacity.

// Token bucket with Redis
async function tokenBucket(key, capacity, refillRate, refillSeconds) {
  const now = Date.now() / 1000  // seconds
  const bucketKey = `ratelimit:bucket:${key}`

  // Lua script for atomicity
  const script = `
    local key = KEYS[1]
    local capacity = tonumber(ARGV[1])
    local refill_rate = tonumber(ARGV[2])
    local refill_seconds = tonumber(ARGV[3])
    local now = tonumber(ARGV[4])

    local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
    local tokens = tonumber(bucket[1]) or capacity
    local last_refill = tonumber(bucket[2]) or now

    -- Refill tokens based on elapsed time
    local elapsed = now - last_refill
    local new_tokens = math.min(capacity, tokens + (elapsed * refill_rate / refill_seconds))

    if new_tokens < 1 then
      redis.call('HMSET', key, 'tokens', new_tokens, 'last_refill', now)
      redis.call('EXPIRE', key, refill_seconds * 2)
      return {0, math.ceil((1 - new_tokens) * refill_seconds / refill_rate)}
    end

    redis.call('HMSET', key, 'tokens', new_tokens - 1, 'last_refill', now)
    redis.call('EXPIRE', key, refill_seconds * 2)
    return {1, 0}
  `

  const [allowed, retryAfter] = await redis.eval(
    script, 1, bucketKey, capacity, refillRate, refillSeconds, now
  )

  return { allowed: allowed === 1, retryAfter }
}

// Usage: 100 requests per minute, burst up to 20
await tokenBucket('user:123', 20, 100, 60)

Express Middleware

Using express-rate-limit (simple)

npm install express-rate-limit rate-limit-redis
import rateLimit from 'express-rate-limit'
import { RedisStore } from 'rate-limit-redis'
import { redisClient } from './redis'

// Global rate limit
const globalLimit = rateLimit({
  windowMs: 15 * 60 * 1000,  // 15 minutes
  max: 1000,
  standardHeaders: true,     // X-RateLimit-* headers
  legacyHeaders: false,
  store: new RedisStore({ sendCommand: (...args) => redisClient.sendCommand(args) }),
  message: { error: 'Too many requests', retryAfter: 900 },
})

// Strict limit for auth endpoints
const authLimit = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 10,
  skipSuccessfulRequests: true,  // only count failed attempts
  message: { error: 'Too many login attempts' },
})

app.use(globalLimit)
app.post('/auth/login', authLimit, loginHandler)
app.post('/auth/register', authLimit, registerHandler)

Custom middleware with per-user limits

// Different limits per subscription tier
const TIER_LIMITS = {
  free: { requests: 100, window: 3600 },     // 100/hour
  pro: { requests: 1000, window: 3600 },     // 1000/hour
  enterprise: { requests: 10000, window: 3600 }, // 10000/hour
}

async function rateLimitMiddleware(req, res, next) {
  const user = req.user  // set by auth middleware
  const tier = user?.subscriptionTier ?? 'free'
  const { requests, window } = TIER_LIMITS[tier]

  const key = user ? `user:${user.id}` : `ip:${req.ip}`
  const { allowed, count, retryAfter } = await slidingWindowLimit(key, requests, window)

  // Set standard headers
  res.setHeader('X-RateLimit-Limit', requests)
  res.setHeader('X-RateLimit-Remaining', Math.max(0, requests - count))
  res.setHeader('X-RateLimit-Reset', Math.floor(Date.now() / 1000) + window)

  if (!allowed) {
    res.setHeader('Retry-After', retryAfter ?? window)
    return res.status(429).json({
      error: 'Rate limit exceeded',
      limit: requests,
      window: `${window}s`,
      retryAfter: retryAfter ?? window,
    })
  }

  next()
}

Per-Endpoint Rate Limiting

// Different limits for different operations
const limits = {
  // Expensive operations
  '/api/generate-report': { requests: 5, window: 3600 },    // 5/hour
  '/api/export': { requests: 10, window: 3600 },             // 10/hour

  // API key endpoints
  '/api/v1/': { requests: 10000, window: 3600 },            // 10k/hour

  // Public endpoints
  '/api/posts': { requests: 500, window: 3600 },             // 500/hour
}

function getLimit(path) {
  for (const [pattern, limit] of Object.entries(limits)) {
    if (path.startsWith(pattern)) return limit
  }
  return { requests: 1000, window: 3600 }  // default
}

Handling Rate Limit Responses (Client Side)

// JavaScript fetch client with automatic retry
async function fetchWithRetry(url, options = {}, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options)

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('Retry-After') ?? '60')
      console.warn(`Rate limited. Retrying in ${retryAfter}s`)
      await new Promise(r => setTimeout(r, retryAfter * 1000))
      continue
    }

    return response
  }
  throw new Error('Max retries exceeded')
}

// Axios interceptor
axios.interceptors.response.use(null, async (error) => {
  if (error.response?.status === 429) {
    const retryAfter = error.response.headers['retry-after'] ?? 60
    await new Promise(r => setTimeout(r, retryAfter * 1000))
    return axios.request(error.config)
  }
  return Promise.reject(error)
})

Production Patterns

IP-based vs user-based

function getRateLimitKey(req) {
  // Authenticated users: rate limit by user ID (fair per account)
  if (req.user?.id) return `user:${req.user.id}`

  // API keys: rate limit by key
  if (req.headers['x-api-key']) return `apikey:${req.headers['x-api-key']}`

  // Unauthenticated: rate limit by IP
  // Note: use X-Forwarded-For carefully behind proxies
  const ip = req.headers['x-forwarded-for']?.split(',')[0].trim() ?? req.ip
  return `ip:${ip}`
}

Bypass for trusted clients

const TRUSTED_IPS = new Set(['10.0.0.1', '10.0.0.2'])

function rateLimitMiddleware(req, res, next) {
  if (TRUSTED_IPS.has(req.ip)) return next()
  if (req.user?.isAdmin) return next()
  // ... apply rate limiting
}

Cloudflare / Nginx rate limiting (infrastructure layer)

# nginx.conf — coarse rate limiting before app
limit_req_zone $binary_remote_addr zone=api:10m rate=100r/m;

location /api/ {
  limit_req zone=api burst=20 nodelay;
  limit_req_status 429;
  proxy_pass http://app;
}

Response Headers Reference

X-RateLimit-Limit: 1000          # requests allowed in window
X-RateLimit-Remaining: 847       # requests remaining
X-RateLimit-Reset: 1713312000    # Unix timestamp when limit resets
Retry-After: 3600                # seconds until client can retry (on 429)

Key Takeaways

AlgorithmBest for
Fixed windowSimple, high-traffic counters
Sliding window logAccurate limits, lower traffic
Token bucketAPIs that allow bursts (recommended default)
PatternRecommendation
State storeRedis (shared across servers)
Rate byUser ID > API key > IP
Status code429 with Retry-After header
InfrastructureNginx/Cloudflare for DDoS, app for per-user
Different limitsPer endpoint, per tier, per operation type

Implement rate limiting early — retrofitting it onto a production API is painful. Start with express-rate-limit + Redis for simplicity, then implement custom token bucket logic if you need tier-based limits or burst control.