Tiered API Rate Limits in Rails 8.2 with Dynamic rate_limit
Note: This feature is coming in Rails 8.2. It’s merged to main but not yet released. You can view the source on GitHub or try it by pointing your Gemfile at the main branch.
Since Rails 8.0, you’ve had a rate_limit macro in your controllers. The catch was that to: and within: had to be hardcoded numbers, which made tiered pricing awkward. Rails 8.2 lets both options take procs or method names, so you can finally do “Free 100/min, Pro 1k/hour, Enterprise unlimited” without bringing in Rack::Attack.
Why rate_limit was static before Rails 8.2
Before 8.2, rate_limit was static-only:
class Api::BaseController < ApplicationController
rate_limit to: 100, within: 1.minute
end
Every user got the same limit. If you wanted per-plan tiers before, your options were:
- Skip
rate_limitand reach forRack::Attack, where the per-user lookup happens before Rails has loadedcurrent_user. - Hand-roll a
before_actionthat counted requests in the cache and rendered 429 yourself. - Define one controller per tier and route subclasses based on the plan.
All three duplicated logic Rails already had. The piece that was missing was the ability to compute the limit from request context.
How dynamic rate_limit works in Rails 8.2
In Rails 8.2 (PR #56128), to: and within: accept procs and method names alongside static values, just like by: already did. The rules are simple:
- A symbol is dispatched as a no-argument method on the controller (
send(:max_requests)). - A proc or lambda runs as if it were a method on the controller, so
current_user,params,request, andsessionare all in scope. Procs take no arguments.
That’s it. The rest of the post is what you can build with it.
Pattern 1: Free / Pro / Enterprise Tiers
If you’re building multi-tenant SaaS, this is the pattern you’ve been waiting for. Here’s how it looks:
class Api::BaseController < ApplicationController
before_action :authenticate_api_user
rate_limit to: :max_requests,
within: :rate_window,
by: -> { current_user.id }
private
def max_requests
case current_user.plan
when "enterprise" then 10_000
when "pro" then 1_000
else 100
end
end
def rate_window
current_user.plan == "free" ? 1.minute : 1.hour
end
end
On every request, Rails calls those two methods on the controller, so the limit reflects whatever plan the user is on right now. The by: proc keys the counter on the user’s ID, giving each user their own bucket.
Testing tiered limits
Here is how you test rate limit configurations. Two things to handle: clear the cache between runs, and time-travel across the window.
require "test_helper"
class Api::BaseControllerTest < ActionDispatch::IntegrationTest
include ActiveSupport::Testing::TimeHelpers
setup { Rails.cache.clear }
test "free users hit 100 requests per minute" do
sign_in_as users(:free_user)
100.times { get api_widgets_url }
assert_response :success
get api_widgets_url
assert_response :too_many_requests
travel 1.minute do
get api_widgets_url
assert_response :success
end
end
end
For parallel test runs, give each worker its own cache, or swap to :null_store and assert the macro is configured rather than asserting on the count.
Pattern 2: Burst + Sustained Limits
1,000 requests an hour sounds generous until a user fires all of them in three seconds. Layer a burst limit on top of the sustained one:
class Api::BaseController < ApplicationController
before_action :authenticate_api_user
rate_limit to: :sustained_limit,
within: 1.hour,
name: "sustained",
by: -> { current_user.id }
rate_limit to: :burst_limit,
within: 10.seconds,
name: "burst",
by: -> { current_user.id }
private
def sustained_limit = current_user.plan == "pro" ? 1_000 : 100
def burst_limit = current_user.plan == "pro" ? 50 : 10
end
Once you have more than one rate_limit in a controller, name: becomes required. That’s what tells Rails which counter is which in the cache.
Pattern 3: Anonymous vs Authenticated
If your endpoint is public, there’s no current_user to key on. Drop down to IP and a lower ceiling. The by: proc handles both cases in one shot, and the same trick works for API keys:
class Api::PublicController < ApplicationController
rate_limit to: -> { current_user ? 1_000 : 20 },
within: 1.minute,
by: -> { current_user&.id || request.remote_ip }
end
class Api::KeyAuthedController < ApplicationController
rate_limit to: 5_000,
within: 1.hour,
by: -> { request.headers["X-Api-Key"] }
end
A gotcha to be aware of: if you’re behind Cloudflare or another proxy, request.remote_ip will be the proxy’s IP unless config.action_dispatch.trusted_proxies is set. Skip that and every anonymous request hashes to the same bucket, which makes your rate limit fairly useless.
Pattern 4: Friendly 429 Responses (with Instrumentation)
By default, hitting the limit raises ActionController::TooManyRequests, which Action Dispatch returns as a plain 429. That’s fine for HTML, but API clients usually want JSON and a Retry-After header. While you’re customizing the response, the same with: block is a good place to fire an ActiveSupport::Notifications event for your APM, since the macro doesn’t emit one on its own.
class Api::BaseController < ApplicationController
rate_limit to: :max_requests,
within: :rate_window,
by: -> { current_user.id },
with: -> {
ActiveSupport::Notifications.instrument(
"rate_limit.exceeded",
user_id: current_user.id,
plan: current_user.plan,
controller: self.class.name
)
response.headers["Retry-After"] = rate_window.to_i.to_s
render json: {
error: "rate_limit_exceeded",
plan: current_user.plan,
retry_after: rate_window.to_i
}, status: :too_many_requests
}
end
Subscribe to rate_limit.exceeded in an initializer and you’ve got a feed for Datadog, Honeycomb, or wherever your traffic dashboards live. Clients (or your background jobs using smart retry strategies) can read Retry-After and back off accordingly.
Rails rate_limit caveats and edge cases
A few things the macro doesn’t tell you up front:
-
Cache store choice.
rate_limitusesRails.cache.:memory_storeis per-process, so two Puma workers won’t share counts;:file_storewon’t work across servers. In production, use:solid_cache_store,:redis_cache_store, or:mem_cache_store. With any of those,cache_store.incrementis atomic, so concurrent requests across multiple app servers stay correct without extra coordination. -
Fixed-window, not sliding. Counters reset on window boundaries. A user can fire
limitrequests at 12:59:59 and anotherlimitat 13:00:00. If that bothers you, layer a tighter burst limit (Pattern 2). -
Plan upgrades take effect on the next window. The limit is recomputed each request, but the in-flight window’s count carries over. A free user at 99/100 who upgrades to Pro will still get 429’d on their next request that minute. Including
current_user.planin theby:key sidesteps this if you really need an instant upgrade. -
Procs run in the controller’s binding.
current_user,params, andrequestwork; local variables defined where you wroterate_limitdo not. Stick to controller methods and instance state. -
Symbol methods take no arguments. They’re dispatched via
send(name), so just return the value. Don’t try to accept the request as a parameter. -
Cross-controller shared limits. Pass
scope:to make multiple controllers share a bucket. Handy for “all writes count against one limit” setups. -
Rack::Attackstill has a place. Keep it for IP allow/blocklists, fail2ban-style abuse rules, and anything that needs to run before Rails has booted the request.rate_limitis the right tool for per-controller, per-user, plan-aware policy where you already have controller context. -
Performance. A method call or proc invocation per request is in the microsecond range. Don’t waste time worrying about it.
FAQ
How do I rate-limit Rails API requests by user plan?
Pass a method name or proc to the to: and within: options of rate_limit and have it return values based on current_user.plan. Rails 8.2 evaluates these on every request, so the limit reflects the user’s current plan.
Can I have multiple rate_limit declarations in one controller?
Yes. Add a unique name: to each one (for example name: "burst" and name: "sustained"). Without a name, Rails raises when you declare more than one in the same controller.
Does Rails rate_limit work across multiple application servers?
Yes, as long as the cache store is shared. Solid Cache, Redis, and Memcached all support atomic increments, so concurrent requests across servers stay correctly counted. The default :memory_store is per-process and won’t work across servers or even multiple Puma workers.
What’s the difference between Rails rate_limit and Rack::Attack?
rate_limit runs inside the Rails request cycle, so it has access to current_user, controller methods, and routing context. Rack::Attack runs at the Rack layer before Rails boots the request, which makes it the right tool for IP allow/blocklists and abuse rules. Use both: Rack::Attack for low-level IP filtering, rate_limit for per-user, plan-aware policy.
How do I send a Retry-After header from rate_limit?
Use the with: option to provide a custom callback that sets response.headers["Retry-After"] and renders your 429 response. The default behavior raises ActionController::TooManyRequests, which produces a plain 429 with no header.
Tiered rate limits in production
This change makes rate_limit a lot more useful. What used to be a middleware concern (or a custom before_action) now fits into a few lines on your Api::BaseController. Tiered pricing, burst plus sustained limits, anonymous fallbacks, API-key buckets, observability hooks, all of it just lives in the macro now.
The implementation lives in PR #56128 if you want to read the source. If you’re shipping a tiered API for real, pair this with smart retry strategies so your clients respect the Retry-After headers you’re now returning. And if you’re keying off API tokens instead of session users, bearer token authentication works well with this.