Rate Limit Use Cases

Discover when requests to a tenant are rate-limited

There are a handful of ways to determine if a customer product is being rate limited by Auth0. See below for the possible causes of rate limitation.

Tenant logs

You can use tenant logs for more information on request limits. The api_limit is exported in the logs immediately after a rate limit is exceeded. If the rate limit is still being exceeded after an hour, then a second log is created. This enables rate limit detection in both Auth0 or customer-built applications.

To learn more about tenant logs, read Logs.

API responses

Auth0 API responses deliver HTTP 429 (Too Many Requests) responses with the exceeded rate limit. This enables customers to observe rate limit enforcement in real time. However, this is only useful for custom-built customer applications interacting directly with the Auth0 API.

SDK error handling

If you are using an SDK, refer to the Management API SDK libraries error pages.

Error pages

The error page response is sent for endpoints that render HTML content to the end user. If your tenant is configured to use generic (Auth0 hosted) pages, Auth0 renders the error page instead of the expected content when you exceed the response limit.  If your tenant is configured to use custom error pages, the user is redirected to the custom error page URL with the relevant error in the error_description query string parameter.  For more information, see Affected endpoints and the JSON Error descriptions.

Find out why a tenant is being rate limited

If you believe tenant requests are rate limited and need assistance to understand why, open a request via the Support Center.  As part of your request, please include the full raw log where the issue was seen.

Predict when requests to a tenant will be rate limited 

Auth0 reports up-to-date information on the current status of your rate limits using HTTP response headers from endpoints that have rate limit policies configured. This status is communicated as follows:

  • x-ratelimit-limit: Maximum number of requests available until the bucket is refilled with additional requests.

  • x-ratelimit-remaining: Number of remaining requests available until the bucket is refilled with additional requests.

  • x-ratelimit-reset: UNIX timestamp, in milliseconds, of the expected time when additional requests will be added to the bucket.

For example:

An API has the following rate limit:

  • Burst Limit: 1000

  • Sustained Rate Limit: 100 requests per second (on a sliding window)

From this information, you can derive:

  • The sustained rate limit is 100 requests per second on a sliding window

  • 1000 milliseconds = 1 second 

  • Due to the sliding window, a new request is granted every 10 milliseconds (1000 milliseconds divided by 100 requests = 10 milliseconds for a request to be granted).

If you receive the following x-headers in your API response:

  • x-ratelimit-limit: 1000

  • x-ratelimit-remaining: 50

  • x-ratelimit-reset: 1675452600000

You now know that:

  • Your tenant has used up 950 of the 1000 requests allowable to that API, and only has 50 requests remaining until such time that additional requests are added.

  • New requests will be added at 1675452600000, or 7:30:00.000 PM UTC on February 3, 2023.

  • 1 new request will be added at that time 

Therefore, if you are making requests at a rate greater than what is described above, then a rate limitation is expected.  How soon you will be rate limited depends on the burst limit and to what extent you are exceeding the sustained limit.

Examples of how rate limits are enforced

Requests per second example

Assume Auth0 launches a new API called /ratelimitexample with the following rate limit values:

  • Burst limit: 5 requests

  • Sustained rate limit: 10 requests per second.

Example scenario:

Key points:

  • API begins with, and will never exceed, 5 request tokens, which is equal to the burst limit.

  • 10 new request tokens are added every second, using a “sliding window.”  10 new tokens are added at equal, incremental intervals over the course of each second.  There are 1000 milliseconds in 1 second, and we can determine that a new request token will be added by 1000 milliseconds / 10 requests = 100 milliseconds / request.

 Example scenario with rate limits:  

In this scenario:

  • T0 - T100ms:  The end user makes 6 requests in the first 100 milliseconds.  5 requests – equal to the burst limit – receive a 200 response.  The 6th request receives a 429 error because there are no remaining request tokens.

  • T100 ms - T200ms:  Auth0 adds a new request token, due to the sliding window. A new request is added at a rate of 10 RPS, or 1 token every 100 ms.  The 7th request exhausts the remaining request but is successful.  The eighth request therefore results in a 429 error.

  • T200ms - T300ms:  Auth0 adds a new request token, and the next request receives a 200 response.

Requests per minute example

Assume Auth0 launches a new API called /ratelimitexample2 with the following rate limit values:

  • Burst limit:  5 requests

  • Sustained Rate limit:  6 requests per minute.

Example scenario:

Key points:

  • API begins with 5 request tokens, which is equal to the burst limit.

  • 6 new request tokens are added every minute, using a “sliding window”.  6 new tokens are added at equal, incremental intervals over the course of each minute.  There are 60 seconds in 1 minute, and a new request token will be added 60 seconds / 6 request tokens = 10 seconds.

Example scenario with rate limits:  

In this scenario:

  • T0 - T-10 seconds:  The end user makes 6 requests in the first second.  5 requests – equal to the burst limit – receive a 200 response.  The 6th request receives a 429 error because there are no remaining request tokens.

  • T-10 seconds - T-20 seconds:  Auth0 adds a new request token. Due to the sliding window a new request is added at a rate of 6 RPM, or 1 token every 10 seconds.  The 7th request exhausts the remaining request token but is successful.  The eighth request therefore results in a 429 error.

  • T-20 seconds - T-30 seconds:  Auth0 adds a new request token, and the next request receives a 200 response.

Other Scenarios

On occasion, Auth0 will assign two rate limits to a single API.  This is done in order to configure a burst limit and the sustained rate limit that is more customized to the needs of the service.  In effect, the first rate limit becomes the effective burst limit, and the second rate limit becomes the effective sustained rate limit.  In this scenario, Auth0 only publishes the effective burst and sustained rate limits, rather than communicating the actual burst and sustained rate limits.