Class RateLimiter


  • public class RateLimiter
    extends Object
    This class is used to limit the rate of requests to the chat service. The idea is that we set a certain limit of the number of requests in a given time period, do not limit the first half of the request count in that time period, but then delay requests if it looks like the limit would be exceeded at the end of that time period, beginning freshly after it has passed. We can make a chain of RateLimiters - e.g. one for the current minute, one for the current hour, one for the current day. Also it can be chained to one for the whole service (that could be split similarly).

    The rate limiting is introduced for several reasons: first, OpenAI poses some limits by itself, then we want to limit DOS attacks, and, last but not least, each request has a price, even if it's much less than a cent.

    • Field Detail

      • PATTERN_LIMIT_ERROR

        protected static final Pattern PATTERN_LIMIT_ERROR
    • Constructor Detail

      • RateLimiter

        public RateLimiter​(@Nullable
                           RateLimiter parent,
                           @Nonnegative
                           int limit,
                           @Nonnegative
                           int period,
                           @Nonnull
                           TimeUnit timeUnit)
        Constructs a rate limiter with a parent.
        Parameters:
        parent - if set, this is also observed
        limit - the number of requests allowed in the given time period (the first half of the requests are not limited, the second half might be limited) (if the parent is set, the limit is applied to the sum of requests from this and the parent) (if the parent is not set, the limit is applied to the requests from this only)
        period - the time period in the given time unit
        timeUnit - the time unit of the time period
    • Method Detail

      • waitForLimit

        public void waitForLimit()
        For a synchronous call: waitForLimit() has to be called before starting the request; it'll make sure the rate limit is not exceeded. We are using a soft limit here, i.e. we allow the first half of the requests in the time period to go through without any delay, but then we delay requests if it looks like the limit would be exceeded at the end of the time period. That way the user is served promptly if he makes only few requests, but if he makes more requests he has not to wait for the whole time period to pass, but only a minimum time so that the requests are spaced out evenly over the rest of the period.

        Specifically, we make sure that the user never has to wait more than 2 * periodDurationMillis / limit for the next request.

      • getCurrentTimeMillis

        protected long getCurrentTimeMillis()
        Provides the possibility to fake time, for easy unittests.
      • of

        @Nullable
        public static RateLimiter of​(String errorbody)
        Tries to find something like "Limit: 3 / min." in errorbody and returns a RateLimiter for that.