Class RateLimiter
- java.lang.Object
-
- com.composum.ai.backend.base.impl.RateLimiter
-
public class RateLimiter extends Object
This class is used to limit the rate of requests to the chat service. The idea is that we set a certain limit of the number of requests in a given time period, do not limit the first half of the request count in that time period, but then delay requests if it looks like the limit would be exceeded at the end of that time period, beginning freshly after it has passed. We can make a chain of RateLimiters - e.g. one for the current minute, one for the current hour, one for the current day. Also it can be chained to one for the whole service (that could be split similarly).The rate limiting is introduced for several reasons: first, OpenAI poses some limits by itself, then we want to limit DOS attacks, and, last but not least, each request has a price, even if it's much less than a cent.
-
-
Field Summary
Fields Modifier and Type Field Description protected static Pattern
PATTERN_LIMIT_ERROR
-
Constructor Summary
Constructors Constructor Description RateLimiter(RateLimiter parent, int limit, int period, TimeUnit timeUnit)
Constructs a rate limiter with a parent.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected long
getCurrentTimeMillis()
Provides the possibility to fake time, for easy unittests.static RateLimiter
of(String errorbody)
Tries to find something like "Limit: 3 / min." in errorbody and returns a RateLimiter for that.protected void
sleep(long delay)
Provides the possibility to fake time, for easy unittests.void
waitForLimit()
For a synchronous call:waitForLimit()
has to be called before starting the request; it'll make sure the rate limit is not exceeded.
-
-
-
Field Detail
-
PATTERN_LIMIT_ERROR
protected static final Pattern PATTERN_LIMIT_ERROR
-
-
Constructor Detail
-
RateLimiter
public RateLimiter(@Nullable RateLimiter parent, @Nonnegative int limit, @Nonnegative int period, @Nonnull TimeUnit timeUnit)
Constructs a rate limiter with a parent.- Parameters:
parent
- if set, this is also observedlimit
- the number of requests allowed in the given time period (the first half of the requests are not limited, the second half might be limited) (if the parent is set, the limit is applied to the sum of requests from this and the parent) (if the parent is not set, the limit is applied to the requests from this only)period
- the time period in the given time unittimeUnit
- the time unit of the time period
-
-
Method Detail
-
waitForLimit
public void waitForLimit()
For a synchronous call:waitForLimit()
has to be called before starting the request; it'll make sure the rate limit is not exceeded. We are using a soft limit here, i.e. we allow the first half of the requests in the time period to go through without any delay, but then we delay requests if it looks like the limit would be exceeded at the end of the time period. That way the user is served promptly if he makes only few requests, but if he makes more requests he has not to wait for the whole time period to pass, but only a minimum time so that the requests are spaced out evenly over the rest of the period.Specifically, we make sure that the user never has to wait more than 2 * periodDurationMillis / limit for the next request.
-
getCurrentTimeMillis
protected long getCurrentTimeMillis()
Provides the possibility to fake time, for easy unittests.
-
sleep
protected void sleep(long delay) throws InterruptedException
Provides the possibility to fake time, for easy unittests.- Throws:
InterruptedException
-
of
@Nullable public static RateLimiter of(String errorbody)
Tries to find something like "Limit: 3 / min." in errorbody and returns a RateLimiter for that.
-
-