java.lang.Object
- com.composum.ai.backend.base.service.chat.impl.GPTInternalOpenAIHelper.GPTInternalOpenAIHelperInst
- - com.composum.ai.backend.base.service.chat.impl.GPTChatCompletionServiceImpl

All Implemented Interfaces:

GPTChatCompletionService, GPTInternalOpenAIHelper
```
public class GPTChatCompletionServiceImpl
extends GPTInternalOpenAIHelper.GPTInternalOpenAIHelperInst
implements GPTChatCompletionService, GPTInternalOpenAIHelper
```
Implements the actual access to the ChatGPT chat API.

See Also:

"https://platform.openai.com/docs/api-reference/chat/create", "https://platform.openai.com/docs/guides/chat"

Nested Class Summary

Nested Classes
Modifier and Type	Class	Description
`protected static class`	`GPTChatCompletionServiceImpl.EnsureResultFutureCallback`	Makes doubly sure that result is somehow set after the call.
`static interface`	`GPTChatCompletionServiceImpl.GPTChatCompletionServiceConfig`
`protected static class`	`GPTChatCompletionServiceImpl.RetryableException`	Thrown when we get a 429 rate limiting response.
`protected class`	`GPTChatCompletionServiceImpl.StreamDecodingResponseConsumer`

Nested classes/interfaces inherited from interface com.composum.ai.backend.base.service.chat.impl.GPTInternalOpenAIHelper
GPTInternalOpenAIHelper.GPTInternalOpenAIHelperInst

Field Summary

Fields
Modifier and Type	Field	Description
`protected GPTBackendsService`	`backendsService`
`protected org.osgi.framework.BundleContext`	`bundleContext`
`protected int`	`connectionTimeout`
`static String`	`DEFAULT_EMBEDDINGS_MODEL`
`static String`	`DEFAULT_HIGH_INTELLIGENCE_MODEL`
`static String`	`DEFAULT_IMAGE_MODEL`
`static String`	`DEFAULT_MODEL`
`protected String`	`defaultModel`
`protected static int`	`DEFAULTVALUE_CONNECTIONTIMEOUT`
`protected static int`	`DEFAULTVALUE_REQUESTS_PER_DAY`
`protected static int`	`DEFAULTVALUE_REQUESTS_PER_HOUR`
`protected static int`	`DEFAULTVALUE_REQUESTS_PER_MINUTE`
`protected static int`	`DEFAULTVALUE_REQUESTTIMEOUT`
`protected boolean`	`disabled`
`protected RateLimiter`	`embeddingsLimiter`	Rate limiter for embeddings.
`protected String`	`embeddingsModel`
`protected String`	`embeddingsUrl`
`protected com.knuddels.jtokkit.api.Encoding`	`enc`	Tokenizer used for GPT-4 variants.
`protected RateLimiter`	`gptLimiter`	If set, this tells the limits of ChatGPT API itself.
`protected static com.google.gson.Gson`	`gson`
`protected String`	`highIntelligenceModel`
`protected org.apache.hc.client5.http.impl.async.CloseableHttpAsyncClient`	`httpAsyncClient`
`protected String`	`imageModel`
`protected long`	`lastGptLimiterCreationTime`
`protected RateLimiter`	`limiter`	Limiter that maps the financial reasons to limit.
`protected static org.slf4j.Logger`	`LOG`
`protected Integer`	`maximumTokensPerRequest`
`protected Integer`	`maximumTokensPerResponse`
`static int`	`MAXTRIES`	The maximum number of retries.
`protected static String`	`OPENAI_EMBEDDINGS_URL`
`protected static Pattern`	`PATTERN_TRY_AGAIN`
`protected com.knuddels.jtokkit.api.EncodingRegistry`	`registry`
`protected AtomicLong`	`requestCounter`
`protected int`	`requestTimeout`
`protected ScheduledExecutorService`	`scheduledExecutorService`
`protected Map<String,GPTChatMessagesTemplate>`	`templates`
`static String`	`TRUNCATE_MARKER`

Fields inherited from interface com.composum.ai.backend.base.service.chat.GPTChatCompletionService
MARKER_DEBUG_OUTPUT_REQUEST, MARKER_DEBUG_PRINT_REQUEST

Constructor Summary

Constructors
Constructor Description

GPTChatCompletionServiceImpl()

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`void`	`activate(GPTChatCompletionServiceImpl.GPTChatCompletionServiceConfig config, org.osgi.framework.BundleContext bundleContext)`
`protected void`	`checkEnabled()`
`protected void`	`checkEnabled(GPTConfiguration gptConfig)`
`protected void`	`checkTokenCount(String jsonRequest)`
`int`	`countTokens(String text)`	Counts the number of tokens for the text for the normally used model.
`protected ChatCompletionRequest`	`createExternalRequest(GPTChatRequest request)`	Creates the external request.
`void`	`deactivate()`
`protected String`	`determineModel(GPTConfiguration configuration, boolean hasImage)`
`protected static GPTChatCompletionServiceImpl.RetryableException`	`extractRetryableException(Throwable e)`
`List<float[]>`	`getEmbeddings(List<String> texts, GPTConfiguration configuration)`	Calculates embeddings for the given list of texts.
`protected List<float[]>`	`getEmbeddingsImpl(List<String> texts, GPTConfiguration configuration, long id)`
`protected List<float[]>`	`getEmbeddingsImplDivideAndConquer(List<String> texts, GPTConfiguration configuration, long id)`
`String`	`getEmbeddingsModel()`	Returns the model used for `GPTChatCompletionService.getEmbeddings(List, GPTConfiguration)`.
`GPTInternalOpenAIHelper.GPTInternalOpenAIHelperInst`	`getInstance()`	Returns a helper for implementation in this package.
`protected String`	`getModel(GPTConfiguration gptConfiguration)`
`String`	`getSingleChatCompletion(GPTChatRequest request)`	The simplest case: give some messages and get a single response.
`GPTChatMessagesTemplate`	`getTemplate(String templateName)`	Retrieves a (usually cached) chat template with that name.
`protected void`	`handleStreamingEvent(GPTCompletionCallback callback, long id, String line)`	Handle a single line of the streaming response.
`String`	`htmlToMarkdown(String html)`	Helper for preprocessing HTML so that it can easily read by ChatGPT.
`boolean`	`isEnabled()`	Whether ChatGPT completion is enabled.
`boolean`	`isEnabled(GPTConfiguration gptConfig)`	Checks whether `GPTChatCompletionService.isEnabled()` and whether gptConfig enables executing GPT calls.
`boolean`	`isVisionEnabled()`	Returns true if vision is enabled.
`protected org.apache.hc.client5.http.async.methods.SimpleHttpRequest`	`makeRequest(String jsonRequest, GPTConfiguration gptConfiguration)`
`String`	`markdownToHtml(String markdown)`	Opposite of `GPTChatCompletionService.htmlToMarkdown(String)`.
`protected void`	`performCallAsync(CompletableFuture<Void> finished, long id, org.apache.hc.client5.http.async.methods.SimpleHttpRequest httpRequest, GPTCompletionCallback callback, int tryNumber, long defaultDelay)`	Executes a call with retries.
`protected <T> long`	`recalculateDelay(String responsebody, long delay)`	If the response body contains a string like "Please try again in 20s." (number varies) we return a value of that many seconds, otherwise just use iterative doubling.
`String`	`shorten(String text, int maxTokens)`	Helper method to shorten texts by taking out the middle if too long.
`void`	`streamingChatCompletion(GPTChatRequest request, GPTCompletionCallback callback)`	Give some messages and receive the streaming response via callback, to reduce waiting time.
`void`	`streamingChatCompletionWithToolCalls(GPTChatRequest request, GPTCompletionCallback callback)`	Give some messages and receive the streaming response via callback, to reduce waiting time.
`protected CompletableFuture<Void>`	`triggerCallAsync(long id, org.apache.hc.client5.http.async.methods.SimpleHttpRequest httpRequest, GPTCompletionCallback callback)`	Puts the call into the pipeline; the returned future will be set normally or exceptionally when it's done.
`protected void`	`waitForLimit()`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - LOG
```
protected static final org.slf4j.Logger LOG
```
  - OPENAI_EMBEDDINGS_URL
```
protected static final String OPENAI_EMBEDDINGS_URL
```
    See Also:
    
    Constant Field Values
  - PATTERN_TRY_AGAIN
```
protected static final Pattern PATTERN_TRY_AGAIN
```
  - DEFAULT_MODEL
```
public static final String DEFAULT_MODEL
```
    See Also:
    
    Constant Field Values
  - DEFAULT_IMAGE_MODEL
```
public static final String DEFAULT_IMAGE_MODEL
```
    See Also:
    
    Constant Field Values
  - DEFAULT_EMBEDDINGS_MODEL
```
public static final String DEFAULT_EMBEDDINGS_MODEL
```
    See Also:
    
    Constant Field Values
  - DEFAULT_HIGH_INTELLIGENCE_MODEL
```
public static final String DEFAULT_HIGH_INTELLIGENCE_MODEL
```
    See Also:
    
    Constant Field Values
  - DEFAULTVALUE_CONNECTIONTIMEOUT
```
protected static final int DEFAULTVALUE_CONNECTIONTIMEOUT
```
    See Also:
    
    Constant Field Values
  - DEFAULTVALUE_REQUESTTIMEOUT
```
protected static final int DEFAULTVALUE_REQUESTTIMEOUT
```
    See Also:
    
    Constant Field Values
  - DEFAULTVALUE_REQUESTS_PER_MINUTE
```
protected static final int DEFAULTVALUE_REQUESTS_PER_MINUTE
```
    See Also:
    
    Constant Field Values
  - DEFAULTVALUE_REQUESTS_PER_HOUR
```
protected static final int DEFAULTVALUE_REQUESTS_PER_HOUR
```
    See Also:
    
    Constant Field Values
  - DEFAULTVALUE_REQUESTS_PER_DAY
```
protected static final int DEFAULTVALUE_REQUESTS_PER_DAY
```
    See Also:
    
    Constant Field Values
  - TRUNCATE_MARKER
```
public static final String TRUNCATE_MARKER
```
    See Also:
    
    Constant Field Values
  - MAXTRIES
```
public static final int MAXTRIES
```
    The maximum number of retries.
    
    See Also:
    
    Constant Field Values
  - defaultModel
```
protected String defaultModel
```
  - highIntelligenceModel
```
protected String highIntelligenceModel
```
  - imageModel
```
protected String imageModel
```
  - httpAsyncClient
```
protected org.apache.hc.client5.http.impl.async.CloseableHttpAsyncClient httpAsyncClient
```
  - gson
```
protected static final com.google.gson.Gson gson
```
  - requestCounter
```
protected final AtomicLong requestCounter
```
  - limiter
```
protected RateLimiter limiter
```
    Limiter that maps the financial reasons to limit.
  - lastGptLimiterCreationTime
```
protected volatile long lastGptLimiterCreationTime
```
  - gptLimiter
```
protected volatile RateLimiter gptLimiter
```
    If set, this tells the limits of ChatGPT API itself.
  - registry
```
protected com.knuddels.jtokkit.api.EncodingRegistry registry
```
  - enc
```
protected com.knuddels.jtokkit.api.Encoding enc
```
    Tokenizer used for GPT-4 variants.
  - bundleContext
```
protected org.osgi.framework.BundleContext bundleContext
```
  - templates
```
protected final Map<String,GPTChatMessagesTemplate> templates
```
  - requestTimeout
```
protected int requestTimeout
```
  - connectionTimeout
```
protected int connectionTimeout
```
  - disabled
```
protected boolean disabled
```
  - scheduledExecutorService
```
protected ScheduledExecutorService scheduledExecutorService
```
  - maximumTokensPerRequest
```
protected Integer maximumTokensPerRequest
```
  - maximumTokensPerResponse
```
protected Integer maximumTokensPerResponse
```
  - embeddingsLimiter
```
protected volatile RateLimiter embeddingsLimiter
```
    Rate limiter for embeddings. These are a quite inexpensive service (0.13$ per million tokens), so we just introduce a limit that should protect against malfunctions for now.
  - embeddingsUrl
```
protected String embeddingsUrl
```
  - embeddingsModel
```
protected String embeddingsModel
```
  - backendsService
```
protected GPTBackendsService backendsService
```
- Constructor Detail
  - GPTChatCompletionServiceImpl
```
public GPTChatCompletionServiceImpl()
```
- Method Detail
  - activate
```
public void activate(GPTChatCompletionServiceImpl.GPTChatCompletionServiceConfig config,
                     org.osgi.framework.BundleContext bundleContext)
```
  - deactivate
```
public void deactivate()
```
  - getSingleChatCompletion
```
public String getSingleChatCompletion(@Nonnull
                                      GPTChatRequest request)
                               throws GPTException
```
    Description copied from interface: GPTChatCompletionService
    
    The simplest case: give some messages and get a single response. If the response can be more than a few words, do consider using GPTChatCompletionService.streamingChatCompletion(GPTChatRequest, GPTCompletionCallback) instead, to give the user some feedback while waiting.
    
    Specified by:
    
    getSingleChatCompletion in interface GPTChatCompletionService
    
    Throws:
    
    GPTException
  - makeRequest
```
protected org.apache.hc.client5.http.async.methods.SimpleHttpRequest makeRequest(String jsonRequest,
                                                                                 GPTConfiguration gptConfiguration)
```
  - streamingChatCompletion
```
public void streamingChatCompletion(@Nonnull
                                    GPTChatRequest request,
                                    @Nonnull
                                    GPTCompletionCallback callback)
                             throws GPTException
```
    Description copied from interface: GPTChatCompletionService
    
    Give some messages and receive the streaming response via callback, to reduce waiting time. It possibly waits if a rate limit is reached, but otherwise returns immediately after scheduling an asynchronous call.
    
    Specified by:
    
    streamingChatCompletion in interface GPTChatCompletionService
    
    Throws:
    
    GPTException
  - streamingChatCompletionWithToolCalls
```
public void streamingChatCompletionWithToolCalls(@Nonnull
                                                 GPTChatRequest request,
                                                 @Nonnull
                                                 GPTCompletionCallback callback)
                                          throws GPTException
```
    Description copied from interface: GPTChatCompletionService
    
    Give some messages and receive the streaming response via callback, to reduce waiting time. This implementation also performs tool calls if tools are given in GPTChatRequest.getConfiguration(). It possibly waits if a rate limit is reached, but otherwise returns immediately after scheduling an asynchronous call.
    
    Specified by:
    
    streamingChatCompletionWithToolCalls in interface GPTChatCompletionService
    
    Throws:
    
    GPTException
  - handleStreamingEvent
```
protected void handleStreamingEvent(GPTCompletionCallback callback,
                                    long id,
                                    String line)
```
    Handle a single line of the streaming response.
    
    First message e.g.: {"id":"chatcmpl-xyz","object":"chat.completion.chunk","created":1686890500,"model":"gpt-3.5-turbo-0301","choices":[{"delta":{"role":"assistant"},"index":0,"finish_reason":null}]}
    
    Data: gather {"id":"chatcmpl-xyz","object":"chat.completion.chunk","created":1686890500,"model":"gpt-3.5-turbo-0301","choices":[{"delta":{"content":" above"},"index":0,"finish_reason":null}]}
    
    End: {"id":"chatcmpl-xyz","object":"chat.completion.chunk","created":1686890500,"model":"gpt-3.5-turbo-0301","choices":[{"delta":{},"index":0,"finish_reason":"stop"}]}
  - waitForLimit
```
protected void waitForLimit()
```
  - performCallAsync
```
protected void performCallAsync(CompletableFuture<Void> finished,
                                long id,
                                org.apache.hc.client5.http.async.methods.SimpleHttpRequest httpRequest,
                                GPTCompletionCallback callback,
                                int tryNumber,
                                long defaultDelay)
```
    Executes a call with retries. The response is written to callback; when it's finished the future is set - either normally or exceptionally if there was an error.
    
    Parameters:
    
    finished - the future to set when the call is finished
    
    id - the id of the call, for logging
    
    httpRequest - the request to send
    
    callback - the callback to write the response to
    
    tryNumber - the number of the try - if it's 5 , we give up.
  - extractRetryableException
```
protected static GPTChatCompletionServiceImpl.RetryableException extractRetryableException(Throwable e)
```
  - triggerCallAsync
```
protected CompletableFuture<Void> triggerCallAsync(long id,
                                                   org.apache.hc.client5.http.async.methods.SimpleHttpRequest httpRequest,
                                                   GPTCompletionCallback callback)
```
    Puts the call into the pipeline; the returned future will be set normally or exceptionally when it's done.
  - recalculateDelay
```
protected <T> long recalculateDelay(String responsebody,
                                    long delay)
```
    If the response body contains a string like "Please try again in 20s." (number varies) we return a value of that many seconds, otherwise just use iterative doubling.
  - createExternalRequest
```
protected ChatCompletionRequest createExternalRequest(GPTChatRequest request)
                                               throws com.fasterxml.jackson.core.JsonProcessingException
```
    Creates the external request. Caution: the model name does have to be processed with GPTBackendsService.getModelNameInBackend(String) - the actual model can only be determined here but can have a backend prefix yet.
    
    Throws:
    
    com.fasterxml.jackson.core.JsonProcessingException
  - determineModel
```
protected String determineModel(GPTConfiguration configuration,
                                boolean hasImage)
```
  - checkEnabled
```
protected void checkEnabled()
```
  - isEnabled
```
public boolean isEnabled()
```
    Description copied from interface: GPTChatCompletionService
    
    Whether ChatGPT completion is enabled. If not, calling the methods that access ChatGPT throws an IllegalStateException.
    
    Specified by:
    
    isEnabled in interface GPTChatCompletionService
  - isEnabled
```
public boolean isEnabled(GPTConfiguration gptConfig)
```
    Description copied from interface: GPTChatCompletionService
    
    Checks whether GPTChatCompletionService.isEnabled() and whether gptConfig enables executing GPT calls. (That is currently whether there is an api key either globally or in the gptConfig).
    
    Specified by:
    
    isEnabled in interface GPTChatCompletionService
    
    Specified by:
    
    isEnabled in interface GPTInternalOpenAIHelper
  - checkEnabled
```
protected void checkEnabled(GPTConfiguration gptConfig)
```
  - isVisionEnabled
```
public boolean isVisionEnabled()
```
    Description copied from interface: GPTChatCompletionService
    
    Returns true if vision is enabled.
    
    Specified by:
    
    isVisionEnabled in interface GPTChatCompletionService
  - getTemplate
```
@Nonnull
public GPTChatMessagesTemplate getTemplate(@Nonnull
                                           String templateName)
                                    throws GPTException
```
    Description copied from interface: GPTChatCompletionService
    
    Retrieves a (usually cached) chat template with that name. Mostly for backend internal use. The templates are retrieved from the bundle resources at "chattemplates/", and are cached.
    
    Specified by:
    
    getTemplate in interface GPTChatCompletionService
    
    Parameters:
    
    templateName - the name of the template to retrieve, e.g. "singleTranslation" .
    
    Throws:
    
    GPTException
  - shorten
```
@Nonnull
public String shorten(@Nullable
                      String text,
                      int maxTokens)
```
    Description copied from interface: GPTChatCompletionService
    
    Helper method to shorten texts by taking out the middle if too long. In texts longer than this many tokens we replace the middle with " ... (truncated) ... " since ChatGPT can only process a limited number of words / tokens and in the introduction or summary there is probably the most condensed information about the text. The output has then maxTokens tokens, including the ... marker.
    
    Specified by:
    
    shorten in interface GPTChatCompletionService
    
    Parameters:
    
    text - the text to shorten
    
    maxTokens - the maximum number of tokens in the output
  - markdownToHtml
```
public String markdownToHtml(String markdown)
```
    Description copied from interface: GPTChatCompletionService
    
    Opposite of GPTChatCompletionService.htmlToMarkdown(String).
    
    Specified by:
    
    markdownToHtml in interface GPTChatCompletionService
  - countTokens
```
public int countTokens(@Nullable
                       String text)
```
    Description copied from interface: GPTChatCompletionService
    
    Counts the number of tokens for the text for the normally used model. Caution: message boundaries need some tokens and slicing text might create a token or two, too, so do not exactly rely on that.
    
    Specified by:
    
    countTokens in interface GPTChatCompletionService
  - checkTokenCount
```
protected void checkTokenCount(String jsonRequest)
```
  - htmlToMarkdown
```
@Nonnull
public String htmlToMarkdown(String html)
```
    Description copied from interface: GPTChatCompletionService
    
    Helper for preprocessing HTML so that it can easily read by ChatGPT.
    
    Specified by:
    
    htmlToMarkdown in interface GPTChatCompletionService
  - getEmbeddings
```
@Nonnull
public List<float[]> getEmbeddings(List<String> texts,
                                   GPTConfiguration configuration)
                            throws GPTException
```
    Description copied from interface: GPTChatCompletionService
    
    Calculates embeddings for the given list of texts.
    
    Specified by:
    
    getEmbeddings in interface GPTChatCompletionService
    
    Throws:
    
    GPTException
  - getEmbeddingsImplDivideAndConquer
```
protected List<float[]> getEmbeddingsImplDivideAndConquer(List<String> texts,
                                                          GPTConfiguration configuration,
                                                          long id)
```
  - getEmbeddingsImpl
```
protected List<float[]> getEmbeddingsImpl(List<String> texts,
                                          GPTConfiguration configuration,
                                          long id)
```
  - getEmbeddingsModel
```
public String getEmbeddingsModel()
```
    Description copied from interface: GPTChatCompletionService
    
    Returns the model used for GPTChatCompletionService.getEmbeddings(List, GPTConfiguration).
    
    Specified by:
    
    getEmbeddingsModel in interface GPTChatCompletionService
  - getInstance
```
public GPTInternalOpenAIHelper.GPTInternalOpenAIHelperInst getInstance()
```
    Description copied from interface: GPTInternalOpenAIHelper
    
    Returns a helper for implementation in this package. We do this indirection to make it only available for this package, since otherwise everything is public in an interface.
    
    Specified by:
    
    getInstance in interface GPTInternalOpenAIHelper
  - getModel
```
protected String getModel(@Nullable
                          GPTConfiguration gptConfiguration)
```

Class GPTChatCompletionServiceImpl

Nested Class Summary

Nested classes/interfaces inherited from interface com.composum.ai.backend.base.service.chat.impl.GPTInternalOpenAIHelper

Field Summary

Fields inherited from interface com.composum.ai.backend.base.service.chat.GPTChatCompletionService

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

LOG

OPENAI_EMBEDDINGS_URL

PATTERN_TRY_AGAIN

DEFAULT_MODEL

DEFAULT_IMAGE_MODEL

DEFAULT_EMBEDDINGS_MODEL

DEFAULT_HIGH_INTELLIGENCE_MODEL

DEFAULTVALUE_CONNECTIONTIMEOUT

DEFAULTVALUE_REQUESTTIMEOUT

DEFAULTVALUE_REQUESTS_PER_MINUTE

DEFAULTVALUE_REQUESTS_PER_HOUR

DEFAULTVALUE_REQUESTS_PER_DAY

TRUNCATE_MARKER

MAXTRIES

defaultModel

highIntelligenceModel

imageModel

httpAsyncClient

gson

requestCounter

limiter

lastGptLimiterCreationTime

gptLimiter

registry

enc

bundleContext

templates

requestTimeout

connectionTimeout

disabled

scheduledExecutorService

maximumTokensPerRequest

maximumTokensPerResponse

embeddingsLimiter

embeddingsUrl

embeddingsModel

backendsService

Constructor Detail

GPTChatCompletionServiceImpl

Method Detail

activate

deactivate

getSingleChatCompletion

makeRequest

streamingChatCompletion

streamingChatCompletionWithToolCalls

handleStreamingEvent

waitForLimit

performCallAsync

extractRetryableException

triggerCallAsync

recalculateDelay

createExternalRequest

determineModel

checkEnabled

isEnabled

isEnabled

checkEnabled

isVisionEnabled

getTemplate

shorten

markdownToHtml

countTokens

checkTokenCount

htmlToMarkdown

getEmbeddings

getEmbeddingsImplDivideAndConquer

getEmbeddingsImpl

getEmbeddingsModel

getInstance

getModel