Class GPTTranslationServiceImpl
- java.lang.Object
-
- com.composum.ai.backend.base.service.chat.impl.GPTTranslationServiceImpl
-
- All Implemented Interfaces:
GPTTranslationService
public class GPTTranslationServiceImpl extends Object implements GPTTranslationService
Building onGPTChatCompletionServicethis implements translation.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interfaceGPTTranslationServiceImpl.Config
-
Field Summary
Fields Modifier and Type Field Description protected PathcacheDirprotected GPTChatCompletionServicechatCompletionServiceprotected GPTTranslationServiceImpl.Configconfigstatic PatternHTML_TAG_AT_STARTstatic StringLASTIDstatic StringMULTITRANSLATION_SEPARATOR_ENDEnd of separator like `573472 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%` .protected static PatternMULTITRANSLATION_SEPARATOR_PATTERNRegexp matching separator like `%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 573472 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%` (group "id" matches the number).static StringMULTITRANSLATION_SEPARATOR_STARTStart of separator like `%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 573472 %%%%%%%%%%%%%%%%` .protected static PatternPATTERN_HAS_LETTERprotected static PatternPATTERN_SEPARATE_WHITESPACESeparate whitespace at the beginning and end from the non-whitespace text.protected Integerseedprotected Doubletemperaturestatic StringTEMPLATE_SINGLETRANSLATIONTemplate forGPTChatMessagesTemplateto translate a single word or phrase.
-
Constructor Summary
Constructors Constructor Description GPTTranslationServiceImpl()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidactivate(GPTTranslationServiceImpl.Config config)protected voidcacheResponse(String cacheKey, GPTChatRequest request, String response)protected voiddeactivate()protected voidensureEnabled()protected static StringfakeTranslation(String text)This turns the capitalization of every odd letter in each word on it's head.protected List<String>fragmentedTranslation(List<String> texts, String targetLanguage, GPTConfiguration configuration, AtomicInteger permittedRetries, List<GPTResponseCheck> translationChecks)List<String>fragmentedTranslation(List<String> texts, String targetLanguage, GPTConfiguration configuration, List<GPTResponseCheck> translationChecks)We join all text fragments we have to translate into one big texts separated with separators like `%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 573472 %%%%%%%%%%%%%%%%` and then translate that.protected List<String>fragmentedTranslationDivideAndConquer(List<String> texts, String targetLanguage, GPTConfiguration configuration, AtomicInteger permittedRetries, List<GPTResponseCheck> translationChecks)We try to translate the whole lot of texts.protected StringgetCachedResponse(String cacheKey)protected GPTConfigurationgetServiceConfiguration()protected static StringjoinTexts(List<String> texts, List<String> ids)protected static List<String>separateResultTexts(String response, List<String> texts, List<String> ids, String joinedtexts)StringsingleTranslation(String rawText, String sourceLanguage, String targetLanguage, GPTConfiguration configuration)Translate the text from the target to destination language, either Java locale name or language name.voidstreamingSingleTranslation(String text, String sourceLanguage, String targetLanguage, GPTConfiguration configuration, GPTCompletionCallback callback)Translate the text from the target to destination language, either Java locale name or language name.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface com.composum.ai.backend.base.service.chat.GPTTranslationService
fragmentedTranslation
-
-
-
-
Field Detail
-
TEMPLATE_SINGLETRANSLATION
public static final String TEMPLATE_SINGLETRANSLATION
Template forGPTChatMessagesTemplateto translate a single word or phrase. Has placeholders ${sourcelanguage} ${sourcephrase} and ${targetlanguage}.- See Also:
- Constant Field Values
-
chatCompletionService
protected GPTChatCompletionService chatCompletionService
-
config
protected GPTTranslationServiceImpl.Config config
-
temperature
protected Double temperature
-
seed
protected Integer seed
-
cacheDir
protected Path cacheDir
-
MULTITRANSLATION_SEPARATOR_START
public static final String MULTITRANSLATION_SEPARATOR_START
Start of separator like `%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 573472 %%%%%%%%%%%%%%%%` .- See Also:
- Constant Field Values
-
MULTITRANSLATION_SEPARATOR_END
public static final String MULTITRANSLATION_SEPARATOR_END
End of separator like `573472 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%` .- See Also:
- Constant Field Values
-
MULTITRANSLATION_SEPARATOR_PATTERN
protected static final Pattern MULTITRANSLATION_SEPARATOR_PATTERN
Regexp matching separator like `%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 573472 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%` (group "id" matches the number). The \n cannot be directly matched since at the start it's sometimes ```%%%%... We give the pattern a bit of leeway since some models get the number of % wrong.
-
LASTID
public static final String LASTID
- See Also:
- Constant Field Values
-
PATTERN_HAS_LETTER
protected static final Pattern PATTERN_HAS_LETTER
-
PATTERN_SEPARATE_WHITESPACE
protected static final Pattern PATTERN_SEPARATE_WHITESPACE
Separate whitespace at the beginning and end from the non-whitespace text.
-
HTML_TAG_AT_START
public static final Pattern HTML_TAG_AT_START
-
-
Method Detail
-
singleTranslation
@Nullable public String singleTranslation(@Nullable String rawText, @Nullable String sourceLanguage, @Nullable String targetLanguage, @Nullable GPTConfiguration configuration)
Translate the text from the target to destination language, either Java locale name or language name.- Specified by:
singleTranslationin interfaceGPTTranslationService- Parameters:
rawText- the text to translatesourceLanguage- the language to translate from - human readable name, or null for autodetecttargetLanguage- the language to translate to - human readable nameconfiguration- the configuration to use
-
streamingSingleTranslation
public void streamingSingleTranslation(@Nonnull String text, @Nonnull String sourceLanguage, @Nonnull String targetLanguage, @Nullable GPTConfiguration configuration, @Nonnull GPTCompletionCallback callback) throws GPTException
Description copied from interface:GPTTranslationServiceTranslate the text from the target to destination language, either Java locale name or language name.- Specified by:
streamingSingleTranslationin interfaceGPTTranslationService- Throws:
GPTException
-
fragmentedTranslation
@Nonnull public List<String> fragmentedTranslation(@Nonnull List<String> texts, @Nonnull String targetLanguage, @Nullable GPTConfiguration configuration, @Nullable List<GPTResponseCheck> translationChecks) throws GPTException
We join all text fragments we have to translate into one big texts separated with separators like `%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 573472 %%%%%%%%%%%%%%%%` and then translate that. Then we split the result at the separators and return the fragments. Safety check is that the is from the fragments have to match the ids in the result.- Specified by:
fragmentedTranslationin interfaceGPTTranslationService- Parameters:
texts- the texts to translatetargetLanguage- the language to translate to - human readable nameconfiguration- the configuration to usetranslationChecks- additional checks to verify the translation- Returns:
- the translated texts
- Throws:
GPTException
-
fragmentedTranslationDivideAndConquer
protected List<String> fragmentedTranslationDivideAndConquer(@Nonnull List<String> texts, @Nonnull String targetLanguage, @Nullable GPTConfiguration configuration, @Nonnull AtomicInteger permittedRetries, List<GPTResponseCheck> translationChecks) throws GPTException
We try to translate the whole lot of texts. If that leads to an exception because we are out of tokens or the response was garbled, we split it into two and translate these individually. If even one text is too long, we are lost and give up.- Throws:
GPTException
-
fragmentedTranslation
protected List<String> fragmentedTranslation(@Nonnull List<String> texts, @Nonnull String targetLanguage, @Nullable GPTConfiguration configuration, @Nonnull AtomicInteger permittedRetries, @Nullable List<GPTResponseCheck> translationChecks) throws GPTException
- Throws:
GPTException
-
separateResultTexts
protected static List<String> separateResultTexts(String response, List<String> texts, List<String> ids, String joinedtexts)
-
getServiceConfiguration
protected GPTConfiguration getServiceConfiguration()
-
fakeTranslation
protected static String fakeTranslation(String text)
This turns the capitalization of every odd letter in each word on it's head. If we are in a HTML tag (that is, between a < and a > ) then nothing is changed to avoid destroying richtext. For quick and inexpensive testing e.g. of bulk translation mechanics.Example: "This is a test
and some Code" -> "THiS iS a tEsTaNd sOmE COdE"
-
cacheResponse
protected void cacheResponse(String cacheKey, GPTChatRequest request, String response)
-
activate
protected void activate(GPTTranslationServiceImpl.Config config)
-
deactivate
protected void deactivate()
-
ensureEnabled
protected void ensureEnabled()
-
-