Class GPTTranslationServiceImpl
- java.lang.Object
-
- com.composum.ai.backend.base.service.chat.impl.GPTTranslationServiceImpl
-
- All Implemented Interfaces:
GPTTranslationService
public class GPTTranslationServiceImpl extends Object implements GPTTranslationService
Building onGPTChatCompletionService
this implements translation.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
GPTTranslationServiceImpl.Config
-
Field Summary
Fields Modifier and Type Field Description protected Path
cacheDir
protected GPTChatCompletionService
chatCompletionService
protected GPTTranslationServiceImpl.Config
config
static Pattern
HTML_TAG_AT_START
static String
LASTID
static String
MULTITRANSLATION_SEPARATOR_END
End of separator like `573472 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%` .protected static Pattern
MULTITRANSLATION_SEPARATOR_PATTERN
Regexp matching separator like `%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 573472 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%` (group "id" matches the number).static String
MULTITRANSLATION_SEPARATOR_START
Start of separator like `%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 573472 %%%%%%%%%%%%%%%%` .protected static Pattern
PATTERN_HAS_LETTER
protected static Pattern
PATTERN_SEPARATE_WHITESPACE
Separate whitespace at the beginning and end from the non-whitespace text.protected Integer
seed
protected Double
temperature
static String
TEMPLATE_SINGLETRANSLATION
Template forGPTChatMessagesTemplate
to translate a single word or phrase.
-
Constructor Summary
Constructors Constructor Description GPTTranslationServiceImpl()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
activate(GPTTranslationServiceImpl.Config config)
protected void
cacheResponse(String cacheKey, GPTChatRequest request, String response)
protected void
deactivate()
protected void
ensureEnabled()
protected static String
fakeTranslation(String text)
This turns the capitalization of every odd letter in each word on it's head.protected List<String>
fragmentedTranslation(List<String> texts, String targetLanguage, GPTConfiguration configuration, AtomicInteger permittedRetries, List<GPTResponseCheck> translationChecks)
List<String>
fragmentedTranslation(List<String> texts, String targetLanguage, GPTConfiguration configuration, List<GPTResponseCheck> translationChecks)
We join all text fragments we have to translate into one big texts separated with separators like `%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 573472 %%%%%%%%%%%%%%%%` and then translate that.protected List<String>
fragmentedTranslationDivideAndConquer(List<String> texts, String targetLanguage, GPTConfiguration configuration, AtomicInteger permittedRetries, List<GPTResponseCheck> translationChecks)
We try to translate the whole lot of texts.protected String
getCachedResponse(String cacheKey)
protected GPTConfiguration
getServiceConfiguration()
protected static String
joinTexts(List<String> texts, List<String> ids)
protected static List<String>
separateResultTexts(String response, List<String> texts, List<String> ids, String joinedtexts)
String
singleTranslation(String rawText, String sourceLanguage, String targetLanguage, GPTConfiguration configuration)
Translate the text from the target to destination language, either Java locale name or language name.void
streamingSingleTranslation(String text, String sourceLanguage, String targetLanguage, GPTConfiguration configuration, GPTCompletionCallback callback)
Translate the text from the target to destination language, either Java locale name or language name.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface com.composum.ai.backend.base.service.chat.GPTTranslationService
fragmentedTranslation
-
-
-
-
Field Detail
-
TEMPLATE_SINGLETRANSLATION
public static final String TEMPLATE_SINGLETRANSLATION
Template forGPTChatMessagesTemplate
to translate a single word or phrase. Has placeholders ${sourcelanguage} ${sourcephrase} and ${targetlanguage}.- See Also:
- Constant Field Values
-
chatCompletionService
protected GPTChatCompletionService chatCompletionService
-
config
protected GPTTranslationServiceImpl.Config config
-
temperature
protected Double temperature
-
seed
protected Integer seed
-
cacheDir
protected Path cacheDir
-
MULTITRANSLATION_SEPARATOR_START
public static final String MULTITRANSLATION_SEPARATOR_START
Start of separator like `%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 573472 %%%%%%%%%%%%%%%%` .- See Also:
- Constant Field Values
-
MULTITRANSLATION_SEPARATOR_END
public static final String MULTITRANSLATION_SEPARATOR_END
End of separator like `573472 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%` .- See Also:
- Constant Field Values
-
MULTITRANSLATION_SEPARATOR_PATTERN
protected static final Pattern MULTITRANSLATION_SEPARATOR_PATTERN
Regexp matching separator like `%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 573472 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%` (group "id" matches the number). The \n cannot be directly matched since at the start it's sometimes ```%%%%... We give the pattern a bit of leeway since some models get the number of % wrong.
-
LASTID
public static final String LASTID
- See Also:
- Constant Field Values
-
PATTERN_HAS_LETTER
protected static final Pattern PATTERN_HAS_LETTER
-
PATTERN_SEPARATE_WHITESPACE
protected static final Pattern PATTERN_SEPARATE_WHITESPACE
Separate whitespace at the beginning and end from the non-whitespace text.
-
HTML_TAG_AT_START
public static final Pattern HTML_TAG_AT_START
-
-
Method Detail
-
singleTranslation
@Nullable public String singleTranslation(@Nullable String rawText, @Nullable String sourceLanguage, @Nullable String targetLanguage, @Nullable GPTConfiguration configuration)
Translate the text from the target to destination language, either Java locale name or language name.- Specified by:
singleTranslation
in interfaceGPTTranslationService
- Parameters:
rawText
- the text to translatesourceLanguage
- the language to translate from - human readable name, or null for autodetecttargetLanguage
- the language to translate to - human readable nameconfiguration
- the configuration to use
-
streamingSingleTranslation
public void streamingSingleTranslation(@Nonnull String text, @Nonnull String sourceLanguage, @Nonnull String targetLanguage, @Nullable GPTConfiguration configuration, @Nonnull GPTCompletionCallback callback) throws GPTException
Description copied from interface:GPTTranslationService
Translate the text from the target to destination language, either Java locale name or language name.- Specified by:
streamingSingleTranslation
in interfaceGPTTranslationService
- Throws:
GPTException
-
fragmentedTranslation
@Nonnull public List<String> fragmentedTranslation(@Nonnull List<String> texts, @Nonnull String targetLanguage, @Nullable GPTConfiguration configuration, @Nullable List<GPTResponseCheck> translationChecks) throws GPTException
We join all text fragments we have to translate into one big texts separated with separators like `%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 573472 %%%%%%%%%%%%%%%%` and then translate that. Then we split the result at the separators and return the fragments. Safety check is that the is from the fragments have to match the ids in the result.- Specified by:
fragmentedTranslation
in interfaceGPTTranslationService
- Parameters:
texts
- the texts to translatetargetLanguage
- the language to translate to - human readable nameconfiguration
- the configuration to usetranslationChecks
- additional checks to verify the translation- Returns:
- the translated texts
- Throws:
GPTException
-
fragmentedTranslationDivideAndConquer
protected List<String> fragmentedTranslationDivideAndConquer(@Nonnull List<String> texts, @Nonnull String targetLanguage, @Nullable GPTConfiguration configuration, @Nonnull AtomicInteger permittedRetries, List<GPTResponseCheck> translationChecks) throws GPTException
We try to translate the whole lot of texts. If that leads to an exception because we are out of tokens or the response was garbled, we split it into two and translate these individually. If even one text is too long, we are lost and give up.- Throws:
GPTException
-
fragmentedTranslation
protected List<String> fragmentedTranslation(@Nonnull List<String> texts, @Nonnull String targetLanguage, @Nullable GPTConfiguration configuration, @Nonnull AtomicInteger permittedRetries, @Nullable List<GPTResponseCheck> translationChecks) throws GPTException
- Throws:
GPTException
-
separateResultTexts
protected static List<String> separateResultTexts(String response, List<String> texts, List<String> ids, String joinedtexts)
-
getServiceConfiguration
protected GPTConfiguration getServiceConfiguration()
-
fakeTranslation
protected static String fakeTranslation(String text)
This turns the capitalization of every odd letter in each word on it's head. If we are in a HTML tag (that is, between a < and a > ) then nothing is changed to avoid destroying richtext. For quick and inexpensive testing e.g. of bulk translation mechanics.Example: "This is a test
and some Code
" -> "THiS iS a tEsTaNd sOmE COdE
"
-
cacheResponse
protected void cacheResponse(String cacheKey, GPTChatRequest request, String response)
-
activate
protected void activate(GPTTranslationServiceImpl.Config config)
-
deactivate
protected void deactivate()
-
ensureEnabled
protected void ensureEnabled()
-
-