Class AutoTranslateConfigServiceImpl
- java.lang.Object
-
- com.composum.ai.aem.core.impl.autotranslate.AutoTranslateConfigServiceImpl
-
- All Implemented Interfaces:
AutoTranslateConfigService
public class AutoTranslateConfigServiceImpl extends Object implements AutoTranslateConfigService
Serves the configurations for the automatic translation service.
-
-
Field Summary
Fields Modifier and Type Field Description protected List<Pattern>
allowedAttributeRegexes
static List<String>
CERTAINLY_TRANSLATABLE_PROPERTIES
List of properties that should always be translated.protected AutoTranslateConfig
config
protected List<Pattern>
deniedAttributesRegexes
protected List<Pattern>
deniedResourceTypes
protected com.composum.ai.backend.base.service.chat.GPTChatCompletionService
gptChatCompletionService
static Pattern
HTML_TAG_OR_COMMENT_PATTERN
Matches a HTML tag or endtag or HTML comment.protected static int
HUGE_TRESHOLD
Somewhat arbitrary treshold: if a single property is more than that number of tokens we don't try to translate it.protected static Pattern
PATTERN_HAS_LETTER
protected static Pattern
PATTERN_HAS_WHITESPACE
protected static Pattern
PATTERN_HAS_WORD
As additional heuristic - the text should have at least one word with >= 5 letters.
-
Constructor Summary
Constructors Constructor Description AutoTranslateConfigServiceImpl()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
activate(AutoTranslateConfig config)
void
deactivate()
boolean
includeExistingTranslationsInRetranslation()
If true, we when retranslating a page with some changes we provide the existing translations of that page to the AI as well as additional context with examples.boolean
includeFullPageInRetranslation()
If true, we do not only provide changed texts to the AI during re-translating a page with some changes, but give the entire page to provide better context.boolean
isEnabled()
static boolean
isHeuristicallyTranslatableProperty(String name, Object value)
Checks whether the property is one of jcr:title, jcr:description, title, alt, cq:panelTitle, shortDescription, actionText, accessibilityLabel, pretitle, displayPopupTitle, helpMessage , or alternatively don't have a colon in the name, have a String value, don't start with /{content,apps,libs,mnt}/ in the value and the value has a whitespace and at least one 4 letter sequence.static boolean
isHtmlButNotRichtext(String name, Object value)
Heuristic check whether this is actually HTML that shouldn't be translated.boolean
isHuge(String name, Object value)
This recognizes huge values that we shouldn't even attempt to translate since that would very likely fail and block translation of the rest of the properties.boolean
isPocUiEnabled()
Whether the debugging UI is enabled.boolean
isTranslatableResource(org.apache.sling.api.resource.Resource resource)
boolean
isUseHighIntelligenceModel()
If true, the translator will use the 'high-intelligence model' (see OpenAI config) for translation.List<String>
translateableAttributes(org.apache.sling.api.resource.Resource resource)
Returns those attributes that should be translated.
-
-
-
Field Detail
-
CERTAINLY_TRANSLATABLE_PROPERTIES
public static final List<String> CERTAINLY_TRANSLATABLE_PROPERTIES
List of properties that should always be translated.
-
PATTERN_HAS_WHITESPACE
protected static final Pattern PATTERN_HAS_WHITESPACE
-
PATTERN_HAS_WORD
protected static final Pattern PATTERN_HAS_WORD
As additional heuristic - the text should have at least one word with >= 5 letters. That will break source languages very different from english, I know, but this is a POC. :-)
-
PATTERN_HAS_LETTER
protected static final Pattern PATTERN_HAS_LETTER
-
HTML_TAG_OR_COMMENT_PATTERN
public static final Pattern HTML_TAG_OR_COMMENT_PATTERN
Matches a HTML tag or endtag or HTML comment.
-
HUGE_TRESHOLD
protected static final int HUGE_TRESHOLD
Somewhat arbitrary treshold: if a single property is more than that number of tokens we don't try to translate it. That'll likely fail and block the translation, and there is very likely something that shouldn't be translated, anyway, like a large HTML fragment.- See Also:
- Constant Field Values
-
config
protected AutoTranslateConfig config
-
gptChatCompletionService
protected com.composum.ai.backend.base.service.chat.GPTChatCompletionService gptChatCompletionService
-
-
Method Detail
-
activate
public void activate(AutoTranslateConfig config)
-
deactivate
public void deactivate()
-
isPocUiEnabled
public boolean isPocUiEnabled()
Description copied from interface:AutoTranslateConfigService
Whether the debugging UI is enabled.- Specified by:
isPocUiEnabled
in interfaceAutoTranslateConfigService
-
isEnabled
public boolean isEnabled()
- Specified by:
isEnabled
in interfaceAutoTranslateConfigService
-
isUseHighIntelligenceModel
public boolean isUseHighIntelligenceModel()
Description copied from interface:AutoTranslateConfigService
If true, the translator will use the 'high-intelligence model' (see OpenAI config) for translation.- Specified by:
isUseHighIntelligenceModel
in interfaceAutoTranslateConfigService
-
isTranslatableResource
public boolean isTranslatableResource(@Nullable org.apache.sling.api.resource.Resource resource)
- Specified by:
isTranslatableResource
in interfaceAutoTranslateConfigService
-
translateableAttributes
public List<String> translateableAttributes(@Nullable org.apache.sling.api.resource.Resource resource)
Description copied from interface:AutoTranslateConfigService
Returns those attributes that should be translated. (Of the resource, not children.)- Specified by:
translateableAttributes
in interfaceAutoTranslateConfigService
-
includeFullPageInRetranslation
public boolean includeFullPageInRetranslation()
Description copied from interface:AutoTranslateConfigService
If true, we do not only provide changed texts to the AI during re-translating a page with some changes, but give the entire page to provide better context. That is a bit slower and a bit more expensive, but likely improves the result.- Specified by:
includeFullPageInRetranslation
in interfaceAutoTranslateConfigService
-
includeExistingTranslationsInRetranslation
public boolean includeExistingTranslationsInRetranslation()
Description copied from interface:AutoTranslateConfigService
If true, we when retranslating a page with some changes we provide the existing translations of that page to the AI as well as additional context with examples. That is a bit slower and a bit more expensive, but likely improves the result."- Specified by:
includeExistingTranslationsInRetranslation
in interfaceAutoTranslateConfigService
-
isHeuristicallyTranslatableProperty
public static boolean isHeuristicallyTranslatableProperty(String name, Object value)
Checks whether the property is one of jcr:title, jcr:description, title, alt, cq:panelTitle, shortDescription, actionText, accessibilityLabel, pretitle, displayPopupTitle, helpMessage , or alternatively don't have a colon in the name, have a String value, don't start with /{content,apps,libs,mnt}/ in the value and the value has a whitespace and at least one 4 letter sequence. We also exclude something that isisHtmlButNotRichtext(String, Object)
.
-
isHtmlButNotRichtext
public static boolean isHtmlButNotRichtext(String name, Object value)
Heuristic check whether this is actually HTML that shouldn't be translated. Richtext is acceptable for translating, but HTML with lots of attributes not. We recognize text that has a very significant amount of text in HTML tags / comments and try to err on the save side - it should be "very obviously" HTMl, which often has many attributes in it's HTML tags.
-
isHuge
public boolean isHuge(String name, Object value)
This recognizes huge values that we shouldn't even attempt to translate since that would very likely fail and block translation of the rest of the properties. (Note: it is not impossible to translate such a thing but that would need a special implementation which we will do if there is a real use case.)
-
-