Class AutoTranslateConfigServiceImpl
- java.lang.Object
-
- com.composum.ai.aem.core.impl.autotranslate.AutoTranslateConfigServiceImpl
-
- All Implemented Interfaces:
AutoTranslateConfigService
public class AutoTranslateConfigServiceImpl extends Object implements AutoTranslateConfigService
Serves the configurations for the automatic translation service.
-
-
Field Summary
Fields Modifier and Type Field Description protected List<Pattern>allowedAttributeRegexesstatic List<String>CERTAINLY_TRANSLATABLE_PROPERTIESList of properties that should always be translated.protected AutoTranslateConfigconfigprotected List<Pattern>deniedAttributesRegexesprotected List<Pattern>deniedResourceTypesprotected GPTChatCompletionServicegptChatCompletionServicestatic PatternHTML_TAG_OR_COMMENT_PATTERNMatches a HTML tag or endtag or HTML comment.protected static intHUGE_TRESHOLDSomewhat arbitrary treshold: if a single property is more than that number of tokens we don't try to translate it.protected static PatternPATTERN_HAS_LETTERprotected static PatternPATTERN_HAS_WHITESPACEprotected static PatternPATTERN_HAS_WORDAs additional heuristic - the text should have at least one word with >= 5 letters.
-
Constructor Summary
Constructors Constructor Description AutoTranslateConfigServiceImpl()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidactivate(AutoTranslateConfig config)voiddeactivate()StringgetDefaultModel()booleanincludeExistingTranslationsInRetranslation()If true, we when retranslating a page with some changes we provide the existing translations of that page to the AI as well as additional context with examples.booleanincludeFullPageInRetranslation()If true, we do not only provide changed texts to the AI during re-translating a page with some changes, but give the entire page to provide better context.booleanisEnabled()static booleanisHeuristicallyTranslatableProperty(String name, Object value)Checks whether the property is one of jcr:title, jcr:description, title, alt, cq:panelTitle, shortDescription, actionText, accessibilityLabel, pretitle, displayPopupTitle, helpMessage , or alternatively don't have a colon in the name, have a String value, don't start with /{content,apps,libs,mnt}/ in the value and the value has a whitespace and at least one 4 letter sequence.static booleanisHtmlButNotRichtext(String name, Object value)Heuristic check whether this is actually HTML that shouldn't be translated.booleanisHuge(String name, Object value)This recognizes huge values that we shouldn't even attempt to translate since that would very likely fail and block translation of the rest of the properties.booleanisPocUiEnabled()Whether the debugging UI is enabled.booleanisTranslatableResource(org.apache.sling.api.resource.Resource resource)List<String>translateableAttributes(org.apache.sling.api.resource.Resource resource)Returns those attributes that should be translated.
-
-
-
Field Detail
-
CERTAINLY_TRANSLATABLE_PROPERTIES
public static final List<String> CERTAINLY_TRANSLATABLE_PROPERTIES
List of properties that should always be translated.
-
PATTERN_HAS_WHITESPACE
protected static final Pattern PATTERN_HAS_WHITESPACE
-
PATTERN_HAS_WORD
protected static final Pattern PATTERN_HAS_WORD
As additional heuristic - the text should have at least one word with >= 5 letters. That will break source languages very different from english, I know, but this is a POC. :-)
-
PATTERN_HAS_LETTER
protected static final Pattern PATTERN_HAS_LETTER
-
HTML_TAG_OR_COMMENT_PATTERN
public static final Pattern HTML_TAG_OR_COMMENT_PATTERN
Matches a HTML tag or endtag or HTML comment.
-
HUGE_TRESHOLD
protected static final int HUGE_TRESHOLD
Somewhat arbitrary treshold: if a single property is more than that number of tokens we don't try to translate it. That'll likely fail and block the translation, and there is very likely something that shouldn't be translated, anyway, like a large HTML fragment.- See Also:
- Constant Field Values
-
config
protected AutoTranslateConfig config
-
gptChatCompletionService
protected GPTChatCompletionService gptChatCompletionService
-
-
Method Detail
-
activate
public void activate(AutoTranslateConfig config)
-
deactivate
public void deactivate()
-
isPocUiEnabled
public boolean isPocUiEnabled()
Description copied from interface:AutoTranslateConfigServiceWhether the debugging UI is enabled.- Specified by:
isPocUiEnabledin interfaceAutoTranslateConfigService
-
isEnabled
public boolean isEnabled()
- Specified by:
isEnabledin interfaceAutoTranslateConfigService
-
getDefaultModel
public String getDefaultModel()
- Specified by:
getDefaultModelin interfaceAutoTranslateConfigService
-
isTranslatableResource
public boolean isTranslatableResource(@Nullable org.apache.sling.api.resource.Resource resource)
- Specified by:
isTranslatableResourcein interfaceAutoTranslateConfigService
-
translateableAttributes
@Nonnull public List<String> translateableAttributes(@Nullable org.apache.sling.api.resource.Resource resource)
Description copied from interface:AutoTranslateConfigServiceReturns those attributes that should be translated. (Of the resource, not children.)- Specified by:
translateableAttributesin interfaceAutoTranslateConfigService
-
includeFullPageInRetranslation
public boolean includeFullPageInRetranslation()
Description copied from interface:AutoTranslateConfigServiceIf true, we do not only provide changed texts to the AI during re-translating a page with some changes, but give the entire page to provide better context. That is a bit slower and a bit more expensive, but likely improves the result.- Specified by:
includeFullPageInRetranslationin interfaceAutoTranslateConfigService
-
includeExistingTranslationsInRetranslation
public boolean includeExistingTranslationsInRetranslation()
Description copied from interface:AutoTranslateConfigServiceIf true, we when retranslating a page with some changes we provide the existing translations of that page to the AI as well as additional context with examples. That is a bit slower and a bit more expensive, but likely improves the result."- Specified by:
includeExistingTranslationsInRetranslationin interfaceAutoTranslateConfigService
-
isHeuristicallyTranslatableProperty
public static boolean isHeuristicallyTranslatableProperty(String name, Object value)
Checks whether the property is one of jcr:title, jcr:description, title, alt, cq:panelTitle, shortDescription, actionText, accessibilityLabel, pretitle, displayPopupTitle, helpMessage , or alternatively don't have a colon in the name, have a String value, don't start with /{content,apps,libs,mnt}/ in the value and the value has a whitespace and at least one 4 letter sequence. We also exclude something that isisHtmlButNotRichtext(String, Object).
-
isHtmlButNotRichtext
public static boolean isHtmlButNotRichtext(String name, Object value)
Heuristic check whether this is actually HTML that shouldn't be translated. Richtext is acceptable for translating, but HTML with lots of attributes not. We recognize text that has a very significant amount of text in HTML tags / comments and try to err on the save side - it should be "very obviously" HTMl, which often has many attributes in it's HTML tags.
-
isHuge
public boolean isHuge(String name, Object value)
This recognizes huge values that we shouldn't even attempt to translate since that would very likely fail and block translation of the rest of the properties. (Note: it is not impossible to translate such a thing but that would need a special implementation which we will do if there is a real use case.)
-
-