Class AutoTranslateConfigServiceImpl

    • Field Detail

      • CERTAINLY_TRANSLATABLE_PROPERTIES

        public static final List<String> CERTAINLY_TRANSLATABLE_PROPERTIES
        List of properties that should always be translated.
      • PATTERN_HAS_WHITESPACE

        protected static final Pattern PATTERN_HAS_WHITESPACE
      • PATTERN_HAS_WORD

        protected static final Pattern PATTERN_HAS_WORD
        As additional heuristic - the text should have at least one word with >= 5 letters. That will break source languages very different from english, I know, but this is a POC. :-)
      • PATTERN_HAS_LETTER

        protected static final Pattern PATTERN_HAS_LETTER
      • HTML_TAG_OR_COMMENT_PATTERN

        public static final Pattern HTML_TAG_OR_COMMENT_PATTERN
        Matches a HTML tag or endtag or HTML comment.
      • HUGE_TRESHOLD

        protected static final int HUGE_TRESHOLD
        Somewhat arbitrary treshold: if a single property is more than that number of tokens we don't try to translate it. That'll likely fail and block the translation, and there is very likely something that shouldn't be translated, anyway, like a large HTML fragment.
        See Also:
        Constant Field Values
      • deniedResourceTypes

        protected List<Pattern> deniedResourceTypes
      • allowedAttributeRegexes

        protected List<Pattern> allowedAttributeRegexes
      • deniedAttributesRegexes

        protected List<Pattern> deniedAttributesRegexes
      • gptChatCompletionService

        protected com.composum.ai.backend.base.service.chat.GPTChatCompletionService gptChatCompletionService
    • Constructor Detail

      • AutoTranslateConfigServiceImpl

        public AutoTranslateConfigServiceImpl()
    • Method Detail

      • deactivate

        public void deactivate()
      • isHeuristicallyTranslatableProperty

        public static boolean isHeuristicallyTranslatableProperty​(String name,
                                                                  Object value)
        Checks whether the property is one of jcr:title, jcr:description, title, alt, cq:panelTitle, shortDescription, actionText, accessibilityLabel, pretitle, displayPopupTitle, helpMessage , or alternatively don't have a colon in the name, have a String value, don't start with /{content,apps,libs,mnt}/ in the value and the value has a whitespace and at least one 4 letter sequence. We also exclude something that is isHtmlButNotRichtext(String, Object).
      • isHtmlButNotRichtext

        public static boolean isHtmlButNotRichtext​(String name,
                                                   Object value)
        Heuristic check whether this is actually HTML that shouldn't be translated. Richtext is acceptable for translating, but HTML with lots of attributes not. We recognize text that has a very significant amount of text in HTML tags / comments and try to err on the save side - it should be "very obviously" HTMl, which often has many attributes in it's HTML tags.
      • isHuge

        public boolean isHuge​(String name,
                              Object value)
        This recognizes huge values that we shouldn't even attempt to translate since that would very likely fail and block translation of the rest of the properties. (Note: it is not impossible to translate such a thing but that would need a special implementation which we will do if there is a real use case.)