Class HtmlToApproximateMarkdownServicePlugin
- java.lang.Object
 - 
- com.composum.ai.backend.slingbase.impl.HtmlToApproximateMarkdownServicePlugin
 
 
- 
- All Implemented Interfaces:
 ApproximateMarkdownServicePlugin
public class HtmlToApproximateMarkdownServicePlugin extends Object implements ApproximateMarkdownServicePlugin
A plugin for theApproximateMarkdownServicethat transforms the rendered HTML to markdown. That doesn't work for all components, but might more easily capture the text content of certain components than trying to guess it from the JCR representation, as is the default. 
- 
- 
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static classHtmlToApproximateMarkdownServicePlugin.CapturingResponseWe wrap a response to capture the content, forwarding all but modifying methods to the original response.protected static interfaceHtmlToApproximateMarkdownServicePlugin.Configprotected static classHtmlToApproximateMarkdownServicePlugin.EmptyRequestParameterMapprotected classHtmlToApproximateMarkdownServicePlugin.NonModifyingRequestWrapperWraps the request to make sure nothing is modified.protected static classHtmlToApproximateMarkdownServicePlugin.UnsupportedOperationCalledThrown when unsupported operation was called that requires blacklisting.- 
Nested classes/interfaces inherited from interface com.composum.ai.backend.slingbase.ApproximateMarkdownServicePlugin
ApproximateMarkdownServicePlugin.PluginResult 
 - 
 
- 
Field Summary
Fields Modifier and Type Field Description protected PatternallowedResourceTypePatternprotected Map<String,Long>blacklistedResourceTypeResourceTypes we ignore since their rendering uses unsupported methods.protected LongblacklistedResourceTypeCleanupTimeprotected PatterndeniedResourceTypePattern 
- 
Constructor Summary
Constructors Constructor Description HtmlToApproximateMarkdownServicePlugin() 
- 
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidactivate(HtmlToApproximateMarkdownServicePlugin.Config config)protected voidcleanupBlacklist()protected voiddeactivate()@Nullable StringgetImageUrl(@Nullable org.apache.sling.api.resource.Resource imageResource)Retrieves the imageURL in a way useable for ChatGPT - usually data:image/jpeg;base64,{base64_image} If the plugin cannot handle this resource, it should return null.protected booleanisBecauseOfUnsupportedOperation(Throwable e)protected booleanisIgnoredNode(org.apache.sling.api.resource.Resource resource)We start with depth 3 since the higher nodes often contain headers, navigation and such that don't help for ChatGPT.@NotNull ApproximateMarkdownServicePlugin.PluginResultmaybeHandle(@NotNull org.apache.sling.api.resource.Resource resource, @NotNull PrintWriter out, @NotNull ApproximateMarkdownService service, org.apache.sling.api.SlingHttpServletRequest request, org.apache.sling.api.SlingHttpServletResponse response)Checks whether the resource should be handled by this plugin and if so, handles it by printing an appropriate markdown representation to the PrintWriter.protected StringrenderedAsHTML(org.apache.sling.api.resource.Resource resource, org.apache.sling.api.SlingHttpServletRequest request, org.apache.sling.api.SlingHttpServletResponse response)We render the resource into a mock response and capture and return the generated HTML.- 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait 
- 
Methods inherited from interface com.composum.ai.backend.slingbase.ApproximateMarkdownServicePlugin
cacheMarkdown, getMasterLinks, resourceRendersAsComponentMatching 
 - 
 
 - 
 
- 
- 
Field Detail
- 
allowedResourceTypePattern
protected Pattern allowedResourceTypePattern
 
- 
deniedResourceTypePattern
protected Pattern deniedResourceTypePattern
 
- 
blacklistedResourceType
protected Map<String,Long> blacklistedResourceType
ResourceTypes we ignore since their rendering uses unsupported methods. Blacklisting for only 1h since there might be a deployment in the meantime. Maps the resource type to the time (ms) until it is blacklisted. 
- 
blacklistedResourceTypeCleanupTime
protected volatile Long blacklistedResourceTypeCleanupTime
 
 - 
 
- 
Method Detail
- 
maybeHandle
@NotNull public @NotNull ApproximateMarkdownServicePlugin.PluginResult maybeHandle(@NotNull @NotNull org.apache.sling.api.resource.Resource resource, @NotNull @NotNull PrintWriter out, @NotNull @NotNull ApproximateMarkdownService service, @Nonnull org.apache.sling.api.SlingHttpServletRequest request, @Nonnull org.apache.sling.api.SlingHttpServletResponse response)
Description copied from interface:ApproximateMarkdownServicePluginChecks whether the resource should be handled by this plugin and if so, handles it by printing an appropriate markdown representation to the PrintWriter.- Specified by:
 maybeHandlein interfaceApproximateMarkdownServicePlugin- Returns:
 - what is already handled by this plugin. It is possible to write to the PrintWriter in any case.
 
 
- 
cleanupBlacklist
protected void cleanupBlacklist()
 
- 
getImageUrl
@Nullable public @Nullable String getImageUrl(@Nullable @Nullable org.apache.sling.api.resource.Resource imageResource)
Description copied from interface:ApproximateMarkdownServicePluginRetrieves the imageURL in a way useable for ChatGPT - usually data:image/jpeg;base64,{base64_image} If the plugin cannot handle this resource, it should return null.- Specified by:
 getImageUrlin interfaceApproximateMarkdownServicePlugin
 
- 
isBecauseOfUnsupportedOperation
protected boolean isBecauseOfUnsupportedOperation(Throwable e)
 
- 
isIgnoredNode
protected boolean isIgnoredNode(@Nonnull org.apache.sling.api.resource.Resource resource)
We start with depth 3 since the higher nodes often contain headers, navigation and such that don't help for ChatGPT. 
- 
renderedAsHTML
protected String renderedAsHTML(org.apache.sling.api.resource.Resource resource, org.apache.sling.api.SlingHttpServletRequest request, org.apache.sling.api.SlingHttpServletResponse response) throws javax.servlet.ServletException, IOException
We render the resource into a mock response and capture and return the generated HTML. The response is wrapped so that the real response cannot be modified. We don't do that for the request, because that would be more complicated and probably not needed.- Throws:
 javax.servlet.ServletExceptionIOException
 
- 
activate
protected void activate(HtmlToApproximateMarkdownServicePlugin.Config config)
 
- 
deactivate
protected void deactivate()
 
 - 
 
 -