com.fatwire.crawler
Interface ResourceRewriter


public interface ResourceRewriter

This interface is used to provide the implementation for rewriting links inside the downloaded markup. In Site Capture context, each downloaded resource from the crawler is considered as a WebResource.

Site Capture OOTB provides a default implementation based on regular expression - PatternResourceRewriter which is used to rewrite the links inside the downloaded markup.

Refer to developer document to see the more details on PatternResourceRewriter.


Method Summary
 byte[] rewrite(WebResource resource)
          This method is automatically injected by the crawler framework and is used to provide an implementation for the resource rewriter algorithm.
 

Method Detail

rewrite

byte[] rewrite(WebResource resource)
This method is automatically injected by the crawler framework and is used to provide an implementation for the resource rewriter algorithm.

Parameters:
resource - A WebResource object which contains the information regarding the downloaded resource as part of crawl session.
Returns:
Return a byte array of binary data for the downloaded resource after rewriting it as per the algorithm.
Throws:
java.io.IOException - Throws an IOException if there is any problem in rewriting the markup.


Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved