Filter<URIResponse>
, FlyweightPrototype<Filter<URIResponse>>
, com.google.common.base.Predicate<URIResponse>
, java.util.function.Predicate<URIResponse>
BinaryParser
, HTMLParser
public interface Parser<T> extends Filter<URIResponse>
responses
. Every parser provides the following functionalities:
Filter
that is able to decide whether it can parse a certain URIResponse
or not (e.g.,
based on the declared content-type
header etc.);
Parser.LinkReceiver
, that will typically accumulate them or send them to the appropriate class for processing;
parsing
method will return a digest computed on a
(possibly) suitably modified version of the document (the way in which the document it is actually modified and
the way in which the hash is computed is implementation-dependent and should be commented by the implementing classes);
Modifier and Type | Interface | Description |
---|---|---|
static interface |
Parser.LinkReceiver |
A class that can receive URLs discovered during parsing.
|
static interface |
Parser.TextProcessor<T> |
A class that can receive piece of text discovered during parsing.
|
Modifier and Type | Field | Description |
---|---|---|
static Parser.LinkReceiver |
NULL_LINK_RECEIVER |
A no-op implementation of
Parser.LinkReceiver . |
FILTER_PACKAGE_NAME
Modifier and Type | Method | Description |
---|---|---|
Parser<T> |
copy() |
This method strengthens the return type of the method inherited from
Filter . |
java.lang.String |
guessedCharset() |
Returns a guessed charset for the document, or
null if the charset could not be
guessed. |
byte[] |
parse(java.net.URI uri,
org.apache.http.HttpResponse response,
Parser.LinkReceiver linkReceiver) |
Parses a response.
|
T |
result() |
Returns the result of the processing.
|
static final Parser.LinkReceiver NULL_LINK_RECEIVER
Parser.LinkReceiver
.byte[] parse(java.net.URI uri, org.apache.http.HttpResponse response, Parser.LinkReceiver linkReceiver) throws java.io.IOException
response
- a response to parse.linkReceiver
- a link receiver.null
if no digest has been
computed.java.io.IOException
java.lang.String guessedCharset()
null
if the charset could not be
guessed.null
.T result()
Note that this method must be idempotent.