Package it.unimi.dsi.law.warc.parser
Class BinaryParser
java.lang.Object
it.unimi.dsi.law.warc.parser.BinaryParser
- All Implemented Interfaces:
com.google.common.base.Predicate<Response>,Filter<Response>,Parser,Cloneable,Predicate<Response>
public class BinaryParser extends Object implements Parser
A universal binary parser that just computes digests.
-
Nested Class Summary
Nested classes/interfaces inherited from interface it.unimi.dsi.law.warc.parser.Parser
Parser.LinkReceiver -
Field Summary
Fields inherited from interface it.unimi.dsi.law.warc.filters.Filter
FILTER_PACKAGE_NAMEFields inherited from interface it.unimi.dsi.law.warc.parser.Parser
NULL_LINK_RECEIVER -
Constructor Summary
Constructors Constructor Description BinaryParser(String messageDigestAlgorithm)Builds a parser for digesting a page.BinaryParser(MessageDigest messageDigest)Builds a parser for digesting a page. -
Method Summary
Modifier and Type Method Description booleanapply(Response response)Objectclone()StringguessedCharset()Returns a guessed charset for the document, ornullif the charset could not be guessed.byte[]parse(Response response, Parser.LinkReceiver linkReceiver)Parses a response.Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface com.google.common.base.Predicate
equals, test
-
Constructor Details
-
BinaryParser
Builds a parser for digesting a page.- Parameters:
messageDigest- the digesting algorithm, ornullif no digesting will be performed.
-
BinaryParser
Builds a parser for digesting a page.- Parameters:
messageDigestAlgorithm- the digesting algorithm (as a string).- Throws:
NoSuchAlgorithmException
-
-
Method Details
-
parse
Description copied from interface:ParserParses a response.- Specified by:
parsein interfaceParser- Parameters:
response- a response to parse.linkReceiver- a link receiver.- Returns:
- a byte digest for the page, or
nullif no digest has been computed. - Throws:
IOException
-
apply
- Specified by:
applyin interfacecom.google.common.base.Predicate<Response>
-
clone
-
guessedCharset
Description copied from interface:ParserReturns a guessed charset for the document, ornullif the charset could not be guessed.- Specified by:
guessedCharsetin interfaceParser- Returns:
- a charset or
null.
-