Package it.unimi.dsi.law.warc.parser
Class BinaryParser
java.lang.Object
it.unimi.dsi.law.warc.parser.BinaryParser
- All Implemented Interfaces:
com.google.common.base.Predicate<Response>
,Filter<Response>
,Parser
,Cloneable
,Predicate<Response>
public class BinaryParser extends Object implements Parser
A universal binary parser that just computes digests.
-
Nested Class Summary
Nested classes/interfaces inherited from interface it.unimi.dsi.law.warc.parser.Parser
Parser.LinkReceiver
-
Field Summary
Fields inherited from interface it.unimi.dsi.law.warc.filters.Filter
FILTER_PACKAGE_NAME
Fields inherited from interface it.unimi.dsi.law.warc.parser.Parser
NULL_LINK_RECEIVER
-
Constructor Summary
Constructors Constructor Description BinaryParser(String messageDigestAlgorithm)
Builds a parser for digesting a page.BinaryParser(MessageDigest messageDigest)
Builds a parser for digesting a page. -
Method Summary
Modifier and Type Method Description boolean
apply(Response response)
Object
clone()
String
guessedCharset()
Returns a guessed charset for the document, ornull
if the charset could not be guessed.byte[]
parse(Response response, Parser.LinkReceiver linkReceiver)
Parses a response.Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface com.google.common.base.Predicate
equals, test
-
Constructor Details
-
BinaryParser
Builds a parser for digesting a page.- Parameters:
messageDigest
- the digesting algorithm, ornull
if no digesting will be performed.
-
BinaryParser
Builds a parser for digesting a page.- Parameters:
messageDigestAlgorithm
- the digesting algorithm (as a string).- Throws:
NoSuchAlgorithmException
-
-
Method Details
-
parse
Description copied from interface:Parser
Parses a response.- Specified by:
parse
in interfaceParser
- Parameters:
response
- a response to parse.linkReceiver
- a link receiver.- Returns:
- a byte digest for the page, or
null
if no digest has been computed. - Throws:
IOException
-
apply
- Specified by:
apply
in interfacecom.google.common.base.Predicate<Response>
-
clone
-
guessedCharset
Description copied from interface:Parser
Returns a guessed charset for the document, ornull
if the charset could not be guessed.- Specified by:
guessedCharset
in interfaceParser
- Returns:
- a charset or
null
.
-