Package it.unimi.dsi.law.warc.tool
Class IndexWarc
java.lang.Object
it.unimi.dsi.law.warc.tool.IndexWarc
public class IndexWarc extends Object
A tool to index a WARC file.
-
Constructor Summary
Constructors Constructor Description IndexWarc() -
Method Summary
Modifier and Type Method Description static voidmain(String[] arg)static voidrun(FastBufferedInputStream in, boolean isGZipped, OutputStream out)This method reads from a given input stream a sequence of WARC records and writes to a given output stream the byte offset of the read records.
-
Constructor Details
-
IndexWarc
public IndexWarc()
-
-
Method Details
-
run
public static void run(FastBufferedInputStream in, boolean isGZipped, OutputStream out) throws IOException, WarcRecord.FormatExceptionThis method reads from a given input stream a sequence of WARC records and writes to a given output stream the byte offset of the read records.- Parameters:
in- the input warc stream.isGZipped- tells if the input stream contains compressed WARC records.out- the output index stream.- Throws:
IOExceptionWarcRecord.FormatException
-
main
- Throws:
Exception
-