Package it.unimi.dsi.law.warc.tool
Class IndexWarc
java.lang.Object
it.unimi.dsi.law.warc.tool.IndexWarc
public class IndexWarc extends Object
A tool to index a WARC file.
-
Constructor Summary
Constructors Constructor Description IndexWarc()
-
Method Summary
Modifier and Type Method Description static void
main(String[] arg)
static void
run(FastBufferedInputStream in, boolean isGZipped, OutputStream out)
This method reads from a given input stream a sequence of WARC records and writes to a given output stream the byte offset of the read records.
-
Constructor Details
-
IndexWarc
public IndexWarc()
-
-
Method Details
-
run
public static void run(FastBufferedInputStream in, boolean isGZipped, OutputStream out) throws IOException, WarcRecord.FormatExceptionThis method reads from a given input stream a sequence of WARC records and writes to a given output stream the byte offset of the read records.- Parameters:
in
- the input warc stream.isGZipped
- tells if the input stream contains compressed WARC records.out
- the output index stream.- Throws:
IOException
WarcRecord.FormatException
-
main
- Throws:
Exception
-