Class GrepWarc

java.lang.Object
it.unimi.dsi.law.warc.tool.GrepWarc

public class GrepWarc
extends Object
A "grep" for WARC files.
  • Constructor Details

    • GrepWarc

      public GrepWarc()
  • Method Details

    • run

      public static void run​(FastBufferedInputStream in, boolean isGZipped, Filter<WarcRecord> filter, OutputStream out) throws IOException
      This method acts as a sort of "grep" for WARC files.

      It reads from a given input stream a sequence of (possibly compressed) WARC records, and writes the one that are accepted by the specified AbstractFilter to a given output stream (uncompressed).

      Parameters:
      in - the input stream.
      isGZipped - tells if the input stream contains compressed WARC records.
      filter - the filter.
      out - the output stream.
      Throws:
      IOException
    • main

      public static void main​(String[] arg) throws Exception
      Throws:
      Exception