Package it.unimi.dsi.law.warc.tool
Command-line tools that manipulate WARC files.
-
Class Summary Class Description CompressWarc A tool to compress a WARC file.CutWarc A class to extract specific records from a WARC file.ExtractDigestUrls A tool to extract digests and URLs from response records of a WARC file.ExtractLinks Extracts links from a WARC file.GrepWarc A "grep" for WARC files.GZWarcStats A tool to compute some statistics about a gzipped WARC file.IndexWarc A tool to index a WARC file.ListGZWarcComments A tool to list the GZip header comments contained in a compressed WARC file.