This collection contains software distributed by the Laboratory for Web Algorithmics (LAW), and it is usually linked to some publication. If you find our software useful while working at a scientific publication, please cite us properly, either using the publications quoted in the documentation, or contacting us for suggestions.
We try to distribute everything under the GNU General Public License or the GNU Lesser General Public License.
Highlights
 Statistical tools to compute efficiently Kendall's τ and the weighted τ. They include tools to limit accurately the precision of the involved ranks, as the noise caused by approximation can significantly alter the computation of τ (see “Traps and pitfalls of topicbiased PageRank”, by Paolo Boldi, Massimo Santini, Roberto Posenato, and Sebastiano Vigna, in WAW 2006. Fourth Workshop on Algorithms and Models for the WebGraph, volume 4936 of Lecture Notes in Computer Science, pages 107−116, Springer–Verlag, 2008).
 The largest publicly available set of classes and documentation
related to spectral ranking. It includes a detailed
explanation of theoretical formulations and of the algorithms actually
implementing them. In particular,
PageRankParallelGaussSeidel
is our bestofbreed implementation of PageRank, whereasPageRankFromCoefficients
makes it possible to compute PageRank and its derivatives for every value of the damping factor using the precomputed coefficients of PageRank's power series (using the results described in “PageRank: Functional dependencies”, by Paolo Boldi, Massimo Santini, and Sebastiano Vigna, ACM Trans. Inf. Sys., 27(4):1−23, 2009). You can also compute, for instance, the dominant eigenvector and Katz's index.  A highly scalable implementation of the Layered LabelPropagation algorithm.
ConsistentHashFunction
implements the consistent hash function used by UbiCrawler.
Package Dependencies
The LAW software requires Java ≥6; it uses the DSI utilities, WebGraph, MG4J, and three packages providing highperformance containers and algorithms, that is, fastutil 6.4 or greater, the COLT distribution, and Sux4J. Moreover, it uses JSAP for linecommand parsing. The LAW software uses also a number of useful libraries from the Jakarta commons project, including collections, lang, configuration and io. All logging is performed using log4j. Compiling the LAW software requires javacc.
Package  Description 

it.unimi.dsi.law 
Basic classes.

it.unimi.dsi.law.big.graph  
it.unimi.dsi.law.big.rank  
it.unimi.dsi.law.big.stat  
it.unimi.dsi.law.big.util  
it.unimi.dsi.law.bubing.util  
it.unimi.dsi.law.graph 
Graphrelated classes.

it.unimi.dsi.law.io.tool 
Tools manipulating and converting files.

it.unimi.dsi.law.rank 
Computation of spectral rankings and associated utilities.

it.unimi.dsi.law.stat 
Statistical tools (in particular, Kendall's τ) for largesize data.

it.unimi.dsi.law.util 
Utility classes.

it.unimi.dsi.law.vector  
it.unimi.dsi.law.warc.filters 
A comprehensive filtering system.

it.unimi.dsi.law.warc.filters.parser  
it.unimi.dsi.law.warc.io 
Provides classes performing low and high level WARC I/O (for format details, please see the ISO draft).

it.unimi.dsi.law.warc.io.examples  
it.unimi.dsi.law.warc.parser 
Extensions of the
BulletParser . 
it.unimi.dsi.law.warc.tool 
Commandline tools that manipulate WARC files.

it.unimi.dsi.law.warc.util  
it.unimi.dsi.law.webgraph 