Class WeightedTau

java.lang.Object
it.unimi.dsi.law.stat.CorrelationIndex
it.unimi.dsi.law.stat.WeightedTau

public class WeightedTau
extends CorrelationIndex
Computes the weighted τ between two score vectors. More precisely, this class computes the formula given by Sebastiano Vigna in “A weighted correlation index for rankings with ties”, Proc. of the 24th International World–Wide Web Conference, pages 1166−1176, 2015, ACM Press, using the algorithm therein described (see details below).

Given two scores vectors for a list of items, this class provides a method to compute efficiently the weighted τ using an ExchangeWeigher.

Instances of this class are immutable. At creation time you can specify a weigher that turns indices into weights, and whether to combine weights additively or multiplicatively. Ready-made weighers include HYPERBOLIC_WEIGHER, which is the weigher of choice. Alternatives include LOGARITHMIC_WEIGHER and QUADRATIC_WEIGHER. Additional methods inherited from CorrelationIndex make it possible to compute directly the weighted τ bewteen two files, to bound the number of significant digits, or to reverse the standard association between scores and ranks (by default, a larger score corresponds to a higher rank, i.e., to a smaller rank index; the largest score gets rank 0).

The weighted τ is defined as follows: consider a rank function ρ (returning natural numbers or ∞) that provides a ground truth—it tells us which elements are more or less important. Consider also a weight function w(−, −) associating with each pair of ranks a nonnegative real number. We define the rank-weighted τ by

r, sρ,w = ∑ij sgn(rirj) sgn(sisj) w(ρ(i), ρ(j))
rρ,w = 〈r, rρ,w1/2
τρ,w(r, s) = 〈r, sρ,w / (‖rρ,wsρ,w).

The weight function can be specified by giving a weigher f (e.g., HYPERBOLIC_WEIGHER) and a combination strategy, which can be additive or multiplicative. The weight of the exchange between i and j is then f(i) ● f(j), where ● is the chosen combinator.

Now, consider the rank function ρr, s induced by the lexicographical order by r and s. We define

τw = (τρr, s, w + τρs, r, w) / 2.

In particular, the (additive) hyperbolic τ is defined by the weight function h(i) = 1 / (i + 1) combined additively:

τh = (τρr, s, h + τρs, r, h) / 2.

The methods inherited from CorrelationIndex compute the formula above using the provided weigher and combination method. A ready-made instance HYPERBOLIC can be used to compute the additive hyperbolic τ. An ad hoc method can instead compute τρ,w.

A main method is provided for command-line usage.

• Field Details

• HYPERBOLIC_WEIGHER

public static final Int2DoubleFunction HYPERBOLIC_WEIGHER
A hyperbolic weigher (the default one). Rank x has weight 1 / (x + 1).

A quadratic weigher. Rank x has weight 1 / (x + 1)2.
• LOGARITHMIC_WEIGHER

public static final Int2DoubleFunction LOGARITHMIC_WEIGHER
A logarithmic weigher. Rank x has weight 1 / ln(x + e).
• ZERO_WEIGHER

public static final Int2DoubleFunction ZERO_WEIGHER
A constant zero weigher.
• HYPERBOLIC

public static final WeightedTau HYPERBOLIC
A singleton instance of the symmetric hyperbolic additive τ.
• Constructor Details

• WeightedTau

public WeightedTau()
• WeightedTau

public WeightedTau​(Int2DoubleFunction weigher)
Create an additive weighted τ using the specified weigher.
Parameters:
weigher - a weigher.
• WeightedTau

public WeightedTau​(Int2DoubleFunction weigher, boolean multiplicative)
Create an additive or multiplicative weighted τ using the specified weigher and combination strategy.
Parameters:
weigher - a weigher.
multiplicative - if true, weights are combined multiplicatively, rather than additively.
• Method Details

• compute

public double compute​(double[] v0, double[] v1)
Computes the symmetrized weighted τ between two score vectors.
Specified by:
compute in class CorrelationIndex
Parameters:
v0 - the first score vector.
v1 - the second score vector.
Returns:
the symmetric weighted τ.
• compute

public double compute​(double[] v0, double[] v1, int[] rank)
Computes the weighted τ between two score vectors, given a reference rank.

Note that this method must be called with some care. More precisely, the two arguments should be built on-the-fly in the method call, and not stored in variables, as the first argument array will be null'd during the execution of this method to free some memory: if the array is referenced elsewhere the garbage collector will not be able to collect it.

Parameters:
v0 - the first score vector.
v1 - the second score vector.
rank - the “ground truth” ranking used to weight exchanges, or null to use the ranking induced lexicographically by v1 and v0 as ground truth.
Returns:
the weighted τ.
• main

public static void main​(String[] arg) throws
Throws:
NumberFormatException
IOException
JSAPException