Class SpectralRanking
- Direct Known Subclasses:
DominantEigenvectorParallelPowerMethod
,KatzParallelGaussSeidel
,LeftSingularVectorParallelPowerMethod
,PageRank
,PowerSeries
public abstract class SpectralRanking extends Object
The usage pattern for concrete subclasses is as follows: first create an instance specifying the graph over which the spectral ranking should be computed. Then, modify public available attributes to fine-tune the ranking algorithm. Finally,
- either call the
init()
method, which initializes the state, and then repeatedly call thestep()
method; every call will compute the next approximation; the current approximation is contained in therank
attribute; - or call the
stepUntil(SpectralRanking.StoppingCriterion)
method, which callsinit()
and then iterates thestep()
method until a certain stopping criterion is met; aSpectralRanking.StoppingCriterion
is a class that decides whether the iteration can be stopped. TheSpectralRanking
class provides two ready-to-use implementations of stopping criteria:- one that decides to stop depending on whether
the value returned by
normDelta()
(if implemented) is smaller than a certain threshold. - another that decides to stop on the basis of the number of iterations.
Moreover, this class provides two static methods that compose stopping criteria in a conjunctive or disjunctive way.
- one that decides to stop depending on whether
the value returned by
At any time, the user may re-initialize the computation, by calling the init()
method, or call
clear()
to get rid of the large arrays that the implementing subclasses usually manage. In the latter case, the arrays
are rebuilt on the next call to init()
.
Choosing a threshold
The stopping threshold used by SpectralRanking.NormStoppingCriterion
should be set so to obtain a reasonable number
of significant digits. In some cases this requires to adapt the threshold to the graph: for instance, PageRank
is a stochastic vector, so its entries tend to be very small (of order 1/n, where n is the number of
nodes in the graph). You should be wary about digits that are not significant, as they can lead to very biased results when comparing using
Kendall's τ rankings with a significant amount of ties (see “Traps and pitfalls of topic-biased PageRank”, by
Paolo Boldi, Roberto Posenato, Massimo Santini, and Sebastiano Vigna,
WAW 2006. Fourth Workshop on Algorithms and Models for the Web-Graph, volume 4936 of Lecture Notes in Computer Science, pages 107−116, Springer, 2008).
Note that, depending on the implementation, normDelta()
might return a bound on the norm of the difference
between the current iterate and the target ranking (e.g., PageRankPowerSeries
and PageRankParallelPowerSeries
,
or any Gauß–Seidel implementation using a w-norm), a tentative estimate
of the same bound (e.g., DominantEigenvectorParallelPowerMethod
)
or simply the norm of the difference between two successive iterates. Precision up to the
chosen threshold is guaranteed only in the first case.
- Author:
- Sebastiano Vigna
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
SpectralRanking.IterationNumberStoppingCriterion
A stopping criterion that stops whenever the number of iterations exceeds a given bound.static class
SpectralRanking.NormStoppingCriterion
A stopping criterion that evaluatesnormDelta()
, and stops if this value is smaller than a given threshold.static interface
SpectralRanking.StoppingCriterion
A a strategy that decides when a computation should be stopped. -
Field Summary
Fields Modifier and Type Field Description static int
DEFAULT_MAX_ITER
Default maximum number of iterations (note that this value is used as a default by main methods).static Norm
DEFAULT_NORM
The default norm (Norm.L_INFINITY
).static double
DEFAULT_THRESHOLD
Default threshold (note that this value is used as a default by main methods).ImmutableGraph
graph
The graph.int
iteration
The current step (0 after initialization).Logger
logger
A logger defined by the implementing subclasses.int
n
The number of nodes ofgraph
, cached.double[]
rank
The current rank vector.protected static double
STOCHASTIC_TOLERANCE
The admitted tolerance in the verification that a vector is a stochastic one. -
Constructor Summary
Constructors Constructor Description SpectralRanking(ImmutableGraph graph, Logger logger)
Creates a new instance. -
Method Summary
Modifier and Type Method Description static SpectralRanking.StoppingCriterion
and(SpectralRanking.StoppingCriterion stop1, SpectralRanking.StoppingCriterion stop2)
Composes two stopping criteria, producing a single stopping criterion (the computation stops iff both conditions become true; lazy boolean evaluation is applied).protected byte[]
approximateNormVector(DoubleIterator doubleIterator)
Returns a compact logarithmic approximation of a norm vector.Properties
buildProperties(String graphBasename)
Returns aProperties
object that contains all parameters used by the computation.void
clear()
Clears all data and releases resources by nullingrank
(i.e., results we no longer be available).void
init()
Initializes the rank vector, zeroesiteration
and logs basic data.protected static boolean
isStochastic(DoubleList v)
Commodity method checking whether a vector is stochastic (nonnegative entries summing up to one withinSTOCHASTIC_TOLERANCE
).double
normDelta()
Returns the norm of an estimation of the distance to the limit of the iterative process: depending on the implementation, this can be an actual bound or, for example, just the difference between the last two approximations.static SpectralRanking.StoppingCriterion
or(SpectralRanking.StoppingCriterion stop1, SpectralRanking.StoppingCriterion stop2)
Composes two stopping criteria, producing a single stopping criterion (the computation stops iff either condition becomes true; lazy boolean evaluation is applied).abstract void
step()
Performs one computation step.void
stepUntil(SpectralRanking.StoppingCriterion stoppingCriterion)
Callsinit()
and steps until a given stopping criterion is met.
-
Field Details
-
DEFAULT_THRESHOLD
public static final double DEFAULT_THRESHOLDDefault threshold (note that this value is used as a default by main methods).- See Also:
- Constant Field Values
-
DEFAULT_MAX_ITER
public static final int DEFAULT_MAX_ITERDefault maximum number of iterations (note that this value is used as a default by main methods).- See Also:
- Constant Field Values
-
DEFAULT_NORM
The default norm (Norm.L_INFINITY
). -
STOCHASTIC_TOLERANCE
protected static final double STOCHASTIC_TOLERANCEThe admitted tolerance in the verification that a vector is a stochastic one. A stochastic vector is nonnegative and has ℓ1 norm equal to 1 ±STOCHASTIC_TOLERANCE
.- See Also:
- Constant Field Values
-
graph
The graph. -
n
public final int nThe number of nodes ofgraph
, cached. -
logger
A logger defined by the implementing subclasses. -
rank
public double[] rankThe current rank vector. -
iteration
public int iterationThe current step (0 after initialization).
-
-
Constructor Details
-
SpectralRanking
Creates a new instance.- Parameters:
graph
- the graph.logger
- a logger.
-
-
Method Details
-
and
public static SpectralRanking.StoppingCriterion and(SpectralRanking.StoppingCriterion stop1, SpectralRanking.StoppingCriterion stop2)Composes two stopping criteria, producing a single stopping criterion (the computation stops iff both conditions become true; lazy boolean evaluation is applied).- Parameters:
stop1
- a stopping criterion.stop2
- a stopping criterion.- Returns:
- a criterion that decides to stop as soon as both criteria are satisfied.
-
or
public static SpectralRanking.StoppingCriterion or(SpectralRanking.StoppingCriterion stop1, SpectralRanking.StoppingCriterion stop2)Composes two stopping criteria, producing a single stopping criterion (the computation stops iff either condition becomes true; lazy boolean evaluation is applied).- Parameters:
stop1
- a stopping criterion.stop2
- a stopping criterion.- Returns:
- a criterion that decides to stop as soon as one of the two criteria is satisfied.
-
isStochastic
Commodity method checking whether a vector is stochastic (nonnegative entries summing up to one withinSTOCHASTIC_TOLERANCE
).This method uses Kahan's summation algorithm.
- Parameters:
v
- the vector to check.- Returns:
- true if the vector is stochastic.
-
buildProperties
Returns aProperties
object that contains all parameters used by the computation.Implementing subclasses should extends this method by calling
super()
and setting additional properties on the resultingProperties
.- Parameters:
graphBasename
- basename of the graph- Returns:
- a properties object that represent all the parameters used to calculate the ranking.
-
init
Initializes the rank vector, zeroesiteration
and logs basic data. Please extend this method to handle additional attributes.- Throws:
IOException
-
step
Performs one computation step.- Throws:
IOException
-
normDelta
public double normDelta()Returns the norm of an estimation of the distance to the limit of the iterative process: depending on the implementation, this can be an actual bound or, for example, just the difference between the last two approximations.This method must be implemented by concrete subclasses if you want to use
SpectralRanking.NormStoppingCriterion
.- Returns:
- the norm of an estimation of the distance to the limit.
- Throws:
IllegalStateException
- if called before the first iteration.UnsupportedOperationException
- if it is not possible to compute a norm.
-
stepUntil
Callsinit()
and steps until a given stopping criterion is met. The criterion is checked a posteriori (i.e., after each step); this means that at least one step is performed.- Parameters:
stoppingCriterion
- the stopping criterion to be used.- Throws:
IOException
-
clear
public void clear()Clears all data and releases resources by nullingrank
(i.e., results we no longer be available). Please extend this method to handle additional attributes. -
approximateNormVector
Returns a compact logarithmic approximation of a norm vector.- Parameters:
doubleIterator
- an iterator enumerating a norm vector.- Returns:
- an array of bytes containing the opposite of a lower bound on the binary logarithm of the doubles returned by the iterator.
-