Class SpectralRanking

  • Direct Known Subclasses:
    DominantEigenvectorParallelPowerMethod, KatzParallelGaussSeidel, LeftSingularVectorParallelPowerMethod, PageRank, PowerSeries

    public abstract class SpectralRanking
    extends Object
    A base abstract class defining methods and attributes supporting computations of graph spectral rankings such as the dominant eigenvector, PageRank or Katz's index. For some elaboration on the relationships between these rankings, see “Spectral Ranking” by Sebastiano Vigna.

    The usage pattern for concrete subclasses is as follows: first create an instance specifying the graph over which the spectral ranking should be computed. Then, modify public available attributes to fine-tune the ranking algorithm. Finally,

    • either call the init() method, which initializes the state, and then repeatedly call the step() method; every call will compute the next approximation; the current approximation is contained in the rank attribute;
    • or call the stepUntil(SpectralRanking.StoppingCriterion) method, which calls init() and then iterates the step() method until a certain stopping criterion is met; a SpectralRanking.StoppingCriterion is a class that decides whether the iteration can be stopped. The SpectralRanking class provides two ready-to-use implementations of stopping criteria:
      • one that decides to stop depending on whether the value returned by normDelta() (if implemented) is smaller than a certain threshold.
      • another that decides to stop on the basis of the number of iterations.

      Moreover, this class provides two static methods that compose stopping criteria in a conjunctive or disjunctive way.

    At any time, the user may re-initialize the computation, by calling the init() method, or call clear() to get rid of the large arrays that the implementing subclasses usually manage. In the latter case, the arrays are rebuilt on the next call to init().

    Choosing a threshold

    The stopping threshold used by SpectralRanking.NormStoppingCriterion should be set so to obtain a reasonable number of significant digits. In some cases this requires to adapt the threshold to the graph: for instance, PageRank is a stochastic vector, so its entries tend to be very small (of order 1/n, where n is the number of nodes in the graph). You should be wary about digits that are not significant, as they can lead to very biased results when comparing using Kendall's τ rankings with a significant amount of ties (see “Traps and pitfalls of topic-biased PageRank”, by Paolo Boldi, Roberto Posenato, Massimo Santini, and Sebastiano Vigna, WAW 2006. Fourth Workshop on Algorithms and Models for the Web-Graph, volume 4936 of Lecture Notes in Computer Science, pages 107−116, Springer, 2008).

    Note that, depending on the implementation, normDelta() might return a bound on the norm of the difference between the current iterate and the target ranking (e.g., PageRankPowerSeries and PageRankParallelPowerSeries, or any Gauß–Seidel implementation using a w-norm), a tentative estimate of the same bound (e.g., DominantEigenvectorParallelPowerMethod) or simply the norm of the difference between two successive iterates. Precision up to the chosen threshold is guaranteed only in the first case.

    Sebastiano Vigna
    • Field Detail


        public static final double DEFAULT_THRESHOLD
        Default threshold (note that this value is used as a default by main methods).
        See Also:
        Constant Field Values

        public static final int DEFAULT_MAX_ITER
        Default maximum number of iterations (note that this value is used as a default by main methods).
        See Also:
        Constant Field Values
      • n

        public final int n
        The number of nodes of graph, cached.
      • logger

        public final Logger logger
        A logger defined by the implementing subclasses.
      • rank

        public double[] rank
        The current rank vector.
      • iteration

        public int iteration
        The current step (0 after initialization).
    • Constructor Detail

      • SpectralRanking

        public SpectralRanking​(ImmutableGraph graph,
                               Logger logger)
        Creates a new instance.
        graph - the graph.
        logger - a logger.
    • Method Detail

      • isStochastic

        protected static boolean isStochastic​(DoubleList v)
        Commodity method checking whether a vector is stochastic (nonnegative entries summing up to one within STOCHASTIC_TOLERANCE).

        This method uses Kahan's summation algorithm.

        v - the vector to check.
        true if the vector is stochastic.
      • buildProperties

        public Properties buildProperties​(String graphBasename)
        Returns a Properties object that contains all parameters used by the computation.

        Implementing subclasses should extends this method by calling super() and setting additional properties on the resulting Properties.

        graphBasename - basename of the graph
        a properties object that represent all the parameters used to calculate the ranking.
      • init

        public void init()
                  throws IOException
        Initializes the rank vector, zeroes iteration and logs basic data. Please extend this method to handle additional attributes.
      • normDelta

        public double normDelta()
        Returns the norm of an estimation of the distance to the limit of the iterative process: depending on the implementation, this can be an actual bound or, for example, just the difference between the last two approximations.

        This method must be implemented by concrete subclasses if you want to use SpectralRanking.NormStoppingCriterion.

        the norm of an estimation of the distance to the limit.
        IllegalStateException - if called before the first iteration.
        UnsupportedOperationException - if it is not possible to compute a norm.
      • stepUntil

        public void stepUntil​(SpectralRanking.StoppingCriterion stoppingCriterion)
                       throws IOException
        Calls init() and steps until a given stopping criterion is met. The criterion is checked a posteriori (i.e., after each step); this means that at least one step is performed.
        stoppingCriterion - the stopping criterion to be used.
      • clear

        public void clear()
        Clears all data and releases resources by nulling rank (i.e., results we no longer be available). Please extend this method to handle additional attributes.
      • approximateNormVector

        protected byte[] approximateNormVector​(DoubleIterator doubleIterator)
        Returns a compact logarithmic approximation of a norm vector.
        doubleIterator - an iterator enumerating a norm vector.
        an array of bytes containing the opposite of a lower bound on the binary logarithm of the doubles returned by the iterator.