Class PageRankPush


public class PageRankPush
extends PageRank
Computes strongly preferential PageRank for a preference vector concentrated on a node using the push algorithm.

The push algorithm is an incremental way of computing PageRank that is particularly useful when the preference vector is nonzero on a single node, the root (i.e., the entire probability mass of the preference vector is in a single node). It was first proposed by Glen Jeh and Jennifer Widom in “Scaling personalized web search”, Proc. of the Twelfth International World Wide Web Conference, pages 271−279, ACM Press, 2003.

This implementation, in particular, computes strongly preferential PageRank for a single root node (i.e., dangling nodes donate all their rank to the root). Since often the set of nodes involved in the computation is a small fraction, we represent implicitly such nodes using their discovery order. As a result, the SpectralRanking.rank vector does not contain the ranks, which can be recovered using the following code instead:

  double[] rank = new double[graph.numNodes()];
  for(int i = pageRank.node2Seen.size(); i-- != 0;) rank[pageRank.seen2Node[i]] = pageRank.rank[i] / pageRank.pNorm;
  

In case you are interested in the pseudorank, instead, you should use

  double[] rank = new double[graph.numNodes()];
  for(int i = pageRank.node2Seen.size(); i-- != 0;) rank[pageRank.seen2Node[i]] = pageRank.rank[i] / (1 - pageRank.backToRoot);
  

Details on the push algorithm have been given by Paolo Boldi and Sebastiano Vigna in “The Push Algorithm for Spectral Ranking”. We implement both a priority-based update rule, and a simple FIFO update rule. Moreover, we implement loop elimination at the root, as described by Pavel Berkhin in “Bookmark-coloring approach to personalized PageRank computing”, Internet Math., 3(1):41−62, 2006.

Author:
Sebastiano Vigna
  • Field Details

    • progressLogger

      public final ProgressLogger progressLogger
      A progress logger.
    • root

      public int root
      The node where the preference vector is concentrated.
    • threshold

      public double threshold
      The threshold for stopping.
    • residual

      public double[] residual
      The vector r (the rôole of p is covered by SpectralRanking.rank).
    • pNorm

      public double pNorm
      The norm of the SpectralRanking.rank.
    • node2Seen

      public Int2IntOpenHashMap node2Seen
      A map from nodes to the seen-order.
    • seen2Node

      public int[] seen2Node
      A map from seen-order to nodes.
    • backToRoot

      public double backToRoot
      The amount of ranking going back to the root.
  • Constructor Details

    • PageRankPush

      protected PageRankPush​(ImmutableGraph graph, Logger logger, boolean fifo)
      Creates a new instance.
      Parameters:
      graph - the graph on which to compute PageRank.
      logger - a logger that will be passed to super().
      fifo - whether to use a FIFO queue instead of a priority queue to choose the next node to update.
    • PageRankPush

      public PageRankPush​(ImmutableGraph graph, boolean fifo)
      Creates a new instance.
      Parameters:
      graph - the graph on which to compute PageRank.
      fifo - whether to use a FIFO queue instead of a priority queue to choose the next node to update.
  • Method Details