uk-2005

If you publish results based on this graph, please quote the references suggested in the dataset page.

This graph has been obtained from a 2005 crawl of the .uk domain performed by UbiCrawler. The crawl was very shallow, and aimed at gathering a large number of hosts, but a small number of pages from each host. Note that before April 2006 the URL order was not exactly lexicographical due to some locale issues (the graph was anyway correct). The current dataset has been correctly permuted so that URLs are ordered lexicographically.

Basic data
nodes39 459 925
arcs936 364 282
bits/link1.463 (6.62%)
bits/link (transpose)1.156 (5.23%)
average degree23.729
maximum indegree1 776 852
maximum outdegree5 213
dangling nodes11.17%
buckets8.99%
largest component25 711 307 (65.16%)
average distance15.79 (± 0.026)
reachable pairs64.30% (± 0.624)
median distance18 (50.79%)
harmonic diameter23.19 (± 0.205)
Random access (recommended)
FilenameSize
uk-2005.graph201M
uk-2005.properties4.0K
uk-2005-t.graph140M
uk-2005-t.properties4.0K
uk-2005.map438M
uk-2005.smap438M
uk-2005.md5sums4.0K
uk-2005.lmap1.3G
uk-2005.fcl1.2G
uk-2005.urls.gz309M
uk-2005.stats4.0K
uk-2005.indegree3.4M
uk-2005.outdegree12K
uk-2005.scc151M
uk-2005.sccsizes23M
Sequential access (high compression)
FilenameSize
uk-2005-hc.graph164M
uk-2005-hc.properties4.0K
uk-2005-hc-t.graph130M
uk-2005-hc-t.properties4.0K
Natural order (random access)
FilenameSize
uk-2005-nat.graph264M
uk-2005-nat.properties4.0K
uk-2005-nat.fcl999M
uk-2005-nat.urls.gz294M
JGraphT serialized succinct representation
FilenameSize
uk-2005.suxdir5.2G
uk-2005.suxmap1.3G
Indegree-frequency plotIndegree-frequency plot (with Fibonacci binning)
Outdegree-frequency plotOutdegree-frequency plot (with Fibonacci binning)
Indegree-rank plot (cumulative)Indegree-rank plot (cumulative)
Outdegree-rank plot (cumulative)Outdegree-rank plot (cumulative)
Distance probability mass functiondistance probability mass function
Connected-components size distributionConnected-components size distribution
Large connected componentsLarge connected components
Distribution of the logarithm of successor gapsDistribution of the logarithm of the successor gaps
Distribution of successor gapsDistribution of successor gaps