it-2004

If you publish results based on this graph, please quote the references suggested in the dataset page.

A fairly large crawl of the .it domain.

Basic data
nodes41 291 594
arcs1 150 725 436
bits/link1.41 (6.43%)
bits/link (transpose)1.143 (5.21%)
average degree27.868
maximum indegree1 326 745
maximum outdegree9 964
dangling nodes12.76%
buckets4.63%
largest component29 855 421 (72.30%)
spid2.15 (± 0.016)
average distance15.04 (± 0.030)
reachable pairs73.07% (± 0.642)
median distance16 (51.41%)
harmonic diameter19.16 (± 0.153)
Random access (recommended)
FilenameSize
it-2004.graph250M
it-2004.properties4.0K
it-2004-t.graph175M
it-2004-t.properties4.0K
it-2004.map458M
it-2004.smap458M
it-2004.md5sums4.0K
it-2004.lmap1.3G
it-2004.fcl1.2G
it-2004.urls.gz247M
it-2004.stats4.0K
it-2004.indegree2.6M
it-2004.outdegree24K
it-2004.scc158M
it-2004.sccsizes26M
Sequential access (high compression)
FilenameSize
it-2004-hc.graph194M
it-2004-hc.properties4.0K
it-2004-hc-t.graph157M
it-2004-hc-t.properties4.0K
Natural order (random access)
FilenameSize
it-2004-nat.graph382M
it-2004-nat.properties4.0K
it-2004-nat.fcl1010M
it-2004-nat.urls.gz220M
Indegree-frequency plotIndegree-frequency plot (with Fibonacci binning)
Outdegree-frequency plotOutdegree-frequency plot (with Fibonacci binning)
Indegree-rank plot (cumulative)Indegree-rank plot (cumulative)
Outdegree-rank plot (cumulative)Outdegree-rank plot (cumulative)
Distance probability mass functiondistance probability mass function
Connected-components size distributionConnected-components size distribution
Large connected componentsLarge connected components
Distribution of the logarithm of successor gapsDistribution of the logarithm of the successor gaps
Distribution of successor gapsDistribution of successor gaps