Some web characteristics

Significant duplication

High linkage

Complex graph topology

Spam

Fetterly, D., Manasse, M. and Najork, M. 2003. On the evolution of clusters of near-duplicate Web pages
    Procs. 1st Latin-Amer. Web Congress, 37-45.