The war against spam

Quality signals - Prefer authoritative pages based on:

  • Votes from authors (linkage signals)
  • Votes from users (usage signals)

Policing of URL submissions

  • Anti robot test

Limits on meta-keywords

Robust link analysis

  • Ignore statistically implausible linkage (or text)
  • Use link analysis to detect spammers (guilt by association)

Spam recognition by artificial intelligence

  • Training set based on known spam

Family friendly filters

  • Linguistic analysis, general classification techniques, etc.
  • For images: flesh tone detectors, source text analysis, etc.

Editorial intervention

  • Blacklists
  • Top queries audited
  • Complaints addressed
  • Suspect pattern detection