koldfront

Detecting near-duplicates #heuristics

🕡︎ - 2008-08-10

A very cute way of detecting near-duplicates: Look at the words right after stopwords. It is a pretty simple idea, cool if it works.

Add comment

To avoid spam many websites make you fill out a CAPTCHA, or log in via an account at a corporation such as Twitter, Facebook, Google or even Microsoft GitHub.

I have chosen to use a more old school method of spam prevention.

To post a comment here, you need to:

¹ Such as Thunderbird, Pan, slrn, tin or Gnus (part of Emacs).

Or, you can fill in this form:

+=