We are still actively working on the spam issue.
reCAPTCHA is a service run by Google to both help digitise books, and prevent bots from spamming. It is used on 4chan to prevent bots from spamming posts or reports.
reCAPTCHA is not a silver bullet. Any sufficiently dedicated spammer can just hire some people from a poor country to fill out CAPTCHA problems for a few cents each. That being said, it's reasonably effective on most websites.
How it works
- The user's browser requests a challenge (an image with distorted text) from reCAPTCHA. reCAPTCHA gives the user a challenge and a token that identifies the challenge.
- The user fills out the web page form, and submits the result to your application server, along with the challenge token.
- reCAPTCHA checks the user's answer, and gives you back a response.
- If true, generally you will allow the user access to some service or information. E.g. allow them to comment on a forum, register for a wiki, or get access to an email address. If false, you can allow the user to try again.
As part of their new spam detection algorithms, Google will serve considerably more difficult CAPTCHAs to users who aren't logged in to a Google account. These harder CAPTCHAs offer zero tolerance on typing mistakes, forcing you to type both test words correctly, much to the bane of most 4chan users, who tend to enter gibberish for the OCR word.
- A single-"word" (typically not an actual English word) captcha with minimal distortion.
- A house number.
- An image recognition test where the user is asked to pick images like the sample image.
- Two words, only one of which must be solved correctly, similar to classic reCAPTCHA.
- Two highly distorted words with added "ink blots" and many easily confusable m's, n's, and r's, both of which must be typed correctly.
- Starting in February 2015, two highly distorted words with the letters drawn outlined, both of which must be typed correctly.
When the new API was first introduced, some users were able to reduce the difficulty of the captchas Google serves them by setting their User-Agent header to that of an Android browser, and by forging rather than blocking the Referer header. Cookies passed to the captcha as a result of being logged in to Google services also affected its behavior, although not always for the better.
In February 2015, Recaptcha was updated, and setting your User-Agent header no longer has any effect. Currently it appears Recaptcha is requiring a referer and a login cookie to get an easy captcha. Initially, both could be forged, and
Referer: https://www.google.com/recaptcha/api/fallback?k=6Ldp2bsSAAAAAAJ5uyx_lx34lJeEpTLVkP5k04qc Cookie: NID=67
was enough for some users to get a classic-style reCAPTCHA. However, sometime in March 2015, they stopped accepting forged cookies, and now only valid Google login cookies appear to be accepted.
In Firefox, the Header Tool add-on is very useful for tweaking these HTTP headers on a per-site/per-page basis. To use it, install the add-on in Firefox, open its settings via Tools > Header Tool > Header Tool, and enter in the sidebar a regexp to select the applicable sites (preceded by an @), such as
followed by the headers you want to send to that site.