I recently saw a funny talk by Andrei Broder on the development of the first CAPTCHAs (you know, those distorted-text images of words that you type to prove that you’re human).
Apparently this team knew a lot about search and other technologies, but not so much about optical character recognition (OCR). Yet they knew that they wanted CAPTCHAs to be OCR-resistant. They did, luckily have one invaluable resource in the office, in the form of a manual for an early-model Brother scanner. This manual had some sketchy bullet-point advice for how to maximize the chances of OCR success: keep the orientation consistent, don’t change typefaces, avoid background clutter, and so on. So the anti-OCR strategy in its entirety was: let’s try to do everything that Brother says we shouldn’t do, and maybe OCR programs will have trouble. (I also recently learned that the state of the art has improved a lot since then…)