One technical measure that can protect against many scrapers is the robot.txt command, a text file that can give instructions to Web robots, said Furtsch. It has a serious limitation, however. In most cases, screen scrapers must choose to find the file and read its instructions in order for the text file to be effective; malicious bots likely won’t seek the file.
Another measure sites should take is to provide their users with mechanisms for deleting their sensitive data whenever they choose.
Another widely used tool to protect against scrapers are known as captchas. They show squiggly letters and numbers that a computer or bot cannot decipher. Sites have people type the captcha during registration to prove that they’re human.
Captchas should also be regularly updated, Furtsch said, as some scraping tools have been known to outsmart certain types of captchas.