Distinguishing between “good” bots and “bad” bots is key to implementing a solid security framework.
Automated web traffic is a fundamental part of the Internet. The bots that generate this traffic come from a wide variety of sources, from Google’s harmless web crawling to malicious hackers targeting government voter registration pages.
In fact, bots drove almost 40% of all collected Internet traffic in 2018. That means that out of every ten Internet users, only six are actually human beings sitting behind a computer or peering into a smartphone.
The vast proliferation of bots is a concerning development for business leaders in almost every industry. From airlines to e-commerce, there is an ecosystem of bots carrying out a broad range of activities.
Not all of these activities are harmful. Many of them simply occupy network resources. But some of them aggressively scrape data for fraudulent purposes, and others are part of sophisticated criminal networks.
The “Good” Bot/Bad Bot” Distinction
The problem with bad bots is that they cleverly mask their behavior to seem like human users. Sophisticated programming helps them act in ways that advanced firewall technology cannot directly address. A hypervigilant firewall solution could easily begin restricting legitimate users – which would be bad news.
The existence of good bots also complicates matters. Any marketer with experience in search engine optimization will tell you that blocking Google’s Googlebot will hurt search rankings.
This means that any advanced bot mitigation solution will need to be able to carefully distinguish between harmless bot traffic and malicious bot traffic. This is where Imperva, in partnership with Distil Networks, comes into the picture.
What Bad Bots Can Do
Bad bots are responsible for a wide variety of malicious business practices. Some of the most well-known include:
- Price Scraping. Businesses that employ bots to scrape their competitor’s prices can automate a system that sets their own prices just $0.01 lower than the lowest available price. This earns them the coveted “lowest price” tag on most online marketplaces.
- Content Scraping. Businesses can use bots to copy content from authoritative competitors onto dozens of public websites. The duplicate content drags down search rankings, hurting business and credibility.
- Credential Stuffing. Cybercriminals give bots access to databases full of cracked login credentials and let the bots try every possible combination until they hit a match. This can lead to customer account lockouts, service tickets, and full-scale identity theft.
- Credit Card Fraud. Cybercriminals with access to stolen credit cards can use bots to identify users’ financial data. They can program bots to sift through every combination of expiration date and CVV number until they get a hit.
- Denial of Service. Denial of service attacks are a common way hackers attack businesses and force their websites to close down. This results in lost revenue, service blackouts, and bad reputation.
- Account Aggregation. Malicious bots can create thousands of spam accounts on any platform that allows free account creation. This impacts conversion rates and makes users not trust the service.
- Denial of Inventory. If a malicious bot signs onto an e-commerce site and places an item in its shopping cart, it may be able to deny any other user from purchasing that item. This is a serious and widespread problem in the airline and ticketing industries.
- Gift Card Balance Checking. Cybercriminals regularly employ bots to steal money from gift card accounts. It results in fraudulent purchases and eroded customer trust.
Unfortunately, there is no one-size-fits-all solution that can guarantee a total elimination of bad bot behavior. Since the most sophisticated bots act like human users, optimal mitigation demands a multi-faceted approach that goes beyond the traditional approach.
How Imperva Mitigates Bad Bot Behavior
One of the primary ways that websites, applications, and APIs defend against automated users with malicious intent is through reverse proxies. Where a web proxy accesses web content on a user’s behalf, a reverse proxy accesses server resources after receiving a client’s request.
In a typical denial of service attack, thousands of bots overwhelm the victim’s servers with requests. The reverse proxy is an effective solution for reducing the attack surface, offering a layer of protection from unexpected traffic spikes.
This technology goes hand-in-hand with caching. If a website stores copies of its pages on the reverse proxy server, it is possible to configure to combine those requests into a single request to the end server. This way, the end server does not suffer the impact of the attack.
The combination of reverse proxies and content caching allows content delivery networks to mitigate many of the most dangerous bot attacks reliably. In Imperva’s case, the fact that mirror versions of website pages are stored in disparate geographical locations also helps improve content delivery speeds for legitimate users.
Since reverse proxies act as intermediaries between backend servers and anonymous traffic coming in from every corner of the Internet, they offer an ideal position for traffic scrubbing. This is the process of validating incoming requests before sending them onwards to the origin server.
Deploying a perimeter mesh of reverse proxy servers at this stage provides resilient defense against bot traffic. Combining this defense with Cloud web application firewall (WAF) technology allows the system to weed out malicious bots and cybercriminal requests that make their way in.
Keeping the Good Bots In
Even the most sophisticated bot mitigation solution must let some bots in. A significant percentage of bots perform valuable, desirable functions. Hackers know this, of course, and will not hesitate to impersonate a good bot if it improves their chances.
Bad bots posing as good ones always give themselves away somehow. There is only one legitimate Googlebot out there, and its behavior is hard to mimic convincingly.
It takes years of collecting and qualifying data on good bot behavior to accurately identify which bots to whitelist. Imperva and Distil Networks constantly refine this process, leading to better identification and fewer false positives every day.
Five Bot Threats and Mitigation Strategies ExplainedAccurate Detection Is the First Step to Effective Bot MitigationGood Bots, Bad Bots, and Sneaker Bots: Categorizing Automation by Intent and Transparency