Chapter 02 | Network Defense: Understanding IP Bans, ASN Isolation, and TLS Fingerprinting

7 MIN READ | UPDATED: 2026-06-16
DIRECT SUMMARY // KEY TAKEAWAY

Explore the first line of defense in anti-scraping: network-layer controls. Learn how websites identify scrapers via IP, rate limiting, and sophisticated TLS fingerprinting.

2.1 IP Banning: The Oldest and Most Effective Defense

IP banning is the first and most common line of defense for any website. Websites block automated access by identifying the source IP of a request and blacklisting it.

Types of IP Controls

Type Trigger Condition Symptom
Single IP Ban High request frequency from one IP Returns 403 or 429
ASN Banning Banning an entire IP range (e.g., Cloud provider ranges) No IPs from that data center can access the site
Geo-Banning Banning specific countries or regions Rejected based on GeoIP match
Residential vs Data Center Detecting IP type Data center IPs are directly rejected

Why is your server IP so easily banned?
IP ranges for cloud providers like AWS, GCP, and Alibaba Cloud are public. Websites can block all cloud servers with a simple rule:
if ip in DATACENTER_IP_RANGES: block()

Rate Limiting

sequenceDiagram
    participant Bot as Scraper
    participant WAF as WAF/CDN
    participant Server as Origin Server

    Bot->>WAF: GET /page (1st time)
    WAF->>Server: Forward request
    Server-->>Bot: 200 OK

    Bot->>WAF: GET /page (50th time/sec)
    WAF->>WAF: Trigger rate threshold
    WAF-->>Bot: 429 Too Many Requests

    Note over WAF: Sliding window counter
or Token Bucket algorithm Bot->>WAF: GET /page (during block period) WAF-->>Bot: 403 Forbidden (IP temporarily banned)

2.2 TLS Fingerprinting: Your Handshake Ritual Exposed

This is an advanced anti-scraping method unknown to many developers. The TLS (HTTPS) handshake itself can leak whether you are a bot or a real browser.

JA3 Fingerprinting

JA3 is a fingerprinting algorithm that hashes the TLS ClientHello message, uniquely identifying a TLS client:

JA3 = MD5(SSLVersion, Ciphers, Extensions, EllipticCurves, EllipticCurvePointFormats)

Comparison Examples:

Client JA3 Hash
Chrome 120 (Windows) cd08e31494f9531f560d64c695473da9
Python requests 3b5074b1b5d032e5620f69f9f700ff0e
curl 7dc465ee29f9f4cde9001c75d09b1e65

Python's requests library and curl have fixed JA3 signatures. Cloudflare Bot Management can identify and intercept them during the handshake phase.

How to Detect Your TLS Fingerprint

# Check your curl fingerprint (using a dedicated detection service)
curl https://tls.peet.ws/api/all | python3 -m json.tool | grep ja3

# Check browser fingerprint
# Visit https://browserleaks.com/tls to view your browser's fingerprint

2.3 HTTP Header Analysis

Modern anti-scraping systems meticulously analyze every HTTP request header:

# ❌ Typical scraper headers (easily identified)
headers = {
    'User-Agent': 'python-requests/2.31.0'  # Directly exposes the tool
}

# ✅ Simulating real Chrome browser headers
headers = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8',
    'Accept-Language': 'en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7',
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive',
    'Upgrade-Insecure-Requests': '1',
    'Sec-Fetch-Dest': 'document',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site': 'none',
    'Sec-Fetch-User': '?1',
    'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"macOS"',
}

Key Detection Point: Real browsers have strict logical relationships between Sec-Fetch-* and sec-ch-ua headers.
If you spoof the UA but lack these security headers, or if the header order doesn't follow Chrome conventions, you will be flagged immediately.


2.4 Chapter Review

  1. Why is setting the User-Agent to a Chrome string often still detected?
  2. What is a JA3 fingerprint, and how can a Python program generate the same JA3 as Chrome?
  3. What is the fundamental difference between a cloud server IP and a residential IP in anti-scraping detection?