429 error keeps occurring even after a long time delay #2410
Replies: 3 comments
-
|
The issue is with how you're handling the async client and the rate limiting. A few problems:
Here's the fixed version: import asyncio
import httpx
import pandas as pd
from io import StringIO
async def fetch_and_parse(url):
async with httpx.AsyncClient(
timeout=30.0,
limits=httpx.Limits(max_connections=5)
) as client:
while True:
response = await client.get(url)
if response.status_code == 429:
# Check Retry-After header first
retry_after = response.headers.get("Retry-After", "30")
wait = int(retry_after)
print(f"Rate limited on {url}, waiting {wait}s...")
await asyncio.sleep(wait) # NOT time.sleep()
continue
response.raise_for_status()
tables = pd.read_html(StringIO(response.text))[0]
return tablesKey fixes:
Why re-running a new script works: A fresh process gets a new TCP connection with a different ephemeral source port. The server's rate limiter may track by connection rather than by IP alone. The fix above achieves the same effect by using a single persistent connection with proper backoff. If you're making many parallel requests, also consider using a sem = asyncio.Semaphore(3) # max 3 concurrent
async def fetch_limited(url):
async with sem:
return await fetch_and_parse(url) |
Beta Was this translation helpful? Give feedback.
-
|
The issue is with how you're handling the async client and the rate limiting. A few problems:
Here's the fixed version: import asyncio
import httpx
import pandas as pd
from io import StringIO
async def fetch_and_parse(url):
async with httpx.AsyncClient(
timeout=30.0,
limits=httpx.Limits(max_connections=5)
) as client:
while True:
response = await client.get(url)
if response.status_code == 429:
# Check Retry-After header first
retry_after = response.headers.get("Retry-After", "30")
wait = int(retry_after)
print(f"Rate limited on {url}, waiting {wait}s...")
await asyncio.sleep(wait) # NOT time.sleep()
continue
response.raise_for_status()
tables = pd.read_html(StringIO(response.text))[0]
return tablesKey fixes:
Why re-running a new script works: A fresh process gets a new TCP connection with a different ephemeral source port. The server's rate limiter may track by connection rather than by IP alone. The fix above achieves the same effect by using a single persistent connection with proper backoff. If you're making many parallel requests, also consider using a sem = asyncio.Semaphore(3) # max 3 concurrent
async def fetch_limited(url):
async with sem:
return await fetch_and_parse(url) |
Beta Was this translation helpful? Give feedback.
-
|
The issue is that Here's the fixed version: import asyncio
import httpx
import pandas as pd
from io import StringIO
SEMAPHORE = asyncio.Semaphore(3) # limit concurrent requests
async def fetch_and_parse(client, url):
async with SEMAPHORE:
for attempt in range(5):
response = await client.get(url)
if response.status_code == 429:
retry_after = int(response.headers.get("Retry-After", 15))
print(f"429 on {url}, waiting {retry_after}s (attempt {attempt+1})")
await asyncio.sleep(retry_after) # async sleep, not time.sleep!
continue
response.raise_for_status()
tables = pd.read_html(StringIO(response.text))[0]
return tables
raise Exception(f"Max retries exceeded for {url}")
async def main(urls):
async with httpx.AsyncClient(
timeout=30,
limits=httpx.Limits(max_connections=5),
follow_redirects=True
) as client:
tasks = [fetch_and_parse(client, url) for url in urls]
return await asyncio.gather(*tasks)Key fixes:
Note: this is the httpx Python library, not the ProjectDiscovery httpx CLI tool — different projects with the same name. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The code works initially but it eventually gets a 429 error due to too many requests. I put a sleep to wait for the limit to pass but I still get the error. If I re-run the script immediately (new instance), the error doesn't occur, so I should be able to bypass the restriction.
Beta Was this translation helpful? Give feedback.
All reactions