I am encountering a 504 Gateway Time-out error when attempting to retrieve recent CDX entries for a large websites using the following URL:
http://web.archive.org/cdx/search/cdx?url=https://www.nih.gov/&from=20250301&matchType=prefix&limit=10
However, when I search using an older date (e.g., http://web.archive.org/cdx/search/cdx?url=https://www.nih.gov/&from=20240301&matchType=prefix&limit=10), or when I simply remove the from parameter, I receive results relatively quickly.
Note that there is no such problem when querying for a smaller website, or when setting matchType=exact. It makes sense because there are fewer entries to manage. However, I find the behavior with dates more strange. Is there a known issue with querying recent dates, or am I doing something wrong?
Thank you in advance!
I am encountering a 504 Gateway Time-out error when attempting to retrieve recent CDX entries for a large websites using the following URL:
http://web.archive.org/cdx/search/cdx?url=https://www.nih.gov/&from=20250301&matchType=prefix&limit=10
However, when I search using an older date (e.g., http://web.archive.org/cdx/search/cdx?url=https://www.nih.gov/&from=20240301&matchType=prefix&limit=10), or when I simply remove the from parameter, I receive results relatively quickly.
Note that there is no such problem when querying for a smaller website, or when setting
matchType=exact. It makes sense because there are fewer entries to manage. However, I find the behavior with dates more strange. Is there a known issue with querying recent dates, or am I doing something wrong?Thank you in advance!