[codex] fix youshedubao import side effect#2
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the youshedubao.py script to handle escaped single quotes in the JSON string and wraps the execution entry point in an if __name__ == "__main__": block. The review feedback points out that using chained .replace() calls to unescape strings is fragile and suggests a more robust approach using codecs.escape_decode.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| uisdc_news = json.loads( | ||
| uisdc_news.replace('\\"', '"').replace("\\\\", "\\").replace("\\'", "'") | ||
| ) |
There was a problem hiding this comment.
Using chained .replace() calls to unescape JS/JSON strings is fragile and can lead to corruption. For example, if the string contains a literal backslash followed by a single quote, the first replacement of double backslashes with a single backslash will convert it, and the subsequent replacement of escaped single quotes will incorrectly strip the backslash.\n\nA more robust and standard way to unescape all JS/Python-style escape sequences (like escaped quotes, backslashes, newlines, etc.) in a single pass is to use codecs.escape_decode.
| uisdc_news = json.loads( | |
| uisdc_news.replace('\\"', '"').replace("\\\\", "\\").replace("\\'", "'") | |
| ) | |
| import codecs | |
| uisdc_news = json.loads( | |
| codecs.escape_decode(uisdc_news.encode("utf-8"))[0].decode("utf-8") | |
| ) |
Summary
Fix the hotToday spider startup failure caused by
youshedubaodoing network work during module import.Root Cause
task.pyimportsget_youshedubao_data, butyoushedubao.pyalso calledprint(get_youshedubao_data())at module top level. When the upstream page started returning JS-style escaped single quotes such asSam\'s, import-time parsing raisedJSONDecodeErrorand stopped the entire hourly spider run before other sources could update.Changes
if __name__ == "__main__"so importing the module has no side effects.Validation
git diff --cached --checkdocker exec hottoday-spider python -m py_compile /app/youshedubao/youshedubao.pydocker exec hottoday-spider python -c 'from youshedubao.youshedubao import get_youshedubao_data; data=get_youshedubao_data(); print(len(data["data"]))'returned6./rank/hot; API returned updated data with latest timestamp2026-06-08 15:11:38.