A simple, browser-based tool for exporting Reddit posts and comment threads for research, analysis, and archival. This tool fetches comment data as JSON, formats it hierarchically, and offers multiple output options.
Use the hosted version at: https://dpfu.github.io/reddit-data-tool/
This is a static, client-side application. It uses native browser modules and can be served by GitHub Pages without a build step.
To run it locally:
npm run devThen open http://localhost:8000.
The tool loads a public Reddit comments URL as JSON in the browser by adding /.json?raw_json=1. It does not use OAuth, a registered API client, or a backend server.
This makes the tool simple and transparent, but it also means:
- only public content that the browser can reach is available;
- large threads may be incomplete when Reddit returns
kind: "more"placeholders; - deleted, private, quarantined, age-gated, or otherwise restricted content may be missing;
- direct browser JSON access can be throttled, blocked, or changed by Reddit.
For systematic or high-volume collection, use approved Reddit API or data access and account for Reddit's deletion and retention requirements.
If live browser requests are blocked, paste a complete Reddit comments JSON response into the manual import field. Pasted JSON is parsed locally in the browser.
Run the unit tests:
npm testRun syntax checks:
npm run checkIf you find this software useful in your work, please cite it as follows:
Pfurtscheller, D. (2026). Reddit Data Tool (Version 1.1.3) [Computer software]. https://doi.org/10.5281/zenodo.15024196
- Preserved focused comment permalinks such as
/comments/{post}/comment/{comment_id}/. - Generated Reddit JSON URLs with the
/.json?raw_json=1form.
- Added a local pasted-JSON import fallback for cases where Reddit blocks direct browser requests.
- Added blocked-request guidance that includes the generated
.json?raw_json=1URL.
- Made the example flow use bundled sample data so it still works when Reddit blocks direct JSON requests.
- Improved HTTP 403/429 errors for blocked or throttled Reddit requests.
- Refactored the static app into smaller browser modules.
- Added unit tests and syntax checks.
- Improved the export UI, filtering, sorting, empty state, and responsive layout.
- Reworked the Thread Map into a compact, collapsible overview for large comment structures.
- Added inline guidance about the limits of Reddit's public JSON access.