Skip to content

dpfu/reddit-data-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DOI

Reddit Data Tool

A simple, browser-based tool for exporting Reddit posts and comment threads for research, analysis, and archival. This tool fetches comment data as JSON, formats it hierarchically, and offers multiple output options.

Usage

Use the hosted version at: https://dpfu.github.io/reddit-data-tool/

This is a static, client-side application. It uses native browser modules and can be served by GitHub Pages without a build step.

To run it locally:

npm run dev

Then open http://localhost:8000.

Limits

The tool loads a public Reddit comments URL as JSON in the browser by adding /.json?raw_json=1. It does not use OAuth, a registered API client, or a backend server.

This makes the tool simple and transparent, but it also means:

  • only public content that the browser can reach is available;
  • large threads may be incomplete when Reddit returns kind: "more" placeholders;
  • deleted, private, quarantined, age-gated, or otherwise restricted content may be missing;
  • direct browser JSON access can be throttled, blocked, or changed by Reddit.

For systematic or high-volume collection, use approved Reddit API or data access and account for Reddit's deletion and retention requirements.

If live browser requests are blocked, paste a complete Reddit comments JSON response into the manual import field. Pasted JSON is parsed locally in the browser.

Development

Run the unit tests:

npm test

Run syntax checks:

npm run check

Cite

If you find this software useful in your work, please cite it as follows:

Pfurtscheller, D. (2026). Reddit Data Tool (Version 1.1.3) [Computer software]. https://doi.org/10.5281/zenodo.15024196

Release notes

v1.1.3

  • Preserved focused comment permalinks such as /comments/{post}/comment/{comment_id}/.
  • Generated Reddit JSON URLs with the /.json?raw_json=1 form.

v1.1.2

  • Added a local pasted-JSON import fallback for cases where Reddit blocks direct browser requests.
  • Added blocked-request guidance that includes the generated .json?raw_json=1 URL.

v1.1.1

  • Made the example flow use bundled sample data so it still works when Reddit blocks direct JSON requests.
  • Improved HTTP 403/429 errors for blocked or throttled Reddit requests.

v1.1

  • Refactored the static app into smaller browser modules.
  • Added unit tests and syntax checks.
  • Improved the export UI, filtering, sorting, empty state, and responsive layout.
  • Reworked the Thread Map into a compact, collapsible overview for large comment structures.
  • Added inline guidance about the limits of Reddit's public JSON access.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors