💡 LEDE : A large-scale benchmark for AI-generated news detection

[AI-generated news construction pipeline]

LEDE is a large-scale benchmark dataset for AI-generated news detection, comprising over 337K articles and approximately 4.3M sentences. It addresses the limitations of existing benchmarks by providing broader generator diversity and news-specific coverage across 21 state-of-the-art LLMs, two languages, and 17 news categories. LEDE serves as a valuable resource for advancing research on AI-generated text detection, cross-model generalization, multilingual robustness, and domain-aware evaluation.The dataset repository includes AI-generated news articles spanning multiple prompting strategies and news categories. For access to the full dataset, please refer to the Hugging Face repository below: https://huggingface.co/datasets/NeurIPS-2026-LEDE/LEDE-dataset

💡 Quantitative comparison of LEDE and existing AI-Gen News datasets

Dataset	Venue	Including News	# News	# LLMs	# Category	# Language
M4 [paper]	EACL 2024	✓ (N%)	12,000	2	✗	3
MAGE [paper]	ACL 2024	✓ (N%)	58,391	27	✗	1
M4GT-Bench [paper]	ACL 2024	✓ (N%)	19,100	4	✗	6
RAID [paper]	ACL 2024	✓ (N%)	726,240	11	5	1
DetectRL [paper]	NeurIPS 2024 D&B	✓ (N%)	33,600	4	✗	1
Beemo [paper]	NAACL 2025	✗	--	--	--	--
M-DAIGT [paper]	RANLP 2025 Shared Task	✓ (N%)	7,000	6	✗	2
LEDE	--	✓ (100%)	337,322	21	17	2

💡 Data Description

LEDE is a large-scale multilingual benchmark for AI-generated news detection, designed to support robust evaluation across diverse LLMs, news categories, generation strategies, and languages.

📈 LEDE Dataset Statistics

AI-generated News

# of LLMs : 21
# of Languages : 2 (Eng, Kor)
# of Articles : 337,322
# of Sentences : 4,309,153
# of News Category : 17
# of News Strategy : 4 (sc, ib, ng, we)
# English Sentences : 2,393,518
# Korean Sentences : 1,915,635

📑 Configuration of LEDE Metadata

Field	Description
`human_rid`	Identifier for the original human-written article. • AIHub datasets: uses the official AIHub dataset ID • English datasets: constructed as `{first 4 words}-{last 4 words}` from the original article
`human_fid`	Identifier for the corresponding fake/generated counterpart. • AIHub datasets: uses the official AIHub dataset ID • English datasets: constructed as `{first 4 words}-{last 4 words}` from the original article
`title`	Title of the AI-generated news article
`summary`	Summary of the AI-generated news article
`ai_article`	Full text of the AI-generated news article
`category`	News category/domain of the article (17 categories in total; e.g., politics, health, law, economy, sports)
`model`	Large Language Model (LLM) used for article generation (21 models in total)
`strategy`	Generation strategy used for article creation (sc, ib, ng, we)
`language`	Language of the generated article (Kor or Engs)
`num_sentences`	Number of sentences in the generated article
`num_words`	Number of words in the generated article

💡 Evaluation

1. Data preparation

1.1. Download LEDE Datasets

To access the LEDE dataset, please visit the following link.

https://huggingface.co/datasets/NeurIPS-2026-LEDE/LEDE-dataset

The LEDE dataset is available under the Creative Commons Attribution-NonCommercial 4.0 International Public License. Any violation of this license agreement may result in legal action. By downloading the HiDF, the user agrees to the terms of the CC BY-NC 4.0 license.

1.2. Download Human-written News Datasets

Please download all of the following datasets and store them in the human-written/ directory.

1.3.Mapping Human-written News

Each human-written article is aligned with its corresponding AI-generated article using the human_rid field.

AI-Hub datasets: The original dataset ID is used directly.
English datasets: IDs are constructed in the format {first 4 words}-{last 4 words} from the original article.

This mapping enables direct and consistent comparison between human-written and AI-generated texts during evaluation.

2. Baseline Evaluation

Run baseline model evaluation using either a single CSV file or a CSV directory. Below are sample commands for running zero-shot baseline evaluations.

$ git clone https://github.com/DSAIL-SKKU/LEDE.git

2-1. Fast-DetectGPT

Installation

You can follow the official Fast-DetectGPT GitHub repository for installation details.
Python3.8
PyTorch1.10.0

Evaluate a CSV Directory

$ cd src/baselines/fast-detect-gpt
$ bash scripts/eval.sh --csv_dir /path/to/csv_dir

Each file prints metrics in the following format:

n_pairs: XXXX
ROC AUC (criterion): 0.XXXX
PR AUC (criterion): 0.XXXX

The aggregated per-file metrics are saved to ./outputs/batch_eval/roc/ by default.

2-2. Binoculars

Installation

You can follow the official Binoculars GitHub repository for installation details.
Python3.8
PyTorch1.10.0

Evaluate a Single CSV File

$ cd src/baselines/Binoculars/
$ bash eval.sh --csv_path /path/to/file.csv

Evaluate a CSV Directory

$ cd src/baselines/Binoculars/
$ bash eval.sh --csv_dir /path/to/csv_dir

Each file prints metrics in the following format:

[OK] <file>.csv | n=<rows> (eval=<evaluated_rows>) | ACC=0.XXXX ROC_AUC=0.XXXX PR_AUC=0.XXXX

The aggregated per-file metrics are saved to binoculars_csv_folder_metrics.csv by default.

2-3. Additional Models

In addition to the two base models described above, other AI-generated text detection models can be explored through their official GitHub repositories.

Zero-shot Modles

Supervised Models

💡 License

The LEDE dataset is available under the Creative Commons Attribution-NonCommercial 4.0 International Public License: https://creativecommons.org/licenses/by-nc/4.0/. The code is released under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data/ai		data/ai
figures		figures
src/baselines		src/baselines
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

💡 LEDE : A large-scale benchmark for AI-generated news detection

💡 Quantitative comparison of LEDE and existing AI-Gen News datasets

💡 Data Description

📈 LEDE Dataset Statistics

AI-generated News

📑 Configuration of LEDE Metadata

💡 Evaluation

1. Data preparation

1.1. Download LEDE Datasets

1.2. Download Human-written News Datasets

1.3.Mapping Human-written News

2. Baseline Evaluation

2-1. Fast-DetectGPT

2-2. Binoculars

2-3. Additional Models

💡 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

💡 LEDE : A large-scale benchmark for AI-generated news detection

💡 Quantitative comparison of LEDE and existing AI-Gen News datasets

💡 Data Description

📈 LEDE Dataset Statistics

AI-generated News

📑 Configuration of LEDE Metadata

💡 Evaluation

1. Data preparation

1.1. Download LEDE Datasets

1.2. Download Human-written News Datasets

1.3.Mapping Human-written News

2. Baseline Evaluation

2-1. Fast-DetectGPT

2-2. Binoculars

2-3. Additional Models

💡 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages