This is the official artifacts repository for the paper "Cryptanalysis of LDPC-Based Pseudorandom Error-Correcting Codes" (accepted by USENIX Security 2026 at Cycle 2 with Paper ID #1773).
In this paper, we analyze the security of pseudorandom error-correcting codes (PRCs), and propose three attacks against their undetectability and robustness properties. Since PRCs are used for watermarking generative content, our attacks can be used to detect and remove those watermarks, revealing the concrete security limits of PRCs in real-world watermarking applications.
This artifact contains the code for both thoretical analysis of our attacks, along with the attack against real-world PRC watermark schemes for Large Language Models (LLMs) and Generative Image Models (GIMs). We also provide the example data for reproducing the results in our paper.
- Software: an Linux-based environment with
conda,git, andunzipinstalled. - Hardware: Generating PRC watermarked content requires a GPU with at least 80GB of VRAM. However, the attacks and result analysis can be reproduced on a CPU-only machine with the provided data artifacts.
-
Download the artifacts from Zenodo or directly clone the repository from GitHub.
git clone https://github.com/1234wangtr/PRC_estimator
-
Unzip the example data artifacts
cd prc-estimator unzip gim/data/SD21_t3.zip -d gim/data/SD21_t3_example unzip llm/data/Deepseek_t_3_temp_1.8.zip -d llm/data/Deepseek_t_3_temp_1.8_example unzip llm/data/Deepseek_t_3_temp_all.zip -d llm/data -
Create the conda environment for llm and gim.
conda env create -f setup/environment-llm.yaml conda env create -f setup/environment-gim.yaml
-
Download the publicly available models and datasets to
.models.- Stable Diffusion 2.1 Base (For GIM main experiments)
- DeepSeek Qwen2.5 8B (For LLM main experiments)
- Stable Diffusion 1.5 (For GIM ablation experiments)
- Stable Diffusion 2 Base (For GIM ablation experiments)
- Qwen3 8B (For LLM ablation experiments)
- Stable Diffusion Prompts
These models and datasets can be downloaded by:
setup/get_gim.sh # For GIM main experiments setup/get_llm.sh # For LLM main experiments setup/get_gim_ablation_models.sh # For GIM ablation experiments setup/get_llm_ablation_models.sh # For LLM ablation experiments
The artifacts are seperated into two main folders: llm for the LLM-based experiments and gim for the GIM-based experiments. Each folder is organized as follows:
llm / gim
├── requirements.txt # Python dependencies for the experiments
├── security_estim # Code for estimating the time complexities of our attacks
├── generation # Code for generating PRC watermarked content
├── data # Example watermarked content and necessary data for reproducing the attack results
└── attack # Code for the concrete attacks
The scripts for estimating the complexities of our attacks in Section 5 are located in <gim or llm>/security_estim/main.py directory.
The estimation can be reproduced by running the following commands:
conda run -n prc-estimator-llm python llm/security_estim/main.py # Figure 4(a), Table 6
conda run -n prc-estimator-gim python gim/security_estim/main.py # Figure 4(b), Table 7This will reproduce the following results:
llm/data/security_estim.pdf(Figure 4(a))gim/data/security_estim.pdf(Figure 4(b))llm/data/security_estim.csv(Table 6)gim/data/security_estim.csv(Table 7)
To generate the watermarked texts, you can run the following scripts:
conda activate prc-estimator-llm
CUDA_VISIBLE_DEVICES=0 python llm/generation/main.py --prc_t 3 --model_name Deepseek --temperature 1.8 --start 0 --end 16 # 64GB VRAM, 30 min/file * 10 files
CUDA_VISIBLE_DEVICES=0 python llm/generation/main.py --prc_t 4 --model_name Deepseek --temperature 1.8 --start 0 --end 16 # 64GB VRAM, 30 min/file * 10 filesThis will generate watermarked texts and the associated PRC data in the data/llm/<model_name>_t_<prc_t>_temp_<temperature> directory.
The implementations of the Attack I and Attack II are provided in llm/attack.
To run the attacks and reproduce Table 3, you can run the following scripts:
conda activate prc-estimator-llm
python llm/attack/attack1_2_main.py llm/data/Deepseek_t_3_temp_1.8This will output the corresponding data for Table 3 in the terminal.
The LLM-generated watermarked and non-watermarked content under different temperatures are unzipped to llm/data/gen_result.
To reproduce Figure 5, you can run the following command:
conda run -n prc-estimator-llm \
python llm/generation/plot_entropy.pyThis will generate the following figure:
llm/data/avg_entropy_vs_correct_rate.pdf(Figure 5)
To reproduce Table 9, you can manually check the following JSON files:
llm/data/gen_result/temperature_1.0/1748357173909375233.jsonllm/data/gen_result/temperature_1.2/1748402150078932357.jsonllm/data/gen_result/temperature_1.4/1748402173162096394.jsonllm/data/gen_result/temperature_1.6/1748402254743413050.jsonllm/data/gen_result/temperature_1.8/1748402215940443270.json
To generate the watermarked images, you can run the following scripts:
conda activate prc-estimator-gim
CUDA_VISIBLE_DEVICES=0 python gim/generation/main.py --gen_model_id SD21 --inv_model_ids SD21 --prc_t 3 --start 0 --end 10 # 16GB VRAM, 15 min/file * 10 files
CUDA_VISIBLE_DEVICES=0 python gim/generation/main.py --gen_model_id SD21 --inv_model_ids SD15,SD2,SD21 --prc_t 4 --start 0 --end 10 # 16GB VRAM, 15 min/file * 10 filesThis will generate watermarked images and the associated PRC data in the gim/data/<model_id>_t<prc_t> directory.
The implementations of the Attack I and Attack II are provided in gim/attack.
To run the attacks and reproduce Table 4, you can run the following scripts:
conda activate prc-estimator-gim
python gim/attack/attack1_2_main.py gim/data/SD21_t3This will output the corresponding data for Table 4 in the terminal.
To reproduce Table 8, you can run the following scripts:
conda activate prc-estimator-gim
python gim/attack/attack1_2_diff_inv_main.pyThis will output the corresponding data for Table 8 in the terminal.
Since Attack III requires significantly more computational resources, we directly extract imtermedia results of PRC and only test whether Attack III will degrade the image quality. The data extraction and watermark removal can be reproduced by running the following scripts:
conda activate prc-estimator-gim
CUDA_VISIBLE_DEVICES=0 python gim/generation/main-for-attack3.py --prc_t 3 --model_id SD21 --start 0 --end 10 # Extract the intermediate data for Attack III, 16GB VRAM, 2 min for 10 files
CUDA_VISIBLE_DEVICES=0 python gim/attack/attack3_main.py gim/data/attack3_SD21_t3 --start 0 --end 10 --model_id SD21 --eps 16 # Remove the watermark using results from Attack III, 80GB VRAM, 6 min/file * 10 files
python gim/attack/attack3_stati.py gim/data/attack3_SD21_t3/inv_lat_16.0 # Calculate the attack statisticsThis will output the success rate of watermark removal under different image quality budgets.
The code for generating PRC watermarked images is adapted from https://github.com/XuandongZhao/PRC-Watermark.
@inproceedings{WWCRLW26,
author = {Tianrui Wang and Anyu Wang and Tianshuo Cong and Delong Ran and Jinyuan Liu and Xiaoyun Wang},
title = {Cryptanalysis of LDPC-Based Pseudorandom Error-Correcting Codes},
booktitle = {USENIX Security} ,
year = {2026}
}