Skip to content

Rashidbm/farras

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

فرّاس — Arabic AI Text Detector

An open-source tool for detecting AI-generated Arabic text

Live Demo  |  Model on HuggingFace  |  Research Report (v1)


Overview

Farras (فرّاس) is a detection system for identifying AI-generated Arabic text. The project went through two major iterations, each informed by the limitations of the previous one.

Version Base Model Training Data Accuracy Status
v1 AraBERTv2 + XGBoost + N-grams Custom Gemini-only (12,796 samples) 93.16% (internal), 86% (external) Archived
v2 XLM-RoBERTa KFUPM-JRCAI multi-generator (28,098 samples) 97.27% Deployed

The Journey

v1: Hybrid Ensemble (Dec 2025)

The first version explored whether stylistic and structural cues could outperform deep transformers for Arabic AI detection. Five approaches were compared:

  • Naive Bayes baseline (55.7% accuracy)
  • Character N-grams with logistic regression (87.0%)
  • Farasa morphological features with XGBoost (81.2%)
  • AraBERTv2 fine-tuning (85.2%)
  • Hybrid ensemble combining N-grams + linguistic features (93.2%)

Key finding: the hybrid model that combined surface-level patterns with linguistic features outperformed the deep transformer. The full analysis is in the research report.

Limitations identified:

  • Dataset was Gemini-only — the model had never seen GPT-4, Llama, or Jais outputs
  • AraBERT's aggressive Arabic normalization (diacritics removal, alef normalization) was destroying detection signals
  • External evaluation dropped to 86% accuracy, confirming poor generalization

v2: XLM-RoBERTa (Feb 2026)

Informed by the AraGenEval 2025 shared task results and the v1 limitations, the second version made three key changes:

  1. Switched to XLM-RoBERTa — the AraGenEval findings showed xlm-roberta-base outperforms AraBERT for Arabic AI detection (F1=0.770 vs ~0.618)
  2. Multi-generator training data — used KFUPM-JRCAI datasets covering 4 generators (ALLaM, Jais, Llama 3.1, GPT-4) across 2 domains (academic abstracts + social media)
  3. No Arabic normalization — following the BUSTED team's finding that text normalization destroys stylistic cues that differentiate AI from human writing

Results:

Class Precision Recall F1
Human 0.93 0.94 0.93
AI 0.98 0.98 0.98
Overall Accuracy 97.27%

Architecture

farras.app (Next.js on Vercel)
    ↓ API calls via @gradio/client
HuggingFace Space (Gradio backend)
    ↓ loads model at startup
HuggingFace Hub (XLM-RoBERTa weights, 1.1GB)

Repository Structure

farras/
├── v1-hybrid/                        # First iteration (archived)
│   ├── Arabic_AI_Text_Detection_Report.pdf   # Full research report
│   ├── app.py                        # Gradio app (ensemble backend)
│   ├── hybrid_model/                 # N-gram + linguistic feature code
│   └── finetuned_model/              # AraBERTv2 fine-tuning notebook
│
├── v2-xlmr/                          # Current deployed version
│   ├── app.py                        # Gradio app (XLM-RoBERTa backend)
│   ├── train_xlmr.py                 # Training script
│   └── requirements.txt
│
└── web/                              # Landing page (Next.js)

Quick Start

Run the detector locally

cd v2-xlmr
pip install -r requirements.txt
# Download model from HuggingFace Hub
python -c "from transformers import AutoModel, AutoTokenizer; AutoTokenizer.from_pretrained('Rashidbm/farras-xlmr-arabic-ai-detector'); AutoModel.from_pretrained('Rashidbm/farras-xlmr-arabic-ai-detector')"
python app.py

Train from scratch

cd v2-xlmr
python train_xlmr.py

Requires the KFUPM-JRCAI datasets in Datasets/KFUPM-JRCAI/.

Known Limitations

  • Struggles with short texts (<50 words) — training data averages 110-879 words per sample
  • Optimized for Modern Standard Arabic and common dialects; may underperform on very niche dialects
  • Detection accuracy may degrade as LLMs improve their Arabic generation

Links

Authors

  • Rashid Binkulaib
  • Mohammed Alomar
  • Nawaf Alwazrah

License

MIT

About

فرّاس — Arabic AI Text Detector | Detecting AI-generated Arabic text with 97.27% accuracy

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors