Skip to content

ameermasood/FigLangUnderstanding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

112 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Figurative Language Understanding via Mixture-of-Adapters and Tensor-of-Cues

This repository contains the end-to-end experimental pipeline for Figurative Language Understanding based on the BESSTIE dataset.

Our work replicates standard encoder and decoder baselines and implements two extension architectures: Mixture-of-Adapters (MoA) and Variety-Aware Tensor-of-Cues (VA-ToC).


Problem Context & Objectives

Figurative language understanding (sarcasm detection and sentiment analysis) remains a major challenge for NLP models due to its dependence on implicit meaning and context. In national varieties of English (e.g., Australian en-AU, Indian en-IN, British en-UK), the expression of sarcasm and sentiment is heavily influenced by regional pragmatics, slang, and cultural cues. Models trained on general datasets frequently experience significant performance drop-offs when evaluated across these regional boundaries.

The key objectives are:

  1. Benchmark standard models (BERT, RoBERTa, Mistral) under variety and domain shifts.
  2. Measure generalization drops across regional English dialects (AU, IN, UK) and platforms (Google Reviews, Reddit).
  3. Develop targeted model extensions (Mixture-of-Adapters and Tensor-of-Cues) to improve model robustness.

Proposed Architecture

Our project introduces two extension frameworks targeting task-specific constraints and geographical varieties (en-AU, en-IN, en-UK):

Proposed Architectural Extensions

  1. Mixture-of-Adapters (MoA): Adds low-rank adapter modules for each target task at the transformer layer level, utilizing a routing/gating network to weight adapter representations based on text embeddings.
  2. Variety-Aware Tensor-of-Cues (VA-ToC): A structured prompting method for decoder models (Mistral) where a variety-routing classifier (trained using XLM-RoBERTa) predicts the national variety of the text. This prediction is mapped to hyperbolic variety embeddings and injected into instruction prompts to guide the language model's predictions.

Key Performance Results

Our proposed architectures show performance improvements over standard fine-tuned baselines, particularly under variety and domain shifts:

  • Zero-Shot Sarcasm Generalization (+89.6% Gain): The Variety-Aware Tensor-of-Cues (VA-ToC) strategy increases zero-shot sarcasm Macro-F1 on the Mistral-7B model from 0.20 to 0.38. This indicates that structured pragmatic cues and hyperbolic variety embeddings guide decoder predictions under zero-shot transfer.
  • Geographical Variety Robustness (+33.7% Gain): In cross-variety sarcasm transfer (e.g., training on Australian English and testing on British/Indian English), the Mixture-of-Adapters (MoA) model increases RoBERTa's Macro-F1 from 0.49 to 0.65. Conditional adapter routing prevents representation collapse under regional shifts.
  • In-Variety Adaptation (+24.0% Gain): Within matched variety partitions, VA-ToC increases sarcasm Macro-F1 on Mistral from 0.54 to 0.67 and sentiment classification from 0.81 to 0.92, showing successful local adaptation.

Tasks & Evaluation Protocol

The pipeline evaluates two binary classification tasks:

  • sentiment (positive vs. negative)
  • sarcasm (sarcastic vs. non-sarcastic)

We evaluate under the following experimental setups:

  • In-domain: Trained and tested within the same source/domain (Google vs. Reddit).
  • Cross-domain: Trained on Google Reviews, tested on Reddit (and vice versa) for sentiment. Sarcasm follows the Reddit-only benchmark setup.
  • FULL: Trained on the combined training pool and evaluated on the full validation set.
  • Cross-variety matrix: Trained on one national variety (e.g., en-AU) and tested on each national variety (en-AU, en-IN, en-UK) to measure geographical shift.

All models report: Macro-F1, Accuracy, Precision, Recall, and per-variety breakdown tables.


Repository Layout

  • data/: Raw source datasets and standardized processed outputs.
  • src/figlang/: Reusable Python package modules (data loading, remapping, preprocessing, training loops, visualization).
  • notebooks/: Phase-based Jupyter Notebooks used for remote Google Colab execution.
  • models/: Saved metrics, predictions, plots, and training checkpoints.
  • docs/: Project report PDF and paper documentation.
  • scripts/: Standalone Python execution scripts wrapping the package logic.

Installation & Setup

To install the project dependencies and the figlang package in editable mode locally:

# Clone the repository
git clone https://github.com/ameermasood/FigLangUnderstanding.git
cd FigLangUnderstanding

# Install package and requirements
pip install -e .

Execution Flows

The pipeline can be run in two modes: script-wise (local command-line) and notebook-wise (Google Colab).

Option A: Script-wise Run Sequence (Local CLI)

We provide a unified CLI tool flu to run pipeline stages locally. Execute these commands in order:

  1. Run sanity checks (validates notebook formatting, compiles code, and runs tests):
    flu check
  2. Run preprocessing (reads raw data, applies schemas, and splits train/validation index files):
    flu preprocess --overwrite
  3. Inspect generated indexes (validates local paths and remapped Colab configs):
    flu inspect
  4. Run baseline training scripts:
    # BERT Baseline
    flu train-bert --run-source
    
    # RoBERTa Baseline
    flu train-roberta --run-source
    
    # Mistral Baseline (requires GPU / unsloth environment)
    flu train-mistral --run-source
  5. Run extension training scripts:
    • Extension 1 (RoBERTa MoA):
      flu train-roberta-moa --run-source
    • Extension 2 (Mistral Hyperbolic-ToC): First, train the prerequisite variety routing classifier:
      flu train-variety-router --run-source
      Then run the Mistral Hyperbolic prompt training loop:
      flu train-mistral-toc --run-source
  6. Evaluate and compile metrics:
    # Compare metric files side-by-side
    flu compare --best
    flu compare --deltas
    
    # Plot Macro-F1 heatmaps
    flu plot --task sentiment --out models/heat_sentiment.png
    flu plot --task sarcasm --out models/heat_sarcasm.png

Option B: Notebook-wise Run Sequence (Jupyter / Google Colab)

If running on Colab, notebooks assume a configurable project root BASE = Path("/content/drive/MyDrive/DNLP") and mount Google Drive automatically. Run the notebooks in the following order:

  1. Phase P0 (Data Collection & Exploration):
    • notebooks/P0_01_data_collection.ipynb
    • notebooks/P0_02_data_exploration.ipynb
  2. Phase P1 (Data Preprocessing):
    • notebooks/P1_01_data_preprocessing.ipynb
  3. Phase P2 (Baselines):
    • notebooks/P2_01_baseline_bert.ipynb
    • notebooks/P2_02_baseline_roberta.ipynb
    • notebooks/P2_03_baseline_mistral.ipynb
  4. Phase P3 (Extensions):
    • Extension 1 (RoBERTa MoA): Run notebooks/P3_01_extension_roberta_mixture_of_adapters.ipynb.
    • Extension 2 (Mistral ToC): Run the prerequisite notebooks/P3_02_extension_mistral_variety_classifier.ipynb first, followed by the main notebook notebooks/P3_03_extension_mistral_hyperbolic_toc.ipynb.

Authors

This project was developed by:


Acknowledgments

This project was developed as part of the Deep Natural Language Processing (DNLP) curriculum at Politecnico di Torino. We would like to express our gratitude to Professor Luca Cagliero and teaching assistants Ali Yassine and Giuseppe Gallipoli for their guidance, valuable feedback, and support throughout the project.

About

Figurative Language Understanding (FLU) System for Sarcasm and Sentiment Detection Using Transformers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages