Adopting a human developmental visual diet (DVD) diet yields robust, shape-based AI vision

A plug-and-play training curriculum that progressively “ages” each image, mimicking the evolving visual sensitivities of the human visual system, from immature newborn vision through to mature adulthood.

1 Why Developmental Visual Diet (DVD)?

Human vision begins with severely immature inputs, whereas current ANNs mainly train on instantaneous, high-fidelity inputs.
We model developmental trajectories of visual acuity, contrast sensitivity, and chromatic sensitivity from birth to 25 years.
A differentiable pipeline dynamically applies these sensory maturation on-the-fly during training.
Guiding AI vision through this Developmental Visual Diet produces models that better align with hallmark features of human visual robustness:
1. Near-human-level shape bias
2. Enhanced robustness to corruptions
3. Improved resilience to adversarial attacks
4. Abstract shape recognition beyond state-of-the-art vision-language models (e.g., ChatGPT-4o, Gemini 2.0 Flash, LLaMA-4-Scout)

📈 Developmental visual trajectories

Note

For new projects, we recommend starting with the scale-free version in the scale_free branch.
It reformulates the DVD transformations, making behaviour more consistent across resolutions and different conditions.
The current branch remains useful for reproducing the original setup and results reported in the paper.

2 Installation

Typical install time on a standard desktop computer: ~1–3 minutes.

git clone https://github.com/KietzmannLab/DVD.git
cd DVD
pip install -e .
python - <<'PY'
import dvd, torch
print("DVD version:", dvd.__version__, "| CUDA =", torch.cuda.is_available())
PY

3 Quick demo - aging visual experience

Typical running time: ~1 minute on a single H100 GPU.

from pathlib import Path
from typing import List
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
import torch
from dvd.dvd.development import DVDTransformer, DVDConfig # DVD Data Transformer (main API)

# ------------------------------------------------------------
# Configuration
# ------------------------------------------------------------
AGES: List[int] = [1, 4, 16, 64, 256]   # ages in months
IMG_SIZE: int = 256                     # resize target (px)
CFG = DVDConfig() 
# Note: If your input images are not normalized to [0, 1], consider set 'by_percentile=True' in DVDConfig() to percentile-based thresholding, which adapts to the image’s actual intensity distribution.

# Input / output paths
ASSETS_DIR = Path("assets/example_stimuli")
IMAGE_PATHS = [
    ASSETS_DIR / "example_1.jpeg",
    ASSETS_DIR / "example_2.jpeg",
]
OUT_DIR = Path("results/dvd_demo_output")
OUT_PATH = OUT_DIR / "dvd_demo_output.pdf"

# Helper: load an image as [1, 3, H, W] float tensor in [0, 1]
def load_tensor(fp: Path) -> torch.Tensor:
    img = Image.open(fp).convert("RGB")
    img.thumbnail((IMG_SIZE, IMG_SIZE), Image.LANCZOS)
    arr = np.asarray(img).transpose(2, 0, 1) / 255.0
    return torch.from_numpy(arr).unsqueeze(0).float()

# Main: build demo
def make_demo(paths: List[Path], outfile: Path) -> None:
    dvdt = DVDTransformer(CFG)
    tensors = [load_tensor(p) for p in paths]

    rows, cols = len(tensors), len(AGES)
    fig, ax = plt.subplots(rows, cols, figsize=(3 * cols, 3 * rows))

    for r, img_t in enumerate(tensors):
        for c, age in enumerate(AGES):
            out = dvdt(img_t.clone(), months=age)              # DVD data transformation
            vis = out.squeeze(0).permute(1, 2, 0).numpy().clip(0, 1)
            ax[r, c].imshow(vis)
            ax[r, c].axis("off")
            if r == 0:
                ax[r, c].set_title(f"{age} mo", fontsize=12)

    fig.tight_layout()
    outfile.parent.mkdir(parents=True, exist_ok=True)
    fig.savefig(outfile, dpi=300)
    print(f"Saved {outfile.resolve()}")


make_demo(IMAGE_PATHS, OUT_PATH)

4 Datasets

This project makes use of several datasets:

Dataset	Description	Link
Ecoset	A natural image dataset introduced in Mehrer et al., 2021	Ecoset Website
ImageNet	Our models were also trained on the initial release of ImageNet	ImageNet Website

5 Trained checkpoints

We provide several trained checkpoints for direct use.

Training data	Model variants	Notes	Link
Ecoset	Baseline, DVD-P, DVD-B, DVD-S	Default models trained on Ecoset. We recommend DVD-B as the default starting point.	OSF checkpoint files
Ecoset	Scale-free DVD-B	This version shows similar shape bias and higher accuracy than the DVD-B Paper Version, see the code in the `scale_free` branch	OSF scale-free checkpoint
ImageNet-1K	Baseline, DVD-P, DVD-B, DVD-S	Models trained on ImageNet-1K (without hyperparameter sweeping)	OSF ImageNet-1K checkpoints

Important

All checkpoints released here were trained with the same data augmentation pipeline, making comparisons within this repository straightforward.

For other publicly available models, training-time data augmentations may differ substantially, so results are not necessarily directly comparable.

6 Training with DVD | Example

python scripts/main.py /share/klab/datasets --arch resnet50 --epochs 150 --dataset-name ecoset_square256 --class-weights-json-path '/share/klab/datasets/optimized_datasets/lookup_ecoset_json.json' --batch-size-per-gpu 512 --image-size 256 --warmup-epochs 0 --development_strategy dvd --months_per_epoch 2 --contrast_amplitude_beta 1e-4 --contrast_amplitude_lambda 150 --seed 1

Flag	Purpose
`--development_strategy`	`dvd` (full curriculum) or `adult` (control).
`--months_per_epoch`	Virtual ageing per epoch (smaller = finer). Example: DVD-P = 4, DVD-B = 2, DVD-S = 1
`--time_order`	`chronological` or `fully_random`.
`--apply_*`	Toggle acuity / colour / contrast sensitivities.
`--contrast_amplitude_*`	Control the reference amplitude threshold in FFT.

7 Core API

from dvd.dvd.development import DVDTransformer, DVDConfig, generate_age_months_curve

# Initialize transformer and generate age mapping curve
dvdt = DVDTransformer(DVDConfig())
age_curve = generate_age_months_curve(
    epochs=args.epochs,
    steps_per_epoch=len(train_loader),
    months_per_epoch=args.months_per_epoch,
)

# Map current batch index to virtual age in months
step_idx = (epoch * len(train_loader)) + i
age_months = age_curve[step_idx]

# Apply age-based visual transformations
images_aged = dvdt(img_t.clone(), months=age, curriculum=age_curve)

8 Citation

@article{lu2025dvd,
  title   = {Adopting a human developmental visual diet yields robust, shape-based AI vision},
  author  = {Zejin Lu, Sushrut Thorat, Radoslaw M. Cichy, Tim C. Kietzmann},
  journal = {arXiv preprint arXiv:2507.03168},
  year    = {2025},
  doi     = {10.48550/arXiv.2507.03168},
  url     = {https://arxiv.org/abs/2507.03168}
}

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
assets		assets
dvd		dvd
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adopting a human developmental visual diet (DVD) diet yields robust, shape-based AI vision

1 Why Developmental Visual Diet (DVD)?

📈 Developmental visual trajectories

2 Installation

3 Quick demo - aging visual experience

4 Datasets

5 Trained checkpoints

6 Training with DVD | Example

7 Core API

8 Citation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Adopting a human developmental visual diet (DVD) diet yields robust, shape-based AI vision

1 Why Developmental Visual Diet (DVD)?

📈 Developmental visual trajectories

2 Installation

3 Quick demo - aging visual experience

4 Datasets

5 Trained checkpoints

6 Training with DVD | Example

7 Core API

8 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages