Rachel Mikulinsky*1, Morris Alper*1, Shai Gordin2, Enrique Jimenez3, Yoram Cohen1, Hadar Averbuch-Elor1,4
1Tel Aviv University, 2Ariel University, 3LMU, 4Cornell University *Equal Contribution
This is the official implementation of ProtoSnap, a method for aligning a cuneiform prototype and a corresponding sign image. ICLR 2025
Given a target image of a cuneiform sign, and a correspoiding prototype with annotated skeleton, we align the skeletong with the target image.
To this aim, we use diffusion features, extracted from a fine-tuned stable diffusion model.
We used this method to train ControlNet, to generate new a diverse cuneiform signs, based only on a prototype. Weights for the ControlNet are available here.
pip install -r requirements.txtTo download the weights:
gdown 'https://drive.google.com/uc?export=download&id=1x2RlD4jk3O7QFZ6z4ApkSe4RWNnJq_K_'
unzip weights.zip -d weights
rm weights.zipTo run on a single sign image:
python main.py <prompt> --target_image_path <path_to_image_dir>Arguments:
promptThe name of the sign (such as A, AN, MA, etc.), used as prompt to the SD model--target_image_pathThe directory path where the targe image is located. The image name should be<prompt>.png. By defualt -target_images--font_dirThe directory with available prototypes. By default -prototypes/Santakku, corresponding to Old Babylonian era. The font Assurbanipal for the Neo-Assyrian era avaliable as well in this repo--con_dirThe directory with annotated skeletons. By default -skeletons/Santakku, skeletons for Assurbanipal font available as well.--output_folderNone by default. If not None, the results will be saved underoutput/<output_folder>, else directly underoutput
To run the system on a list of images:
python run_test.py --samples_df_path <samples_csv>Arguments:
--samples_df_pathA metadata csv for the requested samples. By defaulttest_set/metadata.csv--font_dir,--con_dirand--output_foldersame as for a single image
To generate images using our fine-tunes ControlNet:
python gen_images_with_cn.py <sign_name> --num_of_samples <num_of_samples>The script generats controls, by using available skeletons, and applying small agumentations on each stroke, to create diversity. Then each control is used to generate an image, using ControlNet.
Arguments:
sign_nameThe name of the sign to generate (such as A, AN, MA, etc.)--num_of_samplesNumber of samples to generate. 50 by default--output_pathThe results will be saved under<output_path>/<sign_name>/images. The controls used for generation will be saved under<output_path>/<sign_name>/controls]
- This research was funded by TAU Center for Artificial Intelligence & Data Science (TAD) and by LMU-TAU Research Cooperation Program.
- The method and the test set were devolped using the cunieform OCR dataset. The photographs of tablets are from the British Museum Digital Collections.
- This implementation uses code form the official repository of DIFT
If you find this project useful, you may cite us as follows:
@inproceedings{
mikulinsky2025protosnap,
title={ProtoSnap: Prototype Alignment For Cuneiform Signs},
author={Rachel Mikulinsky and Morris Alper and Shai Gordin and Enrique Jim{\'e}nez and Yoram Cohen and Hadar Averbuch-Elor},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=XHTirKsQV6}
}
