Website: halluminate.ai
This repository showcases sample Supervised Fine-Tuning (SFT) data for UI/action grounding tasks and ecommerce/payment flows.
the events_flights trajectories were recorded using Halluminate's flight search simulator
sample_sft/events_flights/- JSONL transcripts of flight-search interactions (clicks, typing, selections) with aligned screenshots and concise rationales.
- Side-by-side
.mdrenderings that include images and stepwise annotations. screenshot/directory containing referenced PNGs.
sample_sft/payment_samples/- JSONL transcripts of ecommerce checkout flows (search, add-to-cart, address, payment) with aligned screenshots and rationales.
- Side-by-side
.mdrenderings and ascreenshot/directory.
Each line is a single step in an interaction sequence:
timestamp— wall-clock string for ordering (e.g.,2025-08-10_00:04:14).action— atomic UI action (e.g.,click (x, y),type text: <text>,scroll...).screenshot— relative path to the raw screenshot PNG.marked_screenshot— path to an annotated screenshot showing the referenced UI element/region.element— short label of the target UI element when available.rect— bounding box of the target region:{ left, top, right, bottom }.action_description— natural language description of the intended action.action_description_checked— QA tag for description quality (e.g.,Good).thought— concise rationale describing why this step is taken.
Fields may be null when a step does not target a specific element (e.g., free typing, scrolling).
Example: events_flights (first lines only)
{"timestamp": "2025-08-10_00:04:14", "action": "click (846, 680)", "screenshot": "screenshot/20250810_000414_1.png", "element": "Where to?", "rect": {"left": 753, "top": 657, "right": 1031, "bottom": 713}, "marked_screenshot": "screenshot/20250810_000414_1_marked.png", "action_description": "click <\\the \"Where to?\" input box in the center section of the window>", "action_description_checked": "Good", "thought": "To begin searching for flights ..."}
{"timestamp": "2025-08-10_00:04:15", "action": "type text: jfk", "screenshot": "screenshot/20250810_000415_2.png", "element": null, "rect": null, "marked_screenshot": "screenshot/20250810_000415_2.png", "action_description": "type text: jfk", "action_description_checked": null, "thought": "To find the cheapest direct one-way flight ..."}Example: payment_samples (first lines only)
{"timestamp": "2025-08-16_21:05:28", "action": "click (611, 99)", "screenshot": "screenshot/20250816_210528_1.png", "element": "Search Amazon", "rect": {"left": 415, "top": 85, "right": 1437, "bottom": 123}, "marked_screenshot": "screenshot/20250816_210528_1_marked.png", "action_description": "click <\\the search bar at the top-center of the Amazon page>", "action_description_checked": "Good", "thought": "Let me click into the search bar ..."}
{"timestamp": "2025-08-16_21:05:30", "action": "type text: Airpod pro 2", "screenshot": "screenshot/20250816_210530_3.png", "element": null, "rect": null, "marked_screenshot": "screenshot/20250816_210530_3.png", "action_description": "type text: Airpod pro 2", "action_description_checked": null, "thought": "I want to search specifically for the AirPods Pro 2 ..."}For many sequences there is a companion .md file that:
- Restates the task description and difficulty level.
- Shows each step with the corresponding annotated screenshot.
- Mirrors the JSONL content for readability.
- Schema exploration for building parsers and loaders.
- Prototyping SFT data ingestion and validation.
- Training small models on UI-grounded action prediction and rationale generation.
- Demos for screenshot-conditioned agent pipelines.
- Parse JSONL line-by-line. Unknown fields should be ignored; prefer additive parsers.
- Use
screenshot/relative paths to resolve images. - Treat coordinates as pixel units relative to the screenshot resolution.
- Do not assume every step has a
rectorelement.
- This is a sample set, not exhaustive coverage. Some tasks are abbreviated.
- File names include timestamps to help with ordering and reproducibility.
- If you publish results using this sample, please attribute this repository.
