SpokenCRS is a lightweight Python API for normalizing conversational
recommendation datasets into one consistent interface. It currently supports
INSPIRED, ReDial, and ConvApparel, exposing each dataset as split-level
CRSDataFrame objects with normalized user/system turns, item context,
ground-truth items, recommendation labels, rejection labels, and dataset-specific
metadata.
Install the package in editable mode:
git clone https://github.com/DB825/spokencrs.git
cd spokencrs
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install -e .Load each supported dataset through the same entry point:
from spokencrs import load_dataset
inspired = load_dataset("inspired", data_dir="data/inspired")
redial = load_dataset("redial", data_dir="data/redial")
convapparel = load_dataset("convapparel", data_dir="data/convapparel")
train = redial["train"]
conversation = train[train.index[0]]
print(conversation.get_ground_truth_items())
print(conversation.system_turn[0].recommended_items())
print(conversation.system_turn[0].rejected_items())Use list_datasets() to see registered loader names.
Raw datasets are not included in this repository. Download them from the
original sources, extract them locally, and point load_dataset at the matching
directory.
data/
inspired/
train.tsv
dev.tsv
test.tsv
redial/
train_data.jsonl
test_data.jsonl
convapparel/
ConvApparel.json
ConvApparel is distributed as a zipped JSON file on Hugging Face; unzip it so
ConvApparel.json is directly inside data/convapparel.
| Dataset | Domain | Splits | Notes |
|---|---|---|---|
| INSPIRED | Movie recommendation | train, dev, test | Uses movie_id as the ground-truth token. The dataset does not expose reliable per-turn accept/reject labels, so recommendation/rejection methods fall back to item context. |
| ReDial | Movie recommendation | train, test | Resolves @movieId tokens to titles. Recommendation and rejection labels come from respondentQuestions and are exposed on system turns. |
| ConvApparel | Apparel shopping | train | Splits each raw user/assistant turn pair into separate user and system turns. Uses per-turn recommendations, purchase-likelihood ratings, and matched which_product answers for recommendation/rejection semantics. |
- INSPIRED: INSPIRED: Toward Sociable Recommendation Dialog Systems
- ReDial: Towards Deep Conversational Recommendations
- ConvApparel: ConvApparel: A Benchmark Dataset and Validation Framework for User Simulators in Conversational Recommenders
load_dataset(...) returns a dictionary mapping split names to
CRSDataFrame objects. Each CRSDataFrame row represents one conversation and
contains:
conversation_id: stable dataset-specific conversation keyuser_turn: list of user/seekerTurnWrapperobjectssystem_turn: list of system/recommenderTurnWrapperobjectsground_truth_items: item identifiers used for evaluation- raw record fields when useful, such as
raw_recordorraw_conversation_df
Each TurnWrapper is domain neutral:
items: ordered item titles or labels referenced at the turnitem_dict:{item_title: index}lookup for the turnrecommended_item_titles: items explicitly recommended at the turn when the dataset exposes that signalrejected_item_titles: recommended items inferred as not accepted when the dataset exposes that signalhas_recommendation_signalandhas_rejection_signal: flags that distinguish an explicit empty label from a dataset with no available signalmetadata: dataset-specific extras such as movie genre dictionaries, ConvApparel item descriptions, image URLs, feature tags, and turn ratings
item_context() returns the full structured context for a turn. When a loader
has reliable recommendation or rejection signals, recommended_items() and
rejected_items() return only those inferred subsets. When a dataset does not
expose the signal, these methods fall back to item_context() as placeholder
behavior.
The package includes reusable validation helpers:
from spokencrs import (
load_dataset,
summarize_loaded_datasets,
validate_loaded_datasets,
)
loaded = {
"INSPIRED": load_dataset("inspired", "data/inspired"),
"ReDial": load_dataset("redial", "data/redial"),
"ConvApparel": load_dataset("convapparel", "data/convapparel"),
}
summary = summarize_loaded_datasets(loaded)
issues = validate_loaded_datasets(loaded)
print(summary)
print(issues)The companion notebook spokencrs_validation.ipynb mirrors this workflow and
includes dataset-specific spot checks.
Future loaders should:
- Convert raw records into
CRSDataFramerows with the shared columns above. - Normalize speakers to
userandsystem. - Keep core item fields domain neutral:
items,item_dict,recommended_item_titles, andrejected_item_titles. - Set recommendation/rejection signal flags when those labels are available.
- Put dataset-specific annotations in
TurnWrapper.metadata. - Document ground-truth, recommendation, and rejection assumptions in the loader docstring.
- Register the loader in
spokencrs/loader.py. - Add the dataset to the validation notebook and run
validate_loaded_datasets.