Skip to content

test: integration tests fail under newer trl/datasets (SFTTrainer kwargs, compute_perplexity) #60

@SahilKumar75

Description

@SahilKumar75

When TUNEOS_INTEGRATION_TESTS=1 is set against a freshly-resolved dependency set, two long-standing integration tests fail due to upstream API drift (independent of #58):

  1. test_train_step_runs_and_writes_adapterSFTTrainer.__init__() got an unexpected keyword argument 'tokenizer'. Newer trl renamed tokenizerprocessing_class and moved dataset_text_field/max_seq_length into SFTConfig. trainer/finetune.py uses these same removed kwargs, so this is a latent forward-compat issue, not just a test problem.
  2. test_perplexity_is_finitecompute_perplexity does batch["input_ids"].to(model.device) but receives a Python list (dataset not set to torch format), raising AttributeError.

CI does not catch these because the integration suite is gated off by default.

Task: pin/upgrade trl deliberately and migrate SFTTrainer construction to SFTConfig (or the supported kwargs), and set the eval dataset to torch format (or batch via a DataLoader) in trainer/metrics.py:compute_perplexity.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:fine-tuningLoRA, QLoRA, training configuration, and tuning workflowsrefactorCode restructuring and cleanuptesting

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions