Skip to content

Example notebook with Datalab on a text dataset with sota LLM & embeddings model #98

@jwmueller

Description

@jwmueller

Make a version of this tutorial: https://docs.cleanlab.ai/stable/tutorials/datalab/text.html
but using more modern ML models. pred_probs can be produced by a (pretrained) LLM, and features produced via a recently popular Embeddings model.

Recommend using models from HuggingFace. Try to select a dataset where the detected issues are interesting, particularly one where the under-performing group issue is present.

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions