Skip to content

Better support for BatchEncoding #8

@jss367

Description

@jss367

. Example:
texts = ["this is my test text", "this is another test text"] * 100

# load a pre-trained tokenizer
tokenizer = BertTokenizerFast.from_pretrained(
    "bert-base-uncased",
    add_special_tokens=True,
    max_length=input_dimensions,
    pad_to_max_length=True,
    return_tensors="pt",
)

encoded_text = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions