- Documentation: https://labeltext.readthedocs.io/en/latest/
After installing labeltext,
- create a
TextAnnotationobject (or restore from earlier annotation session), - start annotating by calling the
.annotate()method.
pip install labeltexttask = TextAnnotation(
records=["Albert Einstein", "Stephen King", "Marie Curie"],
labels=["male", "female"],
output="scientists.csv"
)
print(task)records: List of text records to be annotatedlabels: List of class labels (up to 16)output: The CSV file where annotations will be saved (default:annotations.csv)
It'll probably be more natural to read the records from a (csv) file somewhere.
import pandas as pd
df = pd.read_csv("example.csv")
task = TextAnnotation(
records=list(df.text.values), # `text` is a column in df
labels=["male", "female"],
output="scientists.csv"
)
print(task)task.annotate(user_name="@dataBiryani", update_freq=2)This function starts an interactive annotation session.
user_name(optional): A project may have multiple annotators. If not provided, the user will be asked for auser_nameupdate_freq(optional): New annotations are not immediately saved to disk. They are saved once everyupdate_freqannotations (default 5), or if the user ends the annotation session, or if no records are left to annotate.
Note: The output of the annotation session will be written to a csv file that you can feed into your modeling pipeline. The current state of annotation will also be saved in a pickle file (with the same filename as the csv file, but with .pkl extension). You can use the .pkl file to continue annotation in future sessions.
task = TextAnnotation()("annotations.pkl")
task.annotate(user_name="@dataBiryani")