Skip to content

Feature extraction issue when series name is a number #97

@windischbauer

Description

@windischbauer

I have a data set where each series has a number as a key and I would like to extract the data.
I have adapted the code from the tutorial to show the issue and I am using v0.3.0.:

import pandas as pd; import scipy.stats as ss; import numpy as np
from tsflex.features import FeatureDescriptor, FeatureCollection, FuncWrapper

# 1. -------- Get your time-indexed data --------
# Data contains 1 column; ["TMP"]
url = "https://github.com/predict-idlab/tsflex/raw/main/examples/data/empatica/"
data = pd.read_parquet(url + "tmp.parquet").set_index("timestamp")

# I renamed the column name to showcase the issue:
data.rename(columns={'TMP': 1234}, inplace=True)

# 2 -------- Construct your feature collection --------
fc = FeatureCollection(
    feature_descriptors=[
        FeatureDescriptor(
            function=FuncWrapper(func=ss.skew, output_names="skew"),
            series_name="1234",
            window="5min", stride="2.5min",
        )
    ]
)
# -- 2.1. Add features to your feature collection
# NOTE: tsflex allows features to have different windows and strides
# fc.add(FeatureDescriptor(np.min, "TMP", '0.5min', '2.5min'))

# 3 -------- Calculate features --------
fc.calculate(data=data, return_df=True)  # which outputs:

IndexError: list index out of range when the series name is in quotation marks and
TypeError: argument of type 'int' is not iterable when the series name is a number

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions