Use a grouper instead of `unique_id`

In the main feature extract loop, tsfeatures groups by the hard coded `unique_id` columns, and then applies transforms the grouped data.

https://github.com/Nixtla/tsfeatures/blob/5ce2ba79bf71c9a2cd39316fec6907010b2fb64e/tsfeatures/tsfeatures.py#L916

It would be more generic if you could pass in a `Grouper` to perform the grouping, i.e. at the moment I have to group my data then create a flat column from the multi-index (i.e. a column of tuples)

```
# group by id and day
grouper = [pd.Grouper(key='id'), pd.Grouper(key='time', freq='1D')]
grouped_data = df.groupby(grouper, group_keys=True)

# join groups, use grouper key as new index
grouped_data = grouped_data.apply(lambda x: x.drop(columns=['id']))
grouped_data = grouped_data.droplevel(-1)

# flatten index to tuples
grouped_data.index = grouped_data.index.to_flat_index()
grouped_data.index.name = 'id'
grouped_data = grouped_data.reset_index()
```

The issue I've had with that is that I've been experimenting with Dask and data formats like parquet don't seem to support this column type (you can create a Dask data frame from a pandas dataframe that contains tuple columns but so far I've been unable to persist them). I know tsfeatures doesn't support Dask at this stage but I guess it might be on the roadmap?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a grouper instead of `unique_id` #23

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Use a grouper instead of unique_id #23

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Use a grouper instead of `unique_id` #23