Since the current iteration is relatively bad at predicting outbreaks, while the general fluctuation is predicted ok-ish, I wanted to try some things for better outbreak prediction.
-
Balancing data
The Idea is that the dataset is unbalanced and outbreaks are realtively rare. I will try to manually identify outbreaks and replicate those observations. Since the data is a primary predictor I'll test the models performance without that info as well.
-
Handle as Classification
Based on the data replication, I'll also try to handle this as an classification problem first. If that works reliable I could fit two different models on the data, one on outbreak one on regular situations.
Since the current iteration is relatively bad at predicting outbreaks, while the general fluctuation is predicted ok-ish, I wanted to try some things for better outbreak prediction.
Balancing data
The Idea is that the dataset is unbalanced and outbreaks are realtively rare. I will try to manually identify outbreaks and replicate those observations. Since the data is a primary predictor I'll test the models performance without that info as well.
Handle as Classification
Based on the data replication, I'll also try to handle this as an classification problem first. If that works reliable I could fit two different models on the data, one on outbreak one on regular situations.