Open
Description
@JoaquinAmatRodrigo I have a question that I did not use cross-validation (CV) like you have done in Data Partition section in recent notebook. I just split dat into train and test-set (without validation-set) and I plotted for your consideration.
- Do you think is it critical and used ML-based regression models' learning could be suffer from over/under-fiting when i did not consider CV-set?
- base on the picture do you think I damaged nature of time-data in the plot after pre-processing stage (de-noise filter, fill missing sequences, detect and replaced global outliers)? in the plot as legend shows cleaned data divided into train-set and test-set and plot over raw time data.
I also did not also use GridSerachCV within my pipeline also due to save runtime! I have lost of df
s samples I need to apply the designed pipeline. Can you kindly comment on these Qs separately