Skip to content

What happens if we don't use cross-validation (CV) and just use trainset and test-set within our forecasting on tail of time data? #887

Open
@clevilll

Description

@clevilll

@JoaquinAmatRodrigo I have a question that I did not use cross-validation (CV) like you have done in Data Partition section in recent notebook. I just split dat into train and test-set (without validation-set) and I plotted for your consideration.
image

  1. Do you think is it critical and used ML-based regression models' learning could be suffer from over/under-fiting when i did not consider CV-set?
  2. base on the picture do you think I damaged nature of time-data in the plot after pre-processing stage (de-noise filter, fill missing sequences, detect and replaced global outliers)? in the plot as legend shows cleaned data divided into train-set and test-set and plot over raw time data.

I also did not also use GridSerachCV within my pipeline also due to save runtime! I have lost of dfs samples I need to apply the designed pipeline. Can you kindly comment on these Qs separately

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions