Can't understand why often Coverage=1.0 of Prediction interval within various models!

I'm experimenting _conformal prediction_ for in-sample forecasting over high-frequency time data (epoch=5mins) retrieving prediction interval using `predict_interval()`. The size of uni-variate (1D) time-series data is `(8640, 1)` meaning a single column value over time for 8460 time steps.  I have applied different regressors to forecast tail of time data (the last 7 days). I'm using` ForecasterRecursive() ` according to our discussion with @JoaquinAmatRodrigo in this [issue#43](https://github.com/cerlymarco/linear-tree/issues/43#issuecomment-2481038734)
```python
runtime_train = []
runtime_predict = []
for i in range(len(name_model)):
    forecaster = ForecasterRecursive(
        regressor=models[i],
        lags=lags
    )
    start_time = datetime.now()
    forecaster.fit(y=data_train_long[name_columns])
    runtime_t = datetime.now() - start_time
    runtime_train.append(runtime_t)

    #start_time = datetime.now()
    y_train_pred = forecaster.predict(steps=len(data_train_long), last_window=None)
    last_window = data_test_long[name_columns].tail(steps)
    start_time = datetime.now()
    y_test_pred=forecaster.predict_interval(
                  steps=steps,
                  interval=[1, 99],
                  n_boot=100,
                  last_window=last_window)
    y_test_pred.index = data_test_long.index
    y_test_pred.to_excel( f"{result_dir}/y_test_pred_in_{name_model[i]}.xlsx")
    y_train_pred.to_excel(f"{result_dir2}/y_train_pred_in_{name_model[i]}.xlsx")
    runtime_p = datetime.now() - start_time
    runtime_predict.append(runtime_p)

    print(f"{name_model[i]} is done.")
    print("*"*30)
```
I have used following to calculate `coverage` to compare models forecast results:
```python
from mapie.metrics import regression_coverage_score
coverage = regression_coverage_score(
    data_test[name_columns],
    y_test_pred - std* 1.96,
    y_test_pred + std* 1.96
)
```
following are the results mainly coverage=1.0 (maybe rounded up when they display in box) :
![image](https://github.com/user-attachments/assets/9a2c39f2-b0a7-485a-b8a6-0efec9e47530)

Yesterday, I saw a new [post](https://www.linkedin.com/posts/joaquin-amat-rodrigo_forecasting-python-machinelearning-activity-7269606405978918913-75OS?utm_source=share&utm_medium=member_desktop) on point for me on LinkedIn from @JoaquinAmatRodrigo  from SKforecast group, about some improvements achieved for better **Coverage**. 

I checked the comment and example notebook and noticed following calculation for `coverage`:
```python
# Predicted interval coverage (on test data)
# ==============================================================================
def empirical_coverage(y, lower_bound, upper_bound):
    """
    Calculate coverage of a given interval
    """
        
    return np.mean(np.logical_and(y >= lower_bound, y <= upper_bound))

coverage = empirical_coverage(
    y = data.loc[end_validation:, 'OT'],
    lower_bound = predictions["lower_bound"], 
    upper_bound = predictions["upper_bound"]
)
print(f"Predicted interval coverage: {round(100 * coverage, 2)} %")

# Area of the interval
# ==============================================================================
area = (predictions["upper_bound"] - predictions["lower_bound"]).sum()
print(f"Area of the interval: {round(area, 2)}")
``` 

I also see one [post ](https://medium.com/bip-xtech/conformal-prediction-for-forecast-confidence-intervals-427887aa8f14)here they deploy other wrapper and use different way to calculate as follows:
```python
#!pip install statsforecast
import pandas as pd
from statsforecast import StatsForecast
from statsforecast.models import SeasonalNaive
from statsforecast.utils import ConformalIntervals

train = pd.read_csv('https://auto-arima-results.s3.amazonaws.com/M4-Hourly.csv')
test = pd.read_csv('https://auto-arima-results.s3.amazonaws.com/M4-Hourly-test.csv').rename(columns={'y': 'y_test'})
n_series = 8
uids = train['unique_id'].unique()[:n_series] # select first n_series of the dataset
train = train.query('unique_id in @uids')
test = test.query('unique_id in @uids')

models = [SeasonalNaive(25)]
sf = StatsForecast(
    models=models,
    freq=1,
    n_jobs=-1)

levels = [80] # confidence levels of the prediction intervals
forecasts = sf.forecast(df=train, h=48, level=levels)

plot_test = test.merge(forecasts, how='left', on=['unique_id', 'ds'])
train_test = pd.concat([train, test.rename(columns={'y_test':'y'})])

# coverage
((plot_test['SeasonalNaive-lo-80']<=plot_test['y_test']) & (plot_test['y_test']<=plot_test['SeasonalNaive-hi-80'])).mean() 
# 0.9192708333333334
```
Can you help me to diagnose taht what is wrong and why most of models I used using `skforecast` with even not good prediction interval that can follow dynamic of test data but they return coverage=1.000? 
I'm confused a bit. 
Thanks in advance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Can't understand why often Coverage=1.0 of Prediction interval within various models! #885

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Can't understand why often Coverage=1.0 of Prediction interval within various models! #885

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions