Skip to content

siddheshwarkoli/Flight-Price-prediction-Regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Flight Price prediction Regression

PROBLEM CONTEXT

  • The goal of this project is to build a predictive regression model that can be useful to forecast the flight fare based on various factors
  • Flight ticket prices are highly unpredictable, often changing day to day, today we might see a price, check out the price of the same flight tomorrow, it will be a different story, making it challenging for travelers and airlines to forecast costs.

DATASET OVERVIEW

Feature Description

  • Airline
    This column will have all the types of airlines like Indigo, Jet Airways, Air India, and many more.

  • Date_of_Journey
    This column will let us know about the date on which the passenger's journey will start.

  • Source
    This column holds the name of the place form where the passenger's journey will start.

  • Destination
    This column holds the name of the place to where passengers wanted to travel.

  • Route
    Here we can know about what the route is through which passengers have opted to travel form his/her source to their destination.

  • Arrival_Time
    Arrival time is when the passenger will reach his/her destination.

  • Duration
    Duration is the whole period that a flight will take to complete its journey form source to destination.

  • Total_Stops
    This will let us know in how many places flights will stop there for the flight in the whole journey.

  • Additional_Info
    In this column, we will get information about food, kind of food, and other amenities.

  • Price
    Price of the flight for a complete journey including all the expenses before onboarding.

TASK 1

PREPARE A COMPLETE DATA ANALYSIS REPORT ON THE GIVEN DATA.

TASK 2

CREATE A PREDICTIVE MODEL WHICH WILL HELP THE CUSTOMERS TO PREDICT FUTURE FLIGHT PRICES AND PLAN THEIR JOURNEY ACCORDINGLY.

TYPE OF MACHINE LEARNING PROBLEM.

  • It is a Regression problem, where given the above set of features, we need to estimate pric.

LIST OF ALGORITHMS USES FOR Regression

  • Linear Regression
  • Support Vector Regressor
  • DecisionTreeRegressor
  • RandomForestRegressor
  • GradientBoostingRegressor
  • XGBRegressor

Result

S.No Algorithm Train R-squared Test R-squared Cross Val Score
1 Linear Regression 0.64 0.64 0.633269
2 SVR 0.01 0.01 0.006415
3 SVR Tuning 0.61 0.61 0.613411
4 Decision Tree 0.96 0.67 0.667105
5 Decision Tree Tuning 0.75 0.75 0.724305
6 Random Forest 0.95 0.79 0.793353
7 Random Forest Tuning 0.89 0.82 0.809537
8 Gradient Boosting 0.76 0.75 0.748531
9 Gradient Boosting Tuning 0.91 0.83 0.825861
10 XGBoosting 0.92 0.83 0.823278
11 XGBoosting Tuning 0.90 0.84 0.828162

Summary

Best Overall Model: Tuned XGBoost

  • With the highest test R² (0.84) and cross-validation score (0.828), it shows excellent balance between performance and generalization.

Top Contenders:

  • Tuned Gradient Boosting (Test R²: 0.83)
  • Tuned Random Forest (Test R²: 0.82)

Models to Avoid:

  • Untuned SVR (Very poor fit)
  • Untuned Decision Tree (Overfits badly)

Tuning Matters:

  • Across all models, tuning improves test and CV scores, proving the value of hyperparameter optimization.

About

Flight ticket prices are highly unpredictable, often changing day to day, today we might see a price, check out the price of the same flight tomorrow, it will be a different story, making it challenging for travelers and airlines to forecast costs

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors