Airline & Airport Passenger Choice Modeling

This project demonstrates academic and practical skills in data science and machine learning by modeling passenger choice behavior in the context of airlines and airports. The work is structured as a demonstration of end-to-end data analysis, feature engineering, and modeling.

Project Overview

Goal: Analyze and model the factors influencing passenger choices between different airlines and airports, using real-world survey data.
Approach: The project follows a typical data science workflow: data cleaning, exploratory data analysis (EDA), feature engineering, and predictive modeling.
Implementation: All work is performed in Jupyter notebooks for clarity and reproducibility.

Dataset

The dataset (airport.xlsx) contains survey responses from airline passengers, including demographic information, travel details, and their choices of airline and airport.
Key columns: Airport, Airline, Age, Gender, Nationality, TripPurpose, TripDuration, ProvinceResidence, GroupTravel, Destination, and more.

Methodology & Key Steps

Data Cleaning
- Handle missing values and remove irrelevant or incomplete records.
- Filter out 'Other' categories to focus on main analysis groups.
Exploratory Data Analysis (EDA)
- Summarize data distributions, check for imbalances, and visualize key variables.
- Calculate and interpret choice probabilities for different passenger segments.
Feature Engineering
- Create new features (e.g., binary flags for Korean vs. foreign airlines, categorical encodings).
- Group and transform variables to prepare for modeling.
Modeling
- Prepare data for machine learning models (e.g., logistic regression, classification).
- Demonstrate model training, evaluation, and interpretation of results.

Rigorous Approach to Groupings and Choice Analysis

A key aspect of this project is the careful, data-driven approach to defining and analyzing groupings (e.g., airline types, passenger segments):

Explicit Group Definitions:
Groupings such as “Korean vs. Foreign Airlines” are defined based on domain knowledge and project goals, with clear code and comments explaining the rationale.
Flexible, Reusable Analysis Functions:
The project includes modular functions (e.g., choiceProb) that allow for flexible analysis of choice probabilities across any grouping variable, making the workflow adaptable to different questions and datasets.
Data-Driven Grouping Decisions:
Groupings are informed by exploratory analysis—choice probabilities are calculated and visualized before finalizing group definitions, ensuring that feature engineering is grounded in the data.
Transparency and Interpretability:
All grouping logic is clearly documented, and outputs are designed to be interpretable, supporting both reproducibility and clear communication of results.

This rigorous approach demonstrates not only technical proficiency but also a strong analytical mindset and attention to best practices in data science.

Skills Demonstrated

Data cleaning and preprocessing with pandas
Exploratory data analysis and visualization
Feature engineering and transformation
Handling missing data and categorical variables
Building and evaluating machine learning models
Writing modular, readable, and reproducible code in Jupyter notebooks

How to Use

Install dependencies:
```
pip install -r requirements.txt
```
Open the Jupyter notebooks (Airline_Modelling.ipynb, Airport_Modelling.ipynb, or AirlineModel_refactored.ipynb).
Run cells step by step to follow the workflow from raw data to modeling and results.

Notes

This project is intended as a demonstration of academic and practical skills in data science and machine learning, not as a production system.
The code is modular and well-commented for clarity and learning purposes.

For questions or collaboration, please contact the project author.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
Airline_Modeling.ipynb		Airline_Modeling.ipynb
Airport_Modeling.ipynb		Airport_Modeling.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Airline & Airport Passenger Choice Modeling

Project Overview

Dataset

Methodology & Key Steps

Rigorous Approach to Groupings and Choice Analysis

Skills Demonstrated

How to Use

Notes

About

Uh oh!

Releases

Packages

Languages

toofanCodes/predictingCustomerChoices

Folders and files

Latest commit

History

Repository files navigation

Airline & Airport Passenger Choice Modeling

Project Overview

Dataset

Methodology & Key Steps

Rigorous Approach to Groupings and Choice Analysis

Skills Demonstrated

How to Use

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages