Admission Prediction

Introduction

The purpose of this project is to create a model that can predict the chance of a student being admitted to a university based on their application data. This information can be useful for universities in making admission decisions and for students in understanding their chances of being accepted to a particular school.

The data used in this project consists of student admissions data from a university, including the following features:

  • GRE Scores (out of 340)

  • TOEFL Scores (out of 120)

  • University Rating (out of 5)

  • Statement of Purpose and Letter of Recommendation Strength (out of 5)

  • Undergraduate GPA (out of 10)

  • Research Experience (either 0 or 1)

The target variable is the chance of admit, which ranges from 0 to 1.

Kaggle dataset link: https://www.kaggle.com/datasets/mohansacharya/graduate-admissions

Data preprocessing

The data was preprocessed to handle missing values, scale numerical features, and encode categorical features using one-hot encoding.

Model training and evaluation

For this project, linear regression was chosen as the model for prediction. The model was trained using 80% of the data and evaluated on the remaining 20%. The metrics used to evaluate the model were mean absolute error, mean squared error, and R2 score.

The model achieved a mean absolute error of 0.05, a mean squared error of 0.005, and an R2 score of 0.82 on the test set. A scatter plot was also plotted to visualize the model's predictions against the true values.

Conclusion

The linear regression model performed well on the test set with an R2 score of 0.82. Future work could include exploring alternative classification algorithms or fine-tuning the model's hyperparameters to further improve performance. This model can be used to predict the chance of admission to a graduate program based on the input variables as shown below.