**Off-Campus Hollins Users:**

To access this document, please click here to log in to our proxy server with your campus network user name/password (the same one you use to log into the campus network and your e-mail).

#### Event Type

Research Presentation

#### Academic Department

Mathematics and Statistics

#### Location

Dana Science Building, 2nd floor

#### Start Date

14-4-2023 1:30 PM

#### End Date

14-4-2023 3:00 PM

#### Description

Under the direction of Dr. Giancarlo Schrementi

Predicting loan default is an important problem for managing risk at banks. Banks began to emerge as key players in the lending market since industrial revolution and they would use the collateral to minimize their risk. Loan default is an important issue for banks because it can lead to bank’s insolvency and can have broader impact on economy. Hence, it is important to manage the risk of loan defaults to promote financial stability and economic growth. Previous studies have been done in this field to predict the probability of loan default using logistic regression, machine learning models and python programming models. This study examines using a logistic regression model in predicting the probability of loan default of customers. A logistic regression is a statistical analysis method to predict a binary outcome, such as yes or no, based on prior observations of a data set. Thus, a logistic regression model is used here because loan default is a binary prediction problem i.e. (a loan is defaulted or not) and logistic regression is commonly used in binary prediction. The dataset is taken from the Kaggle dataset repository, an open dataset platform, and contains a wide assortment of features, half of them being categorical and half being quantitative. The data has highly unbalanced class proportions, as most customers do not default. The methods include exploratory data analysis, data wrangling and cleansing, feature selection, and evaluating the resulting model.

Predicting the Loan Default using Logistic Regression Model

Dana Science Building, 2nd floor

Under the direction of Dr. Giancarlo Schrementi

Predicting loan default is an important problem for managing risk at banks. Banks began to emerge as key players in the lending market since industrial revolution and they would use the collateral to minimize their risk. Loan default is an important issue for banks because it can lead to bank’s insolvency and can have broader impact on economy. Hence, it is important to manage the risk of loan defaults to promote financial stability and economic growth. Previous studies have been done in this field to predict the probability of loan default using logistic regression, machine learning models and python programming models. This study examines using a logistic regression model in predicting the probability of loan default of customers. A logistic regression is a statistical analysis method to predict a binary outcome, such as yes or no, based on prior observations of a data set. Thus, a logistic regression model is used here because loan default is a binary prediction problem i.e. (a loan is defaulted or not) and logistic regression is commonly used in binary prediction. The dataset is taken from the Kaggle dataset repository, an open dataset platform, and contains a wide assortment of features, half of them being categorical and half being quantitative. The data has highly unbalanced class proportions, as most customers do not default. The methods include exploratory data analysis, data wrangling and cleansing, feature selection, and evaluating the resulting model.