Data Visualisation and Analytics

Data Visualisation and Analytics Assignment 3
Department of Econometrics and Business Statistics, Monash University
Due Date: 24th October 2019 at 1PM
A Implementing kNN classification (10 Marks)
This part of the assignment involves kNN classification of a dataset of 140 bank customers and must be
completed by ALL students. Note that this assignment is based on simulated data and each student has
their own personalised dataset. You must enter your student ID number before downloading your unique
dataset. The data can be downloaded here.
In the dataset, for each employee, data were collected on the following variables:
• Name : Customer name.
• Default: Did customer fail to pay back loan (Default) or successfully pay back the loan (No Default).
• WeeklyIncome : Income per week.
• EmploymentDuration : Time spent in current job.
• WeeklySpend : Average amount of money spent per week.
• Children : Number of children.
• Age : Customers age.
• Sample : Whether the customer is in the training sample or test sample.
The objective is to predict on the basis of Weekly Income, Employment Duration, Weekly Spend, Number of
Children and Age whether a customer will default. The training sample can be used for determining a rule
for prediction and the test sample for evaluation. You may assume that the costs of both types of incorrect
prediction are equal. All numerical variables have been standardised by subtracting the mean and dividing
by the standard deviation of the traning sample. You do NOT need to standardise the data.

Once you have downloaded your data, complete the assignment by g