Stephanie Shishis LCR Assignment-1 Completed by StephanieShishis · Pull Request #1 · StephanieShishis/LCR

StephanieShishis · 2026-04-21T20:10:20Z

UofT-DSI | LCR - Assignment 1

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

The changes I was trying to make was to first inspect the data by adding code to find parameters like column/row length, how many predictor values and what variable types existed. I also pre-processed the data by standardizing and data-splitting into testing and training sets. Lastly, I initialized the model by doing a grid search and fitted the KNN model.

What did you learn from the changes you have made?

I learned the importance of standardization for distance-based models and how grid search can be used to find the best hyperparameter.

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

For Question 2, I could have passed one array to train_test_split instead of two separate arrays. If I did one, it would have returned 2 outputs (test, train). However, I would have to then separate the predictors and response variable again before running the grid search.

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

My biggest issue is syntax errors and using the incorrect variable names. I also have to go back to the live coding scripts to remember the correct formatting.

How were these changes tested?

The changes were tested by running each of the code blocks and ensuring no error was thrown.

A reference to a related issue in your repository (if applicable)

N/A

Checklist

[YES] I can confirm that my changes are working as intended

PatelVishakh

Assignment 1:Complete. Great work! Suggested Changes:

When answering questions rather then commenting in after manually reading the output, should automate it. for eg.

Number of observations (rows)
num_observations = wine_df.shape[0]
print(f"Number of observations: {num_observations}")

Q1)III) The type of variable is categorical. In a data science setting, this question is asking whether the variable is numerical or categorial (Integer, continuous, ordinal are for distinguishing further) to assess whether classification or regression methods should be used. The complete statement should Class is a Categorical Variable, represented here in integers (0,1,2) stored in our Dataframe as int64.

Q4)I) When using results from other code sections rather then commenting in after manually reading the output, you should automate it. Specifically using best n_neighbors

Vishakh Patel [LS]

assignment-1 completed

1fb9245

PatelVishakh approved these changes Apr 24, 2026

View reviewed changes

Suggested changes made

eb7fa5f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stephanie Shishis LCR Assignment-1 Completed#1

Stephanie Shishis LCR Assignment-1 Completed#1
StephanieShishis wants to merge 2 commits into
mainfrom
assignment-1

StephanieShishis commented Apr 21, 2026

Uh oh!

PatelVishakh left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

StephanieShishis commented Apr 21, 2026

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

What did you learn from the changes you have made?

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

How were these changes tested?

A reference to a related issue in your repository (if applicable)

Checklist

Uh oh!

PatelVishakh left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants