Skip to content

Stephanie Shishis LCR Assignment-1 Completed#1

Open
StephanieShishis wants to merge 2 commits into
mainfrom
assignment-1
Open

Stephanie Shishis LCR Assignment-1 Completed#1
StephanieShishis wants to merge 2 commits into
mainfrom
assignment-1

Conversation

@StephanieShishis

Copy link
Copy Markdown
Owner

UofT-DSI | LCR - Assignment 1

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

The changes I was trying to make was to first inspect the data by adding code to find parameters like column/row length, how many predictor values and what variable types existed. I also pre-processed the data by standardizing and data-splitting into testing and training sets. Lastly, I initialized the model by doing a grid search and fitted the KNN model.

What did you learn from the changes you have made?

I learned the importance of standardization for distance-based models and how grid search can be used to find the best hyperparameter.

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

For Question 2, I could have passed one array to train_test_split instead of two separate arrays. If I did one, it would have returned 2 outputs (test, train). However, I would have to then separate the predictors and response variable again before running the grid search.

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

My biggest issue is syntax errors and using the incorrect variable names. I also have to go back to the live coding scripts to remember the correct formatting.

How were these changes tested?

The changes were tested by running each of the code blocks and ensuring no error was thrown.

A reference to a related issue in your repository (if applicable)

N/A

Checklist

  • [YES] I can confirm that my changes are working as intended

@PatelVishakh PatelVishakh left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assignment 1:Complete. Great work! Suggested Changes:

When answering questions rather then commenting in after manually reading the output, should automate it. for eg.

Number of observations (rows)
num_observations = wine_df.shape[0]
print(f"Number of observations: {num_observations}")

Q1)III) The type of variable is categorical. In a data science setting, this question is asking whether the variable is numerical or categorial (Integer, continuous, ordinal are for distinguishing further) to assess whether classification or regression methods should be used. The complete statement should Class is a Categorical Variable, represented here in integers (0,1,2) stored in our Dataframe as int64.

Q4)I) When using results from other code sections rather then commenting in after manually reading the output, you should automate it. Specifically using best n_neighbors

Vishakh Patel [LS]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants