# Model Selection And Validation

For this assignment, you are to determine which model is best for prediction, report the

right hyperparameters, and the resulting accuracy for the Digit Recognition data set that

was used in the previous assignment.

As before, create a PDF of your notebook showing your steps and include the table

below as mentioned. Attach both a jupyter notebook and the PDF version. Also, be sure

to include your name at the start of the notebook and at the top of the PDF.

Specifically, you are to test the following models

Model Hyperparameter Testing range Notes

Support Vector Machine

Gamma – size of the kernel C – slack variable

10-x for x = -5 to 5 use the ‘rbf’: radial basis function kernel

K-nearest neighbors

k – number of neighbors

1,3,5,7,9 use the sklearn function

Decision Trees min_samples_split 3,5,7,9 (1 was removed)

use the defaults for the other hyperparameters.

Logistic Regression C – inverse of the regularization strength (smaller = more regularization)

10-x for x = -5 to 5 with the L1 penalty (Lasso)

Steps are as follows:

1. Separate your data into training and testing. We will use cross-validation over the

training set to select the right parameters

a. Use train_test_split to create a separate training and test set.

X_train, X_test, y_train, y_test = train_test_split(X,

y, stratify=True, test_size=0.20)

b. For the training set, you have two choices to perform hyperparameter

selection.

i. Use cross-validation to evaluate each model variant and select the

best hyperparameters (standard practice, most recommended)

ii. Create a hold-out validation set and train on one portion of the data

and use the accuracy on the hold-out validation set to pick the right

hyperparameters (also valid)

1. Steps to turn in for the assignment

a. Train the four models with their default parameters. Report the resulting

accuracy of each model using the default parameters.

b. For each of the four models, find the hyperparameters giving the highest

accuracy on the validation set by performing an exhaustive grid search.

Report the hyperparameter values and accuracy on the validation

set.

i. Consider using sklearn.model_selection.GridSearchCV

ii. For the models with two hyperparameters, you will need to search

both simultaneously to find the optimum combination

c. Now apply the highest accuracy trained models to the test set. Report the

accuracy of each model.

Fill the following table with the information.

Model Default validation accuracy

Tuned validation accuracy

Selected hyperparameter s

Final test set accuracy

SVM

k-NN

Decision Trees

Logistic Regression

## Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
\$26
The price is based on these factors:
Number of pages
Urgency
Basic features
• Free title page and bibliography
• Unlimited revisions
• Plagiarism-free guarantee
• Money-back guarantee
On-demand options
• Writer’s samples
• Part-by-part delivery
• Overnight delivery
• Copies of used sources
Paper format
• 275 words per page
• 12 pt Arial/Times New Roman
• Double line spacing
• Any citation style (APA, MLA, Chicago/Turabian, Harvard)

## Benefits of our college essay writing service

• ### 80+ disciplines

Buy an essay in any subject you find difficult—we’ll have a specialist in it ready

• ### Custom formatting

APA, MLA, Chicago—we can use any formatting style you need.

• ### Plagiarism check

Get a paper that’s fully original and checked for plagiarism

### What the numbers say?

• 527
writers active
• 9.5 out of 10
current average quality score
• 98.40%
of orders delivered on time
error: