Model Selection And Validation

For this assignment, you are to determine which model is best for prediction, report the

right hyperparameters, and the resulting accuracy for the Digit Recognition data set that

was used in the previous assignment.

As before, create a PDF of your notebook showing your steps and include the table

below as mentioned. Attach both a jupyter notebook and the PDF version. Also, be sure

to include your name at the start of the notebook and at the top of the PDF.

Specifically, you are to test the following models

Model Hyperparameter Testing range Notes

Support Vector Machine

Gamma – size of the kernel C – slack variable

10-x for x = -5 to 5 use the ‘rbf’: radial basis function kernel

K-nearest neighbors

k – number of neighbors

1,3,5,7,9 use the sklearn function

Decision Trees min_samples_split 3,5,7,9 (1 was removed)

use the defaults for the other hyperparameters.

Logistic Regression C – inverse of the regularization strength (smaller = more regularization)

10-x for x = -5 to 5 with the L1 penalty (Lasso)

Steps are as follows:

  1. Separate your data into training and testing. We will use cross-validation over the

training set to select the right parameters

a. Use train_test_split to create a separate training and test set.

X_train, X_test, y_train, y_test = train_test_split(X,

y, stratify=True, test_size=0.20)

b. For the training set, you have two choices to perform hyperparameter

selection.

i. Use cross-validation to evaluate each model variant and select the

best hyperparameters (standard practice, most recommended)

ii. Create a hold-out validation set and train on one portion of the data

and use the accuracy on the hold-out validation set to pick the right

hyperparameters (also valid)

  1. Steps to turn in for the assignment

a. Train the four models with their default parameters. Report the resulting

accuracy of each model using the default parameters.

b. For each of the four models, find the hyperparameters giving the highest

accuracy on the validation set by performing an exhaustive grid search.

Report the hyperparameter values and accuracy on the validation

set.

i. Consider using sklearn.model_selection.GridSearchCV

ii. For the models with two hyperparameters, you will need to search

both simultaneously to find the optimum combination

c. Now apply the highest accuracy trained models to the test set. Report the

accuracy of each model.

Fill the following table with the information.

Model Default validation accuracy

Tuned validation accuracy

Selected hyperparameter s

Final test set accuracy

SVM

k-NN

Decision Trees

Logistic Regression

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Benefits of our college essay writing service

  • 80+ disciplines

    Buy an essay in any subject you find difficult—we’ll have a specialist in it ready

  • 4-hour deadlines

    Ask for help with your most urgent short tasks—we can complete them in 4 hours!

  • Free revision

    Get your paper revised for free if it doesn’t meet your instructions.

  • 24/7 support

    Contact us anytime if you need help with your essay

  • Custom formatting

    APA, MLA, Chicago—we can use any formatting style you need.

  • Plagiarism check

    Get a paper that’s fully original and checked for plagiarism

What the numbers say?

  • 527
    writers active
  • 9.5 out of 10
    current average quality score
  • 98.40%
    of orders delivered on time
error: