04 Jul Basic Optimization
ISYE 6740 Homework 3Total 100 points.1. Basic optimization. (30 points.)Consider a simpli ed logistic regression problem. Given m training samples (xi; yi), i = 1; : : : ;m.The data xi 2 R (note that we only have one feature for each sample), and yi 2 f0; 1g. To t alogistic regression model for classi cation, we solve the following optimization problem, where 2 Ris a parameter we aim to nd:max `( ); (1)where the log-likelhood function`( ) =Xmi=1f???? log(1 + expf???? xig) + (yi ???? 1) xig :(a) (10 points) Show step-by-step mathematical derivation for the gradient of the cost function `( )in (1) and write a pseudo-code for performing gradient descent to nd the optimizer . This isessentially what the training procedure does. (pseudo-code means you will write down the stepsof the algorithm, not necessarily any speci c programming language.)(b) (10 points) Present a stochastic gradient descent algorithm to solve the training of logisticregression problem (1).(c) (10 points) We will show that the training problem in basic logistic regression problemis concave. Derive the Hessian matrix of `( ) and based on this, show the training problem (1)is concave (note that in this case, since we only have one feature, the Hessian matrix is just ascalar). Explain why the problem can be solved e ciently and gradient descent will achieve aunique global optimizer, as we discussed in class.2. Comparing Bayes, logistic, and KNN classi ers. (30 points)In lectures we learn three di erent classi ers. This question is to implement and compare them. We aresuggest use Scikit-learn, which is a commonly-used and powerful Python library with various machinelearning tools. But you can also use other similar library in other languages of your choice to performthe tasks.Part One (Divorce classi cation/prediction). (20 points)This dataset is about participants who completed the personal information form and a divorce predic-tors scale.The data is a modi ed version of the publicly available at https://archive.ics.uci.edu/ml/datasets/Divorce+Predictors+data+set (by injecting noise so you will not replicate the results on uci web-site). There are 170 participants and 54 attributes (or predictor variables) that are all real-valued. Thedataset marriage.csv. The last column of the CSV le is label y (1 means divorce’, 0 means nodivorce’). Each column is for one feature (predictor variable), and each row is a sample (participant).A detailed explanation for each feature (predictor variable) can be found at the website link above.Our goal is to build a classi er using training data, such that given a test sample, we can classify (oressentially predict) whether its label is 0 (no divorce’) or 1 (divorce’).1Build three classi ers using (Naive Bayes, Logistic Regression, KNN). Use the rst 80% data fortraining and the remaining 20% for testing. If you use scikit-learn you can use train test split to splitthe dataset.Remark: Please note that, here, for Naive Bayes, this means that we have to estimate the variance foreach individual feature from training data. When estimating the variance, if the variance is zero toclose to zero (meaning that there is very little variability in the feature), you can set the variance tobe a small number, e.g., = 10????3. We do not want to have include zero or nearly variance in NaiveBayes. This tip holds for both Part One and Part Two of this question.(a) (10 points) Report testing accuracy for each of the three classi ers. Comment on their perfor-mance: which performs the best and make a guess why they perform the best in this setting.(b) (10 points) Use the rst two features to train three new classi ers. Plot the data points anddecision boundary of each classi er. Comment on the di erence between the decision boundaryfor the three classi ers. Please clearly represent the data points with di erent labels using di erentcolors.Part Two (Handwritten digits classi cation). (10 points) Repeat the above using the MNISTData in our Homework 2. Here, give digit’ 6 label y = 1, and give digit’ 2 label y = 0. All thepixels in each image will be the feature (predictor variables) for that sample (i.e., image). Our goalis to build classi er to such that given a new test sample, we can tell is it a 2 or a 6. Using the rst80% of the samples for training and remaining 20% for testing. Report the classi cation accuracy ontesting data, for each of the three classi ers. Comment on their performance: which performs the bestand make a guess why they perform the best in this setting.3. Naive Bayes for spam ltering. (40 points)In this problem we will use the Naive Bayes algorithm to t a spam lter by hand. This will en-hance your understanding to Bayes classi er and build intuition. This question does not involve anyprogramming but only derivation and hand calculation.Spam lters are used in all email services to classify received emails as Spam’ or Not Spam’. Asimple approach involves maintaining a vocabulary of words that commonly occur in Spam’ emailsand classifying an email as Spam’ if the number of words from the dictionary that are present in theemail is over a certain threshold. We are given the vocabulary consists of 15 wordsV = fsecret, o er, low, price, valued, customer, today, dollar, million, sports, is, for, play, healthy, pizzag:We will use Vi to represent the ith word in V . As our training dataset, we are also given 3 examplespam messages,• million dollar o er• secret o er today• secret is secretand 4 example non-spam messages• low price for valued customer• play secret sports today• sports is healthy• low price pizza2Recall that the Naive Bayes classi er assumes the probability of an input depends on its input feature.The feature for each sample is de ned as x(i) = [x(i)1 ; x(i)2 ; : : : ; x(i)d ]T , i = 1; : : : ;m and the class of theith sample is y(i). In our case the length of the input vector is d = 15, which is equal to the numberof words in the vocabulary V . Each entry x(i)j is equal to the number of times word Vj occurs in thei-th message.(a) (5 points) Calculate class prior P(y = 0) and P(y = 1) from the training data, where y = 0corresponds to spam messages, and y = 1 corresponds to non-spam messages. Note that theseclass prior essentially corresponds to the frequency of each class in the training sample.(b) (10 points) Write down the feature vectors for each spam and non-spam messages.(c) (15 points) In the Naive Bayes model, assuming the keywords are independent of each other (thisis a simpli cation), the likelihood of a sentence with its feature vector x given a class c is givenbyP(xjy = c) =Ydk=1 xkc;k; c = f0; 1gwhere 0 c;k 1 is the probability of word k appearing in class c, which satis esXdk=1 c;k = 1; 8c:Given this, the complete log-likelihood function for our training data is given by`( 1;1; : : : ; 1;d; 2;1; : : : ; 2;d) =Xmi=1Xdk=1x(i)k log y(i);k(In this example, m = 7.) Calculate the maximum likelihood estimates of 0;1, 0;7, 1;1, 1;15 bymaximizing the log-likelihood function above. (Hint: We are solving a constrained maximizationproblem. To do this, remember, you need to introduce two Lagrangian multiplier because youhave two constraints.)(d) (10 points) Given a test message today is secret’, using the Naive Bayes classier that you havetrained in Part (a)-(c), to calculate the posterior and decide whether it is spam or not spam.3
Our website has a team of professional writers who can help you write any of your homework. They will write your papers from scratch. We also have a team of editors just to make sure all papers are of HIGH QUALITY & PLAGIARISM FREE. To make an Order you only need to click Ask A Question and we will direct you to our Order Page at WriteEdu. Then fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.
Fill in all the assignment paper details that are required in the order form with the standard information being the page count, deadline, academic level and type of paper. It is advisable to have this information at hand so that you can quickly fill in the necessary information needed in the form for the essay writer to be immediately assigned to your writing project. Make payment for the custom essay order to enable us to assign a suitable writer to your order. Payments are made through Paypal on a secured billing page. Finally, sit back and relax.
Do you need help with this question?
Get assignment help from WriteEdu.com Paper Writing Website and forget about your problems.
WriteEdu provides custom & cheap essay writing 100% original, plagiarism free essays, assignments & dissertations.
With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Chat with us today! We are always waiting to answer all your questions.