MIT6036L01h

time to get one level more specific so we've been talking pretty generally about algorithms and hypotheses and so on so now for about the first few weeks we're gonna talk about well or at least for the first two two weeks we'll talk about linear classifiers so let's get more specific so a linear classifier now this is really a choice of hypothesis class so what we need to do if we're doing classification we need to find a way to take R to the D big old D dimensional space and divide it into some subspace that we call positive in some subspace that we call negative and the simplest way to do that is by putting a linear separator in there okay so a linear classifier has this form I'll write it and then I'll explain the whole story okay so there's a bunch of things to talk about here and what I wrote that's easier to see so the first thing is when the heck is this notation so many of you are not used to some kind of function that has a semicolon in it so let me just talk about that first right so what we want to do here is talk about so the hypothesis class H is the class of all possible linear separators exercise let me say what is a linear separator here we go what is a linear separator in two dimensions so imagine we have two dimensions X 1 and X 2 so our points are in so d is equal to 2 and we have some positive examples and we have some negative examples we might say ah here's my hypothesis my hypothesis is this line in two dimensions and I'm gonna think about the normal to this height this this line I'm gonna say and it if it if the point is on the side that the normal points to then I have a positive point and if it's on the other side I'm a negative point so that is a linear classifier and that line right what line is that that line is the set of points for which theta dot X plus theta naught is equal to 0 that's what the line is so what is Theta theta is going to be a vector in R to the D and theta naught is going to be in R and just to get the dimensions right many of you this will be your first experience with writing computer programs that do manipulations of vectors and matrices and so on and especially at the beginning the dimensions will be a little bit of a nightmare and we're gonna get you practicing right away on using numpy and playing with representations of these things and being clear about the dimensions but you always want to be really clear about the dimensions we're gonna treat our data points and our theta as column vectors so X I is going to be D by one and theta is d by one so let's just check dimensions here for a minute right so if we take theta transpose that's 1 by D and a 1 by D thing times the D by one thing is a one by one thing that is to say a scaler and then we can add another scalar to it and we're gonna get out a scalar right so this quantity here is gonna be some kind of real number and then sine of a real number is going to give us plus 1 or minus 1 depending on the sign of this thing so that's what that's what that means right ok so back to what is this crazy thing with the semicolon and so now what this means is if you give me some values for theta so if DS equals to 2 this is like a like 2 - number so you give me some values for theta and a value for theta naught that defines the line right it defines the line in in not high school algebra terms right we so don't if you try to remember y equals MX plus B just remove that from your head right so we like we like theta 1 X 1 plus theta 2 X 2 plus theta 0 equals 0 that's what we're gonna think about is the equation for a line why because it extends to high dimensions I'm gonna draw pictures on the board in two dimensions usually that's what I can manage but all of our software all of our thinking should be for Big D D could be a thousand or ten thousand it's hard to think about that space I can think about two and three that's usually but but really we're gonna think about these things in high dimensions this way of thinking about planes extends to high dimensions so even though I draw a picture that your high school algebra applies to don't think about it that way think about it like this so that's the equation for that line okay so now if you specify a theta and a theta not in my example here in two dimensions that's like drawing a particular line and once you've drawn that particular line now you can ask the question for this point X right so now I have a new point I have a new X here and I want to ask the question is my x do I'm going to predict a positive value or a negative value for this X and the way I'm going to answer that question is I'm going to take the coordinates of this point X take the dot product with my theta and theta 0 and ask if it's bigger than 0 or not does that make sense I've take questions about this kind of there's a ton of practice exercises and problems and so on - just help think about hyper planes and normals and stuff like that but fundamentally these things on this side of the semicolon define the particular hypothesis so you could say actually H linear classifiers defines the hypothesis class once I pick the theta and the theta naught I've picked a particular hypothesis a particular line a particular hyperplane and now given those choices of parameters I can ask for this new input would I that it's positive reneik so that's that's the set up

Full Transcript

Need a transcript for another video?