MIT6036L01c 1

there's an enormous catalogue I kind of wrote a long list in the introductory set of the notes about different settings of machine learning within this class will will study many of them but not all of them there's kind of too many to do an introductory class I'm not going to go through that catalogue right now what we're gonna do is just dive in and start thinking about what's probably the most most typical most generally well understood useful setting which is what people call supervised learning so in supervised learning the idea is that you're given a data set given some data set and it's organized into a set of pairs and I'll try to introduce notation that will be mostly consistent about okay so in supervised learning we're given a data set of pairs this X in each one of these pairs we can think of it as an input and the Y as an output and so the idea is that what we want to do is basically learn a mapping learn some kind of relationship so that in the future when we're given an X we can predict a Y right so this might be these are the X's the vital signs of a patient and the Y is whether or not they're having a heart attack or how many days we expect them to be in the hospital the X might be an image and the Y I might be whether it's a face or whose face is it or how face like is it or something like that so in supervised learning it's called supervised because we're given in question-answer pairs right question answer question answer question is sir and we have to figure out a way from that to answer future questions typically not always typically and for us really most of the time the X is the X I will be you can think of them as vectors of real numbers in D dimensions D could be big D could be small but vectors in D dimensions and Y can be a variety of different things what we're gonna do in the first few weeks here is consider the case where the Y is discrete in a particular to start with we're going to consider Y is in plus 1 minus 1 so when y I is in pusle and minus 1 this is called a classification problem so classification is a kind of supervised learning and in particular because there's only two elements on that set it's a binary classification problem ok so this is like the set up for for trying to decide whether this image contains the face or it doesn't right so we're given an X and we're gonna say yes or no so we often call this a positive example if the y is plus and a negative example if the Y is minus so we're given a bunch of examples XY pairs but let's talk just for one more minute about that X's and where they come from so I said you know humans had to do a bunch of work so one piece of the work that humans have to do is really come up with a representation we'll talk about this more next week but I just you know we have to kind of think about it already from the beginning so really X's might be the actual X's actual access might be patients or songs right now you can't like take a song and put it in the data set like abstractly you you can only you have to characterize it somehow so sometimes we'll talk about a feature mapping so there you might say well I have real exes real secret magic exes which are like people or songs and what I have to do is go from the person or the song into some actual feature representation and it's that feature representation that's in art of the deep right the song is not an element in a vector space the song is just a song but we have some feature representation that could take a song and give us a vector so and that's the thing that you will have to design if you're gonna go out in the world and apply machine learning to some problem you will have to do something to figure out how it is that you're gonna go from songs to some kind of representation you could put in a data set songs or people or whatever cars so we will know now that I've introduced this as a thing to worry about we're gonna mostly not talk about it well we will come back to it off and on but I'm gonna write on the board all the time that we're mapping exes in art of the D to wise but really when you think about that those X's come from some process that a human did which is to take an actual thing in the world a day or a room and map it into some vector in some fixed dimensional space lots of the current work in machine learning is about doing stuff that doesn't involve fixed dimensional vector spaces but we in this class are not going to worry about that okay so we had some domain of interest we have examples we've coded up the examples in some way so that the inputs are elements in our to the D and the outputs are plus 1 and minus 1 so that's the set up for classification that we're going to think about

Full Transcript

Need a transcript for another video?