Kathleen Creel (Northeastern University): 'Against the Pl...

Thanks so much to Jonathan for the warm introduction and to all of you the key and secondary members of uh for being here. Um I'll leave it ambiguous which my evaluation of which are key and which are secondary. Um but I know this department is one of my favorite in the whole world. So I'm just so thrilled to be here. Um, so Liam always says that you have to say straight off what you're going to argue. So here I am. I'm against it. But what is this thing that I am in some senses against? Um, so there's been this emerging discussion in the machine learning literature uh about this idea that they very ambitiously call the platonic representation hypothesis. But it's the idea that neural networks trained with different objectives on different data and modalities are converging to a shared statistical model of reality in their representation spaces. So I'm going to go through and I'm going to explain what each of those uh nouns means to them. But think of this as kind of the motivating picture which is like the world is out there. Uh perhaps it's conveniently predivided into platonic forms for us. Who can say? uh but we're trying to recapture uh that external reality in our various representations and different modalities. Uh and the question is are those representations themselves converging? And this is part of a broader family of universality and convergence hypotheses in machine learning. So uh there's sort of this this more general idea that um as models improve they're converging. Uh there's a nice expression of that called the Ana Coranada principle. Uh many people have gotten one sentence into the very long book Anacaranada. And uh first sentence is happy families are all alike. Unhappy families are unhappy in different ways. And so the idea is if models are happy, if they're successful, if they've learned something about the world, maybe they've done that in the same way. Maybe they have the same representations uh as evidenced by their success. Or a slightly more you could say uh demand side interpretation of this idea would be look uh we have ask these general purpose models to solve a bunch of different problems. Uh and as we give them more and more problems imagine that each of those problems has perhaps a large set of possible solutions models that could solve that problem. But as we ask the models to solve more and more problems, maybe the overlapping space of possible solutions to all those problems simultaneously becomes smaller. And so you could say, yeah, maybe they're convergence. They're converging because the optimization problem for a general purpose model is just so difficult that there aren't that many ways to solve all those problems at the same time. So that would be kind of uh the cow and yammon's version of this. And this idea has really taken off in computer science. Uh as many of us know to our chagrin, computer scientists site more than we do. Uh our papers really are great. It's just that people aren't citing them enough. Uh but in computer science, they they do that quite a bit more. But it's not just that people are citing this over 400 times in the last couple years. It's that they're using it in the title. They're using it as part of the motivation of a research program. So they're saying, you know, we're our research is guided by the platonic representation hypothesis. So that leads me to think that this question has some stakes. Um it also has some big philosophical claims. So in the original paper itself, they cite a bunch of people that we have uh some familiarity with. So they cite, you know, old school hardcore realness as models improve their converging on the true representation of reality. A bunch of philosopher names. Um they do explicitly phrase this in this very platonic language. So they say the training data for our algorithms are the shadows on the cave wall. Yeah, we hypothesize uh models are recovering ever better representations of the actual world outside the cave. Now, this idea didn't survive contact with real Plato scholars. Anyone in the room who has read Plato probably recalls that the one thing Plato says you can't do is learning more about the shadows and then somehow discover things about so I think uh we might want to interpret this more as a as a neocontian idea that we're going to somehow like punch through the phenomena and grasp the numina um and it's going to be great. Um and there's also this sense that uh in virtue of doing this models are going to be able to surpass humans that they're going to be better than us that we are struggling away here uh with the mirror phenomena but they are going to converge on a perfect lossless form of our underlying reality. They're going to have a betterformational grasp of what's really happening. Um, okay. So, uh, it's it's fun to dunk on philosophical claims in scientific papers, but that's not the real purpose of my interest in this. I think there are actually very practical stakes for the scientific research programs of whether or not this claim is true and how we would evaluate whether it's true. Because if it's true, if it's true that existing convergence between models gives us evidence that they are converging on this really successful thing, we should be accelerationist about this, right? We should be trying to get models to converge faster. If we think that existing convergence is already evidence of this epistemically good thing, we should prioritize the models that have already converged over models that aren't converging. We should think those are less good in some way. So it would give us ways to choose between different architectures, modeling strategies. This would be a source of evidence for us. And maybe, you know, if we take the strongest version of this claim, we should say, hey, if there's stuff that PR models are really good at, maybe humans should stop spending as much effort doing it ourselves, we should just let the convergent models handle this strategy. But if the platonic representation hypothesis isn't true, uh maybe we should make different resource allocation choices. We should encourage model multiplicity. We should pull models apart or at least it would be acceptable to do that. We should say maybe it would be good if if different models covered more of the logical space. We don't need them to converge. We actually need them to cover more ground. And of course, we would be allowed to keep doing human science, which would be a bit nice, I think. Um so this is part of uh a broader research program for me of uh transparency and interpretability. Something that I know a lot of people here also work on. What do we need to know in order to have warranted trust in some kind of computational system? Algorithmic monoculture. What are the epistemic and political risks of allowing the same algorithmic system to dominate an ecosystem? and what role should uh our values play in automated decision-m and in science. So the methodology I'm going to try to use is one that uh I've used in a few of my projects and I know is very familiar to people here. Do some philosophy uh come up with an empirically testable or formally a valuable hypothesis. try to actually do that thing and then see what we learn philosophically to feed back in to our philosophical research program. So unfortunately science is takes a long time so we're still here. So I'm going to give you my philosophical analysis uh tell you the hypothesis that I'm going to try to test and hope that by the end of the year we'll have some more formal results to bring back to you. Um so I want to thank my collaborator on this project. Uh we are working together and unfortunately not with these dogs um and all my wonderful collaborators. Their heads will pop up on their slides but their contribution cannot be contained by the circle in which they've been bounded. Okay. So what is the platonic representation hypothesis? I promised you an explanation of each of those words and I will give it to you in the next five minutes. So uh what do they mean when they say these words? Neural networks are converging to a shared statistical model of reality in their representation spaces. Okay, so think of this as as models get bigger and they perform better on our suite of benchmark tasks. So in machine learning, when people say more successful or better, they almost always mean we've set up this list of tasks that are numerically evaluable and the numbers are going up or down in in the in the direction we consider to be positive on those tasks. So the error is going down, we're making fewer mistakes, we're being faster, whatever it might be. Um, as the models themselves are better, uh, they become closer in this particular way. their mutual k nearest neighbor scores increase. So what does that mean? All right, so let's briefly delve inside a large language model to use a large language model's favorite word. Uh so let's say we had some uh inter string uh LSE philosophy is a great notice that I'm I've specially created this to express my true feelings but also pander to you the audience. Um, so we put in these terms and now uh the model is trying to generate what's the most likely next word. Unfortunately, it doesn't say what I think is the most likely place. Instead, it says way. Uh, it's GPT2. Not very good. Um, right. So, uh, let's zoom in right on the beginning here. I put in these English language words and characters, but models of course would prefer to work with numbers. So we have to translate those words into what's called embeddings. So we have these numerical uh token embeddings and then what do we do with those? Uh we map those to a location in highdimensional embedding space. And all we're going to say that a representation is is just some function that assigns a feature vector to each input in some data domain. So think of representation here and through for the rest of the talk in the smallest possible R that you can and R invisible from space. Um so representation just means this. Um, so let's say we uh are visualizing embedding space. So as I spend more time in the battle lab, hopefully I'll be able to imagine 238 dimensions in my mind. Uh, but until then, I'm stuck just imagining 3D space. Um, so let's say we have one dimension that's sort of like here are all these places and I've mapped them in terms of the size of the place going from England all the way down to a neighborhood in London. uh bigger to smaller and here I've mapped what side of the Atlantic are they on. So these are just two hypothetical dimensions uh ways I could map and separate my data. So here because I've picked those two dimensions everything else all the other uh 700 plus dimensions are collapsed into the residual everything that's left over. But if I add something that's intuitively not a place like popper, all of a sudden I get more uh dimension in the residual. So you can see uh there's more left here that hasn't been captured by my two placebased dimensions. So this is one model uh and this is the location of each of these um input tokens in this multi-dimensional space. But how would we compare the spaces of two different models? So let's say you were looking at this purple point in the spaces of these three different models. So all the points are different words. Let's think of them. And the words have different dimensions in that they're closer or further together depending on how similar they are based on their initial training. And we're trying to figure out how similar are these three models to each other with respect to where they put these words. So intuitively, you might think these ones are more similar with respect to this purple dot and this one is a little bit more different. But when we do that intuitively, we're making a global comparison. We're taking some center point and we're measuring the distance from the center point. But it turns out when you have super highdimensional space that's not always reliable because you can have a few points that are extreme outliers and they pull the global center uh in a way that makes these kind of like global comparisons not always reliable. So instead we might decide to do a local comparison. We might say for each point so here's popper what are its 10 nearest neighbors? So if we were to measure in space in any direction, what are the 10 closest points to this point here? We've got Vickingstein, Marx, Quin, Dennit, empiricism. Like I said, not the greatest model. Uh, okay. So those are the 10 nearest neighbors of this point in this model. Now I can compare the list of two different models. So what are the 10 nearest neighbors of Popper in each of these two models? And the thing I'm going to measure is how many of their 10 closest nearest neighbors are the same. So here it's four. Uh in machine learning we often like to do really simple calculations because we're know we're going to have to do them a billion times. Uh not an exaggeration. So um here I'm just going to look at these two lists. I'm going to count the overlap between the lists and the similarity between this point is going to be four out of 10 or 40%. That's really the fundamental methodology here. So when we say that neural next works are converging, what we mean is as these models get bigger and better, their mutual k nearest neighbor scores increase. The number of overlaps on average between their the nearest neighbors of each point get more similar. Okay, so that's a modality. So what are we going to say about the results? Well, we have uh two different types of measurements here. We have uh same modality vision to vision. So we can say as these vision models um we look at different sizes, different ways to make a vision model, different tasks or number of tasks they've been given, different kinds of data they've been given and we can say as they get bigger and better they start to overlap. Non coincidentally, the best performing models have a 40% overlap. Is that good? I don't know. It's the biggest number you're going to see today. So, enjoy it while it lasts. Uh, okay. So, now we have different modality convergence. So, remember one of the cool things about this was supposed to be that, uh, we could say models trained in totally different modalities are going to be converging. So, how do they do that? I think this methodology is quite nice actually. So they look at this data set of uh pairs between a Wikipedia page and the caption uh and the image that is captioned. So they'll say for LSC here's the image. It's the code of arms and they'll say let's take that as a pair. So LSC code of arms and this image. And now I'm going to put the LSC code of arms in the vision model space. And I'm going to put the text of the caption in the language model space. And I'm going to compare the similarity of their nearest neighbors. So that's how we meet different modality convergence. And here as the models get bigger and better, they have 16% convergence. Uh is that a good number? I don't know it's about to get even smaller. So, uh that is the basic methodology that they're using to measure convergence. And now I want to ask a more philosophical question. Does convergence sort of uniquely support the strong realism that they are suggesting? So the first thing I want to say is that we can think of uh this argument as a standard move in the scientific realism anti-realism debate. So they are implicitly saying that we should take this convergence as evidence for realism about the phenomena that they're presenting. So this is something that we see a lot in classic discussions of scientific realism. uh Peron says hey you know we should believe that Avagadro's number is this specific value because there are 13 different forms of convergent evidence uh I've you know done some of the experiments myself but I've also correlated other people's experiments and all these experiments done in different ways measuring different kinds of things they all converge on that one number and that's supposed to be sort of different and special evidence over and above 13 kinds of the same evidence But of course, uh, the realism in an anti-realism debate is battleh hardened. They're, uh, not so easy to fool. Um, vonfroen is not going to just take that line down. He says, "Hey, uh, sure, these these cool convergence respon results, they provide empirical grounding for the number, but they don't have to tell us anything about the underlying reality. You can't get me that quickly." Um, we could also kind of try to explain it away. We could say, sure, we seem to have this convergence. We seem to have this robust result, but maybe all of the results are relying on the same artifact or the same mistaken assumptions. And so this apparent robustness actually doesn't have the eventual value that it seems to. Or it could be that uh the models that all seem to agree were assembled opportunistically. uh we weren't trying on purpose to cover the whole logical space. So sure, they happen to agree, but maybe there's other ones we could have looked at that wouldn't have agreed and we just haven't found them yet. Or maybe there is something about this is kind of like a cow and Yemen's point, but in a in a less realist. Maybe there is something about the world uh where there's something epistemically difficult about pursuing a different strategy. And yes, models are converging, but it's all because they're converging on the same useful idealizations that allow them to avoid something difficult like a a difficult bit of physics or something like that. So yes, they're converging, but it's not on the truth per se. So I want to pick up on one of these strategies, uh the idea that convergence is often contingent in machine learning. Other equally good but substantively different models could exist, but we can't always find them. So this draws on uh a line of work in machine learning that uh I've been engaged with which is the idea that there exists for the same problem many models that have the same accuracies but different token outputs. So we can see that even a very accurate model that's 92% accurate. Of course that means it's 8% inaccurate. And so two different models could differ in up to 16% of their points while still maintaining the same accuracy. And you can imagine rotating the mistakes so that you know and different models make mistakes on uh every different point that could be mistaken. And we can try to generate this multiplicity on purpose by making different decisions throughout the pipeline through which we create the models. Importantly for our particular purposes here, we know that models can have similar accuracies while relying on different internal representations. So, uh, although different models might all successfully recognize this image as being focused on a tiger, um, some of them would rely on orange more than others. Some of them would rely on stripes more than others. And some of them might rely on internal representations that none of the rest of them rely on, like claws, even though the claws are invisible here. So, in some empirical work with co-authors, we've looked at this question. Can we create substantively different models from the same data? So, uh, we had this task of selecting patients for a high-risk care management program at a hospital. So, it would be a good thing for the patients to be included. We had this data set which was basically everything the hospital knows about these patients throughout their entire history of being there. So, a lot of features, but we know that past algorithms had significant racial disparities. Black patients on average that were selected by existing algorithms had to be much sicker in order to be recommended for the high-risk program. So, can we find out of the set of all possible models, the least discriminatory one, for example, or the one that has some other beneficial feature? So we tried to create a bunch of different models using all the fancy tricks that we know. We can use different subsets of the features. We can train the data in a different order. We can perturb the weights ourselves manually. But importantly, we still have the same data. There is no more information about this patient beyond all the information the hospital has ever collected. And what we found is that uh we even with all all our fancy techniques, we're still bound by the epistemic limitations of the data. There are some patients who when we roll the clock forward, we learn that they did in fact have many serious conditions and would have benefited from this program that at the time we're trying to do the prediction, we never found them. Um and so what that suggests is uh given many empirical states of the world and the data that we have gathered at up to this point, there are some qualified individuals that are never found by any of the models. And so if there exists a model that would have found them, which we can train once we know the answers, that model is an unconceived alternative in the sort of Kyle Stanford sense. And why might we not be able to find these models? Well, one reason is that we have this tradition of component sharing in machine learning that we often think leads to a lot of our success. So, we're like, uh, if there's a best data set, we're all going to use it because it's the best. If there's a best model, of course, we're all going to flock to use that model. Um and in this case all the vision models in the platonic representation hazle were trained on the best source of label data imageet. So this is like an enormous collection of images each of which is labeled with its focal image. And this is a potential sort of contingent source of convergence that undercuts the evidential value of convergence for the PR. It could be that models are becoming more similar because they're better collecting the existing features of the same data set uh that they're all trained on. So we can see that uh models sometimes learn the same errors. Um for example, imageet models will tend to use spurious cues and shortcuts like background textures to predict foreground objects. So here uh the model is supposed to say this is a squirrel. Um, it says with high probability it's a sea lion. Here the model is supposed to say this is a dragonfly, but it says with high probability this is a manhole cover. So look at this for a second and see if you see why. It's because of the strong texture. So the texture of the wet rock is similar to the texture of the skin of the sea lion. This cross-hatching on the pool chair is similar to the cross-hatching on the manhole cover. So, it's not that they're making mistakes. All models make mistakes. It's that they're making the exact same mistake because they've been trained on the same data. So, I think we might say, okay, let's just get different data. And sometimes that might be the right answer. But in a lot of these large model cases, that's not that feasible because our whole training strategy is that every model creator is trying to collect literally all the data. So, if your training strategy is to say, I've already scraped the entire internet. I've already, you know, scraped all the text of all the books that have ever been digitized, and now I'm trying to digitize some more. Uh, and every model provider is trying to do that exact same thing. They're converging on the set of all possible texts that currently exists, which means they're converging on the same data set. So the mere fact of convergence can't actually distinguish between convergence due to component similarity or convergence to the landscape necessity. So sometimes people in the PR tradition talk as if we're all converging because we're getting closer and closer to this peak. But of course it could be that we're getting closer and closer to this peak which our data currently suggests is the best one. Maybe different data would lead models to a different equally successful space. We also can't know that convergence is on a global versus local optimum. And I want to throw one more point in here for the Plato scholars. Uh one note about the embedding space measurement is that they're essentially conservative of the original concepts. So we have the concepts we came in with like London uh we are creating distance between the terms and perhaps we could say that's implicitly changing their meaning as they take on new distance relationships between related terms. Maybe in some contextualist sense they have a new meaning but we can't using this method create new terms. We can't say do conceptual engineering. And so if you thought that part of platonic representation was carving nature at its joints, you can't recarve using this method. And again, I think this is related to undercutting arguments that are made by anti-realists or by pluralist realists. So maybe there's unconceived alternatives. Maybe current evidence is misleading. It might genuinely support a theory currently, but it will be later superseded as we collect more evidence. And maybe we just shouldn't think that we should desire to converge on one true scientific ontology. Maybe there are cases where multiple are useful. Um, another possibility is that there's just nothing to be explained here. So the PR proposes that uh cross-domain convergence results of 16% is a phenomenon worth that they're converging and we need to figure out why. But if we don't think that number is that high, we could say, well, you know, these cross domain results aren't really truly separated. We're looking at vision, but the vision models are trained on images with text labels, so they already have some language in them. The images have also been collected to have focal objects that are categorized according to what we think are already like categories in the world. And then the language models also trained in English. Of course, they would overlap because so they have this point of convergence in language. So they're not truly separate modalities. We can see this, I think, when we move to true scientific data. So luckily machine learning people work really fast within like a year of this paper being published. There are a bunch of follow-up studies and one of them is trying to do PR but with uh astronomy data. So the idea is we have these pictures of galaxies from three different sources. So we can see if there's domain convergence within the domain of just image. Uh the highest within domain convergence is 7.2%. 2%. I told you the numbers were going to keep going down. Buckle up. They're getting lower. Uh so this would be as we try to uh put each of these galaxies into an embedding space. How similar are their locations? But then we have a cool crossality convergence. So here is a different type of evidence that we get from the stars. We have optical fibers. So like yes, it's still the modality of light, but it looks really different. I think this looks quite different than you know an image of the galaxy itself and it gives you some different information that you wouldn't have uh from only a photograph. So we have optical fibers and we have spectra. So now we're going to say uh what's the cross domain um convergence between image to spectra and here it's only 0.55% convergence. Is that good? Well, it's better than chance, which would be 0.05. Uh, but it's not an enormous amount of convergence. So, you might say maybe there isn't a big phenomenon to be explained here. But I don't want to leave it here at this negative point. I don't want to just be the philosopher who comes in and says, you know, don't like your methodology. Don't talk about Plato. Um, I want to provide something constructive. And I think part of the problem here is that the metric that's being proposed in this literature um is continuous. So they might and indeed do finish a paper and say 0.55 convergence. We did it. Hooray. We've achieved you know platonic convergence. We're on our way. Uh I might look at that and think that number is pretty small. But because this metric is continuous there isn't a way to resolve these disputes methodologically. So I think K nearest neighbors might just be the wrong kind of metric. The other reason I think it's the wrong kind of metric that's even friendlier to their research program is that I think it penalizes them for the very differences between modalities that we're trying to abstract away from. So let's say we thinking about the modality of color. There might be similarities between objects because they're all red. But then if we think about their haptic properties, uh, their graspability, of course, we wouldn't be able in haptic to figure out which ones were similar because they're all red. But we shouldn't penalize the haptic score and the vision score for not having this similarity of redness. That's not the kind of thing that they could even conceivably converge on. So, and similarly, we can't tell by just looking at these minerals, or at least I can't, which one is the hardest, but you might be able to tell that haptically. So, again, I think they're they're not setting themselves up for success by uh doing the K nearest neighbor scores because they're getting unfairly penalized. So my hope here is to draw on our philosophical resources to offer this research program a discrete functional criterion for convergence. So it has to be something that will give us some kind of yes or no answer. And I would like it to be based on some kind of property. If the models really are converging, they should be able to do something. There's something they should be able to do that they couldn't if they weren't converging. Okay. So I'm going to propose that that is solving molo problems. So what are molino problems? So here's the part of the talk where I'm going to try to give a hypothesis. Here we have to go back to the 17th century. A good place to visit in our minds but not in reality. Um so let's say uh William Molano is a physician. He writes to our uh friend of John Luck and he says, "Hey uh my wife is blind. I've been thinking a lot about blindness. Um suppose there was someone who had been born blind and uh he of course could distinguish between objects by touch. Uh and then you know perhaps at this point non-coincidentally uh the technology of cataract surgery is starting to hit uh the backwards northern parts of the world from the uh highly developed um southern parts of the world. And so let's say someone's sight was miraculously restored and you put the cube that they could distinguish by touch and the sphere on the table. You don't let them touch them and you say which one is the sphere and which one is the cube. So you have these objects that the person understands conceptually and you say can you recognize them? Lock the empiricist says no no way. Uh when we learn to associate our uh representations in different modalities it's brute force. We just smash together what it feels like to touch this thing and what it feels like to look at this thing. We hot associate them. We learn that they're the same. But if the blind person hasn't had that experience, how could they possibly figure out which one was the sphere and which one was the cube? But blindness, ever sunny, ever optimistic, says, "Of course, it seems to me beyond question that the blind person could do this. they would just apply the rational principles to the sensory knowledge they already have and they would just do it. Um, now this seems like an empirical question. Uh, but the debate raged on in the early modern era for quite a long time and I think it's because they didn't have very good experimental testing conditions. I mean there's no anesthesia. So you're not even asleep. you have the surgery, it ends and someone says sphere or cube sphere, you know, maybe you don't get the cleanest data. But in contemporary versions of this trial, uh what we find is uh let's say subjects are presented with initially a target object. So you're supposed to learn this object and then you're presented with a second similar but different object. Um, and you're supposed to figure out which of these two objects is the one you originally saw. So, in the touch case, so you're presented with it haptically and then with both of them haptically, of course, they're very good, 98%. Um, interestingly, in the vision case, this is the new modality for them, they're also quite good, 92%. But in the thing that the Molino test was supposed to test, touch vision, they only get it 50% 58% of the chance time, which is above chance, but not a lot. So I'm going to be licensed by this to say, hey, they're really good at touch touch. They're really good at vision vision. That means our models should be allowed to be trained on these separate modalities and become competent in the separate modalities separately. Uh and so can two unpaired models that are trained in different modalities but one without labels learn to align their representation sufficiently such that when they're joined they can solve a molano task. So we have one model that's really good at vision, one model that's really good at haptic, but we don't let them align uh just by saying this is the sphere. uh instead we say can they align in some other way. So this is a bit different than our typical learning paradigm. Though often we have paired data of the kind you saw. We have the red panda image. The red panda looks at the camera. Uh these are explicitly paired. Sometimes we have unpaired data but where each of the data is labeled. So this is labeled dog. This is labeled guinea pig. This is not paired, but it's also labeled guinea pig. So, we could easily use the label to create a pairing. But we could also have purely unpaired data. And that's what I'm proposing here that we have unpaired data uh that due to the internal structure of the representations um is learns to align in an unsupervised way to uh solve a task. So let's say we only give one of the models the real labels. The other model has dummy labels. So they're not able to align. uh can they now solve some kind of identification task in the other modality after doing this alignment. Um of course the question in the molano literature is always what's a fair characterization of the molino problem. So if this doesn't seem to you like a totally fair reconstruction um just think of it as a task that we mo might motivate for other reasons. But one reason that I think uh it is a potentially fair task is that we can think of it as a scientific learning problem. Can a system or perceiver reidentify the same entity or process in two different modes of perception? So here we might say we have to be able to recognize it in one modality and then we have this new modality. The known entity or process isn't categorized or labeled in it. uh the molino learner has to identify the old entity in the new form of data. But we can't reach out and touch it. We can't wiggle it. We can't pick up the block from the table because that would allow us to use space and time to hot associate in this locked in way. We also can't intervene. And so this is what I'm going to call the Malo problem for science. Now, uh, do scientists ever solve this problem? Not very often. Uh, but I did come up with some help from my astronomy friends with a case where they do. So, here's, uh, the very early introduction of a radio telescope. Um, Bell Labs had said, "Hey, we're tired of all this static um, in the transatlantic um, telegraph lines. Um, Carl Jansky, can you figure out why there's all this static? So, he gets this uh print out of the uh static and over time and starts to chart this path and he says, "Oh, this is kind of interesting. Um, every night that we run the receiver, there's static from a source that follows the same path across the sky and we don't know what it is, but we think it's thunderstorms." So for months um Jansky attributes this static so some kind of signal in the data to a known cause which is thunderstorms. It's not thunderstorms of course. Later he finds that it's the Milky Way. Uh but he is not able to determine this from any feature of the data itself. He's only able to determine it by looking at a star map and seeing what else moves across the sky in that same pattern. So he's not really solving the Molino problem by using, you know, uh, Linets's rational analysis of the new phenomenon to see if it has the same patterns. He's just associating them in space and time. So we have this known data modality. We already knew about the Milky Way. We already knew about a bunch of celestial objects. And we have a novel data modality in which it is observable, but we aren't yet able to observe it. So I think there's a nice one toone correspondence here. Um, of course it's not that common that we would do this. It's not that uh usual that new data modalities are discovered. We often just assume that entities located in the same place and the same time are the same thing. Um, and more importantly, we like to intervene. That's a whole big thing we do in science. And it's only when the objects are so far away because they're in the sky that we can't even have that possibility. But in an interesting way, we put the language models in the position of the observer who can't intervene. They have no way to collect their own data. They just get this static pile of data and they're being asked to solve problems and generate patterns of this data. So I think it's actually fair in this case to say if an automated system with access only to static data could solve scientific malo problems that would actually give us much better evidence for convergence than increasing their mean caner neighbor scores. So uh I've talked about what the platonic representation hypothesis is. I've said it doesn't uniquely support strong realism. But then I tried to give a constructive solution which is to say if two models could achieve unsupervised alignment that would allow models from one modality to work with uh the embedding space of the other modality to identify objects in that second modality without aligning on their text labels. Uh that would be a good measure to say that they have indeed converged. Thank you very much.

Kathleen Creel (Northeastern University): 'Against the Platonic Representation Hypothesis'

Full Transcript

Need a transcript for another video?