王源(王源身高)

舍我其谁 2023-08-14 01:26:09 网友上传

图灵奖得主杨立昆在北京智源大会上分享对人工智能未来的看法-品玩

品玩6月9日讯，在6月9日的北京智源大会上，杨立昆发表了题为“走向能够学习、推理和规划的机器”的演讲。杨立昆教授是一位著名的计算机科学家也是图灵奖得主，以其在人工智能领域，特别是深度学习和卷积网络方面的开创性工作而闻名。他被认为是现代深度学习技术的奠基人之一，他所做的开创性贡献继续塑造着人工智能研究和创新的格局。
在演讲中，杨立昆谈到了人工智能的未来。他认为，要弥补人类和动物与目前我们能够生产的人工智能之间的差距，我们需要不仅仅是学习，还需要推理和规划。他谈到了一些初步的结果，但并没有完整的系统。
杨立昆教授指出，与人类和动物相比，机器学习并不是特别出色。几十年来，我们一直在使用监督学习，这需要太多的标签。强化学习效果很好，但需要大量的试验才能学到任何东西。当然，在最近几年我们一直在使用大量的自监督学习。但结果是这些系统有些专门化和脆弱，它们会犯愚蠢的错误，它们并不真正推理或计划，它们只是快速反应。
因此，杨立昆教授认为，当前机器学习系统在输入和输出之间基本上只有固定数量的计算步骤。这就是为什么它们不能像人类和某些动物那样进行推理和规划。那么，如何让机器理解世界是如何运作的，并预测它们行动的后果，就像动物和人类能够做到的那样？如何让机器执行无限步骤的推理链？如何通过将复杂任务分解为子任务序列来规划复杂任务？
在接下来的十年左右时间里，AI研究将朝着这个方向发展。此外，在演讲中他还谈到了自监督学习及其在过去几年中在机器学习领域取得巨大成功的事实。自监督学习已经被提倡了七八年，并且真正发生了，并且我们今天看到的许多机器学习成果都归功于自监督学习，特别是在自然语言处理和文本理解与生成方面。
在北京智源大会上，杨立昆教授分享了他对人工智能未来发展方向的看法，并强调了推理和规划在人工智能系统中的重要性。
（此内容由此内容由 ChatGPT 产出）
附完整主题演讲全文：
This is being provided in a rough-draft format. Communication Access Realtime Translation (CART) is provided in order to facilitate communication accessibility and may not be a totally verbatim record of the proceedings
>> MC: Hey, good morning. Good morning, Yann.
>> Yann LeCun: Good morning.
>> MC: It is really a great honor and privilege to have Professor Yann LeCun to speak at this BAAI Conference.
Professor LeCun is a renowned computer scientist best known for his groundbreaking work in the field of AI, especially in deep learning and convolutional networks. So when we talk about the CNN in this field in AI, we're not referring to the TV station, but we are referring to Yann LeCun’s work: convolutional neural nets.
He is often regarded as one of the founding fathers of modern deep learning technology, having made a pioneering contribution that continues to shape the landscape of AI research and innovations.
Professor LeCun has been the recipient of numerous prestigious awards and honors throughout his career, most notably in 2018, he received the Turing award, what we call the Nobel prize in computer science. Today, professor LeCun, will give a speech entitled “Toward the machines that can learn, reason and plan”. Let's welcome professor LeCun. Yann, the floor is yours. Thank you.
>> Yann LeCun: All right, thank you very much for the introduction. I hope you can hear me
fine. I'm really sorry. I can't be there in person. I've not been to China for a very long time.
So I'm going to talk a little bit about the future of AI the way I see it to essentially bridge the gap between what we observe in the capacity of humans and animals and the type of artificial intelligence we are capable of producing today.
And what's missing is the capacity to not just learn but also reason and plan in AI systems. A lot of what I'm gonna talk about is either kind of a direction, if you want, where I think AI research will go in the next decade or so, as well as some preliminary results, but no complete systems.
The first thing I should say in colloquial English is that machine learning is not particularly good compared to humans and animals. For a few decades now, we've been using supervised learning, which requires too many labels. Reinforcement learning works well but requires an insane amount of trials to learn anything. And of course in recent years we've been using a lot of self-supervised learning. But the result is that those systems are somewhere specialized and brittle, they make stupid mistakes, they don't really reason nor plan, they sort of react really quickly. And when we compare with animals and humans, animals and humans can run new tasks extremely quickly and understand how the world works, and can reason and plan, and they have some level of common sense that machines still don't have. And that's a problem that was identified in the early days of AI.
That's partly due to the fact that current machine learning systems have essentially a constant number of computational steps between input and output. That's why they really can't reason and plan to the same extent as humans and some animals. So, how do we get machines to understand how the world works, and predict the consequences of their actions like animals and humans can do, can perform chains of reasoning with unlimited number of steps, or can plan complex tasks by decomposing them into sequences of sub tasks?
That's the question I'm going to ask. But before I say this, I'll talk a bit about self-supervised learning and the fact that it really has taken over the world of machine learning over the last several years. This has been advocated for quite a number of years, seven or eight, and it's really happened, and a lot of the results and the successes of machine learning that we see today are due to self-supervised learning, particularly in the context of natural language processing and text understanding and generation.
So what is self-supervised learning? Self-supervised learning is the idea of capturing dependencies in an input. So we're not trying to map an input to an output. We are just being provided with an input. And in the most common paradigm we mask a part of the input and we feed it to a machine learning system, and then we reveal the rest of the input and then train the system to capture the dependency between the part that we see and the part that we don't yet see. Sometimes it is done by predicting the part that is missing and sometimes not entirely predicting.
And that's well explained in a few minutes.
That's the idea of self-supervised learning. It is called self-supervised because we essentially use supervised learning methods, but we apply them to the input itself as opposed to a kind of matching to a separate output that is provided by humans. So the example I show here is a video prediction where you show a short segment of video to a system, and then you train it to predict what's gonna happen next, in the video. But it's not just predicting the future. It could be predicting kind of data in the middle. This type of method has been astonishingly successful in the context of natural language processing, and all the success that we see recently in large language models are a version of this idea. The basic principle here is that you take an input, a text, let's say, a window over a piece of text, and by removing some of the parts and replacing them by a blank marker. And then -- it can do this, which is for a subsequent task.
[video call of LeCun cut]
Okay, so I was saying, this self-supervised learning technique consists in taking a piece of text, removing some of the words in that text, and then training a very large neural net to predict the words that are missing. And in the process of doing so, the neural net learns a good internal representation that can be used for a number of subsequent supervised tasks like translation or text classification or things of that type. So that's been incredibly successful. And what's been successful also is generative AI systems for producing images, videos, or text. And in the case of text, those systems are autoregressive. So the way they're trained using self-supervised learning is by not predicting random missing words, but by predicting only the last word. So you take a sequence of words, suppress the last word, and then train the system to predict the last word, the last token.
They're not necessarily words, but sub-word units. And once the system has been trained on enormous amounts of data, you can use what's called autoregressive prediction, which consists in predicting the next token, then shifting that token into the input, and then predicting the next, next token, and then shifting that into the input, and then repeating the process. So that's autoregressive LLMs, and this is what the popular models that we've seen over the last few months or years do. Some of them from my colleagues at Meta, at FAIR, BlenderBot, Galactica, and Lama, which is open source. Alpaca from Stanford, which is a refinement on top of Lama. Lambda, Bard from Google, Chinchilla from DeepMind, and of course, Chet, JVT, and JVT4 from OpenAI. The performance of the systems is amazing if you train them on something like a trillion token or two trillion tokens. But in the end, they make very stupid mistakes. They make factual errors, logical errors, inconsistencies. They have limited reasoning abilities, could use toxic content, and they have no knowledge of the underlying reality because they are purely trained on text, and what that means is a large proportion of human knowledge is completely inaccessible to them. And they can't really plan their answer. There's a lot of studies about this. However, those systems are amazingly good for writing aids as well as for generating code, for helping programmers writing code.
So you can ask them various things to write code in various languages, and it works pretty well. It gives you a good start. You can ask them to generate text, but again, they can elucidate or confabulate stories. So this is a joke that my colleague played on me. They said, did you know that Yannick Kudrot wrote a rap album last year? We listened to it, and here is what we thought. And the system comes up with some story that somehow I produced a rap album, which of course is not true. But if you ask it to do it, it will do it. So that's entertaining, certainly, but that makes the system not so good as information retrieval systems or as search engines or if you just want factual information. So again, they're good for writing assistance, first draft generation, statistic publishing, particularly if you're not a native speaker in the language that you're writing. They're not good for producing factual and consistent answers, taking into account recent information because they have to be retrained for that. And they may be behavior in the training set, and that is a guarantee that they will behave properly. And then there is issues like reasoning, planning, doing arithmetics and things like this, for which they would use to have tools such as search engine calculator database queries. So that's a really hot topic of research at the moment of how to essentially get those systems to call tools. This is called extended language model. And I co-authored a review paper on this topic with some of my colleagues at FAIR on the various techniques that are being proposed for extended language models. We're easily fooled by their fluency into thinking that they are intelligent, but they really are not that intelligent. They're very good at retrieving memories approximately. But again, they don't have any understanding of how the world works. And then there is kind of a major flaw, which is due to the autoregressive generation. If we imagine that the set of all possible answers, so sequences of tokens, is a tree, if you want, here represented by a circle. But it's really a tree of all the possible sequences of tokens. Within this enormous tree, there's a small subtree that corresponds to the correct answers to the prompt that was given. If we imagine that there is an average probability E that any produced tokens takes us outside of that set of correct answers, And the errors that are produced are independent. Then the probability that the answer of xn is correct is 1-e to the power n.
And what that means is that there is an exponentially diverging process that will take us out of the tree of correct answers. And this is due to the autoregressive prediction process. There's no way to fix this other than e as small as possible. And so we have to redesign the system in such a way that it doesn't do that. And in fact, other people have pointed to the limitations of some of those systems. So I co-wrote a paper, which is actually a philosophy paper, with my colleague Jigdo Browning, who is a philosopher, about the limits of training AI systems using language only.
The fact that those systems have no experience of the physical world, and that makes them very limited. There's a number of papers, either by cognitive scientists, such as the one on the left here from MIT group, that basically says the kind of intelligence that the system has is very restricted compared to what we observe in humans and animals. And there are other papers by people who come from more classical AI, if you want, so they're not machine learning based, that try to analyze the planning ability of those systems and basically conclude that those systems cannot really plan, at least not in ways that are similar to what people were doing with more classical AI in search and planning. So how is it that humans and animals can learn so quickly? And what we see is that babies learn an enormous amount of background knowledge about how the world works in the first few months of life. They learn very basic concepts like object permanence, the fact that the world is three-dimensional, the difference between animate and inanimate objects, the notion of stability, the learning of natural categories, and learning of very basic things like gravity, the fact that when an object is not supported, it's going to fall. Babies learn this around the age of nine months, according to this chart that was put together by my colleague Emmanuel Dupou.
So if you show a five-month-old baby, the scenario here at the bottom left, where a little car is on a platform, and you push the car off the platform, and it appears to float in the air, five-month-old babies are not surprised. But ten-month-old babies will be extremely surprised and look at the scene like the little girl at the bottom here, because in the meantime, they've learned that objects are not supposed to stay in the air. They're supposed to fall under gravity. So those basic concepts are learned in the first few months of life, and I think what we should reproduce with machines is this ability to learn how the world works by watching the world go by or experiencing the world. So how is it that any teenager can learn to drive a car in 20 hours of practice, and we still won't have completely reliable level 5 autonomous driving, at least not without having extensive engineering and mapping and LIDARs and all kinds of sensors, right? So obviously, we're missing something big. How is it that we have systems that are fluent, that can pass the law exams or medical exams, but we don't have domestic robots that can clear up the dinner table and fill up the dishwasher, right? This is something that any 10-year-old can learn in minutes, and we still don't have machines that can come anywhere close to doing this. So we're obviously missing something extremely major. in the AI system that we currently have, we are nowhere near reaching human level intelligence.
So, how are we going to do this? In fact, I've kind of identified three major challenges for AI over the next few years, learning representations and predictive models of the world, using self supervised learning.
Learning to reason: This corresponds to the idea from psychology, from Daniel Kahenman, for example, of System 2 versus System 1. So System 1 is the kind of human action or behavior that corresponds to subconscious computation, things that you do without thinking about them. Then System 2 are things that you do deliberately consciously, you use your full power of your mind. And Kotasi is basically only do System 1 and not very intelligent at all
And then the last thing is running to plan a complex action sequence hierarchically by decomposing complex tasks into ones. I wrote a vision paper about a year ago that I put up an open review that you're invited to look at, which is basically - my proposal for where I think AI research should go over the next 10 years, it passed towards a kind of artificial Intelligence, which is sort of a reconversion of the talk to your hearing right now.
It's based around this idea that we can organize a variety of modules into what's called a cognitive architecture, within this system opposing the centerpiece is the world model.
So the world model is something that the system can use to basically imagine a scenario, imagine what's going to happen, perhaps as a consequence of its act. So the purpose of the entire system is to figure out a sequence of actions according to its own prediction using its word model, is going to minimize a series of costs. So the cost you can think of them as measuring the level of discomfort of this agent. By the way, many of those modules have corresponding subsystems in the brain. The cost module is, again, we had the world model, the prefrontal cortex, shorter memory would be the hippocampus. The actor could be the premature cortex. And of course the perception system is the back of the brain where all the perception analysis of sensors is performed.
The way this system is being operated is that it proceeds the state of the world, combining that with its previous idea of the world that may be stored in a memory. And then you use the world model to predict what would happen if the world just evolves or what would happen as a consequence of actions that the agent would take, which is inside of this yellow active module.
The actual module proposes a sequence of actions. The world model simulates the world and figures out what's gonna happen as a consequence of those actions.
And then a cost is computed. Then, what's going to happen is, the system is going to optimize the sequence of actions, so as to minimize the world model. So what I should say is that, whenever you see an arrow here going one way, you also have gradients going backwards. So I'm assuming that all of those modules are differentiable and we can do this inference of the sequence of actions by back propagating gradients so as to minimize the cost. This is not minimizing with respect to parameters for - which would be for running. This is minimizing with respect to latent variables. And this is taking place at inference time.
So there's really two ways to use that system. It’s akin to system 1, which I call Mode-1 here, where basically it is very reactive. So you observe the state of the world, run it through a perception encoder, which gives you an idea of the state of the world and then run this directly through a policy network, and the actor that just directly produces an action.So there's no need for a world model here, it’s just a reactive policy.
Then Mode-2 is one in which you observe the world and extract the representation of the state of the world as zero. Then the system imagines a sequence of actions from a[0] to a large T. And then using its world model, imagines the result, the effect of taking those actions. Those predicted states are fed to a cost function, and the entire purpose of the system is basically to figure out the sequence of actions that will minimize the cost according to the prediction. So this world model here that is repeatedly applied every time step, essentially predicts the representation of the state of the world at time T+1 from a representation of the world at time T, and imagines a proposed action. This idea is very much similar to what people in optimal control called model predictive control. And there's a number of models that have been proposed in the context of deep learning that use this idea for planning trajectory work by Michael Henaff and myself and our fellow - I see a lot of, in 2019, a series of work by Danijar Hafner, Dreamer V1, V2, et cetera. And other work from Berkeley and various other places.
The question here is, says, how do we learn this world model, essentially? If you skip this, what we're expecting to do is some more complicated version of this where we have a hierarchical system that extracts through a sequence of encoders, extract more and more abstract representations of the state of the world and use different levels of world models, of predictors, to predict the state of the world at different levels of distraction and make prediction at different time scales. At the higher level here, for example, if I want to go from New York to Beijing, the first thing I need to do is go to the airport and then catch a plane to Beijing. So that would be sort of a high level representation of the plan. The ultimate cost function could represent my distance from Beijing, for example. And then the first action would be: Go to the airport, and my state would be, am I at the airport? And then the second action will be, catch a plane to Beijing. How do I go to the airport? From, let's say, my office in New York. The first thing I need to do is, go down the street and catch a taxi, and tell him to go to the airport. How do I get down on the street? I need to stand up from my chair, I'll go to the exit, open the door, go down the street, et cetera. And then you can imagine so, decompose this task all the way down to the millisecond, by millisecond control that you need to do to accomplish this size.
So all complex tasks are accomplished hierarchically in this way and this is a big problem that we don't know how to solve with machine learning today. So, this architecture that I'm showing here no one has built it yet. No one has proved that you can make it work. So I think that's a big challenge, hierarchical planning.
The cost function can be composed of 2 sets of cost modules, and would be modulated by the system to decide what task to accomplish at any one time. So there’s two sub modules in the cost. Some that are sort of intrinsic costs that are hard wired and immutable. You can imagine that those cost functions will implement safety guard rails to ensure that the system behaves properly and is not dangerous, is not toxic, et cetera. That's a huge advantage of those architectures, which is that you can put costs that are optimized at inference time.
You can guarantee that those criteria, those objectives will be enforced and will be satisfied by the system’s output. This is very different from autoregressive LLMs where there is no way basically to ensure that their output is good and non-toxic and safe.
Here, it's possible to impose this because that's basically hardwired into this system that it needs to optimize those cost functions. And then there might be other cost functions, perhaps trainable, that model the satisfaction of the task that is being accomplished. And those could be trained. Those are similar to quinics that are used in reinforcement learning. I'm assuming that all of those modules are differentiable. So how are we going to build this and train this wire model? So again, we're going to do something like in this little example here, kind of capture the dependency in video, perhaps predict the segment of the video, future segment of the video from the present or the past. And if you train a neural net to just predict videos, it doesn't work very well. You get very blurry predictions. This is old work here at the top from 2016 where we attempted to do video prediction. And what you get are blurry predictions because the system cannot predict in advance what is going to happen in the video, so it predicts an average of all outcomes. And that ends up being a blurry prediction. There's a similar scenario here at the bottom where this is a star-dazzled picture from a top-down camera looking at a highway and following a blue car, and the predictive frame is from this video prediction system are blurry, again, the system doesn't know exactly what's going to happen, and so it predicts an average.
and that's a bad prediction.
So the solution I'm going to propose to basically represent a deal with this issue of multimodality is what's called a joint embedding predictive architecture. And so let me explain a little bit what this is. So we have generative architecture, things like autoencoders or generational autoencoders, maxencoders, or the type of large migration models that I was talking about earlier.
And there, what you do is you have an observed variable X, an encoder, a predictor, or a decoder that produces a prediction for the variable you want to predict, Y, and then you measure the divergence of some kind, some distance between the predicted Y and the observed Y. And there's a flaw with this, and the flaw is that the systems have to predict every single details about Y. So this is not too much of a problem when you're predicting words, because you can represent a distribution of words using a, you know, a submax, a non-vector of probabilities of numbers between 0 and 1 that are normalized. So it's eaSy when Y is a discrete variable, but when Y is a continuous variable, like a video frame, for example, a set of video frames, predicting all the details in a video frame is essentially impossible, and most of the details are not even useful to capture. So for this, I propose the architecture on the right, which is called a joint embedding architecture, or in this case, joint embedding predictive architecture. And basically, both X and Y are fed through encoders, and the prediction takes place not in the input space, but in representation space, so that there's a predictor network that tries to predict the representation of Y, SY, from the representation of X, SX. A form of this joint embedding architecture is something that I proposed many years ago in the early 90s, called Sine-Bees network, and those kind of architectures have become somewhat popular over the last few years for self-supervised learning in the context of images. So there are basically no method today, no generative architectures today, that work properly to learn features for vision, with one exception, which is a masked autoencoder from my colleagues at FAIR, including Kai Ming-He.
But it doesn't work very well in the sense that the features that are produced are not very good. You need to fine-tune the system to get good results. So if you want features that are good without fine-tuning, the generative methods are really not good. You need to use the joint embedding architectures. So there's a number of methods, like Perl and MoCo, SimClear, BottleTwins, and VicRack, that essentially use this joint embedding architecture, and they do work very well for self-supervised learning and produce extremely good features.
Another method that is not listed here is Dino, which is a method by my colleague at FAIR in Paris. A slightly more sophisticated version includes a predictor that tries to predict the representation of y from that of x, as opposed to just trying to make them equal. But that would be a deterministic prediction.
On the right is a version of that system in which the predictor takes as input a latent variable, and that latent variable can be used to parametrize a set of possible predictions of SY and basically make non-deterministic predictions. So you would have a system of this type where, if you want to use this architecture to build a world model, the predictor would take two variables, a set of latent variables that represent everything you don't know about the world, all the unobserved variables that represent the state of the world that would allow you to predict what's going to happen next, and then another set of variables that would represent an action you might take.
So from an observation of the state of the world X, you extract a representation of it, then give it an action, and through maybe sampling a set of latent variables, you can make one prediction about the representation of the state of Y, and that Y variable can be run through an encoder. So the next question is, how do we train this system? And that system might include other criteria represented at the top right, like C of S, that may represent constraints that we want the system to satisfy, or cost functions that indicate what type of information we want the system to learn to extract.
So before I talk about how we train this, I have to talk a little bit about what's called energy-based models, which is the framework to which we can understand this, because we can't understand those systems through the traditional methods of probabilistic modeling, unfortunately. So what is an energy-based model? An energy-based model is a function, an energy function, that captures the dependency between, let's say, two sets of variables, x and y. And the way it does this is by putting a scalar output, where that scalar output is small, it takes low values, when x and y are compatible with each other. So for example, y is a good continuation for the video clip x. And then it takes higher values outside of whenever y is not compatible with x. So it's symbolized here in this diagram on the right, where, let's say, x and y are scalar variables, and the set of data points that you observed are those black dots.
So you want the energy function, of course, which is going to be parameterized through a neural net and trained. You want this energy function to take low values around the data points, and then higher values outside. So basically, you can think of this as an implicit function that represents the dependency between x and y. Through learning, it's really eaSy to train this energy function, to give low energy for the data points you trained on. But it's harder to figure out how to make sure that the energy is higher outside of the trained samples. So there are two classes of methods to do so. One is called contrastive methods, and it consists in generating contrastive points, those green flashing dots here, outside of the manifold of data, and then pushing their energy up.
So you simultaneously push the energy down of the data points that you have, and then you generate contrastive data points, which must be elucidated, if you want, and push their energy up. And so by pushing down and pushing up on the right place, the energy surface will take the right shape. But my preference goes to something called regularized methods, and they work by essentially imposing a regularizer on the energy function, so that the volume of space that can take low energy is limited or minimized. So whenever you push down on the energy of some region, the other regions have to go up because there is only a limited volume that can take low energy.
So that's those two categories. I think there is a major limitation with contrastive methods, which is that as the space of y increases in dimension, the number of contrastive points you need to generate for the energy function to take the right shape grows exponentially with the dimension. And I think that's a flaw. Whereas regularized methods tend to be more efficient by basically minimizing the volume of space that can take low energy. So basically I'm asking you to accept four things.
The first thing is to abandon generative models.
This is not very popular at the moment because generative models are all the rage, everybody is talking about them. What I'm saying is that if you want to have a system that can learn world models, it's not good just by generating, being trained to produce video frames, for example, because it's too complicated to make predictions in high dimensional continuous spaces.
So we have to use joint embedding architectures, which means we have to abandon the idea that we're going to have generative models. For text it's fine, but not for view. We have to abandon the idea of using probability modeling and use those energy-based frameworks. The reason being that if you're going to run y through an encoder, you cannot invert this encoder function to produce a conditional probability p of y given x. This is not possible in this context. You have to abandon the idea that you're going to model a distribution. You're just going to compute this energy function, which is a proxy, a r proxy for something like a distribution.
I want to tell you to abandon Kotasi methods in favor of those regularized methods.
And of course, for many years now, I've said that reinforcement learning should be minimized, we should minimize the use of reinforcement learning because it's extremely efficient. So basically use planning wherever we can, and only reinforcement learning where we have no choice. So what are those regularized methods for joint embedding? And this is really kind of the core of the talk, if you want the interesting part. So here's the problem: We're gonna have a collapse issue, which is that, if we just train one of the Joint-Embedding Predictive Architecture (JEPA) to minimize the prediction error, which is the error you make by of predicting the representation of Y from the representation of X. It's not going to work. The system is basically going to collapse, because it's going to ignore X and Y produce Sx and xy vector that are constant, so that the prediction error is zero. And that model is not going to capture the dependency between X and Y, because it's going to give you zero energy for everything. Right? You're gonna be - you will have succeeded in the energy low on your training samples, but the energy would be zero everywhere. You have to prevent this collapse from happening. The technique I'm proposing here is to essentially have some measure of the information content of Sx and the information content of Sy and maximize them.
So find a way of maximizing information content of both Sx and Sy, and that will prevent this collapse. In fact, this has the regularization effect that was telling you earlier.
Basically minimizing the volume of space that can take your energy.
If you have a latent variable for the predictor, you also need to regularize the information content of that latent variable. But there you need to minimize the information content.
If you have a little variable that is too informative, you're also going to have it collapsed the model that predicts that has zero energy for any pair Sx, Sy.
How are we going to maximize the information content of Sx of Sy? A way in which those systems can collapse is by Sx and Sy constant. You can prevent this from happening by having a cost function at running time that basically guarantees that the standard deviation of all the components of Sx is above a certain threshold, in this case 1. So this is a hinge loss that pushes the standard deviation of every component of S to be larger than 1. Okay? So the system can output just constant outputs. And this is computed over a batch, if you want, of a few samples.
We do the same thing for both Sx and Sy. So this is called variance regularization. There's a number of papers here at the bottom that have been using this method coming out of my group from MITA, Barlow Twins, which was the first version, VICReg which is an improved version, VICReg L for image segmentation. And this is similar to a work by Yi Ma at Berkeley. He is actually now in Hong Kong for some time, co-authored with Ari Shawn called XER squared, which has a very similar idea.
So that is not sufficient to prevent the system from collapsing, because the system could choose to make all the components of Sx equal or very correlated. So what you have to do as well is basically minimize the off diagonal terms of the previous matrix of Sx. So you take pairs of coordinates, measure their covariance and for two different coordinates, you try to minimize the covariance term. So basically, the cost function is the square of the covariance term for every pair of variables. So there will be correlated variables.
And that’s still not sufficient. So what we do is play a trick here. We run Sx through the expander module, that produces a larger vector, Vx. Then we apply the variance covariance criterion to this larger vector Vx. And this actually has the effect, and we have some papers on this, this will have the effect of the components of Sx more independent with each other, at least pairwise independent.
So this method is called VICReg, and that means variance invariance covariance regularization. So variance is this criterion, covariance is that criterion, and then invariance is this prediction criterion. VICReg.
We're really happy with that method. It works very well. So we can train one of the joint embedding architectures with pairs of images which are distorted versions of each other. It's the same model as I was talking about earlier where you take an image and you corrupt it. And here in this case, you transform it in some ways.
You trained as joint embedding architecture as VICReg. And then you chop off the expander or the projector in this case, use the representation produced by the system as input to a linear classifier that you train supervised and measure the performance. And this works very well. You can get something like in the mid 70% correct on the image net with a very architecture. We're using conventional net architectures.
Something like convex and things like that. You can now apply the features to other tasks that image that this is pre-trained on distorted versions of images. And it works pretty well, including for segmentation, but not perfectly. So you can improve that system by a very predictor that tries to match the location of features into images.
This will be VICReg for local, and that works really well for a segmentation. So that system really produces fantastic features of segmentation. I should say that another method is also a joint embedding architecture by our colleagues at FAIR Paris called Dino and Dino V2 also produces excellent features for segmentation. It's very similar in many ways, the collapse prevention is different and the representations are quantized, but in many ways, it's a very similar idea.
We skip ahead a little bit. So this is a more recent work called image-JEPA. This is one of those joint embedding predictive architectures, and they’re purely based on masking. So we're not doing data augmentation. We're just taking an image, running into an encoder, producing a representation. And in this case, the encoder is a transformer. We get a complete representation of the image through the transformer. And then we run a partially masked version of the image through the same transformer. So those two share the weights. This then produces weaker partial features. Then we trained a predictor to predict the representations produced from the full image, from the representations obtained with the partial image.
Again, image-JEPA, this is a recent paper by Mido Ashram from FAIR in Montreal in collaboration with some of us. This works amazing and well. It produces really good features. It's very fast.
It doesn't require any data augmentation, which is a huge advantage. It works so much better than, for example, masked autoencoder, which is a generative architecture. And it really doesn't require any data augmentation. It can reach something like 81% on image net, which is similar to what other methods can reach using data augmentation. So this is without using the data augmentation, purely using masking, which is pretty amazing.
And of course, It works really well for transfer tasks and things like that. So, how are we going to use this? I'm coming to the end of my talk here. We're going to use that to build those JEPA architecture to build all the hierarchical world models that will be integrated in this world architecture that will allow systems to plan. What I'm talking about here is the future, because we haven't built up this completely. We have some demos of some subsets, but no real complete system yet.
We can use one of the JEPA architectures to extract low level representations of the input and make predictions at short term with lots of details.
Or what we can do is run through another encoder that is going to extract a higher level abstract where opposite and make predictions that are longer terms, so further in the future. The lower level may involve actions that are real actions in the real world, and the higher level will involve latent variables that may represent high level actions that do not correspond to real actions. And we could use the system, this architecture in the context of hierarchical planning that I was describing at the beginning of the talk.
So we built some versions of this type of hierarchical architecture for video prediction. We don't think we have a perfect recipe yet, but we have something that seems to learn pretty good features from video. So it basically takes two frames from a video, runs into encoders, and has some predictor that tries to predict the representation of the door frame from the representation of the first frame. But it does this hierarchically. And simultaneously, the same encoder is used to train from images.
You see the VICReg and I was talking about earlier. Both of them use VICReg VC regularization at every level so I'm not going to describe the details of this architecture, but basically it's a large commercial net. The predictors have a particular structure that basically forces the representations to follow some sort of geometric warping, and the warp vector fields are predicted from top to bottom. And basically that system spontaneously, using self-supervised learning, learns the distortion between two successive frames in the video and learns to extract representation that represents all the information. If you train that simultaneously with a sort of global image recognition criterion, you get systems in the end they can estimate optical flow. But also we can extract representations that can be used for object recognition and segmentation. We have the results which I’m not going to bore you with, but of a system that can simultaneously perform a good flow estimation, object recognition and object segmentation and object tracking as well, using completely self-supervised learning.
I'm going to the end. So if we are successful in systems of this type that are completely general, we're gonna be able to integrate them into systems that can predict what's gonna happen as a consequence of actions or latent variables and then use the system to do planning. And to some extent planning through energy minimization can be seen as a very general framework for reasoning. Reasoning can always be sort of formulated in terms of minimizing some energy function, so this is much more qualitatively more powerful than autoregressive generation.
So the steps towards AI systems - self-supervised learning, handling uncertainty in predictions using those JEPA architectures and framework, learning world models from observation using SSL and reasoning and planning by using energy minimization with respect to action variables.
And so, this is the kind of reasoning that does not involve any symbols or any logic, it's basically just continuous optimization with vectors. Okay, is there a chance that this kind of architecture will lead us towards human-level intelligence? Perhaps.
We don't know yet because we haven't built it, and it's probably going to take quite a long time before we make it work completely. So, it's not going to happen tomorrow. But, there’s good hope that this will take us a little closer, perhaps, to human-level intelligence where systems can reason and plan. An interesting tidbit about this is that those systems will necessarily have emotions because the ability to predict an outcome, whether an outcome is going to be good or bad, which is basically implemented by the cost module, is kind of very similar to emotions in animals, right?
So, fear is caused by anticipation of a potentially negative result, and elation is produced by the prediction of a very positive result and things like that, right? So, systems that are driven by objectives like this, I actually call this objective-driven AI, will have things like emotion, it will have drives, it will be controllable because we can set goals for them through those cost functions. So, much more controllable than autoregressive models. And, by designing those cost functions appropriately, we can make sure that those systems won't want to take over the world and, on the contrary, be subservient to humans and safe. And, I'm going to stop here and thank you very much for your attention.
>> MC: Thank you very much, Yann, for the very insightful talk. I know it’s very early in the morning, but we’d still like to spend another 10 minutes to entertain some questions. I would like to invite Professor Zhu Jun who is our BAAI chief scientist in machine learning and professor from Tsinghua University to help me lead the Q&A session with you, Yann. Hopefully 10 minutes, but the question may be long.
>> ZHU Jun: Hello Professor LeCun. Nice to see you again. So I will host the Q&A session. First of all thank you again for get up so early to do all this very thoughtful workshop talks with so many insight. So considering the time constraint, I selected a few questions. We have many, many questions we’d like to ask, but I selected a few to ask you. So as you said in your talk, so you say a few like issues about generative model. Many, I also agree with you. But regarding like the basic principle of these generative models, I still have a question to you. So by definition this generative model actually they’ve been defined to output over multiple choices in principle. Also when we apply generative model’s diversity, creativity is a desirable property. So, we often like apply this model to output diverse results. Does it mean actually like the factual errors or illogical errors, inconsistencies, is not available for such models? Even if you have data because in many cases the data may have conflicting facts. You also mentioned about uncertainty in the prediction. So that’s my first question. So what’s your thought on this?
>> Yann LeCun: Right. So I don’t think the issues with autoregressive prediction models, generative models are fixable by preserving the autoregressive generation. I think those systems are inherently uncontrollable. And so, I think they have to be replaced by the kind of architecture I’m proposing where at inference time you have a system optimize some cost and some criteria. That’s the only way to make them controllable and steerable and plan, the systems will be able to plan their answer. You know when you’re giving a speech like the one I just gave, you plan the course of the talk, right? You go from one point to another. You explain things. You’re planning this in your head when you’re designing the talk. You’re not just improvising a talk one word after the other. Maybe at a lower level you’re improvising the words but at a higher level you’re planning. So, the necessity of planning is really obvious. And the fact that humans and many animals are capable of planning is I think an intrinsic property of intelligence. So my prediction is that within a relatively short number of years – certainly within 5 years – nobody in their right mind would be using autoregressive LLMs, okay? Those systems are going to be abandoned pretty quickly. Because they’re not fixable.
>> ZHU Jun: Okay. Let’s see – so I guess another issue about control. In your design, in your framework, a key component is like the intrinsic cost module, right? So which is designed to basically determine the nature of the agent behavior. So I look at like your working paper on the open view and I share the common concern with one comment online. So it said that chiefly this module is not work as specified. Maybe the agent finally [screen froze].
>> Yann LeCun: The cost modules that guarantee the safety of the system is not gonna be a trivial task. But I think it’s going to be a fairly straightforward task. Something that will require a lot of careful engineering and fine tuning and some of those costs may have to be trained as opposed to just design. It’s very much the same thing as a critic in reinforcement learning or what’s called a reward model in the context of LLM. So something that goes from the internal state of the system to a cost. You can train a neural net to predict the cost. You train it by just exposing it to a lot of - it produce a lot of outputs and then have someone or something rate those outputs. And so, that gives you a target for the cost function. And you can train on it to a small on it to compute that cost and then after you have it then you can back propagate through it to guarantee that this property is satisfied. So yeah I think designing costs, I think we’re going to have to move from designing architectures and designing costs for LLM towards designing cost functions. Because those cost functions are going to be driving the nature and behavior of systems, okay? And contrary to some of my colleagues who are more negative about the future, I think this problem of designing the cost so they are aligned with human values is eminently doable and it’s not something that if you do it wrong once it’s not going to be the case that the AI systems are going to escape control and take over the world. And we will have a lot of ways to engineer those things pretty well before we deploy them
>> ZHU Jun: I agree with that. So another technical question also related to this one is that I notice in your design of the modules in like JEPA, hierarchical JEPA, almost all modules are differentiable, right? Maybe you can use back propagation to train. But you know there is also another field like symbolic logic that represent discrete and maybe in some form can like to formulate the constraint we like to put, for example in the intrinsic cost module. So do you like have some special consideration like to bridge these two or simply just ignore this like symbolic logic field? Yeah.
>> Yann LeCun: Right. So I think – yeah, there’s a whole sub field of neuro symbolic architectures that try to combine trainable neural nets together with symbolic manipulation and things like this. I’m very skeptical of those approaches because of the fact that symbolic manipulation is nondifferentiable. So it’s essentially incompatible with deep learning and so, with gradient based learning and certainly with gradient based inference of the type that I described. So I think we should make every effort to use differentiable modules everywhere, including the cost functions. Now there’s probably a certain number of situations where the cost that we could implement are not differentiable. And for this, the optimization procedure that will perform the inference may have to use combinatorial optimization as opposed to gradient based optimization. But I think that should be the last resort, because zero-order gradient free optimization is much less than gradient based. So, if you can have differentiable approximation of your cost function, you should use that as much as possible. To some extent we do this already. When we train a classifier, the cost function we want to minimize is a number of errors. Right? But that is non-differentiable, so what we use is a proxy for that cost which is differentiable. Something like the cost entropy of the output of the system with the desired output distribution or something like e squared or hinge loss right? Those are all upper bounds basically on the binary laws that are not differentiable, that we cannot optimize easily. So we’re very familiar with this idea that, we have to use cost functions that are differentiable approximations of the cost that we actually want to minimize.
>> ZHU Jun: Okay. Yeah. So that’s my next question. I was inspired from our next speaker Professor Tegmark who will give an onsite talk after you. Actually we heard that you will attend a debate on the status and also the future of AGI. As most of us would likely not be able to attend that, can you share like some key points to teach us? We want to hear some insight about that. Yeah.
>> Yann LeCun: Okay. So this is going to be a debate with four participants. And the debate is going to be around a question as to whether AI systems are existential risk to humanity. So Max, together with Yasir Benjo will be on the side of “yes, it’s possible that powerful AI systems may be an existential risk to humanity.” And then on the side of “no” will be myself and Melanie Mitchell from the Santa Fe Institute. And our argument is not going to be that there’s no risk. Our argument is going to be that those risks, while they exist, are easily mitigated or suppressed through careful engineering. My argument for this is that, you know, today, asking people whether we can make super intelligent systems safe for humanity* cannot be answered. Because we don’t have a design for superintelligent system. So until you have basic design for something, you cannot make it safe. It’s as if you asked in 1930 if you asked aeronautical engineer can you make turbo jet safe and reliable? And the engineer would say, “What is a turbo jet?” Because turbo jets had not been invented in 1930. So we’re a little bit in the same situation. It’s a little premature to claim that we cannot make those systems safe because we haven’t invented them yet. Once we invent them – and maybe they’ll be similar to the blueprint that I proposed. Then it will be worth discussing, “How can we make them safe?”, and in my opinion, it’ll be by designing those objectives that minimize the inference time. That’s the way to make the system safe. Obviously if you imagine the future of superintelligent AI system is going to be autoregressive LLMs, then of course we should be scared because those systems are not controllable. They may escape our control and spew nonsense. But the systems of the type that I describe I think can be made safe. And I’m pretty sure they will. It will require careful engineering. It will not be easy, the same way it’s not been easy over the last seven decades to make turbo jets reliable. The turbo jets are incredibly reliable. You can cross large oceans with 2-engine airplane now. And basically with incredible safety. And so, that requires careful engineering. And it’s really difficult. Most of us don’t know how turbo jets are designed to be safe. So it’s not crazy to imagine that, you know, figuring out how to make a super intelligent AI system safe is difficult to imagine as well.
>> ZHU Jun: All right. Thank you for insight, for feedback. As engineer, I’m also – thank you again. Thanks a lot.
>> Yann LeCuN:Thank you so much
(applause)
>> MC: Again, thank you very much Yann and thank you Zhu Jun for the wonderful dialogue. And Yong, I’m not sure you can go back to sleep for a while. Thank you so much got up so late and deliver this wonderful insightful talk. Thank you. [Chinese].
Max Tegmark. It is my great honor to introduce our next keynote speaker Professor Max Tegmark. Professor.
>> [music]
(applause)
>> MC: Allow me to introduce a little bit more. Professor Tegmark is a renowned physicist and cosmologist. So he’s not just an AI scientist, but he’s a physicist. With that background he has made quite some remarkable contributions to the understanding of our universe and definitely lately on artificial intelligence. In the field of AI, Professor Tegmark is a passionate advocate for the responsible development and deployment of AI technologies. He is the co-founder of future life institute. I think many of us have read his very popular book, Life 2.0. Which explores the potential impact of AI on humanity and the future of life itself. Most recently, Professor Tegmark initiated a petition to pause AI research for six months that was co-signed by 1,000 scientists in the world including a few sitting in this room. This petition has woken up many governments and agencies and institutions on the potential risk that the rapid development of AGI could bring mankind. So, today, Professor Tegmark will give a talk entitled “Keep AI under control.” Max, the floor is yours
(applause)
>> MAX TEGMARK: [Chinese].
(applause)
[Chinese]
So I’m going to switch to English. In Yann LeCun’s beautiful presentation that you just saw he also demonstrated that we must not trust our computers too much. And I want to talk today about how concretely we can make our computers more worthy of trust. But first I want to say I’m so happy to be back in China. Many of my best students over the years at MIT where I work are Chinese (中国人). And if you are interested in maybe coming to study at MIT and work with me, please apply.
So how can we keep artificial intelligence under control? I love artificial intelligence. I’m so excited about all of the great opportunities that it gives us. And we’ve already heard this morning about so many wonderful things that we can do with artificial intelligence to help [screen froze] not just China but to help the world. For example, basically all countries on Earth agree on the United Nations sustainable development goals. We published a recent paper showing that artificial intelligence can make us reach almost all of these goals faster if we do it right and keep it under control. And I think we will be able to even go quite far beyond these sustainable development goals, like curing ultimately all diseases and helping create a future where everybody on Earth can live healthy, wealthy, and inspiring lives - IF we keep the technology under control. As you know, there has been a lot of concern recently though about whether we actually will be able to keep it under control. For example, here is AI Godfather Jeffrey Hinton (Geoff Hinton) just a few weeks ago.
>>
(video clip).
>> Are we close to computers coming up with their own ideas for improving themselves?
>> Um, yes, we might be
>> And then it could just go fast.
>> That’s an issue, right. We have to think about how to keep it under control.
>> Yeah, can we?
>> We don’t know. We haven’t been there yet. But we can try.
>> Okay. That seems kind of concerning.
>> Yes.
>> MAX TEGMARK: And here is Sam Altman the CEO of OpenAI that gave us ChatGPT and GPT-4.
>> Sam Altman: In the bad case, and I think this is like important to say, it’s lights out for all of us.
>> MAX TEGMARK: In the bad case he thinks it’s lights out for all of us. All humans are dead. And these are not the random people who know nothing about AI. These are some of the very leaders of the field of course. And just on May 30th, this became world news that an amazing list of AI researchers from around the world signed a statement saying that AI poses a risk of extinction. And you can see here that Yann LeCun among the Turing award winners. He won the Turing award with Jeff Hinton and Yoshua Bengio wh both think AI could wipe out humanity. And Demis Sassabis, CEO of deep mind. Sam Altman CEO of OpenAI and many prominent Chinese researchers have also signed this. So why is it that so many people have this concern? And why are they talking about it now? Actually the concern itself is very old. As old as the idea of artificial intelligence. Alan Turing worried about things like this. Norbert Wiener about things like this. Irving J. Good even long before I was born. Nine years ago Stewart Russell, Stephen Hawking. Frank Wilcheck, myself worried about this. It’s not – the concern is old. It’s kind of obvious really that if the most intelligent species on earth has to deal with another entity that’s more intelligent that they might lose control. This is what happened with the woolly mammoth. This is what happened with the neanderthals. We humans have already caused the extinction of many other species that were less intelligent than us, because they lost control. Our goals were different from their goals and we needed their resources for our projects. They couldn’t stop us because we were smarter. So the idea that we need to be careful to keep control of artificial intelligence if it comes for us is very old. What is not old – what is new – is the urgency. Last time I was in China 3 years ago I showed this picture. I told you about this landscape of tasks where the elevation represents how difficult each task is for a machine. And the sea level represents what machines could do then. In the last three years of course the sea level has been rising a lot. Look how much has changed in just three years. Winograd test, now computers can do quite excellent programming. They can even start doing quite interesting art. Many people are starting to write books with large language models. So the question of when we will reach artificial general intelligence, where AI can do all of these things has really become much more urgent. Back when I was – and that raises this other question. If we get machines that can do all of our jobs, that means our jobs here also in the audience – AI development – which means that in the future after AGI we can replace ourselves by machines that can develop AI probably much faster. So instead of having to wait one year for the next version of things maybe we have to wait one week or one day or one hour, and this raises the possibility of a recursive self-improvement where we soon get AI that’s not just a little bit smarter than us, but way smarter than us. As much smarter than humans as we are smarter than a cockroach maybe limited by ultimately the laws of physics. But the laws of physics, they do limit intelligence. You can’t move faster than the speed of light. You can’t put too much compute in a small space or it will turn into a black hole. But there’s recent work on this showing that the limit is only about a million, million, million times above than today’s state of the art. So superintelligence could be very far beyond us
And what’s basically happened – last time I was in China, there were still people who believed that we would not get artificial general intelligence for hundreds of years. Most AI researchers then still thought it would take decades – 30 or 50 years. But now Microsoft has a paper out saying it’s already starting to happen. They’re claiming sparks of artificial general intelligence already in GPT-4. And many of my AI colleagues are wondering whether we will get AI this year, next year, two years from now. This is why people are talking about this now even though the idea that it could be dangerous to have more intelligence is old. Here’s what Yoshua Bengio recently said.
>> Yoshua Bengio: we have basically now reached the point where our AI systems can fool humans, meaning they can pass the Turing test.
>> MAX TEGMARK: In other words the Turing test the idea that AI could master language for something that long ago many researchers thought was going to be very close to a time when humans could do basically everything machines can. And he’s arguing that large language model already can do this. This might sound depressing. So many people are worrying. I want to point out there’s something very good about this also. I’m actually quite hopeful that now that people take this seriously, it will lead to better relationships between East and West. Because before this, when people only took seriously the idea that AI can give you power, everybody wanted to compete and go as fast as possible, as fast as possible. Get it before the other one. But once it starts to become accepted that this could end civilization for everybody on Earth, it completely changes the incentives for the superpowers, for everybody. It doesn’t matter if you’re American or Chinese if you’re extinct. And instead of thinking of it as an arms race, you can think of it as a race where it doesn’t matter who gets uncontrollable superintelligence first because everybody dies at that point
Even though it might sound scary, I think for the first time actually now we have a situation where both East and West have the same incentive to continue building AI, get all the benefits, but not go so fast that we lose control. This is something we can all work together on. It’s a little bit like climate change also where the problems don’t respect borders, where it’s very natural to have partnership rather than collaborations. So this I think is actually hopeful.
So how?
For the rest of the talk I want to discuss how we can keep AI under control, so we can build all these wonderful things and make sure they work for us, not against us. There are two main problems we need to solve. The first is alignment. How we make an individual AI actually do what its owner wants it to do. Yann LeCun’s laptop was not so aligned because in the middle of his talk it crashed. Right? The second problem is how can we also align all the organizations, the companies, the people et cetera around the world so that their incentives are to use AI for good things and not for bad things. Because if we only solve the first problem and make very powerful AI widely available, we can have the situation where somebody – some terrorist or anybody else really wants to take over the world and do things that you don’t want can use the technology to do this. So we’ll talk a little bit about that also at the end. This is the realm of policy.
But mostly I want to talk about the technical problem of how you align the computer and AI to do what you want it to do. And there are many many aspects to this that we’ll hear about from Stewart Russell and many speakers here. I want to focus on the specific aspect of how you can better understand what’s happening inside of these systems - this so-called field of mechanistic interpretability. Very strong form of interpretability. And after you understand it, change the system so that you can trust it better. This is a relatively young field and a very small field. I recently organized the largest conference in the world on this topic at MIT. Look how few people there are at my conference compared to here, right? But I hope to convince you that this is a very promising and very exciting subject. And I would be very excited to collaborate with many of you on research projects in this.
So mechanistic interpretability, you train a neural network – an inscrutable black box that you don’t understand to interpret something intelligent. And then you try to figure out how exactly is it doing it? Why would you do this? Why do you want this? There are three different levels of ambition you could have. The lowest level of ambition is just to diagnose how trustworthy it is. To understand how much or how little you should trust it. For example, if you’re driving a car, even if you don’t understand how it works, you would at least like to know if you should trust the brakes or not. If you should trust that when you press this thing, it’s going to slow down
The next level of ambition is to understand it better so that you cannot just measure how trustworthy it is but you can make it more trustworthy. And the ultimate level of ambition – and I’m very ambitious – so this is what I’m hoping for – is that we can actually extract all the knowledge in the black box that the machine learning has discovered and take it out and reimplement it in some other system where you can actually prove it’s going to do what you want. Stewart Russell might talk about proof carrying code that can prove it will run. It’s a little bit like the opposite of a virus checker. If you have a virus checker on your laptop, if it can prove that your code you’re going to run is malicious it will refuse to run it. Otherwise, it will run it. Proof carrying code is where the hardware asks the software to prove to it that it’s going to do the right thing. And only if it can prove it, will it run it. And the vision here is to combine this with mechanistic interpretability where you first discover the knowledge automatically by training a machine learning system. And then you use other AI techniques to extract out the knowledge and transform it into some form of proof carrying code.
I’ll talk more about the details. But first let me give you some examples to give you a flavor of how fast this field of mechanistic interpretability is progressing. For example, a recent paper we did in my MIT group – we’ve been working on this in my lab for about 7 years – was we found that the knowledge in large language models is through an interesting approximation made of many little quanta as we call them pieces of knowledge you can learn and study separately. We took gradients for individual tokens and just took the dot product of all the gradients and normalized them and made a similarity matrix. And then we did a spectral clustering. As a result of this we were able to automatically identify these quanta of knowledge that the large language model was learning. And as you can see here a bunch of examples. Many of them are just facts. After a while the neural network learned some fact. It was able to predict this person’s name was so and so. Or this person worked at this company. And there are many quanta like that where there’s just one thing you have to learn. For small neural network, it’s never able to learn it. You have a high loss. But when you have a sufficiently large number of parameters it becomes able to learn and then the loss goes to zero. We call it monogenic tokens. It’s kind of like the gene that tells you the color of your eyes where it’s just one thing that controls the trait you’re interested in. There are other skills that neural networks have which require many quanta. And they tend to be learned more gradually as you increase scale.
Here are some other examples of these quanta of knowledge that you automatically discover. It’s not just facts. If you look at the examples on the left side, the large language model reads it and the red or pink is always the predicted next token. You see on all the examples on the left, what are they doing? What is it that the large language model has learned here? It’s learned about numbered lists. And same insight whether it’s numbered in numbers or letters or hexadecimals. On the right side it’s learned something very different. It’s learned that people like to write lines that have roughly the same length. So it starts to predict the pink line break hit return when the line becomes that long. There’s a very wide variety of quanta we find. And interestingly we find that the large language models we look at, they always tend to learn the quanta kind of in the same order. First it learns the most useful things and then less useful things. It’s kind of like a child. First you learn to crawl. Then you learn to walk in the same order. And the more data you have, the more parameters you have, the more compute you have the more quanta you learn in this sequence. And from this we’re actually able to make an interesting prediction for why we get these scaling laws. It’s well known that the loss you get drops like a power law with respect to compute, with respect to data set size, the number of parameters, and all of these three can be explained at the same time. If you just assume that the frequency distribution of those quanta skills that you need also are a power law like a zip line and we have a formula for how these different scales laws relate to each other.
A different interesting part of mechanistic interpretability is to extract how the knowledge is represented inside of the model. For example, Professor David Bau’s group in Boston – Northeast – they recently discovered how GPT stores the fact that the Eiffel Tower is in Paris. And they figured how it was represented. And they were able to edit those weights in the neural network so they can tell it it’s in Rome instead, not Paris. So when they asked where’s the Eiffel Tower? It started explaining things about Rome. And they also discovered how it’s stored. It’s actually in a way. It was linearly encoding it. Basically you have a matrix, an MLP inside the transformer. And when you multiply it by the vector that embeds encodes Paris, out comes the vector encoding Rome with a large coefficient.
We have – there’s been many many nice papers in the last half year by many groups discovering how different kinds of knowledge are stored. We’ve done a lot of work at MIT looking at how algorithms – algorithmic information gets represented. If you have some kind of abstract multiplication table like this, maybe ordinary multiplication, maybe some other group operation. And you simply want to predict what happens when you combine two elements like this. You can – you want to predict for example all the red examples. And you have a 10th set for the held back green examples but with a much bigger table here than just 5x5. How does that work? So here’s a example we did where the operation is you have to do plus modulo 59. So you take basically two photos of images and pixels or something like that and you send them in. And you know that any group operation can always be represented as a matrix – as a group of matrix multiplication. We train encoding into matrices. And then we train a decoder for how to decode it. And then we look inside. How is this represented in the high dimensional internal space as it’s being trained? And all of a sudden eureka! The neural network had an insight. All the points fall on a 2-dimensional hyperplane and embedding space as a circle. And if you think about modular addition it’s just like if you have a clock with all the numbers 1, 2, 3, 4, up to 12 where if it’s 11:00 plus two hours it’s 1:00. And it starts over. It discovered this and when it discovered this it became able to generalize and solve examples it had not seen before. Because this representation it came up with automatically embeds that this operation is commutative and associative and it can exploit this to fill out the rest of the table. If you just do ordinary addition without the modular part, then you find that it will start representing all the entities in a line, not a circle. If you overfit you see it represented in some random stupid way. So it can always repeat back the training data but it can never generalize. Grokking is a case where it takes very long to generalize. And if you have – if you train in a situation where it fails to learn anything and it’s always confused, the embeddings don’t even change.
We can also use machine learning to try to force the solutions to become r and easier for us to understand. Normally, there are many many different ways in which the neural network can do the same tasks. We want the st one. But it will normally learn one of the more complicated ones because there are more complicated than ones.
So for example if you look at the human brain, it is very modular. We do vision here, here, and here et cetera. So that the neurons that need to communicate a lot are close together. That’s because in the brain, it’s very expensive if the brain has to send signals for a long distance. It takes up more space in the head, uses more energy, and causes time delay. But in a neural network in a fully connected neural network you can permute the order of all the neurons in the layer without changing the loss function at all. And it doesn’t change any of the usual regularizers either. So here we do not give – we don’t encourage our artificial neural networks to be . So we made a test. We had an idea. What if we add a loss function, a regularizer, to the neural networks, a little bit like L1 norm where we penalize weights to make it sparse. But we also multiply this by the distance between neurons. So we embed each neuron in a point. Embed the neurons actually in a real space, two dimensions or three dimensions or whatever. And we penalize – so it costs more to have long connections. And look what happens? We train the neural network for the example we just looked at. And after a while it gets not just much more sparse but you start to see structure. It has modules. This one, this one, this one. These are a bit like these quanta that we talked about earlier. And interestingly, we can look at each of them and see what they do. This one does that circle to discover the circle trick that we talked about. This one discovers a different way of doing it. And then they combine together at the top in a kind of error correction. A little bit like majority voting with three votes to become even more accurate.
And this worked beautifully also for many examples we tried. Here we tried instead of modular addition, we tried permutation. The permutation group of 24 elements. And it made a very neural network. It’s well-known you can think of this as permuting the four corners of a tetrahedron. So there was a 9-dimensional representation of this group corresponding to the 3x3 matrices you can rotate the tetrahedron onto itself and flip it. Sure enough it discovered you only need 9 neurons to get perfect accuracy. And moreover it rediscovers the group representation theory we can study in mathematics class. For example, it discovered this neuron here is either minus one or one depending on whether it’s an even or odd permutation. Whether it reflects the tetrahedron or not. When we looked at the formulas, when we train a neural network to predict functions, this again penalizes long connections. It simplifies to the point where you can see that it’s much easier for humans to interpret. The equation on the left for example – it outputs two different things. The first thing only depends on X1, X2, and X4. The second thing only depends on X1 and X3. And even though in the beginning it’s just a big mess at the end, you can see it actually learned to simplify this into two completely independent neural networks that don’t talk to each other. Which are much easier to understand separately. The next one – there’s feature sharing. When you look at it you can see now, “oh, first what’s happening is it’s learning to square the numbers.” And then it’s combining them. For the third one, you can see everything combines into one neuron. That’s the stuff under the square root. And then it learns to calculate the square root. The last step is plotting the activation of that one neuron on the X-axis against the output and it’s exactly the square root function. This works for larger systems also. This is what happens with m-NEST. These are small examples. But I think this is – what I want you to take away from this is, we humans don’t have to spend so much time figuring out how neural networks work if we can train them to simplify themselves first. It gets a lot easier.
In this case, what we had was, we saw that neural networks are often unnecessarily complicated because you have this permutation symmetry that you can chain, do any permutations of the neurons in a layer without increasing your loss. But by removing that symmetry, you get simplicity. You get modularity.
Another kind of unnecessary complication in many neural networks is continuous. So if you have a feed forward neural network that does something really intelligent and accurate. You can put in any continuous nonlinear transformation of your data if it’s invertible with some layers and then you can have more layers remove that transformation and it’s still going to perform the same, but maybe it will be more complicated. So this awesome grad student from Beida (Beijing University), Liu Zuiming. We were able to get some good results on this. Where it can simplify itself over all these different continuous transformations I mentioned and discover for example hidden symmetries in the problem. We applied it to a very famous physics problem of black holes. When Karl Schwartzchild discovered black holes in 1915, he thought that you would die when you reached the event horizon. Because if you look at the formula here, this called metric tensor that described the shape of space time. You divide by R - 2M. So when R = 2M, you divide by zero. That sounds terrible and you die. No.
17 years later, Gulllstrand and Painleve discovered that you can just continuously transform the coordinate system. This was a stupid coordinate system that Schwartzschild found. There’s a better one. Where you see that nothing dangerous happens. There’s no division by R - 2M. You only have a singularity at R = 0. It took them 17 years to discover this. Our automated AI tool discovered that in one hour. That there was a way of dramatically simplifying this famous physics problem.
Since I have the pleasure to be in China, I have to say a little bit about AlphaGo, AlphaZero et cetera, which of course made lots of news when we were told that it was better than humans at playing Go and chess. Here also there’s been some very interesting work on mechanistic interpretability. This team at Google DeepMind discovered by taking millions and millions of games, the game positions, and letting Stockfish – the chess program programmed by humans calculate [screen frozen]
wrote a very important paper showing that we should not trust AI too much unless we can prove it. Because you were all told that you can never defeat AI in Go, right? And you probably believed it. Because even Ke Jie could not defeat it, right? But it turns out it was wrong. In this paper they discovered they can do an adversarial attack against this Go program. By placing a little trap and tricking the computer into surrounding your stones and then you start to surround the computer at the same time as it’s surrounding you. A computer would not notice that if this was a big enough loop. And then at the end poof you just win. And one of the people on the team was able to beat the world’s best Go program more than 50% of the time without using any computer. So don’t trust AI systems that are supposed to be very powerful unless you can prove it, which is more efficient for mechanistic interpretability. Yann LeCun mentioned in his talk this idea that maybe large language models lack a world model. I hear many people say that. But I think that’s not true. And here’s a really beautiful paper showing this. This is a transformer that’s been trained – a large language model – that’s been trained to play a game much r than Go. The game of Othello. They gave it a bunch of games written like this. They just gave it a sequence of moves and trained it to predict the next move, okay? They did not tell it anything about the rules of the game or that it’s played on a board or in the world or anything like that. But then with mechanistic interpretability nonetheless discovered that inside this transformer it had built a model of the world and it was an 8x8 2-dimensional world. The board with pieces on it. And it did this – just because it turned out by having a good world model it was easier for it to do the task it was trained. And I think you see in the same way when you use Baidu’s ErnieBot or GPT-4 or anything like that, it’s clearly able to model a little bit different people, if you start writing in a certain style, it will try to continue in that style. So large language models can build models of what’s outside of them
One last example that I want to highlight here about the progress in mechanistic interpretability has to do with one of its ultimate ways of understanding what a neural network does which is replacing the whole thing by a formula, a symbolic formula. This is called symbolic regression. If I give you a spreadsheet, the table of numbers, and I ask you to predict the last column from the other columns, as a symbolic formula, that’s symbolic regression. If it’s a linear function, that’s linear regression – super easy, right? Just invert your little matrix. But if it’s an arbitrary nonlinear function like this, this problem is probably NP-hard. Exponentially hard. Because there are exponentially many possible formulas of length and symbols. So we worked a lot on this in my group at MIT and found that the vast majority of the formulas we actually care about in science and engineering are modular. Meaning that a function of many variables can be written as a combination of functions with fewer variables. And we discovered that if you first train a black box neural network to actually fit the function, then just by studying the gradients of your black box and looking for certain algebraic properties you can uncover any modularity, even quite complicated, without knowing what the modules are. And that lets you break apart the complicated problem into many, many r ones that you can then solve
So for example, here I gave it a data table to see if it can fit the formula for kinetic energy according to Einstein’s relativity theory. And I’m showing on the X-axis how complicated this formula is. And on the Y-axis how bad an approximation it is. So we’re most interested in the convex corners which are and accurate. So it discovered the correct one. But it also discovered this approximation, about which you probably all learned in high school or earlier, Newton’s approximation – mass times speed squared over 2. And we tried it on a large number of equations from all sorts of physics textbooks. And we actually got state of the art performance on this task of symbolic regression simply by leveraging the power of neural networks. So the incredible power that a black box has to learn. But we didn’t stop with that. That was step 1. And then once you have the neural networks that discovered the knowledge and data you do these other steps to extract out the knowledge.
So in summary I hope I’ve convinced you that mechanistic interpretability is a very fun field, a very promising field. And let me also try to convince you that it will continue to be fun and promising. It’s a lot like neuroscience where you have a system that can do your brain which can do very intelligent things, but which is a black box that we don’t understand, right? Except this is much easier than neuroscience. Because when you do normal neuroscience here in Tsinghua University or Beida. You can only measure 1000 neurons at a time out of hundreds of billions. And you have to get some special permission from an ethics board to do experiments on humans and there’s noise. When you do mechanistic interpretability, you can measure every neuron all the time, every weight. You have no ethics board. You can even build your own organisms and try things. And as a result even though very few people have worked on this field. For a very short time, it’s progress much much faster than traditional neuroscience. And this gives me hope that if more of us join this field, we can make very rapid progress moving forward and go far beyond the first ambition level of just figuring out whether to trust things or not and them a little bit better. I really think that the third and most ambitious option here, something that’s guaranteed trustworthy will be possible.
Let me just say one more thing to sort of persuade you that it’s not hopeless to have something much more intelligent than us that we can prove is safe. First of all, we don’t have to write the proof ourselves. We can delegate to an AGI or superintelligence to write the proof. Because it is much easier to check that a proof is correct than to discover the proof, right?
So if a machine gives you a very, very long proof, you can write your own program in Python which is so short and that you completely understand it and that checks that whole proof. Yep. That’s right. I trust it. I can run it. And similarly you might think, “oh, if it takes one graduate student one year to write one paper about something, this does not scale.” All the tools I showed you here, almost in mechanistic interpretability, are automated where you can make a machine to do all the hard work. You can train the neural network to simplify itself first. And then you can use other automated tools to figure out, “what has it already learned? How is it doing it?” And come up with methods doing something that’s just as intelligent where you can actually guarantee safety. And this to me is very hopeful. Because I will never trust something much more intelligent than us unless I can prove it. But if I can prove it, I will trust it. Because no matter how smart it is, it cannot do the impossible.
So this offers an opportunity to really keep control over systems that are much more intelligent than us. We use these techniques to make them prove their own trustworthiness. And then we only run the ones we actually have proof.
To finish, let me just say a few words also about the second part. If we have these machines that will obey the person who controls the machine, how do we make sure that people or companies don’t do bad things with their own machines? This is of course the question of policy. So let’s listen again to Yoshua Bengio.
>> Yoshua Bengio: That is both very exciting because of all the benefits we can bring with socially positive applications of AI. but also I’m concerned that powerful tools can also have negative uses. And that society is not ready to deal with that. And that’s essentially why we are saying let’s slow down. Let’s make sure that we develop better guardrails.
>> MAX TEGMARK: So this letter saying “let’s pause,” which was mentioned earlier. I want to be clear – it does not say we should pause AI. It does not say we should pause almost anything that we heard about here so far in this conference. We should continue doing almost all the wonderful research that you’re all doing. It just says that we should pause development of systems more powerful than GPT-4. So this would mainly be a pause for some Western companies right now. And the reason for that is that these are precisely the systems that can lead us the fastest towards losing control, having super powerful systems that we don’t understand enough. The purpose of the pause is just to make artificial intelligence a bit more like biotech. You know in biotech you cannot just if you’re a company say, “Hey, I have a new medicine I’ve discovered. I’m going to start selling it in every supermarket in Beijing tomorrow.” No. First you have to persuade experts from the Chinese government or the American government that this is a safe drug. That its benefits outweigh the harms. There’s a review process. Then you can do it. If we establish guardrails like this for those AI systems that have such great power that they really could cause harm potentially, then we will become like biotech, right? There’s a level playing field. Not a race to the bottom. We go a little bit slower but as a result everything we have is safe. Will this destroy innovation? Well, let me ask you this. Did the biotech innovation get destroyed by the fact that you have to get approval for your medicines? Of course not. We live in a golden age of biotech. It’s in fact precisely the fields that have too weak regulations on dangerous things that often destroy innovation. Like civilian nuclear power. Investments in this have basically collapsed in the East and the West after Fukushima for example, right? We would have been much better if we had been careful and not had Fukushima or Chernobyl, so we could have all the benefits. Let’s not make that mistake. Let’s become more like biotech with our most powerful systems, not like Fukushima and Chernobyl. I want to end by saying I feel China really has a unique opportunity to contribute to safe and wise governance of artificial intelligence. You know both the upsides and the risks of very powerful AI like superintelligence of course should be shared by all humanity, so no country can tackle this alone. And China I think is really uniquely positioned. Both because China is a world leading science and technology power. So it can help lead research therefore not only on how to make AI powerful, but also how to make it robust and trustworthy, is what I talked about, right?
And since China’s international influence is growing together with its ability to inspire and power to shape the global AI agenda including AI regulation, China’s voice is really, really important. I also think that an advantage [screen frozen] China has over the West is that China has – you have one of the oldest surviving civilizations, right? In China, 200 years is a short time, not a long time like it is in America. And there’s a stronger tradition in China of thinking long-term for that reason. There’s also quite a tradition of being safety minded. So for example I was so happy last week I think seeing China’s global security initiative actually explicitly talking about preventing AI risk. And I really welcome this kind of Chinese wisdom and leadership so that we can have a wonderful future with AI by keeping it under control. Thank you. [Chinese]
(applause)
>> MC: Thank you so much Max for the very thought provoking talk. We do want to keep you there for another 10 minutes. So let me invite my old friend and professor from Tsinghua University to have a dialogue with you. Please. [Chinese]
>> ZHANG YAQIN: Nice to see you again. That was certainly a very informative and enlightening talk. I want to first thank you for your leadership and championship in raising the awareness of AI risks and challenges
>> MAX TEGMARK: [Chinese].
>> ZHANG YAQIN: We talked about two types of wisdom. And those two types of wisdom – one is wisdom to invent new things, new technologies, creativity, including AI. The other wisdom is the ability to control it and to make sure that it is steered to the direction for the human and societal benefits. And obviously in your opinion we are kind of behind. No. 2 is behind. I remember you organized a number of events. And of course the organization Future Life Institute and also 23 principles in the Asilomar AI Conference. That was 2017
>> MAX TEGMARK: Yes
>> ZHANG YAQIN: My question is, compared to with back in 2017 – you think the gap between No. 1 and No. 2 has expanded or narrowed?
>> MAX TEGMARK: I think unfortunately it’s expanded. Compared to 5 years ago or even 2020 when I was in China last time, what we’ve seen is the growth of power of AI has grown faster than many people expected. Most of my colleagues in 2020 did not expect that we would pass the Turing test in 2023. And the wisdom – it has gotten a little bit slower than we would have liked. This is precisely why there’s more interest now in pausing the most risky work exactly to give a chance for the wisdom to catch up.
>> ZHANG YAQIN: Your book Life 3.0, I’m a big a fan of your book, so is my son. You talk about three stages of life. Life 1.0 is the most primitive form of life that started 4 billion years ago. Life 2.0 is where we are. Homo sapiens is human but with a civilization language and all the systems we have right now. And Life 3.0 is AGI – is machine intelligence. Called superintelligence. And that of course involves both the thoughts, the culture, the software and also hardware. And where do you think we are? How close we are? And do you want to change your definition of Life 3.0 as mixture of carbon and silicon?
>> MAX TEGMARK: We humans – we’re already life 2.1, 2.2. Because we can put in a pacemaker. Cochlear implants. Maybe some of our children are so connected to their phones. But it’s important to remember that on one hand you know Life 3.0 is a very inspiring idea, right? The idea that after spending the first 200,000 years or so of human existence really being disempowered just trying to not get eaten, trying to not starve to death. Having very little control of our future. We now because of science and technology become the captain of our own ship, become the master of our own destiny. That’s exciting I think. We see if we continue this trend with better AI closer to Life 3.0 we can do so much more instead of having silly fights on this little spinning ball in space about who should have this little territory there and that. We have this incredible cosmos with so much resources where we can flourish for billions of years. All of those things are very hopeful to me and motivating. But it’s important to remember that Life 3.0 is not one thing, just like the next step. The space of possible life 3.0 forms. The space of artificial minds is vast. And we have a lot of choice in what kind of life 3.0 we go to. I personally love humanity. Human beings. Humanity. I think I would like to steer towards the future with life 3.0 that has compassion, has emotions, has feelings, and cares about some of the things we care about, not some kind of zombie totally unconscious machines that all they care about is something we consider very banal. Those are all very real possibilities. I would like to be empowered and not just sit here – we eat popcorn and we wait and see what comes. But ask ourselves, “What do we want to come?” And then steer in that direction
>> ZHANG YAQIN: So it would be an ideal combination of human intelligence and machine intelligence for the best of the world
>> MAX TEGMARK: It’s all possible. But it’s not happening automatically. It will happen because the people here think hard about where we want to go and we work hard to steer in that direction.
>> ZHANG YAQIN: Max, you were born in Sweden and grew up in Europe, you live and teach in the US, you travel around the world. Talk with a lot of Chinese colleagues. So let me ask you a question about the state of AI governance. If you rate the level of AI governance with a scale 1 to 10, 10 being the best. How would you rate China, Europe, and the United States?
>> MAX TEGMARK: [Laughing]. I love this question by the way. One of my favorite quotes that I identify with, is a Dutch renaissance scientist, Christiaan Huygens, he said that “science is my religion and the world is my country”. And I feel - that’s why - and so, if we look at the world itself, I think first of all, if you give a grade to all of humanity it’s pretty low. I think we can do much better. I think China at least has so far done the most in terms of regulating AI. Europe is No. 2. We’ve worked a lot with the Future Life Institute with educating policymakers in Europe, now with the finishing the EU AI ACT for example. And America is in third place.
I think that if we can help Europe come up with really sensible regulations, it’s likely America will follow them. We saw this happening with GDPR for example. Americans didn’t really want to do much with that for privacy. But then Americans started to get much less spam after the European Union passed these laws. And then it started to get more popular also in America.
>> ZHANG YAQIN: Max, your career spans mathematics, physics, neuroscience and of course artificial intelligence. In the future obviously we will depend increasingly on interdisciplinary capability knowledge. We have a lot of graduate students, a lot of perspectives from young people. What is your advice to young people in terms of how to make their career choices in terms of AI?
>> MAX TEGMARK: My advice first of all…
>> ZHANG YAQIN: …in the age of AI.
>> MAX TEGMARK: Focus on the basics. Because the economy and the job market will change ever more rapidly. We’re moving away from this paradigm where you study things for 12 years or 20 years and you do the same thing for the rest of your life. Not going to be like that. It’s much more important that you’re strong on the basics and are very good at creative open minded thinking so you can be nimble and go with the flow. Of course pay attention to what’s happening all across AI – not just in your own field. Because in the job market, the first thing that will happen is not people getting replaced by machines, but people who do not work with AI will be replaced by people who work with AI.
Can I add one more – I see the time is blinking there. But I just want to say something optimistic. I think Yann LeCun was teasing me a little bit. He called me a doomer. Look at me – I’m pretty happy and cheerful – more optimistic than Yan LeCun about our ability to understand future AI systems. I think that’s very hopeful. I think if we go full steam ahead and give ever more control away from humans to machines that we don’t understand, that will end very badly. But we don’t have to do that. I think if we work hard on mechanistic interpretability and many of the other technical topics you’ll hear today we can actually make sure that all this ever more powerful intelligence works for us and use that to create an even more inspiring future than science fiction writers used to dream of.
>> ZHANG YAQIN: Talk about Yann LeCun, I have huge respect for Yann. He did a lot of pioneer work. But one thing I disagree, and just to cause a little bit of debate. He talked about System 1 and System 2, and AI already passed System 1 and really try to make System 2 work in taxonomy. Actually I have a different view.
I think System 2 is probably, AI can achieve in the foreseeable future. System 1 is actually the hardest, because that is the instinctive capability that we don’t even know how to inference, how to reason. And just like self-driving, I spend a lot of time working on self-driving cars. You can learn how to drive a car. But once you learn it, it is muscle memory, instinct - System 1 that actually gives you that edge. And that’s why we have issues, troubles with our AI system. Also if you read the book “In Search of Memory” by Eric Kandel it talks about explicit and implicit decisions corresponding roughly to System 1 and 2, would you agree with me or Yann?
>> MAX TEGMARK: I think you make a very good point there. This is something I think we’ll probably both agree with Yann LeCun about, is that the architectures will change again. Now everybody is like “oh, transformers, transformers, transformers.” I’m quite confident when we get artificial general intelligence, superintelligence is nothing like transformers. Rather some of the things we’re developing now can be at best components in other systems. You already have things like auto GPT where people put loops around large language models and complement what they can do with a lot more traditional machine learning. One thing which is also very obvious from almost all the work in mechanistic interpretability so far is that, every time we figure out how a large language model does something like, “oh my God that’s so stupid. I can do it much better.” And so, it kind of discovers the minimum viable solution. There’s three algorithms and it does error correction and makes them vote or whatever. So I think mechanistic interpretability will not only give us things that AI – that we can trust because we can prove it ultimately, but it will also be more efficient.
>> ZHANG YAQIN: So it’s quanta-agnostic, the algorithms or models?
>> MAX TEGMARK: Yes, more broadly I envision in the future that we use these powerful black box networks as sort of mining for algorithms and discover all these different algorithms…
>> ZHANG YAQIN: But the definition is independent...
>> MAX TEGMARK: And then you put them together in an optimal way. And then you start getting systems, both much more efficient and and reliable and trustworthy. We kind of know it’s going to end this way. Because even though Ernie Bot and ChatGPT and the different flag models we heard about here are all great, look how many watts of energy they use. Your brain can do so much more at 20 watts.
>> ZHANG YAQIN: In the interest of time, we have to end the dialogue. But let me just say this. The superpowerful AI, it is timely and it is necessary to put that under control. I use your word to end this: Let us get empowered by AI, not overpowered by AI. Thank you.
>> MAX TEGMARK: [Xiexie ni].
>> MC: Thank you both very much. One last question for Max. Earlier it was mentioned there will be a debate between you and Yann LeCun. When will that happen?
>> MAX TEGMARK: June 22 in Toronto.
>> [Chinese] on the future of AI.
>> MAX TEGMARK: It will be fun. [Chinese]
(applause)
>> MC: [Chinese].