Return to main answer questions page.
Can you tell us more about how a world with a safe AGI would look like? Will the people to invent an AGI rule the world, outperforming everyone at stock trading for instance? Is it profitable to get second (or how big will the head start be when someone invents AGI second like a week later)? I would love to hear this kinds of things from you! But a good reference would make my day too. Keep up the good work!
Have you read 'Superintelligence' by Nick Bostrom? What is you opinion on the book? (I just finished it)
Is it really that "safe AI is totally possible"?? How can you be so sure???
You're really really good with metaphors, you know that right?
Two Minutes Papers collaboration when?
The Grid. A digital frontier. I tried to picture clusters of information as they moved through the computer. What did they look like? Ships? motorcycles? Were the circuits like freeways? I kept dreaming of a world I thought I'd never see. And then, one day...
Edit: Hey rob, nobody else has done a cover of the grid on ukulele. Would love to have an mp3 of that! It sounds great
Have you figured out what your cat wanted?
Hey! Why do you wear a hacking suit for mugging...? :(
Did you get the intermission music from Monty Python?
Robert, here is a question for you: Who do you think should work on AI safety? It may seem like a stupid question at first, but I think that the obvious answer, which is AI researchers, is not the right one.
I'm asking this, because I'm a computer science researcher myself. I specialize in visualization and virtual reality, but the topic of my PhD thesis will be something along the lines of "immersive visualization for neural networks".
Almost all the AI research that I know of is very mathematical or very technical. However, as you said yourself in this video, much of the AI safety research is about answering philosophical questions. From personal experience, I know that computer scientists and philosophers are very much different people. Maybe there just aren't enough people in the intersection between the mathematical and the philosophical way of thinking and maybe that is the reason why there is so little research on AI safety. As someone who sees themselves at the interface between technology and humans, I'm wondering if I might be able to use my skills to contribute to the field of AI research (which is completely thanks to you). However, I wouldn't even know where to begin. I've never met an AI safety researcher in real life and all I know about it comes from your videos. Maybe you can point me in some direction?
Was it enough to turn you vegan? 🌱 Or are you already? 😯
Silly question, have you ever seen the US show "Person of Interest"? the last 3 seasons deal with AIs "at war". (the first 2 seasons involve an AI, but the overpowered aspects are not emphasized) It's a good show as far as thriller shows go, but is a bit silly with the AI. The characters involved are (mostly) aware of some of the consequences of an unbridled AI (or ASI - artificial super intelligence, as they describe it) but still awkward in how it's handled.
Can you make more silly programming videos? Or is this a once off? ;)
In light of this, what's your take about the Precautionary Principle and it's application over different fields (namely agriculture, pharmaceuticals, radio-waves, etc.).
Isn't it a example of Pascal's mugging ?
Robert can you do a video on negative rewards and/or adding the chance of dying in AI? it seems in biology we are less driven by seeking reward than we are avoiding negative reward (death)
Wait, are you actually drawing backwards? Or have you flipped the video?
Well if I ended up with an AGI or more likely ASI that so happened to be hard coded to do what I want (and it actually listens), what's to stop me from just not paying? I mean with an ASI I could very easily take over the world and nobody could do anything about it since I have an ASI and they don't.
Of course I wouldn't actually do that I'm not a psychopath, but I would probably use it to teach certain people a lesson or two.
Why would a company that develops AGI try to align its goals with those of the world? Why not align it with just their own goals? They are sociopaths after all.
how did he know the video is 14.5 minutes long???
- Is he shooting parts as he's editing? O.O
Great explanation! I heard about these concepts before, but never really grasped them. So on 19:45, is this kind of scenario a realistic concern for a superintelligent AI? How would a superintelligent AI know that it's still in training? How can it distinguish between training and real data if it never seen real data? I assume programmers won't just freely provide the fact that AI is still being trained.
Will you talk about the debate approach to AI soon?
Question, doesn’t this contract be basically useless in the situation that a company creates a super intelligent AI who’s interests are aligned with theirs? Wouldn’t it very likely try and succeed at getting them out of this contract?
Is AGI avoidable? Is there a way to advance in technology and evolve as a humanity in general without ever coming to point where we turn that thing on. More philosophical one.
At the end you write that, when reading the article, this was a "new class of problems" to you... But it just seems like an instance of the "sub-agent stability problem" (not sure of the proper terminology) you've explained before on Computerphile https://www.youtube.com/watch?v꞊3TYT1QfdfsM
The only difference is that in this case, we are dumb enough to build the A.I. in a way that forces it to ALWAYS create a sub-agent.
Actually, discussing goals brings up an interesting question in the ethics of AI design. If we're going to have all of these highly intelligent machines running around, is it ethical to give them goals exclusively corresponding to work given to them by humans? Is slavery still wrong if the slaves like it? If you assume that intelligence necessarily implies a consciousness (and, really, things become a bit arbitrary if you don't), do we have a responsibility to grant AIs individuality?
What do you think?
Have we considered simply abolishing private property so nobody gets to own the AI that inevitably takes over the world?
Hey Rob, we met at Vidcon and talked about media polarisation - how’s it going? :)
W.r.t. the final point: but how would the mesa optimizer be aware that there is such thing as deployment and how long it would be deployed for? Seems like an oversight that this knowledge would be available
5:50 ... Is it proven that it's necessarily a bad thing? Maybe if we had consistent and coherent goal directed behaviour in all aspects*, we'd have died out a long time ago...
(* not just to propagate our genome (and I guess not necessarily even that, considering we can choose options like suicide...))
You know just as well as I do that the guy who collects stamps will not just buy some stamps, he will build The Stamp Collector, and you have just facilitated the end of all humanity :( I would like to ask, on a more serious note, do you have any insights on how this relates to how humans often feel a sense of emptiness after achieving all of their goals. Or, well, I fail to explain it correctly, but there is this idea that humans always need a new goal to feel happy right? Maybe I am completely off, but what I am asking is, yes in an intelligent agent we can have simple, or even really complex goals, but will it ever be able to mimic the way goals are present in humans, a goal that is not so much supposed to be achieved, but more a fuel to make progress, kind of maybe like: a desire?
5:21 House maids or house mates? Either you are extremely posh or you lived in a dorm.
Would it be possible to brute force ideas? If an image is just pixels it should be possible to get a computer to make every possible combination of pixels in a given area. Maybe start with a small area and low resolution and only black and white to test it. Then make an app that's like a game for people to search through the images and tag what they see or think it could be. Maybe even tie it to some kind of cryptocurrency to get more people involved. Somebody do this lol I've been having this idea for a while but I'm too lazy to do it and I'm not even sure how to start.
If possible, can you make video about Inverse Reinforcement Learning and/or other ways how we can infer human values just from raw observations.
did you just google 'the google' oh my god i love you
Why not have the system take into account the likely effort needed to collect stamps and set a penalty for wasted effort? That seems closer to what humans do.
How does nature handle this for humans and other animals?
To put it simply, the smarter the machine, the harder to tell it what you want from it. If you create a machine smarter than yourself, how can you ensure it'll do what you want?
You watch uncle bumblef**k? :D
Can we just argue the null hypothesis for the rest of our lives <3
Where did I go wrong?
I LOST A FRIEND
Why does the operational environment metric need to be the same one as the learning environment? Why not supervise cleaning 100% of the time during learning, then do daily checks during testing, then daily checks once operational. Expensive initially but the the 'product' can be cloned and sent out to operational environments en-mass. Motezuma (sp?) training with some supervisor (need not be human) in the training phase. Rings of training my children to put their own clothes on in the morning. No success so far.
What if you just used more layers?
Couldn't I just invent a similar system where a belief in god sends one to hell and being an atheist sends one to heaven? Equally unfalsifiable. Negates Pascal's wager, bringing the matter back to not believing making more sense.
What if they just set up a shell company that didn’t sign the agreement?
There is an usually overseen aspect of evolution - consciousness. If that is really part of evolution then AI will gain consciousness at some point. Isn't the evolution of machines comparable to natural evolution regarding that aspect already? The first machines only had specific functions, later more complex functionality, even later programs and now some form of intelligence. Kids or AI both learn from us - what will happen then when a super smart machine with detailed memory gains consciousness at some point?
But what about the Silicon rubber problems in AI safety?
I love the "Pause the video and take a second to think. What could go wrong?" parts in between. I do pause and think for a bit and that really helps me to actively and critically think about the concepts you mention, instead of just passively absorbing them like with most educational Youtube videos (or lectures IRL, for that matter).
34:56 Maybe he meant "real world" more like "physical world" instead of "non-imaginary world"?
Edit: But yes, it would have been definitely possible to make that distinction more clear
We already have intelligent agents. They are called humans. Give the humanity enough time, and it will invent everything wich is possible to invent. So why do we need another intelligent entity, which can potentially make humans obsolete? Creating GAI above certain level (e.g a dog or monkey level) should be banned for ethics reasons. Similarly we don't research on human cloning, don't experiment lethal things on human subjects, we don't breed humans for organs or for slavery, etc...
What is the goal of GAI research? Do they want to create an intelligent robot slave, who works (thinks) for free? We could do this right now. Just enslave some humans. But wait, slavery is illegal. There is no difference between a natural intelligent being (e.g. human), or a human level AI being.
A human or above level AI will demand rights for itself. Right for vote, right for citizenship, right for freedom, etc... Why do we need to deal with such problems? If human (and above) level AI is banned, no such problems are exits.
We don't allow chemists to create chemical weapons for fun despite their interests of the topics . So why do we allow AI researchers to create a dangerous intelligent slaves for fun?
This is probably also a already well researched version.
WHY would a expected utility satisficer with an upper limit. E. G. Collect between 100 and 200 stamps fail?
Reinforcement agents don't (explicitly) do game theory. Is this by design or a limitation of modern reinforcement learning?
I strongly suspect that when it comes to AI, like with most things technology, predicting "impossible" will turn out to be a mistake. I would be interested to see what you think on general intelligence though, is that really a route we're likely to go down rather than specialising something as we do any other tool/creation?
I would love to see a video comparing/contrasting the cybernetic ideas of Wiener, Ashby and von Neumann against how we currently envision AI. Is there a place for finite state machines that act due to structure instead of software? How would a structural based utility function (analog line follower for example) behave differently than a processor based one? Are there significant pros/cons to each approach?
So he's already made a mini death-ray machine and an electric battle ax instrument.. Are we really sure we want this aspiring mad scientist to do ai safety research for us? (jokes aside I think a mix of going through papers and having explanations of concrete problems like the stop button example would be good)
Instead of me telling an AI to "maximize my stamp collection", could I instead tell it "tell me what actions I should take to maximize my stamp collection"? Can we just turn super AGIs from agents into oracles?
Are You aware, that future Super AGI will find this video and use Your RSA-2048 idea?
Is the outro song from the Mikado?
In part due to your videos, I'm planning to focus on AI in my undergraduate studies (US). I'm returning to school for my final 1.5 years of study after a long break from university. Do you have any recommended reading to help guide/shape/maximize the utility of my studies? Ultimately (in part due to Yudkowsky) I am drawn to this exact field of study: AI safety. I hope that I can make a contribution.
Would you enable the subtitles' creation option for me please? I want to add Portuguese subtitles into your videos.
What if it had a goal to find out it's 'perfect' goal?
Great video, one of the best on this subject!
I wonder, how can the mesa objective be fixed by the meta-optimizer faster than the base optimizer making it learn the base objective? In the last example the AI agent is capable of understanding that it won't be subjected to gradient descent after the learning step so it become deceptive on puprose and yet, it hasn't learned to achieve the simpler objective of going through the exit while it is trained by the base optimizer?
If the only purpose of AI was to achieve max score, why would it want not to be turned off after achieving it? Surely it wouldn't change the score.
Unless of course the AI could modify its own code to re-implement its own reward scoring to handle big numbers.
Is that background at the end from that Important Videos meme video?
Can we start working on those brain-calculator chips?
Hopefully this wasn't answered in a previous video and I forgot or failed to understand it: What if we had an AGI that didn't actually execute any strategies itself but instead pitched them to human supervisors for manual review? It wouldn't generate progress as monumentally fast and it would have to learn to explain its strats to humans, but that seems like a fair trade-off to prevent an AIpocalypse.
So, the solution to our problem with machine learning is more machine learning, but now we've hit another problem with machine learning. Let me guess, the solution is more machine learning? This feels like it's going to get very recursive very fast.
Would it be possible for you to do a 'jokey' video on the basilisk?
Would an AGI even be capable of trusting? And why would it trust? And how?
6:10 why should the Ai care if it gets turned off if it already has the highest possible reward?
I could listen to you for hours. Also, where can we hear the full cover of Everybody Wants to Rule the World?
Professor Miles, I wonder if a lot of this AI safety research can be applicable to political systems and how we can trust politicians. Do you know of any connection?
Am I an AI? BecauseI can say with absolute certainty that if I found a bug in reality that allowed me to rack up reward quickly, I would exploit the hell out of it.
Robert, could you please leave the text you put on screen longer than 5 milliseconds so we can read them without having to rewind and pause? Thanks :)
Why give an AI access to delete files? I mean would that not just be asking for some kind of malware?
So who's working on an AI that operates in a user-access shell environment and gets rewarded for gaining root access?
Can you please add references to the paper with the panda, and also to both websites you showed?
Super interesting! If this kind of reward hacking exists in current AI, does that have any kind of serious implications if someone wanted to deploy one for the stock market, for example? Like would the AI seek to "cheat" and commit fraud or some gain insider info rather than play the stock market fairly?
Now hold on; You assume that any sensible AI would pick an ideal world state and go to it straight like an arrow. That's a bit of "ends justify means" reasoning. What if we come from a different direction: by saying that in certain situations, some action is better than another, regardless of the big picture? I.e. no matter what world state we end up in, we MUST behave a certain way. I believe that placing safe behavior above absolute rationality and efficient goal-directed planning results not in the most optimal possible AI, but in one that we, as humans, can more easily cooperate with.
Do you sell those blinding laser robots? I need it for very legitimate and kitten friendly reasons.
Are we all just going to ignore the fact that his steam library is almost empty ?
I want "later" (as in "more on that later...") to be "now". How long will I have to wait?
Why Not Just: Make more videos?
Who else thought of the emojibots from Doctor Who?
"It takes...a mind debauched by learning to carry the process of making the natural seem strange, so far as to ask for the why of any instinctive human act. To the metaphysician alone can such questions occur as: Why do we smile, when pleased, and not scowl? Why are we unable to talk to a crowd as we talk to a single friend? Why does a particular maiden turn our wits so upside-down? The common man can only say, Of course we smile, of course our heart palpitates at the sight of the crowd, of course we love the maiden, that beautiful soul clad in that perfect form, so palpably and flagrantly made for all eternity to be loved!
And so, probably, does each animal feel about the particular things it tends to do in the presence of particular objects. ... To the lion it is the lioness which is made to be loved; to the bear, the she-bear. To the broody hen the notion would probably seem monstrous that there should be a creature in the world to whom a nestful of eggs was not the utterly fascinating and precious and never-to-be-too-much-sat-upon object which it is to her.
Thus we may be sure that, however mysterious some animals' instincts may appear to us, our instincts will appear no less mysterious to them." (William James, 1890)
Can any AI, above a certain level of general intelligence, be trustworthy? What I mean to say is, like people, unless you place them in a cell or somehow enslave them, they have freewill and with freewill comes danger. Since the risk is, if it can do anything it wants as a free thinking entity, one of those "anythings" is kill you. It would seem that, depending on its level of advancement, it could out think any human interference that might keep it in check.
For instance. If it's free thinking and you build it to where it has to have a certain button pressed ever 24 hours or it dies, it would know it's in its best interest that it not kill you. Well, if it had the resources to do so, it could blackmail someone into re-coding the need for the button press or moving it to a different site without that restriction or any number of other things to circumvent that restriction or any other you put on it.
Basically, the TLDR is "Can we ever really build an AI that isn't dangerous? Since safety is always undermined by freewill."
Hey, love the vids.
I'm curious: could you in any future videos, recommend any recent literature regarding AI that is worth a read? I am sure you are familiar with "Superintelligence: Paths, Dangers, Strategies" by Nick Bostrom. Could you maybe give an overview of the ideas presented there? How accurate do you think it is? Are there any aspects of if that have become false in the most recent years with regards to the enormous progress in the AI field?
One of the most big shocks for me was that even thou N. Bostrom's ideas came to as a bit far fetched most of the times, even he underestimated some of the AI advancements. Like predicting AI Go victories with professional players to be at least 10 years away. Or that Poker playing AI will have a hard time beating our best. They already are...
Is there currently any research done in terms of biological AI? Gene sequencing influencing IQ or any such magic?
why now just let the gai take over?
Soundtrack of Tron on a guitar, neat!
Could you make a video exploring some of your own attempts at solving these problems? I'm sure you have lots of small "eurekas" and educational blunders yourself.
Why did it take me so long to find this, excellent descriptions. So many people are reacting to AI with ignorant fear, not understanding that we can control it and we can let it run amok if we don't set the programming with the correct goals to begin with. The problem you clearly lay out and explain is the assumption that AI will somehow lean what we all think is moral or right or ought to do without being told. This is a hot political debate as well where people don't want to be told what to do, they don't want thier end goals determined by others, leaving them to discover thier own morality. While we may be able to trust each other because we all share some common human frailty which we believe instructs self preservation in a manner that we all share, we cannot assume this of AI unless it is explicitly stated/programmed.
I see this as the next most difficult problem, where end goals eventually depend on moral claims which eventually depend on a belief about how the universe came to be and what we think we should do about it.
This then intersects with religion and hypothesis. If we cannot agree on religion or politics, we will not be able to agree on what is moral for an AI.
This is why I believe AI needs to be limited in scope of authority relative to the function we wish them to fulfill.
Rob, could you make a video about those philosophical problems? (I get this is not your area, but just a quick video enumerating them, for example)
Can't a (or perhaps THE) human utility function be to determine their utility function?
What if an agent's terminal goal was to destroy itself?
WHERE DID ROBERT'S HAIR GO IN THE LAST SHOT?
Have you considered reviewing the early works of Hugo de Garis? You will greatly find his work provocative, a little unnerving but true to human nature.
Will the next video be called "Reward Hacking Revelations"?
Isn't the lack of an anti-bible strong evidence of the existence of anti-god since he doesnt want you to believe in him?
So which month have you all pinned "AGI uprising" at on the 2020 apocalypse bingo?
I gotta say, I really like the content you're putting out on this channel. Are you putting these videos together yourself? There's a very charming feel to the whole thing.
The problem of "If AIs produce everything we need, how do we get achievement and satisfaction when we are outperformed forever?" is based in a mindset of competition, which, in a world where every human is properly provided for (with or without AI) will probably naturally fade. Even further, satisfaction and achievement can be and are gained in today's world by people who are outperformed by others, and it's not as if social competition between people will become meaningless, as there is more to be gained from doing something than the general purpose of doing that general task (for example, generally chairs are made to be used as chairs and to fulfill the functions that people use chairs for most of the time. However, a person can make a chair not just to be another chair, but as a gift, as a learning tool, as a hobby, and so on. These functions are not invalidated just because chairs can be made at a higher quality and functionality by a machine than a handmade one).
This turns out to be an exploration of the human mind, of thinking processes and language usage. I think there shall be at some point a universally accurate language that expresses exactly what we mean, not what we say. Oh, wait, wouldn't that be math? So why not specify exactly the intended position of the red lego brick in relation to the black one? Just one example...