Category:Canonical questions

From Stampy's Wiki

Canonical questions are the questions which we've checked are in scope and not duplicates, so we want answers to them. They may be edited to represent a class of question more broadly, rather than keeping all their idosyncracies. Once they're answered canonically Stampy will serve them to readers.

See also

394 canonical questions, 284 of them are answered, and 220 canonically answered questions!

All canonical questions

A lot of concern appears to focus on human-level or “superintelligent” AI. Is that a realistic prospect in the foreseeable future?
AIs aren’t as smart as rats, let alone humans. Isn’t it far too early to be worrying about this kind of thing?
Any AI will be a computer program. Why wouldn't it just do what it's programmed to do?
Are AI researchers trying to make conscious AI?
Are Google, OpenAI, etc. aware of the risk?
Are any major politicians concerned about this?
Are expert surveys on AI safety available?
Are there any AI alignment projects which governments could usefully put a very large amount of resources into?
Are there any courses on technical AI safety topics?
Are there any plausibly workable proposals for regulating or banning dangerous AI research?
Are there promising ways to make AI alignment researchers smarter?
Are there risk analysis methods, which may help to make the risk more quantifiable or clear?
Are there types of advanced AI that would be safer than others?
Aren't robots the real problem? How can AI cause harm if it has no ability to directly manipulate the physical world?
Aren’t there some pretty easy ways to eliminate these potential problems?
At a high level, what is the challenge of alignment that we must meet to secure a good future?
Can AI be creative?
Can an AI really be smarter than humans?
Can humans stay in control of the world if human- or superhuman-level AI is developed?
Can people contribute to alignment by using proof assistants to generate formal proofs?
Can we add "friendliness" to any artificial intelligence design?
Can we constrain a goal-directed AI using specified rules?
Can we ever be sure that an AI is aligned?
Can we get AGI by scaling up architectures similar to current ones, or are we missing key insights?
Can we program the superintelligence to maximize human pleasure or satisfaction of human desires?
Can we teach a superintelligence a moral code with machine learning?
Can we tell an AI just to figure out what we want and then do that?
Can we test an AI to make sure that it’s not going to take over and do harmful things after it achieves superintelligence?
Can you give an AI a goal which involves “minimally impacting the world”?
Can you stop an advanced AI from upgrading itself?
Can't we just tell an AI to do what we want?
Can’t we just program the superintelligence not to harm us?
Considering how hard it is to predict the future, why do we think we can say anything useful about AGI today?
Could AI have basic emotions?
Could I contribute by offering coaching to alignment researchers? If so, how would I go about this?
Could an AGI have already been created and currently be affecting the world?
Could divesting from AI companies without good safety culture be useful, or would this be likely to have a negligible impact?
Could emulated minds do AI alignment research?
Could we build provably beneficial AI systems?
Could we get significant biological intelligence enhancements long before AGI?
Could we program an AI to automatically shut down if it starts doing things we don’t want it to?
Could we tell the AI to do what's morally right?
Could weak AI systems help with alignment research?
Do AIs suffer?
Do you need a PhD to work on AI Safety?
Does it make sense to focus on scenarios where change is rapid and due to a single actor, or slower and dependent on getting agreements between several relevant actors?
Does the importance of AI risk depend on caring about transhumanist utopias?
Even if we are rationally convinced about the urgency of existential AI risk, it can be hard to feel that emotionally because the danger is so abstract. How can this gap be bridged?
How can I be a more productive student/researcher?
How can I collect questions for Stampy?
How can I contact the Stampy team?
How can I contribute in the area of community building?
How can I contribute to Stampy?
How can I convince others and present the arguments well?
How can I get hired by an organization working on AI alignment?
How can I join the Stampy dev team?
How can I support alignment researchers to be more productive?
How can we interpret what all the neurons mean?
How close do AI experts think we are to creating superintelligence?
How could an intelligence explosion be useful?
How could an international treaty on the development of AGI look?
How could general intelligence be programmed into a machine?
How could poorly defined goals lead to such negative outcomes?
How difficult should we expect alignment to be?
How do I add content from LessWrong / Effective Altruism Forum tag-wikis to Stampy?
How do I form my own views about AI safety?
How do I format answers on Stampy?
How do I know whether I'm a good fit for work on AI safety?
How do I stay motivated and productive?
How do I stay updated about AI progress?
How do organizations do adversarial training and red teaming?
How do the incentives in markets increase AI risk?
How does AI taking things literally contribute to alignment being hard?
How does MIRI communicate their view on alignment?
How does superintelligence reliably go from controlling the internet to controlling the physical reality?
How does the current global microchip supply chain work, and who has political power over it?
How does the field of AI Safety want to accomplish its goal of preventing existential risk?
How does the stamp eigenkarma system work?
How doomed is humanity?
How fast will AI takeoff be?
How good is the world model of GPT-3?
How hard is it for an AGI to develop powerful nanotechnology?
How important is research closure and OPSEC for capabilities-synergistic ideas?
How is "intelligence" defined?
How is AGI different from current AI?
How is Beth Barnes evaluating LM power seeking?
How is OpenAI planning to solve the full alignment problem?
How is metaethics relevant to AI alignment?
How is the Alignment Research Center (ARC) trying to solve Eliciting Latent Knowledge (ELK)?
How likely are AI organizations to respond appropriately to the risks of their creations?
How likely is an "intelligence explosion"?
How likely is it that AGI will first be developed by a large established organization, rather than a small startup, an academic group or a government?
How likely is it that an AI would pretend to be a human to further its goals?
How likely is it that governments will play a significant role? What role would be desirable, if any?
How long will it be until superintelligent AI is created?
How might "acausal trade" affect alignment?
How might AGI interface with cybersecurity?
How might AGI kill people?
How might a real-world AI system that receives orders in natural language and does what you mean look?
How might a superintelligence socially manipulate humans?
How might a superintelligence technologically manipulate humans?
How might an "intelligence explosion" be dangerous?
How might an AI achieve a seemingly beneficial goal via inappropriate means?
How might humanity decide what we want the future to be like once we have an aligned superintelligence?
How might non-agentic GPT-style AI cause an "intelligence explosion" or otherwise contribute to existential risk?
How might things go wrong with AI even without an agentic superintelligence?
How might we get from Artificial General Intelligence to a Superintelligent system?
How might we reduce the chance of an AI arms race?
How might we reduce the diffusion of dangerous AI technology to insufficiently careful actors?
How much can we learn about AI with interpretability tools?
How much resources did the processes of biological evolution use to evolve intelligent creatures?
How possible (and how desirable) is it to change which path humanity follows to get to AGI?
How powerful will a mature superintelligence be?
How quickly could an AI go from the first indications of problems to an unrecoverable disaster?
How quickly would the AI capabilities ecosystem adopt promising new advances in AI alignment?
How should I change my financial investments in response to the possibility of transformative AI?
How should I decide which quality level to attribute to a proposed question?
How should I personally prepare for when transformative AI arrives?
How software- and/or hardware-bottlenecked are we on AGI?
How successfully have institutions managed risks from novel technology in the past?
How tractable is it to get governments to play a good role (rather than a bad role) and/or to get them to play a role at all (rather than no role)?
How would I know if AGI were imminent?
How would we align an AGI whose learning algorithms / cognition look like human brains?
How would we know if an AI were suffering?
How would you explain the theory of Infra-Bayesianism?
I want to help out AI alignment without necessarily making major life changes. What are some simple things I can do to contribute?
I want to work on AI alignment. How can I get funding?
I'm interested in working on AI safety. What should I do?
If AGI comes from a new paradigm, how likely is it to arise late in the paradigm when it is already deployed at scale, versus early on when only a few people are exploring the idea?
If AI takes over the world how could it create and maintain the infrastructure that humans currently provide?
If I only care about helping people alive today, does AI safety still matter?
If an AI became conscious, how would we ever know?
If we solve alignment, are we sure of a good future?
In "aligning AI with human values", which humans' values are we talking about?
In what ways are real-world machine learning systems different from expected utility maximizers?
Is AI alignment possible?
Is AI safety research racing against capability research? If so, how can safety research get ahead?
Is expecting large returns from AI self-improvement just following an exponential trend line off a cliff?
Is it already too late to work on AI alignment?
Is it likely that hardware will allow an exponential takeoff?
Is it possible to block an AI from doing certain things on the Internet?
Is it possible to code into an AI to avoid all the ways a given task could go wrong, and would it be dangerous to try that?
Is large-scale automated AI persuasion and propaganda a serious concern?
Is merging with AI through brain-computer interfaces a potential solution to safety problems?
Is the UN concerned about existential risk from AI?
Is the focus on the existential threat of superintelligent AI diverting too much attention from more pressing debates about AI in surveillance and the battlefield, and its potential effects on the economy?
Is the question of whether we're living in a simulation relevant to AI safety? If so, how?
Is there a Chinese AI safety community? Are there safety researchers working at leading Chinese AI labs?
Is there a danger in anthropomorphizing AI’s and trying to understand them in human terms?
Is this about AI systems becoming malevolent or conscious and turning on us?
Isn't it hard to make a significant difference as a person who isn't going to be a world-class researcher?
Isn't it too soon to be working on AGI safety?
Isn't the real concern AI being misused by terrorists or other bad actors?
Isn't the real concern AI-enabled totalitarianism?
Isn't the real concern autonomous weapons?
Isn't the real concern technological unemployment?
Isn’t AI just a tool like any other? Won’t it just do what we tell it to?
Isn’t it immoral to control and impose our values on AI?
I’d like to get deeper into the AI alignment literature. Where should I look?
Might an "intelligence explosion" never occur?
Might an aligned superintelligence force people to "upload" themselves, so as to more efficiently use the matter of their bodies?
Might an aligned superintelligence force people to have better lives and change more quickly than they want?
Might an aligned superintelligence immediately kill everyone and then go on to create a "hedonium shockwave"?
Might attempting to align AI cause a "near miss" which results in a much worse outcome?
Might humanity create astronomical amounts of suffering when colonizing the universe after creating an aligned superintelligence?
Might trying to build a hedonium-maximizing AI be easier and more likely to work than trying for eudaimonia?
OK, I’m convinced. How can I help?
Once we notice that a superintelligence given a specific task is trying to take over the world, can’t we turn it off, reprogram it or otherwise correct the problem?
Should I engage in political or collective action like signing petitions or sending letters to politicians?
Should we expect "warning shots" before an unrecoverable catastrophe?
Superintelligence sounds like science fiction. Do people think about this in the real world?
This all seems rather abstract. Isn't promoting love, wisdom, altruism or rationality more important?
To what extent are there meaningfully different paths to AGI, versus just one path?
We already have psychopaths who are "misaligned" with the rest of humanity, but somehow we deal with them. Can't we do something similar with AI?
We’re going to merge with the machines so this will never be a problem, right?
What AGI safety reading lists are there?
What about AI concerns other than misalignment?
What about having a human supervisor who must approve all the AI's decisions before executing them?
What actions can I take in under five minutes to contribute to the cause of AI safety?
What alignment strategies are scalably safe and competitive?
What approaches are AI alignment organizations working on?
What are "coherence theorems" and what do they tell us about AI?
What are "human values"?
What are "scaling laws" and how are they relevant to safety?
What are "selection theorems" and can they tell us anything useful about the likely shape of AGI systems?
What are Encultured working on?
What are OpenAI Codex and GitHub Copilot?
What are Scott Garrabrant and Abram Demski working on?
What are alternate phrasings for?
What are brain-computer interfaces?
What are language models?
What are likely to be the first transformative applications of AI?
What are mesa-optimizers?
What are plausible candidates for "pivotal acts"?
What are some AI alignment research agendas currently being pursued?
What are some good books about AGI safety?
What are some good podcasts about AI alignment?
What are some good resources on AI alignment?
What are some helpful AI policy ideas?
What are some important examples of specialised terminology in AI alignment?
What are some objections to the importance of AI alignment?
What are some of the leading AI capabilities organizations?
What are some of the most impressive recent advances in AI capabilities?
What are some open research questions in AI alignment?
What are some practice or entry-level problems for getting into alignment research?
What are some problems in philosophy that are related to AI safety?
What are some specific open tasks on Stampy?
What are the "win conditions"/problems that need to be solved?
What are the differences between AGI, transformative AI and superintelligence?
What are the differences between “AI safety”, “AGI safety”, “AI alignment” and “AI existential safety”?
What are the different possible AI takeoff speeds?
What are the different versions of decision theory?
What are the editorial protocols for Stampy questions and answers?
What are the ethical challenges related to whole brain emulation?
What are the leading theories in moral philosophy and which of them might be technically the easiest to encode into an AI?
What are the main sources of AI existential risk?
What are the potential benefits of AI as it grows increasingly sophisticated?
What are the style guidelines for writing for Stampy?
What assets need to be protected by/from the AI? Are "human values" sufficient for it?
What beneficial things would an aligned superintelligence be able to do?
What can I do to contribute to AI safety?
What can we expect the motivations of a superintelligent machine to be?
What convinced people working on AI alignment that it was worth spending their time on this cause?
What could a superintelligent AI do, and what would be physically impossible even for it?
What does Elon Musk think about AI safety?
What does Evan Hubinger think of Deception + Inner Alignment?
What does MIRI think about technical alignment?
What does Ought aim to do?
What does a typical work day in the life of an AI safety researcher look like?
What does alignment failure look like?
What does generative visualization look like in reinforcement learning?
What does the scheme Externalized Reasoning Oversight involve?
What evidence do experts usually base their timeline predictions on?
What external content would be useful to the Stampy project?
What harm could a single superintelligence do when it took so many humans to build civilization?
What if technological progress stagnates and we never achieve AGI?
What if we put the AI in a box and have a second, more powerful, AI with the goal of preventing the first one from escaping?
What is "Do What I Mean"?
What is "HCH"?
What is "agent foundations"?
What is "biological cognitive enhancement"?
What is "coherent extrapolated volition"?
What is "evidential decision theory"?
What is "friendly AI"?
What is "functional decision theory"?
What is "greater-than-human intelligence"?
What is "hedonium"?
What is "logical decision theory"?
What is "metaphilosophy" and how does it relate to AI safety?
What is "narrow AI"?
What is "superintelligence"?
What is "transformative AI"?
What is "whole brain emulation"?
What is AI Safety via Debate?
What is AI safety?
What is Aligned AI / Stuart Armstrong working on?
What is Anthropic's approach to LLM alignment?
What is Artificial General Intelligence and what will it look like?
What is Conjecture's Scalable LLM Interpretability research adgenda?
What is Conjecture's epistemology research agenda?
What is Conjecture, and what is their team working on?
What is David Krueger working on?
What is Dylan Hadfield-Menell's thesis on?
What is GPT-3?
What is Goodhart's law?
What is John Wentworth's plan?
What is MIRI’s mission?
What is Refine?
What is Stampy's copyright?
What is a "pivotal act"?
What is a "quantilizer"?
What is a "value handshake"?
What is a canonical question on Stampy's Wiki?
What is a duplicate question on Stampy's Wiki?
What is a follow-up question on Stampy's Wiki?
What is a verified account on Stampy's Wiki?
What is an "agent"?
What is an "intelligence explosion"?
What is an "s-risk"?
What is an example of AGI going wrong that doesn't sound like sci-fi?
What is artificial general intelligence safety / AI alignment?
What is causal decision theory?
What is everyone working on in AI alignment?
What is interpretability and what approaches are there?
What is meant by "AI takeoff"?
What is neural network modularity?
What is the "control problem"?
What is the "long reflection"?
What is the "orthogonality thesis"?
What is the "universal prior"?
What is the "windfall clause"?
What is the Center for Human Compatible AI (CHAI)?
What is the Center on Long-Term Risk (CLR) focused on?
What is the DeepMind's safety team working on?
What is the Stampy project?
What is the difference between inner and outer alignment?
What is the general nature of the concern about AI alignment?
What is the probability of extinction from misaligned superintelligence?
What kind of a challenge is solving AI alignment?
What kind of questions do we want on Stampy?
What links are especially valuable to share on social media or other contexts?
What milestones are there between us and AGI?
What organizations are working on technical AI alignment?
What plausibly happens five years before and after AGI?
What research is being done to align modern deep learning systems?
What safety problems are associated with whole brain emulation?
What should I read to learn about decision theory?
What should be marked as a "related" question on Stampy's Wiki?
What should be marked as a canonical answer on Stampy's Wiki?
What should the first AGI systems be aligned to do?
What sources of information can Stampy use?
What subjects should I study at university to prepare myself for alignment research?
What technical problems are MIRI working on?
What technological developments could speed up AI progress?
What training programs and courses are available for AGI safety?
What would a "warning shot" look like?
What would a good future with AGI look like?
What would a good solution to AI alignment look like?
What would a world shortly before AGI look like?
What would be physically possible and desirable to have in an AI-built utopia?
What's especially worrisome about autonomous weapons?
What's meant by calling an AI "agenty" or "agentlike"?
What’s a good AI alignment elevator pitch?
When should I stamp an answer?
When will an intelligence explosion happen?
When will transformative AI be created?
Where can I find all the features of Stampy's Wiki?
Where can I find mentorship and advice for becoming a researcher?
Where can I find people to talk to about AI alignment?
Where can I find questions to answer for Stampy?
Where can I learn about AI alignment?
Where can I learn about interpretability?
Which country will AGI likely be created by, and does this matter?
Which military applications of AI are likely to be developed?
Which organizations are working on AI alignment?
Which organizations are working on AI policy?
Which university should I study at if I want to best prepare for working on AI alignment?
Who created Stampy?
Who is Nick Bostrom?
Who is Sam Bowman?
Who is Stampy?
Why can't we just make a "child AI" and raise it?
Why can't we just turn the AI off if it starts to misbehave?
Why can't we simply stop developing AI?
Why can’t we just use Asimov’s Three Laws of Robotics?
Why can’t we just use natural language instructions?
Why can’t we just “put the AI in a box” so that it can’t influence the outside world?
Why can’t we just…
Why do some AI researchers not worry about alignment?
Why do we expect that a superintelligence would closely approximate a utility maximizer?
Why do you like stamps so much?
Why does AI need goals in the first place? Can’t it be intelligent without any agenda?
Why does AI takeoff speed matter?
Why does there seem to have been an explosion of activity in AI in recent years?
Why don't we just not build AGI if it's so dangerous?
Why is AGI dangerous?
Why is AGI safety a hard problem?
Why is AI a severe threat to humanity?
Why is AI alignment a hard problem?
Why is AI safety important?
Why is safety important for smarter-than-human AI?
Why is the future of AI suddenly in the news? What has changed?
Why might a maximizing AI cause bad outcomes?
Why might a superintelligent AI be dangerous?
Why might contributing to Stampy be worth my time?
Why might people try to build AGI rather than stronger and stronger narrow AIs?
Why might we expect a fast takeoff?
Why might we expect a moderate AI takeoff?
Why might we expect a superintelligence to be hostile by default?
Why should I worry about superintelligence?
Why should we prepare for human-level AI technology now rather than decades down the line when it’s closer?
Why think that AI can outperform humans?
Why work on AI safety early?
Why would great intelligence produce great power?
Why would we only get one chance to align a superintelligence?
Will AGI be agentic?
Will AI learn to be independent from people or will it always ask for our orders?
Will an aligned superintelligence care about animals other than humans?
Will superintelligence make a large part of humanity unemployable?
Will there be a discontinuity in AI capabilities? If so, at what stage?
Will we ever build a superintelligence?
Won’t AI be just like us?
Would "warning shots" make a difference and, if so, would they be helpful or harmful?
Would AI alignment be hard with deep learning?
Would an AI create or maintain suffering because some people want it?
Would an aligned AI allow itself to be shut down?
Would donating small amounts to AI safety organizations make any significant difference?
Would it improve the safety of quantilizers to cut off the top few percent of the distribution?
Would we know if an AGI was misaligned?
Wouldn't a superintelligence be smart enough not to make silly mistakes in its comprehension of our instructions?
Wouldn't a superintelligence be smart enough to know right from wrong?
Wouldn't a superintelligence be smart enough to know right from wrong?
Wouldn't it be a good thing for humanity to die out?
Wouldn't it be safer to only build narrow AIs?

Pages in category "Canonical questions"

The following 200 pages are in this category, out of 372 total.

(previous page) ()





(previous page) ()