Is recursive self improvement possible?

3 min read

The possibility of recursive self improvement is debated within the field of AI, as it relies on a multitude of different assumptions.

For example, a common misconception regarding self-improvement is that it relies on the AI having a direct capability to modify its entire codebase - but this is not necessary, because many pathways to self-improvement (e.g. creation of sub-agents, partial self-modification, increasing the scale, as well as many others) do not rely on this ability.
When it comes to the underlying assumptions of self-improvement, one of the most common disagreements is on whether a self-improving AI will eventually reach the point of diminishing returns, and where this point may lie. For example, when it comes to AIs built using modern deep learning paradigms, some have argued that self-improvement may be infeasible and not cost-effective, due to the complexity and computational demand of these types of models. If this is true, then such AIs wouldn’t satisfy the criteria for being seed AI, which were imagined to be designed and not selected for by search-like processes, requiring understanding of their own source code and the ability to make goal-preserving modifications to themselves.

A common response to this objection is that the difficulty of self-improvement in a deep learning paradigm may be simply an artifact of this specific paradigm, as the field of AI is replete with examples of a radical change in approaches allowing previously unfeasible problems to be solved extremely easily. Moreover, improving source code in the context of deep learning does not just refer to, say, a neural network changing each of its weights directly (although steering it as such may well be possible), but also things like code defining the architecture or the code for collecting training data, etc.

Another common question is whether AGI will want to self-improve in the first place.

Ultimately, the danger of self-improvement does not lie in hypothetical infinite recursive improvements, but in pushing the AGI further down a path to AI takeover - even fairly modest, concrete and reasonable improvements may push an AGI beyond the point of controllability. Given that some trends in modern AI provide for steps on the path to self-improvement, the question of it’s long-term feasibility remains a crucial one. Furthermore, you don’t even need self-improvement to get things that look like FOOM.

Might an "intelligence explosion" never occur?

What is recursive self-improvement?

What is an intelligence explosion?