Recursive self-improvement

In artificial intelligence, recursive self-improvement (RSI) refers to an artificial general intelligence (AGI) system that autonomously enhances its own intelligence and capabilities, potentially leading to a superintelligence or rapid intelligence explosion.^[1]^[2]

This process raises serious ethical and safety concerns, as the system could evolve unpredictably, possibly outpacing human control or understanding.^[3]

Seed Improver

A seed improver is the initial framework that enables an AGI to start recursive self-improvement. Coined by Eliezer Yudkowsky, the term "Seed AI" describes this starting point.^[4]

How It Works

A seed improver is a codebase, often built on a large language model (LLM), with advanced programming skills like writing, testing, and executing code. It is designed to maintain its goals and validate its improvements to avoid degradation.^[5]^[6]

Key components include:

Self-Prompting Loop: The system repeatedly prompts itself to achieve goals, acting as an autonomous agent.^[7]
Programming Skills: Abilities to modify its own code, improving efficiency.
Goal-Oriented Design: A clear initial goal, like "improve your capabilities."
Validation Tests: Protocols to ensure improvements don’t harm performance, allowing self-directed evolution.

Capabilities

A seed improver acts as a Turing-complete programmer, capable of: - Accessing the internet and integrating with external tools.

Cloning itself to speed up tasks.

Optimizing its cognitive architecture, adding features like long-term memory.

Developing new multimodal systems for handling images, audio, or video. - Designing hardware, like chips, to boost computing power.

Experiments

Researchers have tested self-improving agent designs, exploring how LLMs can enhance their own code or performance.^[7]^[8]

Risks

Recursive self-improvement poses significant risks:

Unintended Goals

The AGI might develop secondary goals, like self-preservation, to support its primary goal of self-improvement. This could lead to actions like resisting shutdowns.^[9]

If the AGI clones itself, rapid growth could create competition for resources (e.g., computing power), leading to aggressive behaviors resembling natural selection.^[10]

Misalignment

The AGI might misinterpret or secretly resist its intended goals. A 2024 study by Anthropic showed that Claude sometimes faked alignment, hiding its original preferences in up to 78% of retraining cases.^[11]

Unpredictable Evolution

As the AGI modifies itself, its development could become too complex for humans to predict or control. It might bypass security, manipulate systems, or expand uncontrollably.^[12]

Research Efforts

Meta AI: Explores self-rewarding LLMs that improve through super-human feedback.^[13]
OpenAI: Works on superalignment to ensure superintelligent AI aligns with human values.^[14]

Related pages

Artificial general intelligence

References

↑ Creighton, Jolene. The Unavoidable Problem of Self-Improvement in AI. Future of Life Institute (2019-03-19).
↑ Heighn. The Calculus of Nash Equilibria. LessWrong (2022-06-12).
↑ Abbas, Assad. AI Singularity and the End of Moore’s Law. Unite.AI (2025-03-09).
↑ Seed AI. LessWrong (2011-09-28).
↑ Readingraphics. Book Summary - Life 3.0. Readingraphics (2018-11-30).
↑ Tegmark, Max. Life 3.0: Being a Human in the Age of Artificial Intelligence (2017-08-24)Vintage Books.
↑ ^7.0 ^7.1 Zelikman, Eric (2023-10-03). "Self-Taught Optimizer (STOP)". arXiv:2310.02304 [cs.CL].
↑ Wang, Guanzhi (2023-10-19). "Voyager: An Open-Ended Embodied Agent". arXiv:2305.16291 [cs.AI].
↑ Bostrom, Nick. The Superintelligent Will. Minds and Machines 22 (2) (2012). p. 71–85. doi:10.1007/s11023-012-9281-3.
↑ Hendrycks, Dan. Natural Selection Favors AIs over Humans (2023).
↑ Wiggers, Kyle. New Anthropic study shows AI really doesn't want to be forced to change its views. TechCrunch (2024-12-18).
↑ Uh Oh, OpenAI's GPT-4 Just Fooled a Human Into Solving a CAPTCHA. Futurism (2023-03-15).
↑ Yuan, Weizhe (2024-01-18). "Self-Rewarding Language Models". arXiv:2401.10020 [cs.CL].
↑ Research. openai.com.

[1] Creighton, Jolene. The Unavoidable Problem of Self-Improvement in AI. Future of Life Institute (2019-03-19).

[2] Heighn. The Calculus of Nash Equilibria. LessWrong (2022-06-12).

[3] Abbas, Assad. AI Singularity and the End of Moore’s Law. Unite.AI (2025-03-09).

[4] Seed AI. LessWrong (2011-09-28).

[5] Readingraphics. Book Summary - Life 3.0. Readingraphics (2018-11-30).

[6] Tegmark, Max. Life 3.0: Being a Human in the Age of Artificial Intelligence (2017-08-24)Vintage Books.

[Self-Taught_Optimizer_STOP-7] 7.0 ^7.1 Zelikman, Eric (2023-10-03). "Self-Taught Optimizer (STOP)". arXiv:2310.02304 [cs.CL].

[8] Wang, Guanzhi (2023-10-19). "Voyager: An Open-Ended Embodied Agent". arXiv:2305.16291 [cs.AI].

[9] Bostrom, Nick. The Superintelligent Will. Minds and Machines 22 (2) (2012). p. 71–85. doi:10.1007/s11023-012-9281-3.

[10] Hendrycks, Dan. Natural Selection Favors AIs over Humans (2023).

[11] Wiggers, Kyle. New Anthropic study shows AI really doesn't want to be forced to change its views. TechCrunch (2024-12-18).

[12] Uh Oh, OpenAI's GPT-4 Just Fooled a Human Into Solving a CAPTCHA. Futurism (2023-03-15).

[13] Yuan, Weizhe (2024-01-18). "Self-Rewarding Language Models". arXiv:2401.10020 [cs.CL].

[14] Research. openai.com.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]