An essay on AI

When Will AGI Arrive? The Four-Word Gap on the Way to 2030

Continual learning. Long-term reasoning. Memory. Consistency. Four words name the whole remaining distance to AGI, and the timeline that follows is around 2030.

Hanh D. Brown · 11 min read

An essay on AI

When Will AGI Arrive? The Four-Word Gap on the Way to 2030

Continual learning. Long-term reasoning. Memory. Consistency. Four words name the whole remaining distance to AGI, and the timeline that follows is around 2030.

Hanh D. Brown

When AGI will arrive and the four-word gap to 2030: the AGI timeline reframed as a bounded list of named research problems, each with active work.

In this essay

01 When will AGI arrive?
02 What is still missing from current AI to reach AGI?
03 Why do AI models fail at simple problems but solve hard ones?
04 How good are AI agents in 2026?

Artificial General Intelligence (AGI) arrives around 2030. That is the most-cited prediction from inside the world’s leading Artificial Intelligence (AI) lab. The reasoning is not a guess about scaling laws. It is a list. Continual learning. Long-term reasoning. Memory. Consistency. Four words name the whole remaining distance.

Short answer

When will AGI arrive and what is still missing?

When will Artificial General Intelligence (AGI) arrive? Around 2030 is the timeline the leading frontier lab names. The remaining distance is a four-word list of named research problems. Continual learning, long-term reasoning, memory, consistency. Each one is closeable. The bet on whether new big ideas are needed is fifty-fifty.

When will AGI arrive?#

Around 2030. That is the timeline named by the head of Google DeepMind. The reasoning is structural. The distance to AGI is a bounded list of named research problems, with active work on each. The number is not a feeling. The number is a list with a clock attached.

This is for a founder, an engineer, or a technologist tracking AI. The reader who has noticed that public coverage is split between “any day now” and “decades away.” Neither framing matches what the people building the systems say in a room with no cameras.

The bet on whether existing AI techniques scale to AGI is roughly fifty-fifty. The other path is one or two new big ideas. That is the most honest probability statement in public about AI in 2026. Fifty-fifty refuses the pessimist framing that the architecture is a dead end. Fifty-fifty also refuses the accelerationist framing that scale alone will close the list.

AGI could arrive at year three or year eight rather than year five. The bracket is plus or minus two years on either side. Plans built around 2030 hold inside that bracket. Plans built around 2028 or 2032 produce roughly the same decisions today.

That bracket is what the people closest to the problem actually use when they plan their own roadmaps. The number outside the bracket gets the headline. The number inside the bracket gets the budget. A reader holding both numbers reads the year differently from a reader holding only one.

So the headline stays short. 2030, give or take. Nearer than a decade, further than a year.

What is still missing from current AI to reach AGI?#

Four words name the gap. Continual learning. Long-term reasoning. Memory. Consistency. Each is a known research problem with active work on it. The remaining distance to AGI is not a wall and not a slope. It is this list.

Diagram of structure showing the four named gaps to AGI: continual learning, long-term reasoning, memory, and consistency, each with its implementation-level description. — Four named gaps. Each one a known research problem. Each one with active work.

Continual learning is the piece that holds AI agents back from finishing whole tasks. Today’s agent is useful for the parts. It cannot adapt to the specific context it is dropped into. A continually-learning agent figures the context out and proceeds.

A mother teaching a child a new word over breakfast is doing continual learning the systems do not yet have. The child uses the word correctly at dinner the same day, from the single example a model would need a millionfold more data to match. The family routine is the format. Current models have not caught up to a kid in a kitchen.

Memory is the second word. The common assumption is that memory is a capacity problem. A million-token context window seems to solve it. Capacity is plenty. The problem is what gets stored and how it gets retrieved.

An agent processing live video naively at a million tokens covers about twenty minutes. Real life over a month or two does not fit. The shape of the fix is a retrieval system that scales independently of the working window.

Like a librarian who has read every book in the building, brute-force storage costs you on the lookup. The librarian still has to walk to the shelf. The model still has to scan the window. Perfect recall is not free in either case.

Long-term reasoning and consistency are the other two words. Long-term reasoning is chains of inference that hold across many steps. Consistency is performance that does not swing from olympiad-level on Monday to elementary error on Tuesday.

At the implementation level, the four words are not philosophy. Continual learning is a gradient that updates without catastrophic forgetting. Memory is a retrieval index that scales beyond the context window. Long-term reasoning is an inference loop that holds context across many steps. Consistency is a self-check that catches contradictions before they ship. Each is a tractable target.

Just as important, the list names what is not on it. Pre-training is on the architecture. Reinforcement learning from human feedback is on the architecture. Chain of thought reasoning is on the architecture. The agent paradigm built on top of these is on the architecture. The list of pieces that work is longer than the list of pieces that are missing.

Why do AI models fail at simple problems but solve hard ones?#

Jagged intelligence names the paradox. A frontier model solves a gold-medal International Mathematical Olympiad (IMO) problem on one task. The same model makes an elementary math error on another, depending on phrasing.

Comparison structure showing the same model passing gold-medal Mathematical Olympiad problems and failing elementary math, with missing introspection labeled as the shared mechanism. — Same model. Same day. The gap is in the self-check, not the knowledge.

In the elementary case the model is not stupid. The model is missing the introspective check that would say “this answer cannot be right.” The check is the gap. The knowledge was always there.

The chess example is the same mechanism in another domain. The head of Google DeepMind plays chess against leading foundation models and watches the thinking traces on the screen at his desk. The model considers a move. Recognizes the move is a blunder. Cannot find anything better. Plays the blunder anyway.

On the surface the behavior looks like overthinking. Underneath it is missing introspection. The model loops where the human would notice the loop and break it. The check the human runs unconsciously is the check the model does not yet have at all.

Like a runner whose form breaks for one stride in every hundred, the model performs at the top of the field most of the time. The failure is the one stride, not the stamina. The fix is the check the runner does not yet run on that stride alone. The repair is local to the moment of failure, not to the whole run.

The mechanism is the same in both cases. Gold-medal-then-error. Consider-blunder-then-play-it. The system is missing the self-check that would say “the answer I am about to give cannot be right.” Introspection on the thought process is the candidate for the missing piece. Whoever builds that check builds the thing the timeline is waiting for.

How good are AI agents in 2026?#

AI agents are at the beginning of the curve, not the middle. The valuable workflows have emerged in the last few months. Before that, agents were nice demonstrations. The best engineers are doing roughly 500 to 1000 times the work they did six months ago. That figure is the current top users, not the average user, and far more than most will ever see.

Today’s agents are useful for parts of tasks but cannot yet adapt to the specific context they are dropped into. Fire and forget fails. That gap closes when continual learning closes, which puts agents on the same clock as the rest of the four-word list.

Like a carpenter handed a tool that landed on the workbench yesterday, the best users are learning what the tool actually does at the same time they ship work with it. The tool is more capable than the carpenter realizes for some jobs. The tool is less capable for others. Discovery and delivery happen in the same hour at the same desk.

That 1000x figure is real for only some engineers and some tasks. The figure is not yet the average engineer or the average task. The lag is the human side, which means taste, craft, and a fluency with the new tools that none of us had six months ago. The valuable workflow always emerges months after the technology is good enough, not when it is good enough.

This pattern is older than the technology itself. Spreadsheets were good enough years before the spreadsheet-driven business arrived. Search was good enough years before the search-driven business arrived. The technology lands. The workflow takes months to find. The discovery is what makes the productivity claim real for the rest of us.

A household watching a parent type into a chat window at the kitchen table is watching the workflow being discovered in real time. The kid asks what the parent is doing. The parent says the truth: still figuring it out. That is the honest answer for the year.

AGI is not inevitable and it is not impossible. It is a short list of named research problems with active work on each. The bet on whether existing techniques will close the list or new ideas are needed is fifty-fifty. The arrival year that follows is around 2030. The right disposition is to build as if that is true.

That four-word list is the working frame for the year. Continual learning. Long-term reasoning. Memory. Consistency. The reader who keeps the list in mind reads every AI headline of 2026 with a sharper filter than the headlines themselves provide.

Source: Google DeepMind, in conversation at Y Combinator, 2026.

Questions readers ask

Seven questions on this essay.

01 When will AGI arrive?

Around 2030 is the timeline named by the head of Google DeepMind. The reasoning is concrete, not speculative. The remaining distance to AGI is a short list of named research problems: continual learning, long-term reasoning, memory, and consistency. Each problem has active research work on it across DeepMind, OpenAI, Anthropic, and other frontier labs. The internal bet on whether existing techniques will close the list with incremental innovation, or whether one or two new big ideas are required, is roughly fifty-fifty. Whichever path closes the list, 2030 plus or minus two years is the bracket the people building the systems use when they plan their own roadmaps.

02 What is still missing from current AI to reach AGI?

Four named gaps. Continual learning is a system that integrates new knowledge without catastrophic forgetting and adapts to the specific context it is dropped into. Long-term reasoning is chains of inference that stretch beyond what fits in a single context window. Memory is retrieval that scales past the working memory paradigm currently used. Consistency is performance that does not swing from gold-medal Olympiad to elementary math errors depending on how a question is phrased. Each is a known research problem. Each has active work across the frontier labs. The remaining distance to AGI is this list, named, with active research on each item.

03 Why do AI models fail at simple problems but solve hard ones?

The phenomenon is called jagged intelligence and the underlying mechanism is missing introspection. A model solves a gold-medal International Mathematical Olympiad problem on one task. The same model makes elementary math errors on another, depending on phrasing. The failure is not missing knowledge. It is missing the self-check that would say the answer about to be given cannot be right. The chess example is the same mechanism: the model considers a move, recognizes it as a blunder, finds nothing better, and plays the blunder anyway. Introspection on the thought process is the candidate for the missing piece in reasoning.

04 What is continual learning in AI?

Continual learning is the ability of an AI system to integrate new knowledge into its existing knowledge base without retraining and without catastrophic forgetting of what it already knew. Humans do this naturally. Experiences during the day get consolidated during sleep, especially during the dreaming phase, when important episodes get replayed. Today's AI systems mostly cannot. They use workarounds like very large context windows or retrieval augmentation, but the workarounds recognize themselves as workarounds. Continual learning is the missing piece that would let an AI agent adapt to a new context on its own and continue improving without a fresh training run from the operator.

05 Why is AI memory still limited even with large context windows?

Because the bottleneck is retrieval, not capacity. Even at one million tokens of context, the cost of finding the relevant piece of context for the specific decision a model is making right now is non-trivial. The current practice is brute force: store everything, search across it. An agent processing live video naively at one million tokens covers about twenty minutes, which is not enough for any task that spans days or weeks. The shape of the solution is not a bigger window. It is a memory retrieval system that scales independently of the context window, which puts memory on the same gap list as continual learning.

06 How good are AI agents in 2026?

AI agents are at the beginning of the curve, not the middle. The valuable workflows have emerged only in the last few months. Before that they were nice demonstrations that did not justify the input. The best engineers are doing roughly 500 to 1000 times the work they did six months ago, but this is the current best users, not the average user. The rest will catch up over six to twelve months as the workflows mature. Today's agents are useful for parts of tasks but cannot adapt to the specific context they are placed in. Fire and forget does not work yet.

07 Can current AI architecture scale all the way to AGI?

This is the fifty-fifty bet inside the leading frontier labs. One path: existing techniques, large-scale pre-training, reinforcement learning, chain of thought, agents, scale to AGI with incremental innovation on continual learning, memory, reasoning, and consistency. The other path: one or two new big ideas are required to close the list. The head of Google DeepMind puts the odds at roughly fifty-fifty between these two paths. Either way, the current architecture is likely part of the final shape, not a dead end that gets discarded later. The disagreement is about whether additions on top of the current architecture are enough.

About the author

Hanh D. Brown, writer.

Hanh D. Brown writes on AI, aging, and the decisions in between. Twenty years building systems for life-stage choices, now writing the publication with time to ask why.

Published June 29, 2026

Subscribe: a new essay when it's finished, never before. Join readers thinking about AI, aging, and the decisions in between.

Subscribe From the work See the work

When will AGI arrive?#

What is still missing from current AI to reach AGI?#

Why do AI models fail at simple problems but solve hard ones?#

How good are AI agents in 2026?#

Seven questions on this essay.

AI Orchestration: The Blind Spot Is Not the Model

AI Sample Efficiency Is Why Humans Still Learn Faster

AI Productivity: Five Percent of Your Job, Leveraged 20x

Open Weights AI: Distribution Decides Whose Values Win