AGI Timeline // Sober Four-Wave Assessment

Short version

Software shock now. Labor stress next. Physical disruption later. Beyond that is scenario territory.

AGI as overlapping waves, not a single threshold. Wave one is already visible in software. Wave two is a late-2020s labor transition. Wave three is early physical-economy disruption. After that: scenarios.

Waves overlap and amplify. They don't wait for the previous one to finish.

Waves

Signals

Theses

March 6, 2026

Curated update

Now through 2027

Software Disruption Now

Software disruption is already underway across writing, coding, support, research, and operations.

The practical question is not whether a clean AGI day arrives, but how much useful work shifts before the label catches up.

This wave starts first, but labor and physical-economy effects begin before it is finished.

2026 signal Signal

Agent benchmarks are improving faster than reliability in adversarial or long-horizon work.

Software systems are already taking on more writing, coding, and support tasks, but they still fail in operationally important ways.

Why it matters: Expect partial automation and labor reshaping before dependable autonomy.

Caveat: Benchmark gains are not the same thing as trustworthy end-to-end job replacement.

Open link

2026 deployment signal Signal

Cheaper frontier-style models widen the set of routine software tasks firms can automate.

Faster, cheaper models matter because they make automation practical in everyday workflows, not just demos.

Why it matters: The near-term spread of AI will be driven by deployment economics as much as by raw capability.

Caveat: Cheap inference expands usage, but it does not erase supervision, quality control, or integration costs.

Open link

2026-2027 framing Thesis

The right frame for the current moment is utility and labor effect, not waiting for a ceremonial AGI day.

Software shock is the part moving fastest right now.

Why it matters: The timeline should foreground work displacement and productivity shifts instead of theatrical AGI countdowns.

Caveat: Real utility can be economically disruptive even when systems are still uneven.

Open link

2027 through 2031

Broad Labor Stress

Labor-market stress becomes broad enough that it is hard to dismiss as isolated sector churn.

This is a transition period, not automatic collapse: painful reallocation, tighter management, and messy bargaining over where automation actually sticks.

Software disruption keeps spreading while labor stress rises unevenly across occupations, firms, and regions.

Late 2020s transition Thesis

The late 2020s are the first plausible window for broad labor stress from cumulative software automation.

Broad labor stress can arrive without a net economic collapse.

Why it matters: The transition may feel painful because firms can reallocate labor faster than workers can retrain or move.

Caveat: A turbulent reallocation period is not the same as instant permanent unemployment.

Open link

2026 warning signal Signal

Research attention is shifting toward whether agent performance maps to real work rather than benchmark abstractions alone.

The important question is increasingly whether AI systems can do real work that organizations will trust.

Why it matters: Labor stress becomes more plausible once the conversation moves from demos to workflow fit.

Caveat: Real-work evidence still has to survive integration, compliance, and management friction.

Open link

Policy response window Thesis

Management tightening, retraining efforts, and policy fights arrive before any clean long-run equilibrium.

Firms and governments will likely improvise through the labor shock rather than meet it with one coherent plan.

Why it matters: Expect a stretch of uneven rules, retraining pushes, and disputes over where automation is allowed to land.

Caveat: Policy can slow deployment in some sectors while accelerating it elsewhere.

Open link

2030 through 2035

Physical-Economy Disruption

The first plausible window for broad physical-economy disruption is early in the 2030s, through robots, autonomous logistics, and tightly managed deployment.

Industrial robot momentum is real, but it does not prove that general-purpose humanoids flood the economy tomorrow.

Physical deployment stacks on top of software and labor shocks rather than waiting for them to conclude.

2026 robotics signal Signal

Embodied and humanoid benchmarks are improving, but mostly inside constrained tasks and controlled environments.

Robot progress is worth taking seriously, but today it looks more like narrow industrial momentum than universal physical autonomy.

Why it matters: The physical-economy shock should be modeled as staged deployment in high-ROI settings first.

Caveat: A better benchmark or demo is not proof of cheap, safe, mass deployment.

Open link

Early 2030s window Thesis

The first plausible broad physical-economy disruption window opens in the early 2030s, after software disruption is already established.

Physical disruption likely comes later than software disruption.

Why it matters: The early 2030s are the first plausible period for broad logistics and industrial effects.

Caveat: Industrial robot progress should not be overstated into a claim that humanoids will flood the whole economy tomorrow.

Open link

Infrastructure gating Thesis

Energy, maintenance, supply chains, and safety cases dominate the pace of physical deployment.

The physical-economy wave will be paced by energy, parts, maintenance, and regulation.

Why it matters: Even strong robot capability does not translate into instant economy-wide saturation.

Caveat: The bottleneck is industrial capacity and operational reliability, not only smarter models.

Open link

After 2035

Scenario Territory

Beyond 2035 the right frame is scenarios, not forecasts.

Energy, supply chains, regulation, and real-world reliability dominate the outer boundary more than abstract capability curves.

Long-run outcomes remain path-dependent on how the first three waves interact with power, politics, and industrial capacity.

2026 policy signal Signal

Policy discussion is already expanding toward energy, environmental cost, and governance constraints around advanced AI.

Long-run AI outcomes will be constrained by power, regulation, and environmental cost.

Why it matters: Past a certain point, governance and infrastructure matter as much as capability progress.

Caveat: Long-run scenarios can diverge widely because these constraints are political and industrial, not only technical.

Open link

Post-2035 scenarios Thesis

After 2035 the right framing is branching scenarios, not a single forecast line.

The farther out the timeline goes, the more humility matters.

Why it matters: Beyond 2035 the honest move is to compare scenarios, not to promise a single date.

Caveat: False precision is especially misleading once energy, supply chains, and regulation start to dominate the path.

Open link

Long-run branch point Thesis

Long-run divergence depends on the interaction of grid power, chip supply, regulation, and real-world reliability.

Very long-run outcomes hinge on industrial and political capacity, not just capability curves.

Why it matters: The same technical frontier can yield very different futures under different energy and governance conditions.

Caveat: Reliability failures or infrastructure scarcity can cap deployment long before abstract capability ceilings are reached.

Open link

Analyst view

Signals and theses, separated and inspectable

Curated dataset is the primary source. Synced feed is supplementary and doesn't set the timeline.

Last updated: March 6, 2026 · 12 curated entries · 5 signals · 7 theses

Monitoring desk

Current signals, major shifts, and background context

Manual lanes instead of a velocity feed.

Current signals

Major shifts

The right frame for the current moment is utility and labor effect, not waiting for a ceremonial AGI day. Software · High confidence
The late 2020s are the first plausible window for broad labor stress from cumulative software automation. Labor · Medium confidence
Management tightening, retraining efforts, and policy fights arrive before any clean long-run equilibrium. Policy · Medium confidence
The first plausible broad physical-economy disruption window opens in the early 2030s, after software disruption is already established. Robotics · Medium confidence
Energy, maintenance, supply chains, and safety cases dominate the pace of physical deployment. Energy · High confidence
After 2035 the right framing is branching scenarios, not a single forecast line. Policy · Low confidence

Background context

Software Disruption Now The right frame for the current moment is utility and labor effect, not waiting for a ceremonial AGI day.
Broad Labor Stress The late 2020s are the first plausible window for broad labor stress from cumulative software automation.
Physical-Economy Disruption The first plausible broad physical-economy disruption window opens in the early 2030s, after software disruption is already established.
Scenario Territory After 2035 the right framing is branching scenarios, not a single forecast line.

Saved context

Bookmarked signals and theses

Manual, local, and durable on this device.

No saved signals yet.

Live References

Sources worth keeping nearby

Labor-context panel for the wave-two reading.

Live reference Labor

AI Exposure of the US Job Market

Interactive labor map by occupation, useful for seeing where exposure concentrates before broader labor narratives flatten it.

Live reference Labor

Jobloss.ai

AI-related labor shift tracker for keeping an eye on displacement, adoption, and labor-pressure signals.

Live reference Theory

Theoretical Trajectories: AGI & the Post-Scarcity Question

A conceptual frame for how advanced AI systems could restructure the foundations of work, value, and collective well-being — moving from disruption to a post-scarcity society.

Live reference Theory

OpenAI and the Farming Analogy

On scaling, resource concentration, and what "farming intelligence" means for who benefits from AGI — relevant to UHI (universal human income/capacity) society models.

Now through 2027

Software Disruption Now

Track real workflow replacement, utility, and labor effects instead of headline capability alone.

2026 signal

Aging

Agent benchmarks are improving faster than reliability in adversarial or long-horizon work.

Signal Software High confidence Near-term Benchmark

Capability gains are real, yet benchmark-to-work translation remains uneven and should be measured directly in output, staffing, and exception rates.

Why it is present: It supports the claim that software disruption can spread before anyone can defend a clean AGI threshold story.

Role: Observed signal for the first wave.

Watch for: Sustained gains on messy repository-level work and customer-facing exception handling.

2026 deployment signal

Aging

Cheaper frontier-style models widen the set of routine software tasks firms can automate.

Signal Software Medium confidence Near-term Deployment

Cost-efficient releases make it easier for firms to experiment with AI across support, research, and operations, increasing real utility before any consensus on AGI.

Why it is present: Software disruption spreads when cost and latency improve enough for ordinary business use, not only when headline benchmarks jump.

Role: Observed deployment signal for the software wave.

Watch for: Falling per-task cost in coding, support, and document workflows.

2026-2027 framing

Aging

The right frame for the current moment is utility and labor effect, not waiting for a ceremonial AGI day.

Thesis Software High confidence Near-term Interpretation

A single AGI threshold is less informative than evidence of widening deployment, rising task completion rates, and changing labor demand.

Why it is present: It is the core interpretive move behind the four-wave refactor.

Role: Core thesis for the first wave.

Watch for: Role redesign before formal headcount reduction.

2027 through 2031

Broad Labor Stress

Watch hiring freezes, role compression, wage pressure, and the gap between output growth and headcount.

Late 2020s transition

Aging

The late 2020s are the first plausible window for broad labor stress from cumulative software automation.

Thesis Labor Medium confidence Mid-term Interpretation

The relevant threshold is not one model becoming magical. It is enough deployment across related occupations to alter wages, hiring, and bargaining power at once.

Why it is present: This is the second wave in the sober timeline: transition pressure rather than automatic collapse.

Role: Core thesis for the labor wave.

Watch for: Hiring freezes in white-collar support functions paired with stable or rising output.

2026 warning signal

Aging

Research attention is shifting toward whether agent performance maps to real work rather than benchmark abstractions alone.

Signal Labor Medium confidence Mid-term Labor signal

A shift in evaluation focus often precedes a shift in deployment and management decisions.

Why it is present: The labor wave should be tied to work exposure and organizational adoption, not only to claims about raw intelligence.

Role: Early warning signal for the second wave.

Watch for: Published comparisons between benchmark wins and real business task completion.

Policy response window

Aging

Management tightening, retraining efforts, and policy fights arrive before any clean long-run equilibrium.

Thesis Policy Medium confidence Mid-term Interpretation

Labor-market stress becomes socially salient when organizations redesign roles faster than benefits, training, and bargaining systems adapt.

Why it is present: The timeline should present labor stress as a contested transition period rather than a one-step collapse story.

Role: Institutional response thesis for the second wave.

Watch for: Sector-specific rules on AI use in education, health, law, and public administration.

2030 through 2035

Physical-Economy Disruption

Separate narrow, high-ROI deployment from general-purpose robotics hype.

2026 robotics signal

Aging

Embodied and humanoid benchmarks are improving, but mostly inside constrained tasks and controlled environments.

Signal Robotics Medium confidence Mid-term Benchmark

The strongest evidence points toward constrained environments with repeatable tasks, not open-world humanoid substitution at scale.

Why it is present: It anchors the robotics wave in actual momentum while pushing back on humanoid flood assumptions.

Role: Observed signal for the third wave.

Watch for: Warehouse, factory, and logistics deployments with uptime and safety data.

Early 2030s window

Aging

The first plausible broad physical-economy disruption window opens in the early 2030s, after software disruption is already established.

Thesis Robotics Medium confidence Mid-term Interpretation

The relevant shift is cumulative deployment in constrained physical settings, not a single dramatic robotics reveal.

Why it is present: It places the physical-economy shock later than the software shock while still acknowledging real robotics momentum.

Role: Core thesis for the third wave.

Watch for: Real fleet deployment metrics instead of staged demo videos.

Infrastructure gating

Aging

Energy, maintenance, supply chains, and safety cases dominate the pace of physical deployment.

Thesis Energy High confidence Mid-term Interpretation

The gap between a functioning demo and a scaled fleet is where many overconfident physical-economy forecasts fail.

Why it is present: It keeps the physical-economy wave anchored in infrastructure instead of science-fiction timelines.

Role: Infrastructure thesis for the third wave.

Watch for: Power availability, unit economics, and service network buildout.

After 2035

Scenario Territory

Treat long-run claims as branching scenarios shaped by infrastructure, governance, and social response.

2026 policy signal

Aging

Policy discussion is already expanding toward energy, environmental cost, and governance constraints around advanced AI.

Signal Policy Medium confidence Long-term Policy signal

The farther out the forecast goes, the more the credible question becomes system integration under policy and infrastructure limits.

Why it is present: It is an early signal that the outer boundary after 2035 is constrained by governance and energy, not only by model ambition.

Role: Early signal for the scenario territory wave.

Watch for: Energy use caps, reporting rules, and cross-border compute policy.

Post-2035 scenarios

Aging

After 2035 the right framing is branching scenarios, not a single forecast line.

Thesis Policy Low confidence Long-term Interpretation

Scenario analysis is more credible than point forecasting once the dominant constraints become political, industrial, and path-dependent.

Why it is present: It is the final wave's core instruction: stop pretending precision is forecast quality.

Role: Core thesis for the fourth wave.

Watch for: Divergence between regions with different grid, chip, and regulatory capacity.

Long-run branch point

Aging

Long-run divergence depends on the interaction of grid power, chip supply, regulation, and real-world reliability.

Thesis Energy Low confidence Long-term Interpretation

Once the timeline moves beyond the mid-2030s, infrastructure variables dominate enough that narrow AI forecasting becomes incomplete.

Why it is present: It captures the main caveat behind the final wave: the problem becomes systems integration at civilization scale.

Role: Long-run systems thesis for the fourth wave.

Watch for: Power, cooling, and chip availability as macro constraints.

Supplementary synced feed

Recent feed items

Last updated: March 24, 2026 · 600 cached items

NVIDIA Developer — Blog Recent

Building NVIDIA Nemotron 3 Agents for Reasoning, Multimodal RAG, Voice, and Safety

OpenAI — News Recent

Helping developers build safer AI experiences for teens

OpenAI — News Recent

Powering product discovery in ChatGPT

Engineer view

Bottlenecks, dependencies, and constraints

A dependency chain view — not just what gets smarter, but what can be deployed under cost, safety, infrastructure, and governance constraints.

Now through 2027

Software Disruption Now

Reliability, eval quality, tool orchestration, and inference cost determine how much software work actually moves.

Reliable multi-step task completionWorkflow-level evaluation instead of benchmark theaterLow-cost inference at production volume

Signal Software High confidence

Agent benchmarks are improving faster than reliability in adversarial or long-horizon work.

Long-horizon planning, environment shifts, and evaluator gaming remain active failure modes even as agent stacks improve.

Architecture note: Evaluation loops need adversarial conditions, state drift, and human handoff points.

Bottleneck: Reliable execution under ambiguity.

Dependencies

High-quality evals
Tool-use stability
Human fallback paths

Unlocks

Wider deployment in coding and operations

References

TraderBench: How Robust Are AI Agents in Adversarial Capital Markets? open
SWE-Hub: A Unified Production System for Scalable, Executable Software Engineering Tasks open

Signal Software Medium confidence

Cheaper frontier-style models widen the set of routine software tasks firms can automate.

The operational threshold for adoption is often latency, reliability, and price per useful task rather than absolute benchmark standing.

Architecture note: System design shifts toward orchestration, retrieval, and workflow guardrails when model cost falls.

Bottleneck: Integration quality and exception handling.

Dependencies

Low-latency inference
Production monitoring
Workflow-specific prompts and tools

Unlocks

Routine deployment in internal ops and customer support

References

Gemini 3.1 Flash-Lite: Built for intelligence at scale open

Thesis Software High confidence

The right frame for the current moment is utility and labor effect, not waiting for a ceremonial AGI day.

Teams can absorb brittle systems if they still deliver enough net output in bounded workflows.

Bottleneck: Measuring useful task completion instead of demo success.

Dependencies

Production telemetry
Clear human escalation paths

Unlocks

A credible read on when software disruption becomes macro-relevant

References

TraderBench: How Robust Are AI Agents in Adversarial Capital Markets? open

Carries into next wave.

2027 through 2031

Broad Labor Stress

Deployment quality matters because partial autonomy changes org charts before it delivers full replacement.

Sustained software deployment in business workflowsManagerial willingness to redesign jobs around AI systemsInstitutional responses in training, benefits, and regulation

Thesis Labor Medium confidence

The late 2020s are the first plausible window for broad labor stress from cumulative software automation.

Partial autonomy can compress teams even when full automation is unavailable.

Bottleneck: Deployment quality across varied workflows.

Dependencies

Low-cost AI operations
Managerial process redesign
Compliance acceptance

Unlocks

Broader labor substitution pressure

References

How Well Does Agent Development Reflect Real-World Work? open

Signal Labor Medium confidence

Research attention is shifting toward whether agent performance maps to real work rather than benchmark abstractions alone.

Work-mapping requires evaluators that capture exceptions, interruptions, policy constraints, and collaboration overhead.

Bottleneck: Good task models for real organizations.

Dependencies

Representative workflow datasets
Human review loops

Unlocks

More credible forecasts of labor exposure

References

How Well Does Agent Development Reflect Real-World Work? open

Thesis Policy Medium confidence

Management tightening, retraining efforts, and policy fights arrive before any clean long-run equilibrium.

Compliance, auditability, and fallback design become deployment requirements once AI systems start touching regulated work.

Bottleneck: Operational trust in regulated settings.

Dependencies

Traceability
Policy clarity
Escalation design

Unlocks

Broader but more tightly managed deployment

References

The Global Landscape of Environmental AI Regulation open

Carries into next wave.

2030 through 2035

Physical-Economy Disruption

Safety cases, fleet economics, maintenance, energy, and real-world reliability dominate the pace.

High-uptime robotic systems in constrained environmentsAutonomous logistics with acceptable incident ratesEnergy, parts, and maintenance capacity for fleet rollouts

Signal Robotics Medium confidence

Embodied and humanoid benchmarks are improving, but mostly inside constrained tasks and controlled environments.

Embodied systems face compounding error from sensing, actuation, maintenance, and safety constraints that software-only systems can often bypass.

Bottleneck: Real-world reliability at fleet scale.

Dependencies

Safety validation
Maintenance loops
High-quality teleoperation fallback

Unlocks

Constrained physical deployment in logistics and industry

References

Scaling Tasks, Not Samples: Mastering Humanoid Control through Multi-Task Model-Based Reinforcement Learning open
RMBench: Memory-Dependent Robotic Manipulation Benchmark with Insights into Policy Design open

Thesis Robotics Medium confidence

The first plausible broad physical-economy disruption window opens in the early 2030s, after software disruption is already established.

Physical systems compound software risk with uptime, maintenance, liability, and hardware replacement cycles.

Bottleneck: Economically viable deployment outside lab conditions.

Dependencies

Constrained-environment success
Fleet monitoring
Replacement-part logistics

Unlocks

Warehouse and industrial transformation

References

Scaling Tasks, Not Samples: Mastering Humanoid Control through Multi-Task Model-Based Reinforcement Learning open

Thesis Energy High confidence

Energy, maintenance, supply chains, and safety cases dominate the pace of physical deployment.

Physical AI inherits all the constraints of industrial systems: maintenance windows, safety audits, spare parts, grid power, and local site integration.

Bottleneck: Fleet economics under real service conditions.

Dependencies

Reliable power
Parts supply
Field service capacity

Unlocks

Scaled deployment beyond pilots

References

The Global Landscape of Environmental AI Regulation open

Carries into next wave.

After 2035

Scenario Territory

The bottleneck shifts from model novelty to system integration across energy, hardware, policy, and safety.

Power and chip supply at civilization scaleGovernance that can survive uneven deploymentReliability in open-ended real-world systems

Signal Policy Medium confidence

Policy discussion is already expanding toward energy, environmental cost, and governance constraints around advanced AI.

Outer-boundary deployment depends on whether energy, land, cooling, and regulatory approval can scale with ambition.

Bottleneck: Infrastructure and governance coordination.

Dependencies

Grid buildout
Reporting standards
Cross-border supply resilience

Unlocks

A wider range of plausible long-run deployment paths

References

The Global Landscape of Environmental AI Regulation open

Thesis Policy Low confidence

After 2035 the right framing is branching scenarios, not a single forecast line.

Long-run technical trajectories are tightly coupled to non-model variables that cannot be extrapolated from benchmark trends alone.

Bottleneck: Forecast instability under changing assumptions.

Dependencies

Scenario planning
Cross-domain systems modeling

Unlocks

A more defensible long-run planning frame

References

The Global Landscape of Environmental AI Regulation open

Thesis Energy Low confidence

Long-run divergence depends on the interaction of grid power, chip supply, regulation, and real-world reliability.

The limiting resource becomes the full stack of power, cooling, hardware throughput, maintenance, safety, and regulatory permission.

Bottleneck: Civilization-scale systems integration.

Dependencies

Grid growth
Chip supply resilience
Robust safety governance

Unlocks

Any credible long-run deployment path

References

The Global Landscape of Environmental AI Regulation open