专题
返回专题列表

AI Agents: From Chat to Action

The evolution of autonomous AI agents from an open-source experiment to a consumer product category

01. The Prototype: Auto-GPT 02. The Plumbing: Infrastructure for Agents 03. The First Agents That Work 04. From Code to Culture

In March 2023, a solo developer released an open-source experiment called Auto-GPT — an AI that could set its own goals and pursue them autonomously. It was buggy, expensive, and often useless. Three years later, millions of people were using AI agents — Claude Code, OpenAI Operator, OpenClaw — to write production code, browse the web, and manage their daily tasks. This arc traces the journey from prototype to infrastructure to product: how a hacked-together demo revealed a hunger that the entire industry scrambled to satisfy, and how AI crossed the chasm from conversational to autonomous.

01. The Prototype: Auto-GPT

In March 2023, developer Toran Bruce Richards released Auto-GPT, an open-source experiment built on GPT-4's API. Instead of a chatbot that responded to prompts, Auto-GPT inverted the paradigm: give it a goal ('research the top AI companies and write a report'), and it would autonomously chain LLM calls, web searches, file operations, and code execution until the goal was achieved. It went viral on GitHub — the first widely accessible demonstration of an 'AI agent.' The project had severe limitations: it frequently got stuck in loops, hallucinated confidently, and racked up API costs. The ChaosGPT experiment — where someone tasked it with 'destroying humanity' — brought mainstream media attention. But its impact was disproportionate to its technical maturity. Auto-GPT introduced the concept of 'agents' to a generation of developers who had only known 'chatbots.' It demonstrated a latent demand — developers didn't just want AI to talk; they wanted AI to do. The prototype was crude, but the direction was unmistakable.
Key Insight

The first agent was a hack — and that's exactly why it mattered. It proved demand before the infrastructure existed.

02. The Plumbing: Infrastructure for Agents

Auto-GPT proved that agents were wanted, but it also proved that point solutions wouldn't scale. Every tool integration required custom code; every AI model needed its own adapter; there was no standard way for agents to interact with the world. Two developments in late 2024 addressed this. In November, Anthropic open-sourced the Model Context Protocol (MCP) — a standardized protocol for connecting AI systems to external tools, data sources, and services. Any MCP-compatible agent could now connect to any MCP server: Google Drive, Slack, GitHub, databases. In December, Google launched Gemini 2.0 with Project Mariner, a research prototype demonstrating an AI agent that could browse the web and complete tasks inside the browser. Project Mariner showed what agents could do when they had the right tools — navigating websites, filling forms, managing multi-step workflows. Together, MCP and Mariner addressed the two prerequisites for practical agents: a standard way to connect to tools, and a demonstration of what that connectivity enabled. The plumbing was being laid. The agent era was no longer a question of 'if' but 'how fast.'
Key Insight

The agent revolution needed a protocol as much as it needed models — MCP was the unsung infrastructure that made everything else possible.

03. The First Agents That Work

January and February 2025 brought the first wave of production-ready agents. OpenAI launched Operator on January 23 — an AI agent that could browse the web autonomously, fill forms, book tickets, and complete multi-step online tasks. Unlike Auto-GPT's fragile chaining of API calls, Operator interacted with web pages visually, using a model trained to 'see' and 'click' like a human user. It was limited (slow, expensive, sometimes wrong) but it worked — the first agent that a non-technical user could hand a real task. A month later, Anthropic released Claude Code — a terminal-based coding agent that could understand entire codebases, write patches, and execute multi-step development tasks. Unlike earlier AI coding tools (Copilot, Cursor) that assisted within an IDE, Claude Code functioned as an autonomous agent: given a bug report, it would find the issue, write a fix, test it, and submit a pull request. By mid-2025, Claude Code had become the first agent that real development teams trusted with production code. Operator and Claude Code revealed that the agent market had two entry points: the web (for everyone) and the terminal (for developers). Both had crossed the threshold from 'interesting demo' to 'useful tool.'
Key Insight

The first production agents entered through two doors: web browsing for everyone, and coding for developers.

04. From Code to Culture

As agents proved themselves in the terminal, their influence began changing how people thought about programming itself. In February 2025, OpenAI co-founder Andrej Karpathy coined the term 'vibe coding' — the practice of letting AI write code while the human merely describes what they want, sometimes without even reading the output. The term went viral because it named something real: agents were not just assisting programmers; they were redefining what it meant to program. By January 2026, agents had reached consumers. OpenClaw, a personal AI agent that could manage schedules, browsing, shopping, and communications, went viral — creating the first mainstream personal AI agent phenomenon. Unlike Claude Code (for developers) and Operator (for web tasks), OpenClaw was a general-purpose personal assistant that millions of non-technical users adopted into their daily lives. OpenClaw's virality proved that agents were not just a developer tool or a business product — they were becoming a consumer category. The journey from Auto-GPT's rough prototype to OpenClaw's mainstream adoption took less than three years. In that time, agents had evolved from a conceptual demonstration to an infrastructure layer to a set of production tools to a consumer phenomenon. The shift from 'chat' to 'action' was complete.
Key Insight

Agents crossed from developer tool to consumer product in under three years — faster than the smartphone itself.

Conclusion

The agent transformation happened with breathtaking speed. In March 2023, the concept of an 'AI agent' was a buggy open-source experiment. By January 2026, millions of people were using agents daily for work and personal tasks. The infrastructure — MCP, visual browsing models, terminal agents — was built in real time, in response to demand that Auto-GPT had revealed. The shift from chat to action is perhaps the most consequential product transition in AI since ChatGPT itself. It changes the fundamental relationship between humans and AI: from asking to delegating. The question that remains is not whether agents will become ubiquitous — that trajectory is set. The question is what happens when agents start delegating to other agents, and humans become spectators in the loops they once controlled.