The Fate of AI Agents

Question for ChatGPT:

During the development of new media in the 1980s, Videotex was expected to be the leading new medium, but that did not actually come to pass. During the multimedia boom around 1990, interactive media using CD-I and CD-ROM were developed in droves, but they disappeared within a few years. It can be said that this vision was ultimately realized in the form of the World Wide Web (WWW) with the commercialization of the Internet starting in 1994. In the world of technology, it is not uncommon for initial ideas to be realized only two or three generations later. AI itself, having gone through two booms in the 1950s and 1980s, has finally reached a practical level in this third boom. However, not everything we see in the current AI boom will likely endure. I believe agents are at the very core of this, and I would like to sort out what is real and what is an illusion.

ChatGPT’s response:

You are exactly right. I think the best way to understand this is through the pattern that “the idea itself is real, but the first implementation form disappears.”

To state the conclusion first, the core that will remain from the current AI boom is not simply “agents,” but rather an execution intelligence layer that can receive human intent, refer to organization-specific context, use external tools, record its intermediate process, and advance work in a form that humans can verify.

On the other hand, much of what is now being discussed — such as “autonomous agents will do all company work on their own” or “if multiple agents gather, advanced intelligence will naturally emerge” — is largely fantasy.

1. Historically, what fails first is not the “idea,” but the “form”

Videotex had a fairly correct idea: electronic information services for households. However, terminals, communication speed, display capability, content distribution, billing, and standardization were all immature. Britannica describes videotex as an early end-user information system that delivered information to home terminals via telephone lines or cable.

CD-i and CD-ROM multimedia were similar. The idea of interactive educational, entertainment, and information content was valid. But production costs were high, content could not be easily updated, distribution depended on physical media, and network effects were weak. CD-i was expected in 1991 as an interactive multimedia device for households, but ultimately failed to create a major market.

Later, with the emergence of the WWW, the same concept of an “interactive information space” was finally realized through open standards, browsers, URLs, links, search, low-cost publishing, continuous updating, and network effects. Mosaic made it easier for general users to access web pages and image libraries in 1993, and the arrival of Netscape in 1994 popularized the Web.

In other words, what failed was not “new media” itself. What failed was a closed, heavy, non-updatable implementation form that could not make users, creators, and distributors move together at the same time.

2. The same thing will happen with AI agents

The current agent boom is probably the same.

The broad direction — “AI thinks, investigates, judges, and acts on behalf of humans” — is real. However, many of the agent products we see today are probably not the final winning form. They are more like the videotex phase or the CD-ROM phase.

In fact, Gartner stated in 2025 that AI agents and AI-ready data were among the fastest-moving technologies in AI, while also placing them at the “peak of inflated expectations.” Gartner also predicted that by the end of 2027, more than 40% of agentic AI projects would be canceled due to rising costs, unclear business value, or insufficient risk control.

McKinsey also notes that although AI adoption is spreading, only a minority of companies have deeply embedded advanced capabilities such as agents into business workflows. In its 2026 AI trust survey, McKinsey identified security and risk as the biggest barriers to scaling agentic AI, with inaccuracy and cybersecurity among the main risks.

Industrial research also shows that many companies are still at the stage of “AI assistants” or “AI support,” while full-scale multi-agent orchestration remains rare. The central issue is not only capability, but the lack of mechanisms for verifying outputs and safely integrating them into production workflows.

3. What will remain as “real”

I think the following five things will remain.

1. Natural language interfaces

This is unquestionably real.
Humans will no longer directly operate programs, search formulas, or business screens. Instead, they will express intent in natural language. This is a major UI shift after the GUI, the Web, and smartphones.

However, the chat screen itself may not be the final form. In the future, natural language operation will be embedded into business applications, email, CRM, IDEs, internal wikis, BI, accounting, legal systems, and so on.

2. Tool-using AI

The shift from “AI that answers” to “AI that operates” is real.
Searching, reading files, creating tables, writing code, adjusting schedules, drafting emails, calling APIs — all of these clearly have practical value.

However, this will probably settle not as a general-purpose autonomous agent, but as business-specific agents with limited permissions and clearly defined scope.

3. Context layers and memory layers

This is the most important point.
The difference in AI capability will be determined not only by the model itself, but by what context is given to it, what memory is retained, what history is referenced, and what judgment criteria it possesses.

Current RAG is still mostly search-oriented. In the future, what will matter is not just document search, but a layer that preserves organizational decision history, conceptual structures, policies, constraints, failure cases, and tacit knowledge. This is highly consistent with the direction of ThinkNavi and ConceptMiner.

4. Human-in-the-loop execution workflows

Rather than complete autonomy, what will remain for the time being is a form in which “AI advances the work, and humans approve at key points.”

Research, summarization, comparison, hypothesis generation, document creation, code generation, checklist creation, task extraction from meeting minutes, and drafting customer response proposals are all strong use cases.

What matters is not just the AI’s output, but whether the intermediate process, grounds, reference information, reasoning, and change history remain visible.

5. Evaluation, auditing, and governance

If agents execute business work, “verifiability” becomes more important than model performance.

Research has also pointed out that agents can hallucinate at each stage of task execution, making detection, mitigation, and monitoring necessary.

In other words, the real market for AI agents is not “clever conversation,” but a business execution infrastructure that can be safely entrusted with work.

4. What is likely to disappear as fantasy

1. The fantasy of the “fully autonomous employee”

The idea that AI can behave like a human employee — understanding vague instructions, grasping context, considering stakeholders, judging risks, handling exceptions, and completing work responsibly — is still far away.

AI is extremely strong at partial tasks. But the difficulty of company work lies not only in knowledge, but also in responsibility, priorities, politics, exception handling, tacit agreements, and explanation when things go wrong.

2. The fantasy that “multi-agent systems automatically become intelligent”

When multiple AIs are given roles and made to hold discussions, the result can look sophisticated.

However, without shared context, evaluation criteria, role boundaries, stopping conditions, and verification mechanisms, they merely consume a large number of tokens.

The essence of multi-agent systems is not the number of agents. It is shared memory, role design, task decomposition, evaluation functions, consensus formation, and audit logs.

3. The fantasy that “prompts alone become products”

Prompt collections, wrapper-style chatbots, and general-purpose agent builders may sell in the short term, but they are likely to become commodities in the long term.

That is because their functions will be absorbed by the model platforms themselves.

What will remain is not prompts, but business data, proprietary workflows, customer context, evaluation data, execution permissions, and accumulated decision history.

4. The fantasy of an “agent marketplace”

A model that sells general-purpose agents side by side may survive to some extent, but it is unlikely to become the main battlefield.

That is because value does not arise from the agent alone. It arises when the agent is connected to a specific company, specific business process, and specific data.

This resembles how many CD-ROM titles were created, but most were eventually absorbed into the Web. Generic agents may also ultimately be absorbed into operating systems, browsers, business SaaS, and LLM platforms.

5. The real target is not “agents,” but the “agency stack”

I see the structure that will remain as follows.

Layer	What will remain	What is likely to disappear
Model	High-performance LLMs and multimodal models	Small, undifferentiated proprietary models
UI	Natural language operation and AI embedded in business tools	Mere chat windows
Memory	Organization-specific context, decision history, conceptual structures	Long chat histories left as-is
Execution	APIs, tools, and workflow integration	Demo-style automatic operation
Management	Permissions, approvals, logs, audits	Black-box autonomous execution
Evaluation	Testing, evidence checking, reproducibility	“The AI said so”
Business integration	Embedding into specific business processes	Standalone general-purpose agents

In other words, what is real is not the agent itself, but the infrastructure that makes agency possible.

6. From the perspective of ThinkNavi and ConceptMiner

Seen from this perspective, it would be stronger to position ThinkNavi and ConceptMiner not as “AI agent products” head-on, but as follows:

ThinkNavi = a workspace where humans and AI think strategically together
ConceptMiner = a conceptual structure, long-term memory, and context-modeling layer that agents can refer to

What is lacking in the world of agents is not simply LLM capability. What is lacking is “structured context” that answers questions such as:

What assumptions should decisions be based on?
Which past decisions should be referenced?
Which concepts are close to each other?
Where are the gaps in thinking?
Which hypotheses are weak?
Why did we reach this conclusion?
What should be verified next?

If ConceptMiner’s GNG+MST, concept maps, structural extraction from chat histories, FactCollector, and LLM Wiki are connected to this layer, it becomes an agent-oriented context infrastructure that is higher-level than ordinary RAG.

7. A practical checklist for distinguishing what is real

When evaluating AI-agent-related technologies or businesses, the following questions are useful.

Closer to real

Is it deeply connected to the execution of a specific business process?
Does it have proprietary data, proprietary context, and proprietary workflows?
Are human approval points designed into the process?
Are evidence and operation logs preserved?
Can causes be traced when failure occurs?
Does evaluation data accumulate?
Does value remain even if the model changes?
Does the customer feel not that it is “interesting,” but that time, cost, or risk has been reduced?

Closer to fantasy

It looks impressive in demos, but there is no explanation of production operation.
It emphasizes “full autonomy” too much.
It differentiates only through prompts and UI.
Its explanations of auditing, permissions, and security are weak.
It merely lines up many general-purpose agents.
Responsibility in case of failure is unclear.
It seems easily replaceable by a model update.
Customer-specific memory and context do not accumulate.

8. My conclusion

What will remain from the current AI boom is not simply the AI agent, but the system of context, memory, execution, verification, and auditing that allows work to be entrusted to AI.

Just as the essence of 1980s new media was “interactive access to information from the household,”
and just as the essence of 1990s multimedia was “an interactive knowledge experience integrating text, images, audio, and video,”
the essence of today’s AI agents is probably this:

Transforming human intent into verifiable action in light of organization-specific context.

In that sense, today’s “agents” are still in the CD-ROM phase.

What will truly remain is the equivalent of the WWW: an open and connectable execution infrastructure — in other words, a common layer through which AI connects to tools, knowledge bases, business systems, and human decision-making processes.