Key Points

  • Agentic AI requires a complete redesign of enterprise workflows to unlock its full potential, with a focus on reliable solutions to avoid complexity, according to Sohrab Rahimi, a partner at McKinsey & Company who spearheads AI initiatives in the firm's AI division.

  • CIO News spoke with Sohrab about the attribution problem in multi-agent systems, where failures are hard to trace back to their origin.

  • He emphasized the importance of defining success metrics and prioritizing linear processes for effective agentic AI implementation.

Agentic AI’s full potential isn't unlocked by simply dropping agents into legacy workflows. It demands a wholesale rethinking of how enterprises design and run their core processes. Every application has to be scrutinized, every risk weighed, and only the most reliable agentic solutions woven into the foundation of rearchitected workflows. A rollout with any less structure risks compounding complexity instead of creating clarity.

We spoke with Sohrab Rahimi, a leading voice in enterprise AI strategy as a Partner at McKinsey & Company who spearheads AI initiatives at the firm's AI division, QuantumBlack. Drawing on his experience architecting large-scale AI systems and following up on McKinsey's Seizing the Agentic Advantage report, Sohrab argued that the challenge often begins and ends with proper attribution, traceability, and well-planned orchestration.

  • The attribution problem: "One of the primary issues of fully autonomous agents is cascading failure. If one agent fails, it triggers a failure in the next one, then another, and it trickles down. It becomes very hard to understand where it started and where it ended." One of the biggest risks, he explained, is how fragile these systems can be once multiple agents are strung together, particularly when being bolted on to legacy systems retroactively. In a recent LinkedIn post, Sohrab laid out the weak spots and inherent risks in User-to-Agent, Agent-to-Agent, and Agent-to-Environment implementations. "When something goes wrong, it's very hard for the developer to judge who's at fault. LLM-based agents hallucinate in stealthy ways. Even with the best and latest models, that problem has not been solved."

Even with rigorous tracking across three pillars like metrics, logs, and traces, Sohrab argued that the black-box nature of multi-agent orchestration makes true observability elusive without the proper frameworks and guardrails in place.

  • Three types of agents: "You have three different high-category aspects of agents. One is agents as sensors, which simply observe and don't actually perform tasks. The second type would be agents that do tasks, like leveraging Google Maps to calculate a distance. The third one is agents as decision-makers, which is the trickiest part. It handles the planning and orchestration of different workflows." It is this third category that compounds the attribution challenges and forces enterprises to decide which tasks are even worth attempting with agents in the first place.

  • Defining success: "Sometimes success is subjective. An agent plans something and you would say maybe it’s good, maybe it’s not. But other times the definition of success is very clean: did the agent achieve this particular metric or not?" In cases of subjectivity, he advises leaders to seriously consider which applications of agentic systems are truly appropriate, since the absence of clear metrics can make both accountability and improvement nearly impossible. Put differently, AI implementation for the sake of AI implementation is not a strategy.

Once a solid use case for agents is defined and viability tests have been met, Sohrab's advice shifted to the opposite end of the spectrum, strongly cautioning against underutilizing agents. He said their highest value lies in the information they gather while working, and more data is usually better in the race to build technical moats.

  • Data as the differentiator: "We shouldn’t think of these agents only as doers of tasks and automation, but also as collectors of information." Sohrab stressed that the long-term advantage doesn’t come from the agent’s raw capability, but from the unique information they gather while working. In a customer service scenario, for example, a task-focused agent might simply authenticate a caller or resolve a billing issue. But a more advanced agent could also capture sentiment, context, and recurring frustrations. This data compounds into a deeper institutional understanding over time. "It’s no longer about capability alone. Data is the differentiator."

  • Connecting the org: "Think about doing this in every part of your organization—internal functions, external functions—and connecting these pieces of information. That’s where inefficiencies disappear. Information from one business unit can prevent another from repeating the same mistakes."

We shouldn’t think of these agents only as doers of tasks and automation, but also as collectors of information.

Sohrab Rahimi

Partner and AI Division Lead for QuantumBlack

McKinsey

Sohrab applies a "SMART framework" to assess agent fit, taking into account the scope and structure, metrics and measurement, access and accountability, risks and reliability, and finally the temporal length of the application. "Is the process linear and defined? Is there sufficient volume and quantifiable ROI? Are tools and APIs integrated and callable? Can failures be logged, audited, and contained? Is the task short, self-contained, and episodic?" Applied consistently, this framework provides enterprises with a disciplined method for determining where agentic AI is both suitable and responsible to deploy.

  • Prioritize linear processes: "In a thousand tasks, only maybe 500 of them are good candidates for agentic AI automation. Linear processes are always preferred to those where you have to make decisions with many different pieces of information at different times, interacting with lots of agents." The design solution, he said, is to prioritize simplicity. "Prioritize single agents linearly connected to one another where possible. If you must use a multi-agent framework, it has to be in a very controlled way. The goal is to package the whole thing into a single component so that if it fails, you have a backup plan. That Plan B should be a simplified mitigation strategy, like routing the failed task to a human agent."

  • The persistent black box: Even as newer foundational models hit the market like GPT-5, Gemini, Claude, Sohrab cautioned against viewing models alone as a silver bullet. Stronger models reduce the frequency of failure, but they don’t resolve the attribution problem at the heart of observability. "Better models will reduce the chance of failure, but they won't increase your ability to observe what happened inside the black box. You might go from a 70% chance of success to 90%, but you still don't know what happened in that 10%. That fundamental problem remains."

Looking ahead, Sohrab believes agentic systems will transform organizations in the same way cars transformed transportation. "We already see it in contact centers—automating simple portions of calls lets human agents spend more time on hard problems like tech support. That improves customer experience right away and creates more demand for human expertise."

By automating simpler tasks, human workers are freed to focus on higher-value challenges, creating a cycle of productivity and innovation. "When the car was invented, the people that used to ride horses professionally became drivers. These things happen over and over in civilizations, and innovation always proves to be a force for expansion. Think about how many more drivers we have today compared to the number of horse riders we had before the automobile."