Transforming a $20B Data tool into an Agentic OS

Director of Product Design at CommerceIQ. 2024 to 2026. Leading a platform through its hardest reinvention, and the leadership calls that made it ship.

Every transformation starts with a question someone finally has the nerve to ask out loud. Ours was this: what if the best version of our product is the one the user never has to open?

CommerceIQ runs ecommerce for some of the world’s most iconic consumer brands. Johnson & Johnson. Nestlé. Newell Brands. $20B+ in annual sales across 450+ retailers in 41 countries. We had the data, the scale, and the reputation. What we didn’t have was growth.

This is the story of how we rebuilt the product from a data utility into a collaborative agentic operating system.

The crisis

In 2023, after four straight years of 100% YoY growth, we hit a wall. Growth slowed. Churn climbed. Executive adoption had flatlined at 26%.

Through a rigorous end-of-year review spanning customer exit interviews, churn analysis, and prospect conversations, three patterns kept surfacing.

No stickiness. When the one person at a brand who knew how to use the tool left for another job, the subscription went with them. We had built tool proficiency, not organizational value.

Persona mismatch. Our buyers had moved up to VP level. The product couldn’t answer their one question: am I going to hit my number, and if not, why? So it got filed as a tactical nice-to-have, not a strategic investment.

Manual toil. Users logged in, drilled through three dashboards for 45 minutes, screenshotted what they needed, and pasted it into Excel. That was the workflow. A $20B data engine whose primary output was a screenshot.

We had built a cockpit with 500 blinking lights. No autopilot. No navigation system.

The reframe

The findings went to leadership with a single reframe: this isn’t a feature gap, it’s an identity problem. We built a data utility tool. We need to become an operating system.

And not just any operating system. The end state was clear from day one: AI agents take actions autonomously. Users partner with them through visibility and control. The product runs the business. The human approves, intervenes, and steers.

That was the destination. Nobody in the category had built it. Most weren’t even talking about it yet.

But we couldn’t jump there. Neither could our customers. Tesla didn’t ship Full Self-Driving on day one. They shipped Autopilot. Then enhanced Autopilot. Then supervised FSD. Each stage earned the trust and built the foundation for the next one. We had to do the same.

Just because agents can do the work doesn’t mean enterprise users are ready to hand it over. There’s real change management in between. Real business people. Real money on the line. And the agents themselves needed a foundation underneath them: clean data, narrative intelligence, proven workflows. Without those, autonomy means very little.

So the mental model became a ladder. Each stage mapped to a level of autonomy, and to a level of trust the user had to grow into.

Comparing stages of autonomy with Tesla

CommerceIQ

Pull data

User does everything. System shows a number.

Play with data

Filters, slices, schema builders. User still drives.

Get answers

System synthesizes and narrates. User steers.

Take actions

System executes. User reviews and approves.

Manage outcomes

System runs workflows. User supervises multiple agents.

Hand off to agents

Agents run the business. User sets goals and guardrails.

Tesla

Cruise control

Fixed speed. No awareness of context.

Adaptive cruise control

Speed adjusts to traffic. Light awareness.

Autopilot

Lane keeping plus adaptive cruise. Hands on wheel.

Enhanced Autopilot

Lane changes, navigation, parking.

Supervised FSD

Full point-to-point. Driver responsible but rarely steering.

Unsupervised FSD

True autonomy. No driver needed.

The transformation ran across four streams. Three targeting specific problems in parallel, building immediate value and laying the foundation. The fourth tying them all together as the agentic umbrella that the first three made possible.

Stream 1: Data Foundation. Fix trust. Build enterprise stickiness. Give users a single source of truth they would actually stake decisions on.

Stream 2: Narrative Intelligence. Give executives a reason to care. Automate the most painful manual workflow in the category and surface answers in plain language.

Stream 3: AI-Native Persona Workflows. Rebuild core workflows from the ground up. End-to-end, AI-native experiences for every persona across Sales, Media, Content, and Inventory.

Stream 4: The Agentic OS. The umbrella. A collaborative operating system where every agent, workflow, insight and person on the team operates as one system toward a single goal.

The principles

Four principles governed every decision across every pod. They weren’t aspirational. They were guardrails for a transformation where the temptation to over-design or over-explain was constant.

Trust is designed, not declared. Enterprise users won’t hand over real money to a system they can’t audit. Every AI recommendation had to carry its reasoning. Every surface had to be inspectable. The “why” was always more valuable than the “what,” because the why was what made the what believable. Trust wasn’t a feature. It was the foundation everything else stood on.

Restraint is the hardest design choice. The Gap-to-Plan view shipped with one visualization on the entire screen. The agentic OS shipped on a Kanban board. In a category that competes on dashboard density, the discipline was to remove, not add. Most enterprise design fails at this. The bet that paid off every time was simpler.

Human in the loop as a feature, not a caveat. Graceful handoffs. Moments where the AI recognizes a high-stakes decision and pauses to ask for approval. Not a safety net bolted on after launch. The thing that made customers comfortable giving us the keys.

Design the path, not just the destination. The end state was clear from day one. Autonomous agents, supervising humans, minimal interacting with the UI. Just a notification that the work was done. Each stream earned the trust the next one required. The destination was bold. The path was patient. “No UI” is the reward for getting the rest right.

The team

Seven designers, four parallel streams, two years. Each workstream structured as a pod: one or two designers, one or two PMs, a few engineers. Each pod had real ownership of their problem space.

My role cut across all of it. Working closely with each designer and PM. Staying close to customers. Pushing engineering on what was technically possible. Managing resources across four parallel tracks. Setting and defending the product vision. Partnering with marketing on rebranding the new agentic space. And staying hands-on in the work where it mattered most.

Data foundation

Data was spread across the product with accuracy and trust issues that ran deep. Users questioned numbers in customer calls. CS teams worked around the platform rather than through it.

We built a data foundry. A unified clean data layer with direct integrations into brand systems. But the more important reframe happened before a single screen was designed. This wasn’t a data delivery problem. It was a user autonomy problem.

Discovery kept surfacing the same thing. Users didn’t just want better data. They wanted to stop depending on internal teams to get it. That insight shifted the direction toward a flexible schema builder that let non-technical users manipulate dimensions on the fly, removing the gatekeeper friction entirely.

Impact: Data delivery reduced from days to near-instant. 50% reduction in manual reporting overhead. A primary driver of executive-level churn neutralized.

Narrative intelligence

Executives were drowning in dashboards but starving for meaning. CS managers and NAMs were spending the majority of their week building decks for weekly, monthly, and quarterly business reviews. Manual synthesis work that should have taken minutes. The data was there. The story wasn’t.

The reframe came out of a conversation with Shubham, the product lead across the transformation. We’d built a strong partnership over the years. Challenging each other, learning from each other, enjoying the work. Shubham had a deep technical background, which mattered more than I realized at the time. Pairing with him was how I learned to think clearly about what AI could actually do on top of structured data.

We were talking about automating C3 review decks. I suggested reusing the reporting foundation we already had on the platform to build a custom reporting module. Shubham took the conversation a step further: what if we added a narrative engine on top?

It hit me like a lightbulb.

That single thread opened up the experience. We sat with it for a while, sketched out where it could go, mapped out different ways it could play out across personas, and built conviction together. We pitched it to leadership, validated it with CS and customers, and got buy-in. The narrative layer became the spine of Stream 2.

The vision was a morning coffee report. Every view topped with a plain-language executive summary. A Progressive Storytelling framework where the insight came first and the data supported it, not the other way around.

The work required bringing together UX, Data Science, and Product to ground AI summaries in a proprietary ecommerce data model. A Truth-First UX let users peek under the hood at any point, building credibility in AI-generated insights rather than asking users to just trust the output.

Impact: 60% increase in executive logins. A Tier 1 beverage brand scaled from dozens of manual decks to hundreds of automated ones monthly. The platform repositioned from tactical tool to strategic advisor for C-suite personas.

AI-native workflows

With a trusted data foundation and a clear narrative layer in place, the focus shifted to execution. Every insight needed a direct path to resolution, removing the friction between knowing and doing.

Each persona got one workflow, solved end-to-end, AI-native. Not AI-assisted. The AI as the primary actor, the human as the approver.

Omni Command Center

Persona: National Account Manager.

NAMs were toggling across disparate retailers and product modules manually, trying to piece together why actual sales were deviating from targets. The product was producing metric overflow. Too much information, no clear direction.

The pivot was to a single north star: Gap-to-Plan. Instead of 50 charts, the VP saw one number. How far behind they were, and three agents actively working to close it. The UI became a decision engine, not a data repository.

We made a bold bet on the visual layer. No charts. No graphs. No traditional data viz at all. The only visualization on the entire screen was a single Gap-to-Plan speedometer that anchored the user. Everything else was status of actions, alerts to act on, top insights, and outcomes. For a product category built on dashboards, this was a hard line to hold. The speedometer had to carry the weight of the entire experience.

The design problem was this: how do we represent both underperforming and overperforming territory on the same canvas while keeping the goal itself constant, all inside a fixed width. The widget had to stay compact so it could be reused elsewhere in the product. It also had to be interesting to look at. As the only visualization on the screen, a bar chart was never going to carry that weight.

We tried more variants than I want to admit before landing on something that worked. The final design held in production and became one of the most recognizable elements of the new product.

When leadership changes mid-project threatened team motivation and direction, the original vision was protected. We held the line on solving the Plan-to-Action problem first before anything else was considered.

Impact: $1.3M in recovered revenue for a global pet brand in a single quarter. Now the mandated daily view for sales teams at SharkNinja, Colgate-Palmolive, and Newell Brands.

Goal-Based Campaign Optimizer

Persona: Media Specialist.

Brands’ ad investments were underperforming because budget and campaign systems operated in silos. Media teams were setting campaign-level targets in isolation rather than working toward holistic brand-level goals.

The reframe was from KPI maximization to true incrementality. iRoAS. The question shifted from how do we spend this budget to what outcomes are we actually driving. PhD-level complexity got compressed into a one-click optimization setup without losing the sophistication underneath.

When data accuracy issues started creating suboptimal automated actions, we slowed feature expansion. The actions the system suggested were the real foundation of the product, more than the experience wrapped around them. Fixing the data underneath came before shipping anything new on top of it.

Impact: 144% growth in ad sales for early adopters. 55% increase in iRoAS. Named a Top 3 Finalist at the Walmart Connect Partner Awards 2025 for Technology and Innovation.

Content Compliance, SEO, and AEO

Persona: Content Manager.

54% of content employees were lost in manual toil across fragmented tools. Every PDP update took 35 minutes, meaning brands couldn’t keep pace with real-time retailer algorithms. Organic rank suffered for it.

The insight that changed direction. Product pages were no longer just billboards for human shoppers. In the agentic era, a PDP is a knowledge base for a retailer’s AI. Amazon’s Rufus doesn’t just surface products, it cites them. Brands that weren’t optimizing for that were invisible in a channel that was growing fast. The platform needed to help brands win on both SEO and Answer Engine Optimization (AEO). Structuring content to be the definitive answer for conversational AI agents, not just search bots.

The design challenge was making a complex pipeline feel simple. Under the hood, the system was scraping Rufus, running claim-reality analysis against customer reviews, mapping SKUs to shopper intent prompts, and generating optimized content recommendations with a reasoning document for every change. The user needed none of that complexity. They needed to know which products were invisible on Rufus, why, and what to do about it.

Claim-reality analysis raised a harder question before it raised an easier one. Can a customer review actually be a source of truth? We debated pulling reviews from across the internet rather than just Amazon, but even that signal didn’t hold up strongly. For v1, we made a pragmatic call and built the POC against Amazon reviews, with the intent to broaden the source as the signal got sharper.

The primary working surface was a prioritized SKU table sorted by revenue impact. Highest-value products with the lowest AEO readiness score at the top. One click opened a review canvas showing current content alongside AI recommendations, with visible reasoning for every suggested change. Which keyword drove an addition. Which contradicted customer review signal was removed. Which shopper prompt the new phrasing was designed to answer. Accept, edit, or reject per attribute. One publish action to push changes through to PIM and auto-syndicate to Amazon.

Post-release, the team caught a design paradox. Surfacing every individual agent action in real time was meant to build trust, but it spiked cognitive load and pushed time to value the wrong way. We shifted to a system-status model that signaled processing was in flight without forcing users to watch every step. Trust held. The UI stopped fighting the user.

Impact: A global consumer brand achieved 100% PIM compliance across thousands of SKUs in 60 days. 150 bps boost in organic Share of Voice. 2.1% sales lift.

The agentic OS

Triggering skills and workflows from chat

The final evolution was the shift from tools to AI teammates. This stream is currently in its initial stages. The bet underneath it: the newest technology doesn’t need the most complex interface. It needs the most human one.

The design problem is real. How do you make AI agents feel like coworkers and not like opaque systems? How does a person sitting at their desk on Monday morning hand off work to an agent, see what it’s doing, jump in when needed, and trust the output enough to walk away from it? Every existing pattern in enterprise software felt either too mechanical, too chatty, or too magical.

We explored a lot of variants. Conversational interfaces. Workflow builders. Agent dashboards. Action queues. Each had a flavor of what we wanted but also brought baggage that didn’t belong in the agent era.

The answer that landed was Kanban. The simplest, most intuitive way humans have ever managed work together. Users or agents create tasks and assign them to each other. Tasks move through a standard flow: To Do, In Progress, Review. The system also generates tasks proactively, an out-of-stock alert or a promo anomaly, routed to the right agent automatically.

Every task card carried three things. What the task was about. Why it was being taken. And a dynamic island that continuously updated its status. Clicking into any card revealed the full detail with a thread to quickly chat with the AI or human teammate and resolve blockers. AI work became transparent, manageable, and collaborative rather than opaque and intimidating.

It was beautiful how, after exploring so many variants, we landed on something this simple. A pattern people had been using on whiteboards for decades, repurposed as the interface for managing the most advanced technology on the roadmap.

Agentspace sat alongside the board, a centralized hub to manage the agent workforce, defined by roles and autonomy levels.

The simplest possible experience for the most complex technology on the roadmap.

What made it hard

The team under pressure.

The second half of the transformation overlapped with a harder stretch for the company. Leadership changed multiple times, attrition climbed.

The hardest calls weren’t about product, they were about people.

Some of my best designers weren’t aligned with the culture outside our team, and it wasn’t serving them well. When better opportunities came their way, I let them go with pride in what they had built.

Getting new headcount approved was a struggle. I restructured and leaned on AI to stay efficient with a three-member team.

The agent identity bet.

Our CEO held a strong opinion: the agents should feel unique, with human personification and human names. Sally for Sales, Marty for Marketing, Cathy for Category. It came from a fair commercial read, VPs were evaluating the tool’s budget against their own team’s, and a named teammate was easier to price.

I had a strong opinion too, but stayed open to his view. My take: double down on unified intelligence. A Single Brain, while holding space for personality and skill sets that could still be marketed as distinct entities worth paying for.

I built both paths as real product experiences. The first version landed on Nexus, too abstract. Marketing pushed back, and they were right. Enterprise buyers buy teammates, not ideas.

We landed on Ally. One unified intelligence with personified skill sets for each function.

The bet that mattered held. The bet that didn’t evolved.

The outcome

72% adoption growth. 54% efficiency gains. 10+ at-risk accounts recovered. The highest new ARR in company history, anchored by a $50M expansion with Target.

The platform went from something users visited to pull a report into something that ran continuously on their behalf.

The best version of enterprise software isn’t a better dashboard. It’s the one you never have to open, because it already handled it.

What I took with me

The best ideas come from strong opinions held with real respect for the other side. The narrative engine grew out of Shubham and me sharpening each other. Ally grew out of holding the thesis while making room for the CEO and marketing.

Doing speaks louder than saying. When spinning up a concept is cheap, the case you build beats the case you argue.

Not everything is yours to control. Headcount doesn’t always come back, people leave, culture shifts. Do what you can with what’s in front of you, and let the rest go.

How I lead: six principles forged in practice

→