Customer Support AI Agent for E-Commerce

Built a production-grade customer support agent that handles the long tail of refund, shipping, and order-status tickets with a supervisor-enforced policy layer and human-in-the-loop on any write action above threshold.

The Challenge

This mid-market e-commerce retailer experienced severe operational bottlenecks due to a high volume of repetitive customer support tickets covering order tracking, return requests, and shipping delay inquiries. These transactional tickets accounted for over 70% of inbound customer service queues, draining human agent resources and inflating customer service operational costs. Prior attempts to resolve this issue using traditional, keyword-matching FAQ chatbots failed because those legacy conversational systems could only retrieve static text and lacked the capabilities to access live database states. To successfully automate operations, the business required a secure agentic system that could read and modify customer files within Shopify and HubSpot. However, doing so introduced severe business risks, including potential leakage of personally identifiable information (PII) and the unauthorized execution of fraudulent customer refunds that bypassed standard retail policy controls.

Pain points we set out to solve

×70%+ of inbound tickets were repetitive and deflectable
×Prior chatbot could not act on internal systems, only read
×No guardrails meant every AI refund was a compliance risk
×Agents burned hours copy-pasting order data between 4 tools

Objectives

01Deflect at least 35% of inbound support volume within 90 days
02Zero unauthorized writes - every refund must respect policy limits
03Sub-5-second agent response time on 95% of tickets
04Full audit trail for every action the agent takes

Approach

How we delivered — phased, with clear checkpoints and evidence at each step.

Week 1-2
Discovery and data audit
Mapped ticket taxonomy from 6 months of Zendesk history, identified the 8 ticket types that covered 82% of volume, and defined policy guardrails with the Ops team.
Week 3-5
Agent architecture
Designed a LangGraph supervisor graph with specialized sub-agents - an intent classifier, a read-only order-lookup agent, and a write-action agent - each with scoped tool permissions.
Week 6-8
Tool integration and guardrails
Wired tools into Shopify, HubSpot, and the internal RMA service. Built a policy enforcement node that validates every proposed write against refund limits, return windows, and customer lifetime value.
Week 9-10
HITL, evals and launch
Added human-in-the-loop approval for any refund above 150 dollars. Built an eval harness with 120 golden tickets and shipped behind a feature flag to 10% of traffic, then ramped.

The Solution

The engineered solution is a multi-agent orchestration graph built on the LangGraph framework using a supervisor-worker design pattern to coordinate specialized functional nodes. When a support ticket enters the system, a lightweight intent-classifier agent parses the user query and routes the context to dedicated worker nodes, including a read-only Shopify agent and a CRM data retrieval agent. To eliminate compliance and financial liabilities, the architecture implements a strict supervisor policy node that intercepts all write operations proposed by the action agents. This policy node validates refund amounts, return windows, and customer lifetime values (CLV) against pre-configured retail safety rules in code rather than prompts. Refunds exceeding 150 dollars are automatically paused and routed to a human-in-the-loop (HITL) approval queue. The entire agentic workflow logs comprehensive execution traces to LangSmith, ensuring a complete, searchable audit trail for compliance teams.

Supervisor-enforced policy layer

No write action executes without clearing configurable refund limits, return-window rules, and CLV-based escalation thresholds.

Scoped tool permissions per agent

The read agent can only query. The write agent can only act after supervisor approval. No single agent holds both capabilities.

Human-in-the-loop on high-risk actions

Refunds over 150 dollars, account merges, and address changes route to a human queue with full context pre-loaded.

Full tracing and audit

LangSmith traces every node run, and action logs mirror to the warehouse so compliance can reconstruct any decision.

Technology stack

Picked for latency, cost, and long-term maintainability — not for novelty.

AI / Agent

LangGraphLangChainOpenAI GPT-4oClaude 3.5 Sonnet

Tools & Integrations

Shopify Admin APIHubSpotInternal RMA service

Observability

LangSmithOpenTelemetryDatadog

Infra

FastAPIPostgresRedisAWS ECS

Results

40%Reduction in human support load (first 90 days)

0Unauthorized refunds in the first 90 days

62%Auto-deflection rate on order and refund tickets

3.2sP50 agent response time

Business impact

The support team shifted from clearing a queue to handling edge cases and VIP customers. CSAT held steady at 4.6/5, and the retailer redirected two FTEs from ticket triage to proactive retention work.

Key takeaways

Supervisor graphs beat monolithic agents for anything that touches production systems
Policy belongs in code, not in the prompt - LLMs will confidently violate soft constraints
An eval harness catches regressions that human spot-checks miss, especially on long-tail tickets

Visual reference

Customer Support AI Agent for E-Commerce — reference 1