Engineering a Retail Virtual Assistant in the Age of AI
Our path to retail started in digital storytelling. In early 2024 the start‑up Know‑me asked us to refine a prototype that lets users interview Holocaust survivors and other historical figures through realistic video calls and chat. High latency, drifting personas, and missing safety checks threatened user trust. We re-architected the stack—reducing response time to under three seconds and adding automated guardrails enough to impress investors.
A grocery innovation director who saw that demo asked: Could the same reliability guide shoppers to safe, relevant products in‑store? That question became the Retail Virtual Assistant (RVA).
Six months later, the RVA we unveiled during an internal Tech Talk can tell a customer where the oat-milk lives, explain why yesterday’s allergy alert is a false alarm and suggest a dairy-free carbonara recipe—all in the time it takes a colleague to find the stockroom key. The RVA, which is accessible on a customer’s phone or tablet, can be used in stores to help customers with their queries. It uses Generative AI to create a conversational narrative, combining inventory and other information to construct informed answers to questions. It's about helping customers by leveraging empathy, intelligence, and information.
How We Built It
Our stack:
-
Front‑end – A React + Node progressive web app (PWA) launched via in‑store QR code and usable on any modern phone or tablet.
-
Back‑end – A Python / Django API backed by PostgreSQL.
-
Conversational Core – GPT‑4o accessed through the OpenAI SDK with tool use enabled so the model retrieves authoritative data rather than hallucinating facts.
-
Voice Layer – The browser’s WebSpeech API delivers speech‑to‑text, while 11 Labs provides optional text‑to‑speech for kiosk scenarios.
-
Infrastructure & Operations – Terraform provisions AWS resources—ECS, RDS, S3, Cognito, and CloudWatch—supporting one‑click, multi‑region deployments with full observability.
Tool‑Use Pipeline
-
Natural‑language request → LLM receives system, persona, and user messages.
-
Product search → LLM invokes a search, returning a list of product IDs.
-
Reason & compose → LLM crafts a concise answer citing prices, stock level, or recipe suggestions.
-
JSON schema validation → Response passes through a schema check to prevent missing fields or policy violations.
Prompt Engineering:
We keep responses helpful, on-brand, and safe through a three-layer prompt:
-
Guard-rails – persona, tone, and “don’t go there” topics.
-
Few-shot – real Q&A from store logs.
-
Dynamic context – live allergens, promos, basket contents.
Every change runs through an Evaluation Bot that regression-tests for truth and empathy before we push to production.
This is how our demo RVA looks like.
What’s in It for Customers and the Business?
Giving the customer additional and easier ways of getting the information they want, can help to attract and retain customers. Personalising the experience for the customer also helps to build customer confidence and build the brand. “Dinner for two in 20 minutes, under £10?” feels like advice, not upsell. Offering customers benefits from providing their data incentivises them to provide more zero-party data; this is easier and cheaper for the retailer to obtain and provides valuable and accurate information. Additionally, freeing up colleagues also allows staff to focus on human interactions, not aisle hunts. Not to mention the bigger baskets, context-aware cross-sell can boost the average order value.
Dragons We’re Still Slaying
-
Token limits bite. 1024-char tool prompts cap giant product lists.
-
Maths is hard. “3 for 2” promos still trip the model.
-
Long-term memory. GPT-4o’s baby memory is promising but not yet retail-grade.
Engineering in an AI‑First Era
The RVA illustrates where software engineering is heading with LLMs: orchestrating reliable data pipelines, safety guardrails, and evaluation harnesses around powerful generative models. As the discipline aligns with standards common to mature engineering fields, human creativity can focus on the genuinely novel challenges that still demand it.
Grab a coffee, fire up the demo at aws.deepchatdemo.dae.mn/shop to have a play with what we have built.
Username: ChatbotTemp
Password: Daemon24
If your organisation is ready to explore an AI‑first approach to customer service, we would welcome a conversation about where a virtual assistant—shaped by rigorous engineering—could add measurable value.