Our client is a mid-sized ecommerce company operating in the competitive online fashion retail space. With over 700 apparel and accessories products across categories like womenswear, menswear, and footwear, they sell multiple brands targeted at style-conscious 20-35 year old consumers.
As a digital-native vertical retailer lacking the resources of larger competitors, our client needed an innovative strategy to stand out. Their goal was to increase engagement, conversions, and revenue by providing customized recommendations and promotions tailored to each individual shopper based on preferences and history. However, performing individualized personalization manually across their extensive product catalog and highly diverse customer base was not operationally feasible.
With hundreds of products and limited visibility into individual customer nuances, tailoring unique recommendations posed a significant challenge. Rather than solely relying on generalized recommendations, our client wanted to provide customized friendly suggestions based on each active customer's interests to reduce abandonment rates. The main challenge was to either upsell customers on multiple items they were interested in with discounted combos or reminders, or provide personalized discounts real-time for individual items a high risk abandonment customer was actively viewing.
Furthermore, our objective was to personalize the messaging itself for each customer, with the goal of enhancing resonance and amplifying engagement. This required tailoring the language, tone and offers in real-time popups to align with that individual's preferences and context.
To meet the complex personalization needs of this project, we designed an integrated system optimized for capability, scalability and low-latency.
At the core, we leverage a dedicated instance of Meta's LLAMA-2 natural language model hosted on Azure. After evaluating alternatives like GPT-4, LLAMA-2 was selected due to its optimal balance of conversational ability and high throughput performance. By deploying the mid-size "13B chat" version on Azure, we can achieve strong contextual language generation while maintaining millisecond response times required for real-time use cases.
Critically, the fully managed Azure deployment allows us to maintain control and tailor LLAMA-2 for our specific inference performance needs. We also apply additional content safety wrappers to ensure brand safety.
LLAMA-2 is combined with additional purpose-built components for functions like customer browsing analysis and risk of abandonment prediction, real-time pricing optimization, and personalized popup messaging. Azure Kubernetes Service handles orchestration of these elements into an integrated workflow.
Together, this ensemble approach leverages the strengths of LLAMA-2, cloud scale, and specialized components to analyze and predict customer behavior. It enables responding contextually to each individual across every visit with personalized language. This integrated system provides comprehensive real-time personalization capabilities powered by an ideal language model.
The system consists of four key components working together:
Our personalized solution powered by LLAMA-2 delivered significant uplift during a 3-week A/B test versus control, validating the enterprise value of large language models deployed at scale. The integration of predictive machine learning and LLAMA-2's conversational capabilities strongly resonated with customers.
Given the pilot's success, our client has deployed the solution company-wide. We are collaborating on expanded personalization initiatives like tailored onboarding, customized product descriptions, and data-driven re-engagement messaging. This project has unlocked further opportunities to transform experiences across touchpoints, powered by impactful AI-driven personalization.