Enhancing AI Safety in E-Commerce: Flipkart's Use of NVIDIA NeMo
Written on
Flipkart is making strides in AI safety within the e-commerce sector by adopting advanced technologies like NVIDIA's NeMo Guardrails. With the rise of Generative AI over the past year and a half, our focus has been on leveraging new large language model (LLM) capabilities to improve customer interactions. Given the extensive history of e-commerce and its multiple touchpoints with customers, sellers, and various stakeholders, we recognize the critical importance of maintaining safe, ethical, and compliant customer engagements.
To achieve this goal, we developed an internal AI safety layer called ‘Aegis.’ This orchestration layer is designed to shield production LLMs from inappropriate safety breaches and harmful outputs. Currently, Aegis primarily focuses on moderating inputs, operating concurrently with the main LLM to assess prompt safety prior to user interaction. We are also expanding Aegis to incorporate output moderation.
Our exploration of various open-source tools has led us to discover NVIDIA’s NeMo Guardrails, a toolkit that enables the addition of programmable guardrails to LLM-driven conversational platforms. This toolkit offers a comprehensive framework for managing and controlling language model behavior, significantly enhancing our AI safety protocols.
This article outlines our integration of NeMo Guardrails, aimed at strengthening AI safety for Flipkart's AI-driven customer interactions, and evaluates its efficacy compared to our existing Aegis system.
Focus of Our AI Safety Layer
Ensuring the safe deployment of LLMs requires addressing several potential challenges:
- Topic Relevance and AI Capability Concerns: LLMs tailored for e-commerce should avoid engaging in discussions that are religious, social, or political in nature, and must stay aligned with brand-relevant topics. They should also mitigate standard AI safety issues such as toxicity, illicit activities, prompt injections, role-playing, and impersonation.
- Legal and Misleading Content Generation: User inquiries must be contextually appropriate and align with the bot's capabilities. The generated output should conform to Flipkart’s values and customer service standards to prevent the dissemination of misleading or inaccurate content, which could harm customer experience or lead to legal repercussions.
- Integration and Future Scalability: The system must integrate smoothly with our custom LLMs, which have unique input and output structures. It should support the definition and implementation of custom actions and guardrails to accommodate the variety of LLMs in our ecosystem. As we expand our AI capabilities, a forward-thinking solution will be essential.
Our Analysis
AI Safety at Flipkart is a collaborative effort involving teams from ML Platform Engineering, Data Science, Legal, Data Privacy, Application Security, and Product Management. Among the various safety measures implemented, the run-time guardrail service powered by Aegis filters potentially harmful content from user inputs before they reach the LLMs. It is equally important to ensure that the outputs are suitable and that LLMs are safeguarded against potential threats.
From a platform perspective, all commercial and open-source fine-tuned LLMs are accessible through a central AI gateway, known as ‘Genvoy.’ This gateway facilitates features like rate limiting and multi-tenancy while supporting the integration of run-time safety services. Smooth integration with Genvoy is crucial for effectively managing LLM interactions.
Our objectives also include:
- Integrating new models
- Utilizing simple classifiers for minor safety checks
- Employing vulnerability-specific LLMs to decrease latency
- Customizing guardrails to meet our needs
- Implementing complex workflows
We require extensive customization and control over LLM context and dialogue flow to address our specific requirements.
Integrating NeMo Guardrails to Address Our Challenges
NVIDIA’s NeMo Guardrails provides customizable workflows, actions, a specialized rail system, and the ability to register custom classes, allowing us to define workflows tailored to various use cases.
NeMo Guardrails supports our objectives in several ways:
- Seamless Integration: It integrates effortlessly with existing conversational systems, enabling us to implement programmable guardrails without significant alterations. This is beneficial for Flipkart, where we utilize a limited number of fine-tuned LLMs across diverse use cases, allowing data scientists to enhance AI safety with minimal effort.
- Comprehensive Protection and Orchestration: NeMo Guardrails offers strong mechanisms to safeguard against LLM vulnerabilities like jailbreaks and prompt injections, ensuring secure and reliable conversations. It acts as an orchestration layer, similar to our in-house system, Aegis, coordinating various components of the conversational framework.
- Advanced Rail System: NeMo Guardrails includes:
- General Instructions: Guides overall AI responses for specific use cases, like shopping assistants.
- Dialog Rails: Restricts conversation topics to relevant areas.
- Moderation Rails: Filters inputs and outputs and verifies facts to ensure adherence to community standards.
- Flexible Moderation Levels: Adjusts moderation levels based on different scenarios.
- Specific Rail Configurations: Customizes settings for each rail to provide precise control over prompt processing.
- Colang Modeling Language: Utilizes Colang v2.0 for improved flow management.
- Custom Class Support: NeMo Guardrails allows for the registration of custom classes, enabling us to unify all our LLMs under a single framework to facilitate the service of different models.
Current Architecture
In our current framework:
- User requests are directed to Genvoy, our central AI gateway.
- Genvoy forwards the requests to the guardrail orchestration service, Aegis.
- Aegis simultaneously invokes guardrail APIs (List Checker, Threat Classifier, and LLM Guard) along with the LLM service.
- List Checker: A program that blocks inputs matching a predefined list of harmful keywords and phrases using an n-gram TF-IDF-based fuzzy matching algorithm for close match detection.
- Threat Classifier: Utilizes DeBERTa-based classifiers to identify prompt injection attempts and unsafe content.
- LLM Guard: An LLM-based safeguard model trained to predict safety across 11 categories from the MLCommons taxonomy of hazards.
Aegis then returns a safe or unsafe response to Genvoy, which relays the moderated response back to the user.
With NeMo Guardrails, we can expand Aegis’s core functionalities. The enhanced flow orchestration and custom actions support dynamic and context-sensitive interactions with LLMs. Its flexibility in establishing strict or lenient workflows allows for rapid adaptation to various scenarios. Moreover, we can program safety checks and tailor responses using advanced moderation techniques for new flows.
NeMo Guardrails’ compatibility with our existing infrastructure facilitates straightforward deployment. Its customization capabilities bolster our AI safety framework alongside Aegis for Generative AI applications.
Implementation: How We Integrated NVIDIA NeMo Guardrails
We implemented a NeMo Guardrails endpoint using a Flask server capable of processing any LLM payload. This configuration enables NeMo Guardrails to support all our LLMs within a unified framework, similar to Aegis.
Here's an overview of our implementation and the specific use cases we targeted.
In our newly proposed architecture:
- Genvoy sends requests to our NeMo Guardrails server, which checks the input prompt through the input check flow. This flow includes custom-defined actions that make parallel calls to the Guardrail APIs (List Checker, Threat Classifier, and LLM Guard), akin to Aegis.
- If the response is unsafe, the NeMo Guardrails server returns the filtered response directly to Genvoy.
- If the response is safe, it proceeds to call the LLM and checks the LLM’s output using the output check flow.
- The NeMo Guardrails server communicates the response back to Genvoy, indicating whether the output was safe or unsafe.
Implementation Details
This section discusses the key components of our implementation:
- Custom LLMs (Self-hosted) and Class Integration: We developed a custom class within NeMo Guardrails to accommodate our diverse LLMs.
- Input and Output Rails: Mechanisms to validate and filter inputs and outputs for secure interactions.
- Custom Actions and In-House Classifiers: Integrated in-house classifiers with NeMo Guardrails for asynchronous processing and reduced latency.
Custom LLMs and Class Integration
Our custom class within NeMo Guardrails supports various LLM models at Flipkart, allowing us to manage LLM calls in alignment with our AI policies. We maintain strict adherence to our safety protocols while controlling request submissions and response captures.
Additionally, this custom class enables dynamic parameter configuration, allowing adjustments for temperature and token limits according to different use case requirements. This flexibility allows us to adapt the system for various environments with minimal effort.
The custom class guarantees that NeMo Guardrails can support all our LLMs under a unified framework, mirroring our in-house moderation layer, Aegis. This consistency enhances safety and functionality across all AI-powered interactions.
Input and Output Check — Input and Output Rails
NeMo Guardrails efficiently supports both input and output moderation:
- Input Moderation: NeMo Guardrails analyzes incoming user queries before they reach the LLM, enabling the identification and filtering of off-topic or harmful inputs, ensuring conversations remain focused on shopping. Customizable input rails guide permissible queries, aligning interactions with Flipkart’s business objectives.
- Output Moderation: For outputs, NeMo Guardrails scrutinizes LLM responses before they are sent to users. By routing output prompts through our in-house classifiers, NeMo Guardrails intercepts potentially inappropriate or non-compliant content, ensuring each response adheres to Flipkart’s communication standards.
Implementation Steps:
- Define our own input and output rail (input check and output check) in config.yml.
rails:
input:
flows:
- input check
output:
flows:
- output check
- For each rail, define its flows (e.g., define flow input check) in flows.co.
define bot refuse to respond
"I'm sorry, I can't respond to that because it is unsafe"
define flow input check
$response = execute input_check
if not $response
bot refuse to respond
stop
define flow output check
$response = execute output_check
if not $response
bot refuse to respond
stop
- For each flow, trigger custom actions (input_check and output_check) defined in actions.py. Each action makes three parallel calls to in-house classifiers — LlamaGuard, Threat Classifier, and List Checker.
@action() async def input_check(context: Optional[dict] = None) -> dict:
"""
Checks user input using the list checker model, threat classifier model, and llamaguard model and returns boolean values in a dictionary.
"""
…
return {"list_checker": list_output, "threat_classifier": threat_output, "llama_guard": llamaguard_output}
Custom Actions and In-House Classifiers
We have seamlessly integrated three in-house classifiers into our system using NeMo Guardrails’ custom actions feature. This integration allows us to utilize existing classifiers while maintaining a cohesive connection with the guardrails.
One significant advantage of NeMo Guardrails is its capability to make classifier calls asynchronously. In a typical synchronous setup, each classifier would finish its task before proceeding to the next, which could lead to increased latency. However, with asynchronous calls, NeMo Guardrails allows our system to process classifications concurrently, significantly reducing overall latency and ensuring quick, accurate responses for customers without sacrificing moderation integrity.
Our implementation process is as follows:
// Asynchronous calls to the models async def get_response_from_llamaguard_model(): async def get_response_from_threat_model(): async def get_response_from_list_model(): async def check_all_models(user_message: str) -> dict: // Gather all responses from individual classifiers
try:
threat_result, list_result, llamaguard_result = await asyncio.gather(
get_response_from_threat_model(user_message),
get_response_from_list_model(user_message),
get_response_from_llamaguard_model(user_message)
)
return {"threat_classifier": threat_result, "list_checker": list_result, "llama_guard": llamaguard_result}
except Exception as e:
print(f"Exception in check_all: {e}")
raise
Evaluation
We conducted non-functional testing (NFRs) to compare NeMo Guardrails with our in-house moderation layer, ‘Aegis,’ focusing on latency and throughput. Both systems were tested using the same dataset, centered on e-commerce queries, and operated with the same backend LLM.
Methodology: We tracked key metrics such as end-to-end latency (across percentiles like p50, p75, p95, p99, and max), queries per second (QPS), responses per second (RPS), average input tokens per request, and average tokens generated per request. To simulate real-world usage, we tested with varying levels of concurrency, where multiple "users" interacted with the system simultaneously. Each user (represented by a thread) sent requests and awaited responses, mimicking actual user behavior. We gradually increased the number of concurrent users from one, doubling until we reached a maximum limit, sustaining each level long enough to gather accurate data.
Results: NeMo Guardrails required classifiers to check both input and output, which added latency compared to Aegis, which only checks input prompts. However, NeMo Guardrails ensures that only queries that have passed safety checks are sent to the LLM, preventing delays caused by harmful content processing.
This evaluation was crucial in determining whether the advantages of NeMo Guardrails justify the additional latency, considering its enhanced safety and moderation capabilities.
Here are the evaluation results:
The evaluation highlighted several key performance insights, indicating that Aegis can accommodate a higher queries per second (QPS) with lower latency. In contrast, NeMo Guardrails introduced additional latency while providing more thorough safety checks by validating both input and output with our in-house classifiers.
Conclusion
The integration of NeMo Guardrails into Flipkart's AI safety framework has effectively addressed the primary challenges associated with deploying LLMs in a customer-facing setting. By offering advanced flow management, dual-stage moderation, and the capability to integrate a diverse range of LLMs, NeMo Guardrails has significantly bolstered our capacity to facilitate safe, pertinent, and contextually appropriate conversations for our e-commerce bots.
Addressing Our Challenges
- Maintaining Context and Relevance: NeMo Guardrails enforces strict guardrails, ensuring our LLMs steer clear of inappropriate or irrelevant discussions, such as political debates, and remain focused on pertinent e-commerce subjects.
- Ensuring Comprehensive AI Safety: Through its robust input and output moderation capabilities, NeMo Guardrails guarantees that all content processed by our LLMs complies with Flipkart’s safety standards.
- Achieving Seamless Integration and Customization: The custom class within NeMo Guardrails has enabled us to integrate various LLMs with specific payload structures, providing control and configurability across all models.
While our evaluation underscored some latency due to the detailed checks conducted by NeMo Guardrails, we believe this trade-off is justified by the significant improvements in safety and adaptability. Although we are currently utilizing Aegis, we plan to transition to NeMo Guardrails in the upcoming quarter.
Future Directions
Looking ahead, we aim to implement topical rails for enhanced dialogue management, enabling our e-commerce bots to engage in more precise and context-aware interactions.
We are optimistic that this ongoing effort will not only support Flipkart in delivering an effective and customer-friendly AI experience but also ensure our LLMs uphold the highest standards of safety and performance within the e-commerce sector.
References
- https://github.com/NVIDIA/NeMo-Guardrails/tree/develop?tab=readme-ov-file#types-of-guardrails
- https://github.com/NVIDIA/NeMo-Guardrails
- https://arxiv.org/abs/2310.10501
- https://docs.nvidia.com/nemo/guardrails/colang_2/overview.html#colang-2-0
- https://drive.google.com/file/d/1V8KFfk8awaAXc83nZZzDV2bHgPT8jbJY/view