LLM Guard Services: Dataiku sets guardrails for generative AI projects

Dataiku formalizes the launch of LLM Guard Services in what the publisher calls LLM Mesh, a “common backbone” for generative AI applications.

LLM Guard Services uses functions introduced from the DSS 12 platform and reinforced in DSS 13, then in its version 13.2, available since October 3.

This solution consists of three components:

Safe Guard
Quality Guard
Cost Guard.

Safe Guard embodies DSS capabilities to evaluate requests to large language models and their responses to see if they contain personal or confidential data, and to block them if necessary.

Dataiku had already integrated Presidio, an open source SDK (under MIT license) developed by Microsoft to identify and anonymize sensitive data in texts (preferably respecting a certain structure) and images (via OCR).

Dataiku’s system detects generic entities such as bank card numbers, IBAN codes, email addresses, telephone numbers, URLs, physical and IP addresses or even names. Added to this are entities specific to certain countries, including US or Italian license identifiers, Australian and British medical data (among others).

In DSS 13.2, the company adds support for Meta PromptGuard, a classification model capable of detecting injections in prompts and malicious instructions. This integration is available in private preview for customers who have subscribed to the “Advanced LLM Mesh Early Adopter” program.

Quality Guard also hides a functionality accessible through the same modality. Dataiku has developed a recipe for evaluating the results of an LLM.

“Using LLM Quality Guard, customers can automatically calculate standard LLM assessment metrics, including LLM-as-a-Judge techniques like answer relevance, answer correctness, context accuracy, etc. ., as well as statistical techniques such as BERT, Red and Blue, and more,” Dataiku said in a statement.

A short video allows you to familiarize yourself with the recipe. Depending on the nature of the task (chatbot, translation, summary, etc.), it provides recommendations on the metrics to collect. It is also possible to add personalized evaluations: the recipe interface includes a Python notebook. Alerts can be triggered if the quality of results from an LLM integrated into an application deteriorates. The editor allows you to keep the evaluation results and the code used for its execution for comparison purposes.

Mastering and proving the value of PoCs

Cost Guard was the first of the LLM Guard services to be announced in March 2024. In addition to an audit trail, Dataiku provides a dashboard that tracks the costs and usage of LLMs, whether called by API or deployed on site. The editor also offers a way to cache the most common requests and responses.

“For several years, companies have been interested in statistics, data, machine learning, data science and analytics. All of these areas are often grouped together indiscriminately, and companies strive to exploit them. Since the arrival of ChatGPT two years ago, general management has intensified their efforts, encouraging their data teams not to miss this revolution,” summarizes Amaury Delplancq, vice-president of Southern Europe at Dataiku.

Dataiku’s historical contacts, such as chief data officers and IT departments, have been carrying out experiments with tools such as the Mistral, OpenAI and Meta models for around a year and a half to two years. “Currently, most of these initiatives are at the proof of concept (PoC) stage,” considers Amaury Delplancq.

“However, we are entering a new phase,” he notes. “In large CAC 40 companies, managers who encouraged their teams not to miss this opportunity are now taking a step back. They realize they are spending a lot of money and resources without a clear demonstration of return on investment (ROI). This creates pressure for PoCs to show concrete results.”

During its Everyday AI Week Paris event which took place from September 24 to 26, several clients, including GRDF, Malakoff Humanis, BNP Paribas and Société Générale, shared their progress in this area.

A regulatory “paradigm shift”

Added to this are European regulations. “After the introduction of the GDPR, it is crucial to understand the potential impact of the AI Act. This implies that companies must organize themselves to document their projects and prove that they control these experiments,” he insists.

Companies would be “not ready” to comply with the AI Act. “I had several meetings with large CAC40 companies. Many processes are managed by accumulated systems, some of which are only known by one or two people,” relates Amaury Delplancq. “As a result, some companies are not very comfortable with the idea of a regulatory audit. Not everyone is perfectly organized either, because the current regulations were not yet fully defined.” Now that the text has entered into force, “we are witnessing a paradigm shift”.

This is the purpose of LLM Guard services and the project evaluation module with regard to the requirements of the AI Act.

A central hub to manage business AI use cases

Many players, including cloud providers, as well as Snowflake, Databricks, Splunk, Datadog and others offer similar FinOps and security capabilities. Whether to manage the uses of generative AI or those linked to data processing.

“The difference is that we are not judges and parties,” points out Amaury Delplancq. “We do not have a model for consuming cloud resources. And that changes absolutely everything,” he assures. “This is why our major clients ask us to keep costs under control.”

Generally speaking, Dataiku intends to position itself as the control tower for all AI projects.

“Dataiku positions itself as the central hub,” says Amaury Delplancq. “We can provide an overview of all use cases within the company,” he adds. “We are even capable of managing use cases carried out outside of Dataiku, such as those on Snowflake or Databricks. Some developers use these tools, or even Google Vertex, without going through the Dataiku interface.”

The large accounts mentioned above and others would go to the publisher for this reason. “We even see that some companies, despite their great maturity and a group of 200 very productive data scientists, realize that they can only meet 10 to 20% of their organization’s needs,” says the manager. “This observation leads these companies to understand that they need a platform that can integrate many of the tasks and tools necessary to deploy analytical and AI projects.”

The publisher points out, however, that it can manage the governance of models and algorithms, not data. “The governance we put in place aims to help our clients comply with regulatory requirements. In reality, it will not be us who will have to respond directly to regulators,” specifies the vice-president of Southern Europe.

Tags Paris Malakoff

Mastering and proving the value of PoCs

A regulatory “paradigm shift”

A central hub to manage business AI use cases

Related posts