en

Dataiku Version 13.2

Dataiku 13.2 delivers updates to Generative AI, the LLM Mesh, Data Preparation, AI & Machine Learning enhancements, visualization updates, and more. Learn more in our release notes and find instructions to update your instance.

 

Highlighted Updates

Explore many more feature updates organized by Dataiku capability below.

 

Generative AI

Explore all generative AI capabilities in Dataiku

LLM Evaluation

If you need to assess the performance and outputs of your LLM pipelines, Dataiku now offers a dedicated Evaluate LLM recipe. This recipe enables you to visually evaluate the performance of your LLMs and outputs standard metrics, which are saved into a model evaluation store. This model evaluation store can be utilized to automate processes and will be an essential component of your LLMOps framework.

Updates to LLM Mesh

The latest LLM Mesh improvements introduce support for new models, connections, and features, including:

  • Guardrails*: detection for malicious prompt injection using specialized local models
  • Vertex AI connection enhancements. Added support for:
    • Gemini 1.5 Flash & 1.5 Pro / Text & image embedding models
    • Custom models
  • Bedrock connection enhancements. Added support for:
    • Mistral Large 2 / Llama3.1 models / Titan Multimodal Embeddings G1
    • Support for Llama3.1 models in Databricks connection
  • Support for Claude 3.5 Sonnet in Anthropic connection

*These features are currently only available as part of the Early Adopter Program.

LLM Fine-Tuning Supports Azure OpenAI and AWS Bedrock

The LLM fine-tuning recipe now features Azure OpenAI and AWS Bedrock. You can effortlessly deploy your fine-tuned LLMs to the LLM Mesh, along with managing saved model versions, deployment IDs, and more.

Contextual Retrieval in RAG

In Dataiku’s Advanced RAG, you now have the option to specify additional columns for contextual retrieval. This enhancement allows information from those specified columns to be retrieved during the query, supporting more advanced RAG approaches

 

Universal Ops

Explore all DataOps capabilities and MLOps capabilities in Dataiku

Data Quality Updates

In 13.2, Data Quality received the following updates:

  • A new rule has been introduced to compare the values of two metrics, ensuring that the value of a specific metric remains consistent or has not changed significantly between data sets. This rule is applicable for comparing metric values either within a single data set or across multiple data sets.
  • An additional rule, named Column Values in Set, allows you to check if all the values of a column exist in a predefined set. This simplifies data validation and provides more clarity to the values present in a column.
  • You can now populate values from a sample straight into your Data Quality rules. This will reduce manual work and streamline the rule-creation process.

 

Data Prep

Explore all Data Prep capabilities in Dataiku

Column-Level Data Lineage

With Data Lineage, we’re adding column-level lineage views to the Data Catalog. This includes:

  • Lineage across shared projects: Easily track data flow across projects, improving collaboration and troubleshooting.
  • Automatic lineage: Automatically generate lineage for most visual recipes and prepare processors.
  • Manual controls: For complex recipes or code-based processes, name-based matching allows for lineage mapping with the option to manually edit steps for greater accuracy.

Delete and Reconnect Recipe

The Delete and Reconnect recipe allows users to easily delete a recipe and its successor dataset from the flow while reconnecting the flow’s branches. Additionally, it detects schema changes and provides a way to update them, ensuring no issues happen downstream.

Repeating Recipes and Datasets

You now can use repeating recipes and/or datasets in your flows. A repeating dataset or recipe takes a secondary “driver” dataset as a parameter and “runs” as many times as there are rows in the driver dataset in short succession. Each time a repeating recipe runs, variables are replaced with the value in the current row of the parameters driver dataset.

This feature is currently supported on the following Dataiku objects:

  • Create a dataset from a managed folder
  • Create a SQL dataset
  • SQL recipe
  • Download recipe
  • Export to folder recipe

 

AI & Machine Learning

Explore all AI & Machine Learning capabilities in Dataiku

Free-Text Annotation in Managed Labeling

Labeling can now be done using Free Text Annotation on tabular data sets. Users can find Free Text Annotation in the Managed Labeling Suite, along with other labeling tasks for records, images, and text.

Monotonic Constraints in Tree-Based Models

Users can now set monotonic constraints on numeric features in tree-based models. This ensures that certain variables are consistently increasing or decreasing, to better mimic real-world patterns and improve trust in model results.

 

Visualization & Data Storytelling

Explore all visualization capabilities in Dataiku

Dataiku App Version Management

There is now version management for Dataiku apps. Users can now update app versions directly from the app designer, automatically notify app users of any updates, and retain key settings such as instance URLs and parameter selections during the update process.

Charts Enhancements

Charts have received the following enhancements:

  • Standard deviation aggregations on numeric columns
  • Display values on line charts and mix charts
  • Improved value formatting: percentage display, parentheses for negative values, trailing zeros

 

Dataiku Answers Updates

Updates to the Dataiku Answers v1.3.2 include user profile settings, support for multiple dataset retrieval, an upgrade in Python version support, and multimodal support with Anthropic Claude 3 (via AWS Bedrock) LLMs. Additionally, several bug fixes have been addressed.


All Release Details In Dataiku Release Notes


For previous releases

Take the Release Highlight Course

Review selected features from the latest Dataiku releases!

Take the Course the Academy