Highlighted Updates
- Ability to evaluate and monitor LLM outputs
- Prompt injection detection guard & more LLM Mesh updates
- Azure OpenAI and AWS Bedrock LLMs supported in visual fine-tuning
- Column-level data lineage view in the Data Catalog
- Delete and reconnect recipes
- Repeating recipes and datasets from settings
Explore many more feature updates organized by Dataiku capability below.
- Generative AI
- Universal Ops
- Data Prep
- AI & Machine Learning
- Visualization & Data Storytelling
- Dataiku Answers
Generative AI
Explore all generative AI capabilities in Dataiku
LLM Evaluation
If you need to assess the performance and outputs of your LLM pipelines, Dataiku now offers a dedicated Evaluate LLM recipe. This recipe enables you to visually evaluate the performance of your LLMs and outputs standard metrics, which are saved into a model evaluation store. This model evaluation store can be utilized to automate processes and will be an essential component of your LLMOps framework.
Updates to LLM Mesh
The latest LLM Mesh improvements introduce support for new models, connections, and features, including:
- Guardrails*: detection for malicious prompt injection using specialized local models
- Vertex AI connection enhancements. Added support for:
- Gemini 1.5 Flash & 1.5 Pro / Text & image embedding models
- Custom models
- Bedrock connection enhancements. Added support for:
- Mistral Large 2 / Llama3.1 models / Titan Multimodal Embeddings G1
- Support for Llama3.1 models in Databricks connection
- Support for Claude 3.5 Sonnet in Anthropic connection
*These features are currently only available as part of the Early Adopter Program.
LLM Fine-Tuning Supports Azure OpenAI and AWS Bedrock
The LLM fine-tuning recipe now features Azure OpenAI and AWS Bedrock. You can effortlessly deploy your fine-tuned LLMs to the LLM Mesh, along with managing saved model versions, deployment IDs, and more.
Contextual Retrieval in RAG
In Dataiku’s Advanced RAG, you now have the option to specify additional columns for contextual retrieval. This enhancement allows information from those specified columns to be retrieved during the query, supporting more advanced RAG approaches
Universal Ops
Explore all DataOps capabilities and MLOps capabilities in Dataiku
Data Quality Updates
In 13.2, Data Quality received the following updates:
- A new rule has been introduced to compare the values of two metrics, ensuring that the value of a specific metric remains consistent or has not changed significantly between data sets. This rule is applicable for comparing metric values either within a single data set or across multiple data sets.
- An additional rule, named Column Values in Set, allows you to check if all the values of a column exist in a predefined set. This simplifies data validation and provides more clarity to the values present in a column.
- You can now populate values from a sample straight into your Data Quality rules. This will reduce manual work and streamline the rule-creation process.
Data Prep
Explore all Data Prep capabilities in Dataiku
Column-Level Data Lineage
With Data Lineage, we’re adding column-level lineage views to the Data Catalog. This includes:
- Lineage across shared projects: Easily track data flow across projects, improving collaboration and troubleshooting.
- Automatic lineage: Automatically generate lineage for most visual recipes and prepare processors.
- Manual controls: For complex recipes or code-based processes, name-based matching allows for lineage mapping with the option to manually edit steps for greater accuracy.
Delete and Reconnect Recipe
The Delete and Reconnect recipe allows users to easily delete a recipe and its successor dataset from the flow while reconnecting the flow’s branches. Additionally, it detects schema changes and provides a way to update them, ensuring no issues happen downstream.
Repeating Recipes and Datasets
You now can use repeating recipes and/or datasets in your flows. A repeating dataset or recipe takes a secondary “driver” dataset as a parameter and “runs” as many times as there are rows in the driver dataset in short succession. Each time a repeating recipe runs, variables are replaced with the value in the current row of the parameters driver dataset.
This feature is currently supported on the following Dataiku objects:
- Create a dataset from a managed folder
- Create a SQL dataset
- SQL recipe
- Download recipe
- Export to folder recipe
AI & Machine Learning
Explore all AI & Machine Learning capabilities in Dataiku
Free-Text Annotation in Managed Labeling
Labeling can now be done using Free Text Annotation on tabular data sets. Users can find Free Text Annotation in the Managed Labeling Suite, along with other labeling tasks for records, images, and text.
Monotonic Constraints in Tree-Based Models
Users can now set monotonic constraints on numeric features in tree-based models. This ensures that certain variables are consistently increasing or decreasing, to better mimic real-world patterns and improve trust in model results.
Visualization & Data Storytelling
Explore all visualization capabilities in Dataiku
Dataiku App Version Management
There is now version management for Dataiku apps. Users can now update app versions directly from the app designer, automatically notify app users of any updates, and retain key settings such as instance URLs and parameter selections during the update process.
Charts Enhancements
Charts have received the following enhancements:
- Standard deviation aggregations on numeric columns
- Display values on line charts and mix charts
- Improved value formatting: percentage display, parentheses for negative values, trailing zeros
Dataiku Answers Updates
Updates to the Dataiku Answers v1.3.2 include user profile settings, support for multiple dataset retrieval, an upgrade in Python version support, and multimodal support with Anthropic Claude 3 (via AWS Bedrock) LLMs. Additionally, several bug fixes have been addressed.