en

Dataiku August 2024 Product Updates

Dataiku 13.1 delivers Generative AI updates, additional XOps capabilities, governance enhancements, and visualization updates.
Learn more in our release notes and find instructions to update your instance.

 

Highlighted Updates

 

Explore many more feature updates organized by Dataiku capability below.

Generative AI

Generative AI capabilities in Dataiku

LLM Fine Tuning


If you need to refine your LLM to perform better on highly specific tasks or you have a use case that requires that you continually incorporate labeled data, Dataiku now enables you to fine-tune.

Users now have two options to take advantage of fine-tuning in Dataiku:

  • Dataiku’s new fine-tune recipe is a unique low/no-code approach to fine-tuning that opens up fine-tuning to non-coders.*
  • In addition, users can do this through Python code if preferred, with full flexibility & customizability in fine-tuning local LLMs from Hugging Face Hub and the ability to access state-of-the-art techniques from the open source community, all while benefiting from the LLM Mesh.

Updates to LLM Mesh

The latest LLM Mesh improvements introduce support for new models, connections, and features, including:

  • Guardrails: specialized local models from Hugging Face for toxicity detection*
  • LLM Mesh API to support function calling, other parameters, connecting external tools for advanced use cases (LLM agents, etc.)
  • Support for Llama3 / Mistral (7B, 8*7B, Large) / Titan embeddings v2 / Cohere Command (R, R+) models through AWS Bedrock
  • Support for Gemma in Hugging Face connection 
  • Add “Clear data” option to Knowledge Banks handler

*These features are currently only available as part of the Early Adopter Program.

RAG Updates

Chunking data is done prior to embedding in a vector store, and is a key step in training LLMs for use cases like RAG. These  techniques and parameters can have a great influence over the end result for augmented chatbots.

Now, in Dataiku’s  prepare recipe, there is a “split into chunks” processor. This new processor allows users to specify separators, visualize the chunks interactively, and apply post-processing steps to ensure chunks are separated as expected.

Universal Ops

DataOps capabilities and MLOps capabilities in Dataiku

Unified Monitoring Updates

Unified Monitoring provides a comprehensive view of all deployments and their overall health. Unified Monitoring for batch projects on the automation node has been updated with a new Govern card and status indicator. This allows users to get status on both batch projects and model endpoints fetching deployment status from Dataiku Govern without ever leaving the Unified Monitoring dashboard, to bring together the fullest and most complete view of ML project health all in a centralized view. 


Data Quality Updates

Since its introduction in 12.6, Data Quality has continued to receive improvements. Updates to Data Quality in 13.1 include multi-column support on all column-based rules, the ability to publish data quality statuses to dashboards, and template updates.

Data Prep

Data Prep capabilities in Dataiku

Multi-Row Formula

Users can now utilize an optional offset argument to existing functions used to access a column value. The offset argument is available in the Prepare recipe only, in all processors that support Formula. This feature can be used in use cases such as iterative calculations or auto increment ID.

Build Flow Zones from Scenario

Users now have the option to build everything within a Zone as a Scenario step rather than building individual items within that zone.

Visualization & Data Storytelling

Visualization capabilities in Dataiku

Dashboard Enhancements

Improvements to dashboards include UX and performance enhancements such as page and title settings, as well as performance improvements when loading visible tiles.

Charts Enhancements

Charts have received the following enhancements:

  • Median/percentile aggregation for numeric columns 
  • New gauge chart type

Governance

Governance capabilities in Dataiku

Dataiku Govern Improvements

Dataiku Govern now has enhanced auditability with a global instance timeline, which is a centralized view of all Dataiku Govern items events. This global timeline is accessible to all users and can be filtered based on multiple conditions.. In addition, new custom filters enable users to filter on more metadata, including conditional formatting and with and/or imbrication

Dataiku Solutions

New Dataiku Solutions have been added:

  • Clinical Site Intelligence: Leverage insights from clinical trial studies around the world to facilitate new study competitive intelligence and site analytics.
  • Store Segmentation: Group stores with similar characteristics based on demographic data and/or category sales data in an effort to develop a bespoke approach to optimizing operations.
  • Customer Satisfaction Reviews: Analyze your customer-rated reviews. Extract valuable insights from a large amount of text data. Uses the LLM Mesh.
  • Survival Analysis Plugin: Survival analysis is an advanced statistical technique. This plugin creates new recipes to support statistical tests and survival probability estimation.
  • Dataiku Answers Updates: Updates to the Dataiku Answers plugin v1.2.4 include UI updates, automatic knowledge bank usage, the ability to document metadata context, WT1 events specific to Answers, and bug fixes.

* This feature is currently only available through the Early Adopter Program.


Find all details in our release notes.


For previous releases

Take the Release Highlight Course

Review selected features from the latest Dataiku releases!

Take the Course the Academy