en

AI and Machine Learning With Dataiku

Build and evaluate advanced machine learning models using AutoML and the latest AI techniques.

 

Feature Engineering

To expedite the feature engineering process, data scientists of all types — from citizens to experts — can leverage automatic feature generation or discover reference feature sets in Dataiku’s feature store and import them into their projects.

AutoML in Dataiku transparently applies handling strategies for feature selection and reduction, missing values, variable encoding, and rescaling based on data type. Accept the default settings or easily modify any part for your specific objectives.

 

Integrations with Generative AI Services

Dataiku includes integrations to leading Generative AI services like OpenAI’s ChatGPT with more coming soon. With Dataiku’s technology and algorithm-agnostic approach, your team will always have access to the best Generative AI services and models giving you maximum agility and performance at the best cost to meet your business needs.

 

Delivering More Models with AutoML

Dataiku augments the model development process with a guided methodology, built-in guardrails, and white-box explainability so data scientists and analysts alike can build and compare multiple production-ready models.

Dataiku AutoML offers algorithms from leading frameworks for prediction, clustering, time series forecasting, causal ML and computer vision tasks to help people across the business generate the best results, all in an easy to use interface.

 

Custom ML

Advanced data scientists can extend the visual ML interface by programmatically developing custom models using Python, R, Scala, Julia, Pyspark, and other languages, or by importing models developed with MLFlow.

To ensure external efforts are captured and interpretable to the rest of the team, Dataiku captures the details of MLFlow experiments or Cloud ML models and automatically provides model comparisons and explainability reports.

Regardless of where a model is developed, Dataiku remains the central platform for deployment, monitoring, and governance.

 

Prompt Engineering

Build large language model (LLM)-augmented projects with prompt engineering using Dataiku’s Prompt Studios. Design, compare, and operationalize high-performing, programmatic, and reusable prompts. (coming soon)

 

Model Validation and Evaluation

Dataiku AutoML provides numerous features for validating and evaluating models, from design to deployment. Data scientists can take advantage of k-fold cross tests, automatic diagnostics, and model assertions and prediction overrides for sanity checks during the experimentation phase.

An extensive battery of interactive performance and interpretation reports including fairness analysis, what-if analysis, and stress tests provides the tools teams need to explain results and responsibly deliver reliable, accurate models.

 

Time Series Analysis and Forecasting

Dataiku provides a suite of tools for time-series exploration and statistical analysis, along with preparation tasks such as resampling, imputations, decomposition, and extrema & interval extraction.

Business specialists and data scientists alike can easily develop, deploy, and manage statistical or deep learning forecasting models using Dataiku’s visual ML interface.

 

Visual and Code-Based Deep Learning

Dataiku’s familiar model design, deployment, and governance experience makes it easy to include deep learning as part of data projects and business applications.

Define custom deep learning architectures with Keras and Tensorflow, or take advantage of pretrained models, transfer learning, and no-code interfaces for computer vision tasks such image classification and object detection.

 

Scale with Managed Spark on Kubernetes

For large computation or model training jobs, teams can automatically and efficiently scale workloads with on-demand, elastic resources powered by Spark and Kubernetes on your cloud of choice.

Pre-configured and fully managed clusters abstract away the complexity of containerized infrastructure from data scientists, so you spend more time doing what you love, and less time setting up backend resources.

Go Further

Discover How Dataiku Enables Data Scientists

Get data products in production faster with code environments and advanced capabilities for experimentation, modeling, and deployment.

Discover

Get a Demo

Watch our end-to-end demo to discover the platform.

On-Demand Dataiku Demo

Data Quality in Dataiku

Learn how you can track, verify, and fix data quality so that you can deliver powerful (and trusted) insights.

Read Now

Dig Deeper With a Sample Project

Explore this Dataiku project implementing an object detection use case.

View the Project