What does it take to empower everyone in an organization to work with data, from process control engineers to project managers and everyone in between?
We sat down with Khuram Pervez, Head of AI and Data Science at Emirates Global Aluminum (EGA), who is doing just that. By leveraging the power of Dataiku to bring non-data experts and data experts together on top of Databricks as a powerful data processing and storage layer, Khuram and his team are putting curated, trusted data in the hands of citizen data scientists from all walks of the company. The result? More problems solved across the organization.
With Dataiku and Databricks, EGA is of course more able to tackle analytical problems that enhance its core business — the way they manufacture aluminum (this includes reinforcing safety). However, they also have the capacity to optimize across other business areas, like sales, marketing, and supply chain.
Breaking Down Data Silos
Prior to adopting Dataiku and Databricks, like many companies EGA had data living in different places and in different business teams. These data silos limited accountability for and ownership of data.
That’s why, before diving into data science or analytical problems, EGA had to open up their ecosystem. With Dataiku and Databricks, they were able to easily get visibility into what data they have, where it lives, what it means, and democratize the use of that data to people within the business, especially those with little or no coding experience. That’s because with Dataiku, users can easily build sophisticated visual, no-code workflows that leverage the power of Databricks as the underlying computation engine.
Building Up a Robust Citizen Data Science Practice
While still early in their journey, Khuram and EGA are working on developing both citizen data science and data ambassador programs. These programs will help formalize the collaboration between lines of business and the central data team, identifying data champions around the business for the right levels of visibility and connections that will allow them to work on the highest-value use cases.
They’re taking the time to build out this vision because, as Khuram says, “We can’t become a bottleneck in solving problems with data.” Khuram and his team are working on putting larger, more complex, high-priority use cases into production in collaboration with the business, but in parallel, they also want to keep a flow of ideas coming from the business. Giving them access to curated data that they trust to make hypotheses and experiment with data is the best way to do that.
The programs are accessible to anyone, from those already working with tools like spreadsheets to those who don’t currently do formal data work but have ideas for how data can help improve processes (e.g., people on the shop floors).
Diving Into Generative AI & the Future
When it comes to Generative AI, Khuram’s team is already experimenting and building use cases. The Dataiku LLM Mesh enables users enterprise-wide to securely connect to large language models (LLMs), including models hosted by Databricks Mosaic AI, allowing for PII detection, toxicity moderation, and cost tracking when working with Generative AI and LLMs.
Khuram and his team are especially looking at Generative AI use cases for teams that have lots of policies or technical documents that can benefit from the power of LLMs to summarize and search, ultimately saving time.
However, given the investment, they are also working on how they can quantify the value of these use cases and articulate it in a way that makes sense so they have a clear ROI story. Their other challenge, as with many organizations, is prioritizing Generative AI use cases against their main backlog of more traditional machine learning and analytics use cases.