The Ocean Cleanup is at the forefront of addressing the significant challenge of tracking vast quantities of plastic debris across the oceans. To initiate effective cleanup operations, accurate data on the movement and concentration of plastics is vital, gathered through advanced tools like GPS devices, autonauts for environmental monitoring, and X-band radar. However, the true challenge emerges in the stages of cleaning, integrating, and analyzing this data to transform it into actionable insights.
Addressing Data Challenges to Enhance Efficiency and Collaboration
The Ocean Cleanup faced obstacles in data pipeline management due to slow updates, computational inefficiencies, and data inconsistencies. The management of diverse data types and formats from various sources compounded these issues, and the absence of a centralized data platform hindered effective collaboration and advanced analytics. A versatile platform was crucial to analyze extensive data on plastic locations and address the monumental task of cleaning up 1.8 trillion pieces of plastic floating at the surface of the Great Pacific Garbage Patch.
Empowering Data Science for Environmental Good
In response to these challenges, The Ocean Cleanup partnered with Dataiku’s social impact initiative, Ikig.AI. This collaboration provided access to Dataiku’s platform and expert support, enhancing their data management capabilities and significantly accelerating data analysis processes.
Through this partnership, The Ocean Cleanup has achieved remarkable milestones, including removing over 450,000 kg of plastic from the Great Pacific Garbage Patch and 17,000,000 kg from rivers worldwide. Additionally, the creation of one of the largest beach cleanup databases has improved understanding of shore cleaning processes, driving global collaboration and optimization.
Advanced Data Pipeline Management Revolutionizes Workflow Efficiency
The Dataiku platform revolutionized data pipeline management at The Ocean Cleanup, enabling effective progress monitoring and leveraging insights from past projects to enhance future initiatives. Within the first year, the team efficiently replicated complex data workflows, previously handled across SQL databases, Excel, Matlab, and Python, thereby centralizing operations and significantly reducing manual effort. This integration allowed team members to focus on developing value-maximizing features and fostering deeper innovation.
Comprehensive Data Handling and Centralized Management Empowers Decision-Making
The centralized platform has been pivotal in how The Ocean Cleanup manages and analyzes environmental data across various types and formats. Automated workflows — streamlining tasks like correcting 306,225 rows lacking country information and computing weights for nearly 100,000 rows— have enhanced data accuracy and efficiency. By uniting both technical and non-technical stakeholders on a single platform, Dataiku’s visual interface has accelerated decision-making. This has enabled quicker campaign evaluations, real-time adjustments, and a more strategic approach to addressing environmental challenges.
Geospatial Analysis and Data Integration: Enhanced geospatial analysis capabilities allow for precise tracking of debris movements and identification of plastic hotspots. Automated data pipelines ensure the database is continually updated, optimizing strategic decision-making. Dataiku supports an array of data collection methods, from the largest beach cleanup database to underwater cameras, integrating these sources to provide a detailed view of how plastics move within aquatic environments.
Dataiku’s Interface and Learning Resources Boost Collaboration
Dataiku’s platform democratizes data science, making it accessible to all team members through intuitive visual recipes for data wrangling and visualization. Previously, onboarding new team members required extensive manual training with disparate tools, such as Python scripts and Excel models. Now, Dataiku’s user-friendly interface and robust learning resources help new users quickly get up to speed. This accessibility has broadened engagement, enabling five times more individuals across The Ocean Cleanup to contribute their expertise and enhancing donor outreach efforts by optimizing social media timing to boost funding.
Additionally, Dataiku’s comprehensive learning resources have shortened time-to-insight by allowing both data scientists and non-technical users to draw actionable insights directly from visual workflows. For example, directors now access and analyze project performance without relying on technical support, streamlining the decision-making process. By fostering citizen data scientists, Dataiku has expanded its impact across the organization, turning tasks like fundraising analysis and social media optimization into efficient, data-driven strategies that deliver impactful outcomes.