While empowering smallholder farmers across East Africa to combat hunger and poverty, challenges arose at One Acre Fund due to operating across many different countries and with varying programs. This decentralization posed issues, particularly for smaller countries striving to uphold stringent data management standards.
Manually comparing large datasets along with other data management challenges caused difficulties in payment timeliness and customer satisfaction for the Ethiopia team.
Further, climate change caused difficulty in determining optimal planting times. This problem was compounded by the complexity in reaching farmers in remote areas.
Read on to see how Dataiku streamlined One Acre Fund’s data management, automated formerly manual processes, and helped the country teams reach farmers in remote areas of Africa.
Difficulty Managing and Analyzing Large Datasets
One Acre Fund’s operations in Ethiopia faced a challenge with data management. With 230,000 farmers who received 20 million seedlings of different species, the amount of data to track was significant. The data was recorded and had to be compared with other datasets. Managing and comparing data manually with spreadsheets made this process difficult with such large datasets.
This process was time-consuming but also prone to human errors, making it challenging to gain meaningful insights from the heaps of data. Sharing data between teams was also a challenge, preventing valuable information from aiding essential research on smallholder farmers.
Uncertainty Around Planting Times
Climate change made it difficult for farmers to know when rain would come and therefore when to plant seeds. If seeds were planted too early, the farmers risked germination failure due to insufficient rainfall. When seeds were planted too late, there was insufficient water for development during the growing season. Further, it was difficult to reach the farmers to communicate information about prime planting dates and other important information.
After: Automated Processes Lead to Faster Insights
Quick Setup in Dataiku
The One Acre Fund team quickly imported their different data sources to Dataiku through the API and Google Sheets plugins. They were then able to quickly and easily run recipes to get the insights they needed.
The global data team focused on enhancing the data management infrastructure by establishing a data warehouse in Snowflake and integrating it with the Dataiku framework. One Acre Fund teams were trained to use Dataiku to manage country-specific data.
How One Acre Fund’s Kenya Team Utilized Dataiku Applications
One Acre Fund’s Kenya team uses several Dataiku applications, most notably, a system for real-time credit scoring. The credit scoring flow utilizes OCR technology, image processing algorithms, and machine learning (ML) models to extract relevant data from scanned or captured images of ID documents.
This data is verified with a national database by visiting their API endpoint and, finally, additional farmer and purchase information is given and a credit score is calculated by the credit scoring ML model. Based on the resulting credit score, an array of options is presented to the farmers. This credit score is continuously monitored in Dataiku on data drift, performance, and potential biases, to make sure it serves the organization’s and — more importantly — the farmer’s needs best.
Greater Insights Through Automation
The teams in Ethiopia, Zambia, and Malawi have moved from working manually in Google Sheets to a more sophisticated process utilizing KoboToolbox, Commcare, Snowflake, and Dataiku. With this new method, a field officer gathers farmer information and the resulting dataset encompasses GPS coordinates, field boundary polygons, farmer demographics, purchase details, and potentially other pertinent data.
This information is integrated into Dataiku through the API plugin, where it is then enriched with payment data from an MSSQL database through join recipes. Once this central dataset is established, a series of validations are undertaken, including but not limited to:
- Geographical Overlap Verification: Utilizing the geojoin recipe, checks are performed to identify any overlaps between distinct fields.
- Deforestation Zone Assessment: Through a custom Python plugin integrated with Google Earth Engine, determination is made whether a field lies within a deforested area.
- Field Geometry Evaluation: Both visual inspections utilizing Dataiku charts and automated assessments within a prepare recipe are employed to validate the sensibility of field shapes.
- Data Completeness Examination: Join recipes are employed to identify instances where required data is absent in certain datasets, with unmatched data being retrieved and addressed.
The culminating datasets are then made available within Snowflake and are frequently exported to Google Sheets using sync recipes. These workflow schedules are tailored by the country teams according to their preferences, and updates on these processes are communicated through Slack or email channels. As a result, data refreshing occurs multiple times throughout the day to ensure accuracy and relevancy.
One Acre Fund’s Agriculture Research Team (ART) provides insights and recommendations related to agriculture. Their recommendations are based on data collected in the field and geospatial information publicly available. The data is analyzed and then optimized recommendations generated by ML models are exposed as API endpoints. Individual country teams can then integrate these API endpoints in a way that works best for them.
Dataiku users have access to geospatial insights through a tailor-made Streamlit application. This dynamic application draws upon internal data sourced from the data warehouse, which is then augmented with freely available satellite imagery retrieved from Google Earth Engine. Within this framework, country teams gain the capability to access a dashboard-driven viewer, leveraging its capabilities to craft maps that align precisely with their specific needs. Since the country teams are using Dataiku, their data is already stored in the data warehouse. This inherently facilitates the incorporation of localized geospatial information, streamlining the process and enhancing the utility of the insights generated.
How a Chatbot Helps Farmers Plant at the Right Time
Optimizing planting dates is critical to improving yields and reducing food insecurity. With Dataiku, One Acre Fund was able to connect to different sources of data (internal data, satellite data, and survey information) and create a project to clean the satellite data as well as a project to synchronize the chatbot with the usable datasets.
To communicate with farmers, One Acre Fund created a chatbot using USSD technology to send them direct messages with important information. Using the AutoML features in Dataiku, One Acre Fund was able to create a forecasting model tailored to every farmer. With the Dataiku API the data and model were connected to the chatbot, allowing every farmer to access their personal planting date recommendation.
Achieving Data Democratization and So Much More
Previously reliant on laborious methods like Google Sheets or manual handwriting, the data collection has transitioned to a streamlined form-based approach that integrates seamlessly with Dataiku. This transition has yielded multiple benefits: Data integrity has been secured through proper backups, users receive automated feedback via data checks, and processing speeds have improved significantly, particularly for large datasets that were previously cumbersome for Google Sheets to handle. With time freed up due to automated processes, the One Acre Fund country teams are now enabled to consider more complex tasks and projects.
For those processing data, the global teach team offered a training session in Dataiku. After this half-day training, the team not only knew how to manage their data in Dataiku, but their data was stored in the data warehouse and automatically processed in Dataiku. As a result, this information became readily accessible organization-wide with appropriate user permissions in place.Thanks to the democratization of data in Dataiku, everyone has access to data, saving time and increasing efficiency.
The One Acre Fund Ethiopia team has experienced improved customer satisfaction due to more timely payments thanks to automating processes with Dataiku. The team in Zambia has experienced a 90% reduction in workforce required to process data, allowing employees to focus on other needs. In addition, they now possess the capability to furnish farmers with precise planting recommendations, and have automated satellite checks in place.
By receiving information (from a forecasting model in Dataiku) at the right time through the chatbot, farmers are able to plant seeds at the optimal moment to maximize their output. In the long-term, based on the results of pilot studies (which resulted in an income increase of $10.60 per adopter) the initiative will generate an average of $5.58 per farm annually based on 52.6% adoption rate.
In addition to the aforementioned teams, several other groups within the organization also leverage Dataiku’s capabilities. For instance, the Rwanda team employs Dataiku to provide crucial support for their recently established stores. Meanwhile, the supply chain management team harnesses Dataiku’s power to make accurate forecasts of fertilizer prices, utilizing time series models to aid in this process.
Collectively, a total of 22 teams, boasting over 100 active users, are actively engaged with Dataiku at One Acre Fund. These diverse teams possess the capacity to independently conceptualize, develop, and manage intricate operations. This inherent autonomy allows them to operate with remarkable agility and efficiency, all the while upholding stringent standards of data quality, privacy and performance.