Don’t waste it, save it
This project was carried out as part of the TechLabs “Digital Shaper Program” in Münster (winter term 2021/22).
Abstract
The project “Don’t waste it, save it” deals with the challenges to reduce food waste. As foodsharing.de is one of the biggest initiatives trying to target that problem, our focus was to build our project on their data. Our goal is to raise awareness by showing the daily progress of saved food in the greater Münster area and to make this accessible to the public so that the incentive to participate increases. The process was divided into three big steps — building a scraper and scraping the data, cleaning it and visualizing it.
Introduction
As our aim was to raise awareness on the topic of food waste, the first step was to find a reliable source where we can collect our data from. For that purpose we decided to collaborate with FoodSharing. It is important to mention that FoodSharing does not share any daily numbers on how much food has been saved but rather the total amount that has been saved since the platform has been launched. Our challenge was thus to retrieve and save that data on a daily base and to develop a code that allows us to extract the amount of food that is being saved daily and to finally visualize that number.
Methodology and results
To share code and information we mostly used Google Colab and Google Drive (as well as the chat during Zoom calls where collaborative work mostly took place). InfluxDB was used for the creation of our data base. We furthermore used Visual Studio Code with PlatformIO for programming the microcontroller. As our code and information was thus quite decentralized towards the end of the project, one of our team members gathered it on GitLab for an easier access and overview.
Our first step was to create a scraper and to parse the gathered data via the InfluxDB API for nodejs to our influxdb time based database.
The script uses Nightmare (nodejs) (with proton / electron) to connect to foodsharing.de via CSRF token-forgery (with permission of foodsharing.de). It requests the total amount in kg of every business in the region of Münster (via bezirks id#109). Our scraping script runs every six hours. The script parses the gathered data via the InfluxDB API for nodejs to our influxdb time based database. It therefore uses API-Tokens with precoded libraries. The first data was written to our database at 2021–12–03. We selected influxdb with version >= 2.0 to make use of the gui to visualise and analyse data directly via the database interface. Another benefit is that the api handling for multiple platforms is quite handy. We created multiple accounts to access the database.
We then proceeded to clean the data.
We managed to directly import the needed dataset into a pd dataframe called raw_df by using the influx api with a token access system. Then we started to visualise and analyse the completeness of the given date range to make sure that there are no missing values (no entries for longer than 24 hours). We determined 2022–02–13T00:00:00Z as the starting point for our flux query.
e then started to clean our raw_df by renaming and removing any unnecessary informations such as “result”,”table”,”_start”,”_stop”. The newly created df cleaned_df consists of the scrape_time as index and the total amount saved at this time (value_KG). Because we want to visualise the saved food per for each weekday, we needed to process the data further. A quick inspection of the cleaned_df revealed that foodsharing.de is refreshing the dataset probably just once a day. So we decided to keep only one value_KG per day and dropped the rest. After that we calculated the difference between each day to reveal the amount saved per day as food_kg_diff. Here we started to notice a bug that we like to call -1 bug. On a regular sunday the shops are closed and we would expect to see 0 KG saved, instead we always seems to get -1 KG. This seems to be related to foodsharing.de handling the dataset. It is possible that this is also happening for every other day (unnoticed). We corrected for that bug by subtracting -1 for each weekday. Beside adding food_kg_diff_corrected, we also added weekday_name for each weekday to our dataframe.
A boxplot and histogram of the current df:
The cleaned_df_mean contains the mean value of the amount (food_kg_diff_corrected) saved per day àccording to each weekday. This df only contains 7 rows of information.
Furthermore this freshly extracted information is written to a new bucket called foodsharing_python in the InfluxDB. We push the _value accordingly into the measurement forecast_week_mean for each weekday as a field into the InfluxDB.
Visualisation of the data through the InfluxDB GUI:
Our last step was visualizing the generated data.
To visualise our generated data from the df cleaned_df_mean we decided to choose a different approach rather than creating a webpage. An ESP8266 is connected to an ST7735 1.8" LCD Screen with a resolution of 160x128 pixels. The microprocessor is programmed via C++ using Visual Studio Code with platformio.
The prototype breadboard with the LCD screen (red arrow) and the microprocessor (blue arrow):
Similar to before, the data is imported using the InfluxDB API with special libraries for the Arduino framework. To simulate a live ticker, we decided to use an interpolating map function which calculates the amount of saved food every 10 seconds and displays it accordingly. Foodsharing times are from 08:00 to 22:00. After that or on sundays a message will display that the shops are closed and the weekly sum of saved food will be shown instead. A progress bar underlines the passing time during the day.
Visualisation in action — progress bar, time, weekday and amount saved in KG at this minute:
Possible future features
The list of possible future features is only limited by time, creativity and meaningfulness of the feature.
Data
- Improve the accuracy of the mean forecast by solving the block problem (via pandas).
- Solving the -1 bug and / or implementing a direct api for the data from foodsharing.de (source).
Visualization
- Simple web page with an interactive counter and informations (public relations) regarding foodsharing.de.
- Fitting the microprocessor and display from a prototype breadboard into a 3D printed enclosure.
Our GitLab Repository.
The team
Hendrik Brügging Data Science
Max Lülff Data Science
Natalie Jácobo Goebbels Data Science
Samuel Schlesinger Data Science
Mentor
Nils Schlüter