Moving Münster — Analysis of mobility sensor data (pedestrian, bicycle and car traffic) in dependence of covid and weather data
This project was carried out as part of the TechLabs “Digital Shaper Program” in Münster (summer term 2021).
How did the covid pandemic affect the mobility behavior in Münster? We analyzed mobility sensor data from pedestrians, bikes, and cars as endogenous variables explained by pandemic data to answer this question. Furthermore, we introduced a control variable, weather, with an influence on mobility that is intuitive. After cleaning different datasets on mobility using Pandas and NumPy, we used correlation functions to analyze the relation between sensor mobility, covid, and weather data. Finally, using the packages seaborn and heatmap, we could visualize the significant influence, e.g., covid-incidence had on the different mobility vehicles in Münster.
The city of Münster is a vivid place known for its very own hierarchy regarding means of transportation. Instead of the car, the form of transportation that dominates the cityscape is the bicycle — the “Leeze”. However, during the current Covid-19 pandemic, the intensity and composition of traffic in Münster might have changed. The individual concern of contracting the virus and in particular the different lockdowns and restrictions of public life might have altered the way movement takes place within Münster. The goal of our project “Moving Münster” was to figure out if and to what extent the pandemic — or more specifically the Covid-incidence and the lockdown-intensity — had an influence on the everyday traffic in Münster.
Over the last couple of years, the local government of Münster has installed a multitude of sensors throughout the city which track the number of pedestrians within the pedestrian zone, the number of bicycles on the cycle paths and the number of cars on the streets of Münster. These sensor data are publicly accessible and can be considered a public service for the society. Our mission was to increase this public service by running our analyses described above and thus offering even more information than the mere isolated data do.
Apart from the changing gravity of the pandemic there are further variables which influence the mobility within a city — first and foremost the weather. Therefore, we also integrated a number of weather parameters into our analyses in order to make sure the effects we observe are indeed caused by the pandemic and not by good or bad weather. In other words, we used wether with its different attributes as a control variable.
Our project ends with some interesting conclusions based on the existing data we examined. As a potential next step, we could make forecasts and predict how mobility in Münster will change if a certain lockdown measure is introduced or if a certain incidence is reached. However, considering the vaccination progress and the changed way politicians and virologists react to certain incidences, these forecasts would probably be rather vague, which is why we refrained from forecasting. Nevertheless, our results offer real added value to the society of Münster and to people interested in the effects of the pandemic.
The overall project followed standardized phases of a data science procedure model. In the first phase “Data Acquisition” we gathered necessary sensor data from different sources and checked for completeness and quality. One dataset (corona) was developed from scratch by the project team itself. In the second phase “Data Processing” we technically read the data and cleaned it in such a way that we obtained high quality datasets. In the third phase “Data Integration” we joined the different datasets in order to be able to carry out joined analyses and visualizations. In the fourth phase “Analytical Modeling” we conducted our advanced analyses towards the correlations between the different variables of our datasets. In the fifth and last phase “Presentation” our analyses were visualised in correlation heatmaps and we presented some of the results in a Techlabs online workshop.
All team members participated in the Python Data Science track. Therefore, the main programming language used was Python. Moreover we used different Python packages to process our datasets and carry out our analyses. For mathematical operations we used the package Numpy. For reading and saving our datasets as well as the overall handling of the data in the notebooks we used the package Pandas. The first visualizations were programmed with the package Matplotlib. The more advanced analyses and visualisations were finally developed with the packages Seaborn and Heatmap.
For the development environment we used Jupyter Notebook, which provides an intuitive and easy to use user interface for data analyses. To manage the code and the different notebooks we used the online platform GitHub. Every team member locally cloned the repository and worked independently of each other on his local personal computer. If a small task was finished, the corresponding team member committed and pushed the changes to the online repository. Other team members were then able to pull these changes to their local environment and proceed with their tasks. The initial project management was done in Notion with templates from the Techlabs management team. For all other communication, coordination and the regular online team meetings we used our Slack team channel provided in the Techlabs community Slack room.
For each of the previously explained project phases, we have achieved interim results. The results of the “Data Acquisition” were identified and available data sources of pedestrian, bicycle, car traffic, corona and weather data for Münster of the past few years. The data was initially stored in many different CSV-Files. The next interim result of the “Data Processing” phase were the five following cleaned and high quality datasets stored independently in one single CSV-File each: pedestrians, bicycles, cars, weather and corona. For the analyses and as the interim result for the “Data Integration” phase on the one hand we merged all single datasets to one multiindex dataset and on the other hand we merged the weather and corona data with each traffic dataset fo finally obtain the following datasets: pedestrians_weather_corona, bicycles_weather_corona, cars_weather_corona. For the remaining project phases we analyzed these high quality, merged datasets and obtained the following insights:
We analyzed the influence of corona on the volume of passenger cars with a correlation heatmap matrix and were able to obtain the following findings in particular: The stronger the lockdown, the more free parking spaces there are in the Bremer Platz parking garage (train station). Moreover the stronger the lockdown, the more free parking spaces there are in the Cineplex parking garage. However, it is still unclear whether there is causality behind these correlations.
In addition, we analyzed the influence of weather on the volume of passenger cars with a correlation heatmap matrix. We have discovered that the higher the temperature, the more free parking spaces exist at Schlossplatz. Additionally, the higher the temperature, the more free parking spaces there are at the Cineplex. Again, it is unclear whether there is causality behind these correlations.
Finally, we examined the correlation in our overall merged dataset. In our correlation heat map we can observe that the covid rate is positively correlated to free parking spots. Therefore, it seems that a higher incidence of corona has caused people to bike and drive less, but walk more.
Our data analysis gives a first impression, how weather and corona influences the movement of Münster’s citizens. However, there does not necessarily have to be a causal connection between our technical findings and the real causes. Therefore, our datasets should be examined deeper in further statistical tests in the future.