Cannabis Distribution — Finding the best city districts for distributing Cannabis
This project was carried out as part of the TechLabs “Digital Shaper Program” in Münster (winter term 2021/22).
With the establishment of the new German government in 2021, the governing parties announced their intention to introduce a controlled distribution of cannabis for recreational purposes in licensed stores. As this is an unprecedented as well as much debated business-sector in Germany, many questions arise. First our team identified potential consumers based on research studies across Canada, USA and the Netherlands to then develop an algorithm which ranks different city districts according to their share of potential customers. The results are visualized with heatmaps using geopandas, so that users can intuitively identify the most attractive parts of a city for future cannabis stores.
Legalization of Cannabis is a controversial issue which has been at the center of many political debates throughout the last decades. Due to the governmental change in Germany following last year’s federal elections, the legalization made it into the coalition agreement. Though whether legalization will truly happen, seems to remain an open question, some experts consider early 2024 as the earliest realistic date for legalization.
Making a small excursion into history: in Germany cannabis has been prohibited since 1929. Hence, current generations will find themselves in unfamiliar circumstances, which entail plenty of new research opportunities. To name a few, who are potential consumers, where should cannabis be distributed, what are societal consequences?
According to the governing parties, controlled cannabis distribution will be available for recreational purposes in licensed stores. This led us to the idea of further discussing the first two of the above questions with the help of data and a vivid visualization. With our project we hope to kick off the research about the cannabis legalization in Germany and aim to provide initial insights of possible distribution hotspots.
At the beginning of our project, we quickly agreed on the idea to start our research in countries that already legalized cannabis. Amongst others, we reviewed studies from the US, Canada and the Netherlands. Soon after, we were able to define a Persona that embodies the common cannabis customer. After some internal discussions, we agreed on leveraging data from the Netherlands because we see the Dutch best fit to resemble the German population.
Accordingly, we derived a logic which enables us to forecast the number of potential customers for each city district within a city. Specifically, they are defined as customers ought to consume monthly (we apply data of respondents who claimed to have consumed cannabis at least once within the past month).
After having derived our forecasting code, the next two major goals were to visualize our results in an intuitive way and to embed our code into a program following an input-output-logic. We were able to reach the goals through the skills provided by the TechLabs Data Science track, which all our team members attended. The primary programming language we used was Python. Through the project we employed a variety of Python libraries to handle our datasets and conduct our analysis. Numpy was utilized for mathematical computations, Pandas to read and save our datasets, as well as to handle the data in our notebooks. First visualizations were created using Matplotlib. Finally, we used Seaborn and Geopandas to provide more complex heatmap visualizations.
Furthermore, we picked Jupyter Notebook as our development environment because it offers a simple and intuitive user interface for data analysis. We utilized Github to handle the code and the plentiful datasets. The initial project management was done in Notion using TechLabs’ management team templates. We used our Slack team channel, which was supplied through the TechLabs community Slack room, as well as WhatsApp and Zoom for all additional communication, coordination, and frequent online team meetings.
For the visualization our goal was to use the cities’ maps combined with gradient colors to indicate the different potentials. Therefore, we required the shape data for the respective city to forecast. After having successfully visualized the forecast for Münster, we decided to include Cologne as an additional example. After some setbacks, we were able to discover a corresponding shape file, although we grant it is not as intuitive as the one for Münster as there are multiple ways of separating a city into districts.
For clarity, we used “Stadtteile” for Münster and “Statistische Quatiere” for Cologne. The respective shape files can be found in our Github Repository and must be downloaded and referenced before running our program to see the heatmaps.
With the input-output-logic, our main goal was to deliver a product that is usable for interested outsiders. The result is a program that runs in the Jupyter Notebook and gives the user options to interact. Regarding functionality, we were able to build the program in a way that it correctly calculates the forecasts and also implemented a logic to display the respective heatmap when desired. A problem we could not fix in time was that the heatmaps shown as an output remain unfilled (see notes in Github for our assumptions on the problem’s cause). Nevertheless, we were able to create two separate programs that work perfectly fine in displaying the heatmap of Münster or Cologne after prompting the user for input.
Upcoming steps for our project would be to implement additional variables into our forecasts (of which we have already gathered some) as well as cleaned data (e.g., for being able to forecast consumption quantities in grams by frequency-of-use-groups or implementing data on frequentation of certain spots within a city). That way, the impact of the project could be advanced and it might develop into a helpful tool for setting up an efficient cannabis distribution network in Germany maximizing tax income.
Julian Lagache Data Science: Python (LinkedIn)
Felix Altmann Data Science: Python (LinkedIn)
Lukas Dreiling Data Science: Python (LinkedIn)
Jonas Meyer Data Science: Python (LinkedIn)
Luis Eberhard Data Science: Python (LinkedIn)
Roles inside the team
Julian and Luis were mainly responsible for the data acquisition and data cleaning. Felix and Lukas were mainly responsible for the implementation of the input-output logic and visualization within the Jupyter Notebook. Jonas was mainly responsible for writing the code to perform the forecast. However, due to the rather similar backgrounds and prior knowledge, we helped out each other in all kinds of tasks.