“Miet’se Katzen” — Making the housing market more transparent.

This project was carried out as part of the TechLabs “Digital Shaper Program” in Münster (winter term 2020/21).

Abstract

With our project, we aim to achieve more transparency in the housing market. With the help of our tool, prospective tenants can get a first impression of what a reasonable base rent for a particular flat looks like. For this purpose, we trained a linear regression model with housing data from the German real estate platform ImmobilienScout24 to predict the cold rent of existing apartments based on many more features than just the living space. In this way, our forecast supports the tenant in his or her decision whether or not to rent a flat by reducing the information asymmetry between the homeowner and the tenant.

Introduction

With this Project we — the Miet’se Katzen — are targeting people who are looking for an apartment and want to compare potential apartment to the market. It is easy to just look at the average price per square meter, but a new or restored apartment with, e.g., two balconies, a new kitchen and huge rooms is most likely more expensive per square meter than an apartment, which does not have these advantages. To prevent one to rely on the simple average rent per square meter, we wanted to create a prediction that includes the soft factors such as condition of an apartment. All in all, the goal is to make the real estate market more transparent.

Preprocessing our data

In order to use the huge amount of data we got from Kaggle we needed to clean it. We quickly realized that this task would take more time than we originally thought it would. Because there was so much data it even took us some time to figure out how we could push our data to GitHub for everyone in our group to use.

Building the model

At the beginning of our model building phase, we tried different analysis models to investigate our data in the best possible way and to draw conclusions about the rent level of the investigated apartments. During our track on edyoucated and DataCamp we came across the following models: Linear regression, Random forest and Ada Boost. We have tried and tested all models one after the other to decide which one fits best our requirements. All three models gave nearly similar results. However, the linear regression took less time to calculate which has to do with the amount of data we analysed. So, we decided to use linear regression for our purposes, but kept the random forest model for a feature importance analysis.

Results of the project

As described in the introduction, our goal was to develop a tool that forecasts the cold rent of an apartment as accurately as possible in order to get a better overview, what rent is appropriate and fair for an apartment, for example when you are looking for a new flat.

The team

Felix Albert Data Science: Python
Max Heimsath Data Science: Python
Max Risau Data Science: Python
Kevin Woszczyna Data Science: Python
Kathrin Sandhaus Data Science: Python

Mentor

Maximilian Maiberger

Our community Members share their insights into the TechLabs Experience