Non-Nonvaluable-Tool — How can the value of NFT’s be measured and predicted using EDA of market data and social media interactions?
This project was carried out as part of the TechLabs “Digital Shaper Program” in Münster (winter term 2021/22).
Abstract
The goal of the project is to analyze the relevance, interaction and market value of NFT-collections (Non Fungible Token) using various metrics. Therefore, we asked ourselves whether there is a correlation between social interactions and the market development of selected Ethereum-based NFTs and to what extent interactions on specific NFTs in social networks (in our case on the Twitter platform) are related to price development and demand. For this, in our project the value of different NFTs is being measured and predicted using exploratory data analysis of market data and social media interaction. The project results in an analysis tool to predict potential buy and sell signals on NFTs and a dashboard that clearly and concisely shows the relevant metrics around NFTs and thus provides a way for easier comparisons of different NFTs.
Introduction
The concept arose from the current hype surrounding NFTs (Non Fungible Tokens) and chatter of very rapid money. It has been our personal experience that predicting or identifying potentially successful ventures is extremely tough. There is an abundance of data, some of which is abstract and difficult to measure and compare. Furthermore, there are various ways to uncover prospective new blue chip NFTs, a rug pull (scam), or simply to spot the next hype and price pump. The following factors are crucial in the market, which is also known as the “wild west” of the crypto world:
- Roadmap and value proposition of the project
- Familiarity and experience of the team
- Marketing
- Community
- Hype around the project, generated from the previous aspects
- Market and commercial activities
Our analysis aims to shed more light on the relationship between the last two points.
Methodology
In the initial phase, we looked for data sources that were relevant for our project. Therefore, we examined various sources of information, including NFT groups, market databases, and social media. Because there was no current data that we could use, the initial step of our approach was to locate relevant data sources before collecting data on our own.
The interaction around NFTs is crucial in determining the relevance of NFTs. There are numerous ways to measure and analyze this interaction. Due to a lack of resources, we focused exclusively on Twitter as an engagement tool, because besides Discord it is the most essential network in this field. As a result, we used the basic Twitter programming interface, which gave us access to all tweets from the previous seven days. From the scraped data, we analyzed the following KPIs:
- Tweet timeline (intensity)
- Number of likes & retweets
- Number of followers
- Number of (unique) users
- Interactions around the key word (word cloud from tweet text)
To compare the interaction around the NFTs with the market values we were able to scrape NFT data from the Ethereum Blockchain with the help of the Moralis API. We settled on three randomly selected NFT collections in the process. From the scraped data, we analyzed the following KPIs:
- Number of transfers
- Number of sales
- Total value of transactions
- Number of unique buyers
- Number of unique sellers
After identifying the two relevant data sources we acquired pertinent data from those platforms. Unfortunately, because of Twitter’s constraints, we were restricted to a time period of seven days for scraping the tweets. Nonetheless, the seven-day data collection provides us with a solid foundation for future research. Furthermore, due to the large number of NFT-collections available, we confined our test version to only five different NFT-collections. Of course, the analyses can be repeated for a larger number of collections in the future. The data from the five NFT collections was then scraped, saved to csv files, then roughly cleaned and preprocessed in preparation for future analysis.
After this process, we built a meaningful analysis framework for evaluating NFTs. After identifying and defining relevant KPIs we started with data analysis and data visualization. For this, we mainly used the package matplotlib, as it can be used to create vivid graphs.
Results
The next step after completing the analyses would be to integrate the visualizations into a dashboard. Unfortunately, we weren’t able to do this due to time constraints, so we put it together manually for now. So, for each NFT collection, there is a Twitter analytics dashboard that shows KPIs like the intensity of tweets around the keyword or a word cloud that provides content about Twitter interactions. To analyze the interactions around the tweets, it also provides statistics about the number of retweets and likes for the tweets. The second part of the analysis is related to the NFT trading data that we analyzed from the Ethereum blockchain. These were also summarized in a dashboard with the absolute KPI’s and visualizations of these over time. By comparing the Twitter KPI’s and the NFT data from the Ethereum Blockchain, we were able to see a correlation between Twitter activity and actual transactions of NFT collections. It is possible that market behavior is influenced by both targeted sales/transactions and targeted Twitter marketing. Both analysis components can influence each other in a decoupled manner and have a direct correlation at the same time.
The following images show the word cloud from the “cryptopunks” tweets, which gives us a short insight about the general opinion and possible developments including price predictions, an analysis of the number of transfers over the time for the “mfers”, “veefriends” and “cryptopunks” and an analysis of the tweet intensity over the time for the same tweets in comparison. There one could interpret that e.g., with the “cryptopunks” first an increased demand was achieved by an increased number of tweets and afterwards the downward trend of attention was tried to be reversed by targeted transfers and sales.
These visualizations can now be used to compare the different collections. As a further improvement of our program, we would like to enable a connection to the platforms and the interfaces APIs in the future, so that permanently live data can be scraped, stored in a database and thus an always up-to-date analysis is possible. Furthermore, the analysis concept can also be improved by adding further KPIs. With our analysis tool, one now has the possibility to compare and better assess NFTs and can derive forecasts of possible buy and sell signals in NFTs. So in the long run, it could be a goal to integrate more data sources, i.e. to integrate other technologies as well. This would then be done for a longer period of time and in real time via interfaces of the databases. It would also be conceivable to develop an AI that automatically evaluates this data and automatically issues recommendations as well as evaluates the current situation based on past developments.
The team
Colin Borremans Data Science Track (LinkedIn)
Marie Cermann Data Science Track (LinkedIn)
Fabienne Gadorosi Data Science Track (LinkedIn)
Sven Kreciszewski Data Science Track (LinkedIn)
Roles inside the team
Colin Borremans
- Conceptual design , Data Preparation , Definition & Identification of KPI´s, NFT Market Analysis and Data Visualization (NFT Market)
Marie Cermann
- Twitter Scraping, Data Preparation, Tweets Analysis, Definition & Identification of KPIs and Data Visualization (Tweets)
Fabienne Gadorosi
- Data Preparation, Tweets Analysis, Definition & Identification of KPIs and Data Visualization (Tweets)
Sven Kreciszewski
- Conception of a possible dashboard, Data cleaning, Presentation construction, Definition & elaboration of KPIs and NFT Market Analysis
Mentor
Marcus Cramer