We developed a machine learning model for predicting startup success, specifically designed for assessing startups seeking investment. This machine learning model for predicting startup success is an essential tool for our client, aiding in making informed funding decisions within their company’s activities. Initially, we received a list of deals concluded between companies (or their CEOs) and investors (or investment funds) starting from 2006. We also had access to a list of companies that have gone public through IPOs in the last 20 years.
Our method involved creating a dynamic graph where companies and investors were nodes, and their contracts were graph edges. Using this model, we trained the vector representation of companies (known as embeddings), considering their historical data: the emergence of new edges (new deals) and the ‘neighbors’ of companies in the graph.
The embedding of a graph vertex (i.e., a company) is a vector of a specified dimension, updated (trained) by performing auxiliary tasks like identifying the vertex type (company or non-company) and predicting new edges (deals). The underlying principle of these embeddings is that companies with similar behaviors and interactions with similar subsets of neighbors will have comparable vector representations, which are crucial for the machine learning model to accurately predict startup success.
Experiments were conducted on real datasets. The results obtained using the proposed model surpass the most current baseline metrics and are 1.94 times better than the performance of real investors. The best prediction results were achieved for startups in the fields of IT and healthcare.