Show HN: Credit reports about German companies Hello, In addition to my studies in computer science, I have been working on a side project. I obtain data from the Unternehmensregister, a register where every German limited company is required to publish their financial statements. These statements are published as HTML files and are completely unstructured. While financial statements often look similar, companies are not required to follow a specific structure, which often leads to inconsistently formatted statements. The use of the Unternehmensregister is completely free, so you can check out some examples. I wrote code that converts the unstructured financial statements into structured data using the ChatGPT API. This works well. Of course, there are some problems that have not yet been solved, but data extraction works well for the majority of companies. I than coded a Random Forest algorithm to estimate the probability of default for a company based on its financial statement from the Unternehmensregister. I built a website to present the structured data along with the scores. Essentially, I create a credit reports for companies. Currently, there are four companies in Germany that also create credit reports (Schufa, Creditreform, Crif, and Creditsafe). Other companies resell the data from these four providers. I provide the same services as these companies, but without including personal information such as directors or investors. The market for this service is quite large; for example, Creditreform sold over 26 million credit reports about companies in 2020. My probability of default prediction performs quite well, achieving an AUC score of 0.87 on my test data. An AUC of 0.87 means that there is an 87% chance that the model ranks a randomly selected company that defaults higher than a randomly selected company that does not default. Additionally, there are many more companies to crawl for my database. Currently, I am focusing on companies that are required to publish their profit and loss statements. For testing purposes, there are currently 2,000 companies available on my website. At the moment, the website is only available in German, but you can use Google Translate, which works ok for my website. Thank you very much for your feedback! https://bonscore.org/ December 12, 2024 at 09:59PM
Show HN: Credit reports about German companies https://ift.tt/eDp7TnO
Related Articles
Show HN: HN Update – Hourly News Broadcast of Top HN Stories https://ift.tt/gUyzWZuShow HN: HN Update – Hourly News Broadcast of Top HN Stories I feel li… Read More
Show HN: I created a web app to encrypt/decrypt messages using Web Crypto API https://ift.tt/Jw0EY2MShow HN: I created a web app to encrypt/decrypt messages using Web Cry… Read More
Show HN: Semantic Macros Text Editor https://ift.tt/2zc604yShow HN: Semantic Macros Text Editor https://ift.tt/6Rt8jTv October 21… Read More
Show HN: Data Formulator – AI-powered data visualization from Microsoft Research https://ift.tt/UxyuSkQShow HN: Data Formulator – AI-powered data visualization from Mic… Read More
Show HN: Floating point arithmetic types in C++ for any size and any base https://ift.tt/F4Mpot6Show HN: Floating point arithmetic types in C++ for any size and any b… Read More
Show HN: Create mind maps to learn new things using AI https://ift.tt/5HU8Wr4Show HN: Create mind maps to learn new things using AI Enter a topic a… Read More
Show HN: I made a Sonic runner game in JavaScript https://ift.tt/3eHKqsXShow HN: I made a Sonic runner game in JavaScript https://ift.tt/mJMj3… Read More
Show HN: I built a tool that helps people scan and clean any repo for secrets https://ift.tt/Nu9px8nShow HN: I built a tool that helps people scan and clean any repo for … Read More
0 Comments: