Show HN: The Population Project Two years ago, I turned 50. After a successful career as an entrepreneur, a business angel and a novelist, I set out to start a philanthropic venture under the following constraints: - it had to be global. - it had to be beautiful (in my eyes, at least). - it had to be technology and stats driven. I decided I would try to list the full name and date of birth of all humans alive. While some may find the concept pointless, I immediately knew I had struck gold: - it was global and incredibly hard. - it had an almost artistic quality to it, like an ever-changing installation. - as a libertarian, I resent that states conduct censuses and then sit on the data. - One billion people in the world aren't officially registered. At least someone would acknowledge their existence. I created a non-profit called The Population Project. I would never make a dime off it, but at least my costs would be tax-deductible. I then started researching lists of names online. I quickly adopted two principles. First I would collect a minimal set of information : full name, birth date, and birth place. Second, I would only scrape public information, i.e. nothing behind a password. After a few months, I realized I needed help from more experienced developers. I chose to work on 4D, a platform I had used in the past to develop my company's information system. It was a tough choice: 4D is not a leading player in the back-end world, but I figured the growth of API tooling would make language choice less critical. The first iteration of our database was frustrating - way too slow to publish a website. I learned the power of incremental change, with each marginal improvement saving you a few percent of speed or space. I also got to implement concepts I had heard about but never implemented, such as mirroring, partitioning, or hash-indexing. Then I hired a team of six data processors in Madagascar who clean up and process the lists found online. Lots of Python and Excel macros in their day-to-day. I have instilled in them an obsession with quality. A bad record will sit in our base forever. After trying dozens of softwares, we've settled on Adobe Acrobat and Octoparse. The final piece was the website. I lucked out in finding a strong team in Romania. They build with Next.js and deploy on Vercel. I gave them Wikipedia as the model to aim for. We/they haven't been able to match Wikipedia's simplicity. Our pages are too heavy. But I find the site user-friendly, pleasing to the eye and reasonably fast. We can and we will do better. A word about privacy. Some people complain that because it publishes names and DOBs, the Population Project infringes on their privacy. We obviously don't see it that way. - All our info is public. That DOB you find on the site is probably in the voter list of your state, a list that anyone can request or plainfully download. - The info we publish is minimal. Basically, we say that you exist. No one will find anything about your race, religion, sexual preferences, job or income. - We have adopted Wikipedia's privacy policy. We do not record your IP, unless you create or edit a record. - We're using Matomo for our Analytics. Great stuff. It's not free but they do not use your data like GA. Why am I telling you all this? From the beginning, I've envisioned a three-step process: 1) Build the database and populate it with millions of Western profiles. 2) Launch the site, where anybody can create or edit records and share them with their family. 3) When we've reached critical mass (1B records?), start making deals with NGOs and governments, and venture into other alphabets. We have just completed step 1. Step 2 is daunting as hell. I have grown a business but I have never grown a website. While I am ready to spend a bit of money on PR or SEO, I am not delusional: to reach the level of success we have in mind, we need this thing to go (somewhat) viral. How do you do that? https://ift.tt/F7E8k4x August 7, 2023 at 09:16PM
Show HN: The Population Project https://ift.tt/q9YwU3K
Related Articles
Show HN: Trilby for Hacker News – An Elegant Way to Experience HN on Android https://ift.tt/gEmCKquShow HN: Trilby for Hacker News – An Elegant Way to Experience HN on A… Read More
Show HN: ytcast – cast YouTube videos to your smart TV from command-line https://ift.tt/d35RTU6Show HN: ytcast – cast YouTube videos to your smart TV from command-li… Read More
Show HN: Runk – a CLI based file and folder sharer over network https://ift.tt/4bdx8WPShow HN: Runk – a CLI based file and folder sharer over network https:… Read More
Show HN: MemSafeCrypto, Java cryptography primitives using DirectByteBuffer https://ift.tt/MOgbAWcShow HN: MemSafeCrypto, Java cryptography primitives using DirectByteB… Read More
Show HN: On browser speech recognition for video control https://ift.tt/3xkHMFDShow HN: On browser speech recognition for video control https://ift.t… Read More
Show HN: Arduino 6502 Controller https://ift.tt/RteGN2VShow HN: Arduino 6502 Controller https://ift.tt/2rWQwkS February 19, 2… Read More
Show HN: Hacker News clone using Remix and React https://ift.tt/KJUSPhNShow HN: Hacker News clone using Remix and React Hi all, author here. … Read More
Show HN: EdgeDB 1.0 https://ift.tt/3sjKbtMShow HN: EdgeDB 1.0 https://ift.tt/8NpgqDI February 10, 2022 at 11:43P… Read More
0 Comments: