I'm happy to say that the reports of my death here are greatly exaggerated :)<p>I'm the owner of both #4 and #140 on the Top-scoring Show HN Stories that Didn’t Survive... but both are very much alive!<p>#4 StackSort was a Github.com page, but on 2021 they made it so only Github.io wroks. If dang sees this, I'd really appreciate if you could change the URL for <a href="https://news.ycombinator.com/item?id=5395463">https://news.ycombinator.com/item?id=5395463</a> to use github.io!<p>#140 ReadMe has the same io/com issue, in the opposite direction! we redirect readme.io to readme.com now, which seems to be why it's flagged.
> Extra: ChatGPT Gave a Wrong RegexPermalink
I consulted ChatGPT for a regex to extract domains from urls, and it gave a flawed one:<p>^(?:https?:\/\/)?(?:[^@\n]+@)?(?:www\.)?([^:\/\n?]+).<p>It even gave reasonable detailed explanations which convinced me. Later tests revealed that this regex doesn’t work for url with @ in path, such as <a href="https://foo.com/@./bar" rel="nofollow noreferrer">https://foo.com/@./bar</a>. The correct one should be<p>^(?:https?:\/\/)?(?:[^@\/\n]+@)?(?:www\.)?([^:\/?\n]+).<p>---------------------<p>The trick is to ask ChatGPT what the right tool for the job is in your language of choice. For python, ChatGPT will happily give you:<p><pre><code> from urllib.parse import urlparse
extract_domain = lambda url: urlparse(url).netloc.replace('www.', '', 1)
# Example usage
url = 'https://foo.com/@./bar'
domain = extract_domain(url)
print(domain) # Output: foo.com
</code></pre>
-------------<p>I don't think RegEx is typically the "most" correct tool for the job for things which likely have built-in parser libraries (XML, HTML, URLs, JSON, etc)
Nice work!<p>I'd actually be interested in factors that make a Show HN a success vs failure.<p>Objectively, there's an obvious one your dataset: time of submission. Tuesday afternoon (which timezone? I assume US west coast?) seems to be key. No way this correlates with the quality of submissions.<p>Subjectively: it seems to become much harder recently. I managed once a couple of years ago for a short time to reach the front page with an Android app, now I'm barely able to get above 20 points, even though the product is (again, subjectively) cooler and has a possibly wider audience (<a href="https://news.ycombinator.com/item?id=35671245">https://news.ycombinator.com/item?id=35671245</a>).<p>Not complaining, but perhaps nowadays Show HN is not an easy way anymore to "get the word out" and get some early user feedback for and from indie hackers? Any other sites that might be of interest?
No affiliation, but the second to top deceased site is still alive and kicking [0]<p>Spot checking the top results might give a better estimate for how many are actually alive vs. just using bot protection.<p>[0]<a href="https://news.ycombinator.com/item?id=35543668">https://news.ycombinator.com/item?id=35543668</a>
Thanks for making and sharing this - although I'm surprised it's not a "Show HN" itself!<p>I was curious about the top post that didn't survive - an HTML5 game called "airma.sh" - and I wanted to check it out. I <i>think</i> I found a working mirror: <a href="https://www.crazygames.com/game/airmash" rel="nofollow noreferrer">https://www.crazygames.com/game/airmash</a><p>It's possible that this is a different game, but it seems to fit the description.<p>Interestingly, the person who submitted that post stopped being active on HN after that discussion.
I know you mention there are lots of reasons for false positives and negatives, but does your methodology account for length of time at all? Meaning, if a project was posted to HN in 2009, it could have been successful for 14 years and then closed down, or just changed URLs somewhere along the way, and in that case it would be counted as a failure even though it wasn't. Likewise, if it was posted in May, 2023 and is still around, that doesn't mean much because it's still flying the Grand Opening banner, practically.
The top 250 has 8 dead projects from 2023. Of those 8, 5 are not dead at all, 1 is alive but has an expired certificate and only 2 (the lowest ranked) are dead. This does not seem like useful data.
Airmash still lives at <a href="https://airmash.online/" rel="nofollow noreferrer">https://airmash.online/</a> and there’s also a space mod - Starmash - at <a href="https://airmash.cc/" rel="nofollow noreferrer">https://airmash.cc/</a><p>I apologise in advance for the hours you’ll lose to these (again?)
> Looking for a Sponsor to Host the Database PubliclyPermalink
> In the meantime, it’d be great if anyone can query the database. I tried to host a public database and real-time query interface online, but couldn’t afford the bill for a smooth Postgres instance to hold around 20G (40M rows plus indices) data. While a $20 instance could suffice, it’s pretty slow from usable, comparing to the local one on my M2 MacBook Air.<p>Here is the database with publicly available SQL endpoint: <a href="https://play.clickhouse.com/play?user=play#U0VMRUNUICogRlJPTSBoYWNrZXJuZXdzIExJTUlUIDEwMDA=" rel="nofollow noreferrer">https://play.clickhouse.com/play?user=play#U0VMRUNUICogRlJPT...</a>
Regarding database hosting, if you would consider giving the data away, I would suggest converting it to an SQLite database and sharing it over Torrent.
Neat idea, thanks for sharing.<p>Curious choice to highlight Show HNs that didn't survive, but not the ones that did.<p>Is there a reason for this?
> <i>Send me your interesting queries</i><p>I'd be interested to see what the top Show HN posts were, after adjusting for the growing size of the HN community. That is, posts from 10 years ago would not have garnered as many upvotes simply because the community was smaller, and presumably posts were upvoted less back then, in general.<p>I don't know the best way to measure this; it could be normed based on the median number of upvotes for the top story each week, bucketed by month. Probably someone has a better idea for this.
I am also, along with gkoberger happy to say that we didn't die after our Show HN (Show HN: A Covid-19 testing location site that a group of us are building)<p><a href="https://news.ycombinator.com/item?id=22650725">https://news.ycombinator.com/item?id=22650725</a><p>In fact we were so successful that we were able to shut it down less than a year after we started (It's on the list as a very reasonable Type II error ;))<p>Thanks to the HN community for helping us get an amazing Temporary product out and shut down successfully
Recently I was browsing through old threads where users showed off their personal websites and blogs. I wanted to find some inspiration for my own website.<p>What I found instead were about 3/4 dead links – even though the threads were all from the last 4-5 years. I found that quite sad, because people often talked with great passion about their websites and they sounded really cool. Also i LOVE those small, personal islands in the big, commercialized and in many ways centralized web.
> So I’m looking for a sponsor to host the database publicly. I need one mediocre VM for a Rails stack app and a semi-powerful hosted Postgres instance. Contact me if you’re interested<p>The Oracle Cloud Free tier is a great deal. They give you 4 Ampere A1 Cores + 24 GB RAM + 200GB storage for free. More than enough for a 20G (40M rows plus indices) Posgres instance.
Is there a way to see how long a link stays on the hn front page on average, and if that average is rising or falling over time? I read that avg time spent by a twitter hashtag on the twitter trending page has been falling year over year. Indicating people's are paying less attention to any one thing.
I'd love to get some correlation with rank, or even filtering of lower scoring posts.<p>From what I know, HN posts are often used as a signal for viability of a project. In that case, you can't make a conclusion on the effectiveness of Show HN posts, because some of them will die off by design.
Just a silly aside with regards to the regex to extract domains from URLs, my little tool called unfurl [0] exists to solve that exact sort of problem :)<p>[0]<a href="https://github.com/tomnomnom/unfurl">https://github.com/tomnomnom/unfurl</a>
Phind (#2 on your list) is still up and running also (<a href="https://www.phind.com/search?q=false%20negative&source=searchbox">https://www.phind.com/search?q=false%20negative&source=searc...</a>).
My Show HN from 2013 is still alive but it's listed as dead (#590). Probably because the link from the post uses https but my 301 redirect only works using http.
The pandemic really got the activity going during 2020 (first bar chart), but maybe not so surprising with everyone pivoting to remote work. And obviously all discssusions about vaccines and how different government were handling things.