Show HN: Search Engine for Blogs

389 pointsby dbreretonabout 3 years ago

Hey HN,Blog discovery is a problem [0] due to the decentralized nature of online writing. Everyone writes on their own site or platform, and there’s no central place that brings everything together. Google results prioritize large media publications over blogs, so we need something else.Blog Surf is an attempt to organize all of the great online writing done by individuals. I launched this project last year as a directory of personal blogs [1], but have now rebuilt it from scratch into a full-text search engine for blog posts.You can search for blog posts, and filter by publish date and reading time. Blogs are manually reviewed before being added.Posts are sorted by MarketRank [2], which is a measure of popularity across various online communities. Most projects that have attempted to organize blogs lack any way to measure the quality of a post, reducing their utility. With MarketRank, you can expect the top results for any query to be something you’d want to read.The mental model for searching Blog Surf is “I want to see the best essays on X”There’s also a directory so you can browse blogs by category, if you want a throwback to the Yahoo days.If you’re a blogger yourself, you can check out the rankings page to see how your blog compares to others.If you want to play around with things, we have a search API, and the full post dataset is also available for download.[0] <a href="https://news.ycombinator.com/item?id=28591880" rel="nofollow">https://news.ycombinator.com/item?id=28591880</a>[1] <a href="https://news.ycombinator.com/item?id=26506126" rel="nofollow">https://news.ycombinator.com/item?id=26506126</a>[2] <a href="https://dkb.io/post/market-rank" rel="nofollow">https://dkb.io/post/market-rank</a>

49 comments

NAR8789about 3 years ago

I dig this! Since you're full-text indexing blogs with an eye towards content discovery, can I tempt you towards building out a reverse link index function to enable me to browse by thread?My use case: often times blogs will respond to other blogs, linking to the original post in the process. The nature of linking means it's very easy to follow threads backwards in time, but given the original post it's often hard to discover the responses and ongoing conversation to follow things downstream. I'd like that downstream browsing to be easier.My hope would be that such a tool could unlock higher-quality discourse. As a reader, this would let me hijack my natural tendency to follow comment threads, and redirect that attention towards slower-paced, more nuanced, more focused writing.Edit: hmmm... though looking further, maybe this goes against your MarketRank philosophy.

评论 #30847127 未加载

derekzhouzhenabout 3 years ago

I agree with everything you said, except the popularity ranking. The value of a content shall be in the content itself; popularity is only a flawed measurement. Worse, popularity has very strong positive feedback that contributes to the great polarization of opinions.

评论 #30848660 未加载

jwood27about 3 years ago

This is great! Tangentially related - any maybe an interesting way to view the network you have built up - I put together a quick d3.js force visualization of the "blogrolls" for ~300 linked blogs visible here: <a href="https://jacobwood27.github.io/035_blog_graph/" rel="nofollow">https://jacobwood27.github.io/035_blog_graph/</a>

评论 #30851265 未加载

mgarfiasabout 3 years ago

We, sphere.com, did this starting in 2006. After a year or so, we realized the only people using the service were looking to stroke their egos.Ice rocket, and something else (I can’t remember the name) tried it at the same time and failed.We pivoted, which ended up leading to some unspeakable horrors.At any rate, good luck, hope it works better for you.

评论 #30851244 未加载

stringlytypedabout 3 years ago

You might want consider using OpenSearch [1] to make it easier to add Blog Surf to browsers as a search engine that can be accessed from the location bar. I added it manually in Firefox but it would have been handy to just be able to right-click the search field and choose "Add a Keyword for this Search".[1] <a href="https://developer.mozilla.org/en-US/docs/Web/OpenSearch" rel="nofollow">https://developer.mozilla.org/en-US/docs/Web/OpenSearch</a>

评论 #30849847 未加载

pomokhtariabout 3 years ago

Google has been giving me a very hard time for a while. It's time for SEO to die. We need stuff like this.

rambambramabout 3 years ago

Really like it, good job! Nice color scheme, font choice, and elegant layout.One little thing though: changing a search phrase or word and doing a new search, I notice the results do change, but there's no way to know if it really happened. Changing a Google search, the whole page flashes empty, that way I see/sense there's something new. In your case, a change is subtle, very subtle, too subtle. In one instance I had to look carefully to see the change in results.Maybe adding a "you searched for X" is good enough, but I guess you can come up with a better way.

评论 #30849432 未加载

heywoodlhabout 3 years ago

This is a super cool product! If there was some way on blogsurf to have RSS feeds per category I'm sure that would make my RSS feed curation much easier, random and interesting. I.E. subscribe to all the blogs labeled in cybersecurity, linux, etc. Or maybe this functionality is already present and I didn't see it (I saw the RSS feeds per blog).Unrelated: it was interesting to see my blog listed on the site. Kind of surreal that someone finds my content useful and/or interesting. Very motivating and humbling.

poloteabout 3 years ago

This is cool and the quality of content is great too especially to get the most known blogs of a topic (and I feel like the quality of content is better than all the blogs search engine I have seen).But I don't feel like manual curation by one person is easily compatible with search engine. To me the content of your website is more suited to a weekly newsletter or something like that. Because after trying a few search "getting a job in vc", "best computer chair", "learning erlang" I'm not confident this answer better results than Google.You've got a content size problem as you are manually curating, and this will lead to people not use your search as a default, and probably not use it as a search engine, but instead as a discovery system.You can also try to get more blogs on your search engine, and create a community around it, if you want more more, you can follow this newsletter [1] and you will get probably 5 new blogs per day.Congratz on the job, this is very cool[1] <a href="https://hnblogs.substack.com" rel="nofollow">https://hnblogs.substack.com</a>

评论 #30847192 未加载

drBonkersabout 3 years ago

Many search-engine posts recently. When will someone make the Search Engine for Search Engines?

评论 #30849173 未加载

评论 #30848980 未加载

markdownabout 3 years ago

I searched using a keyword (kava), and got a list of random blog posts with absolutely nothing to do with kava. <a href="https://blogsurf.io/?query=kava" rel="nofollow">https://blogsurf.io/?query=kava</a>

评论 #30851303 未加载

ICodeSometimesabout 3 years ago

This is dope! Wondering how you determine the number of blog points a certain post gets? Is there a blog quality score that's programmatically determined? How so?It seems you are planning to introduce automation instead of manually reviewing every submission, would be interesting if you could crawl via links from blog to blog.I also feel you could likely just type in a url and postfix /blog to it to get a niche blog on some topic. Not sure if that's too simple but seems like it might work as a v0 for your automation.

jedwhiteabout 3 years ago

This is really awesome! I love that you're curating the sites included, and it shows in the quality of the results. The world needs more specialized searches like this, and you've done a brilliant job with implementation. I also really love that you have a directory.I'm going to play with the API, and that's awesome you've made that available.[Disclaimer: also working on a new search engine, and would love to include results from this!]

taubekabout 3 years ago

This is great. What is the policy on accepting blogs to be indexed?

applgo443about 3 years ago

Interesting.How do you figure out which are blogs and which arent'?

superasnabout 3 years ago

This is definitely very cool as I've been looking for something like this since technorati (which was originally a blog search engine).Would love to hear details about how you created the database, the infrastructure, etc if it's not a trade secret. Kudos on the launch!

评论 #30848036 未加载

Kyeabout 3 years ago

This is neat, but it's quietly doing some kind of fuzzy matching with my query.<a href="https://blogsurf.io/?query=furries" rel="nofollow">https://blogsurf.io/?query=furries</a>I actually have no idea what it's really searching for. Only one of the first ten results contains my query. Compare to HN's search which highlights the matched words so it's at least clear when it's going off-script.<a href="https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=furries&sort=byPopularity&type=story" rel="nofollow">https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...</a>Quotes provide an exact match here, but not on your search engine.

评论 #30851349 未加载

asicspabout 3 years ago

Great to have full-text search option!I used to browse latest posts using <a href="https://blogsurf.io/posts" rel="nofollow">https://blogsurf.io/posts</a> link, is that functionality not available anymore?

评论 #30851234 未加载

codazodaabout 3 years ago

I love the idea, but I couldn’t find any results that didn’t look mostly random. I thought I’d look for posts about shooting or editing video. Couldn’t find anything close, no matter how I ordered my query.

评论 #30847217 未加载

Minor49erabout 3 years ago

The tags are pretty limited and have some duplicate entries. Submissions also require a tag for them to be posted, so if they don't fit what's already there, a wrong one will have to be supplied.

评论 #30848201 未加载

Reithabout 3 years ago

I love the idea and I cannot stress how many times I regret finding some undistracting results from those knowledgeable people who put time and energy to produce high-quality materials instead of search-engine-optimized dull content! That said, a mere full-text search lacks understanding context. I tried getting some results about gardening and plants, but returned results where anything from bitcoin to power plants, or even DNA, and not vegetation.

valdectabout 3 years ago

That's an awesome implementation and works really neat. I have been thinking to add this capability to <a href="https://refined.blog/" rel="nofollow">https://refined.blog/</a> . also if you need tagged blog sites you can use our bloglist. also i previously posted in hn so there are some good blogs in here ( <a href="https://news.ycombinator.com/item?id=27973836" rel="nofollow">https://news.ycombinator.com/item?id=27973836</a> )

评论 #30847087 未加载

dustinmorisabout 3 years ago

The search algorithm seems to be extremely poor. I searched for the exact blog post title of some very good blog posts which in other search engines end up in the top 5 if you search for their topic and on blogsurf it didn’t look like it appeared in the results at all. That’s very very weak, especially the first 10 results had nothing to do with the topic I searched for. It was just super high profile websites which had remotely mentioned a few of the keywords.

评论 #30852574 未加载

pete_nicabout 3 years ago

This is great. As a frequent Google user, I breathed a sigh of relief seeing ad free search results from individuals.Curious - how do you know whether a site is a blog versus something else?

评论 #30848155 未加载

rmasonabout 3 years ago

Potentially quite useful. But I ran into one snag. I searched on my friend and frequent blogger Ben Nadel. But at the top all the posts were about Angular.What I wanted were all his posts that weren't about Angular. So I tried adding -angular which works in Google. It pulled up one non-angular post and all the rest were the original ones that are there when you load the page. Add that one feature and I will probably use it a lot.

评论 #30849297 未加载

u2077about 3 years ago

Cool Idea, I love search engines with content made by real people. I’m not sure how many of these you have, but you might be able to pull some more blogs from <a href="https://bloggingfordevs.com/trends/" rel="nofollow">https://bloggingfordevs.com/trends/</a> or <a href="https://blogdb.org/blogs" rel="nofollow">https://blogdb.org/blogs</a>

评论 #30847911 未加载

azhenleyabout 3 years ago

This is awesome! I see some blog posts of mine already on here but using an outdated URL.Are there any plans to check for redirects and update the URL or to recrawl?

davidgerardabout 3 years ago

Google used to have a good blog search. Biggest problem was SEO spam blogs - but even these used to fall down the rankings.Sadly it died of Google deprecating an API. <a href="https://en.wikipedia.org/wiki/Google_Blog_Search" rel="nofollow">https://en.wikipedia.org/wiki/Google_Blog_Search</a>

marbanabout 3 years ago

For business news articles: <a href="https://yup.is" rel="nofollow">https://yup.is</a>

shortformblogabout 3 years ago

Dmitri showed this to me a couple of weeks ago, and I was super-impressed, enough so that even though he sent me a note about it at the end of the night, I stayed up to respond to him. This makes me feel like the spirit of Technorati has a chance of making a comeback someday.

SecurityLagoonabout 3 years ago

I love this. I am always on the lookout for material written by individuals; but, it's surprisingly hard on the modern web.Tbh I'll probably use the random bit more than search but definitely going to keep checking back to pad my RSS feeds with interesting content.

评论 #30848158 未加载

ChrisArchitectabout 3 years ago

What year is it?Don't hate blogs and happy for resurgence, but repeating an uphill battle with indexing like it's 2007.Also, random interesting posts on front page are like from 2009, 2011, 2015...... What? That's the freshest more relevant content?

评论 #30848046 未加载

评论 #30848282 未加载

评论 #30848918 未加载

评论 #30851084 未加载

评论 #30848033 未加载

codeconsciousabout 3 years ago

Lovely! I agree with others that this is promising. Thanks for sharing.One point of feedback: Searching for "C#" seems to bring up C articles and no C# ones, so I suspect perhaps the "#" isn't being included.

camel-cdrabout 3 years ago

I encourage all of you to submit the blogs you are following. Only one of the ten blogs I'm following was supported. Even quite well known once like oldnewthings or lemir's blog.

ilrwbwrkhvabout 3 years ago

Bookmarked. Going to use it.Now if someone would just make a better Reddit search.And then another one for high value properties like mayo clinic, wikipedia, GitHub etc, I will not need to use Google anymore.

stanislavbabout 3 years ago

I tried searching for "saashub", and non of the results had a single mention of that term. Do you know what is the reason? Some stemming?

ropeladderabout 3 years ago

This is a great idea! It's very tech centric, though (at least judging from your directory).

recuterabout 3 years ago

Would you mind sharing some stats on the index? Are you populating it with manual curation?

评论 #30849038 未加载

COilabout 3 years ago

Excellent idea, I was thinking of creating something similar. My new homepage, for sure.

davepeckabout 3 years ago

I love seeing new search engines and this appears to be very well done. The web needs better visibility into its blogs. Bravo!I wonder about the use of MarketRank. For instance, search for "COVID" and your top hit will be from Alex Berenson, a well-known purveyor of outright COVID misinformation. Is this post "interesting"? Yup. Is it trustworthy? Absolutely not.

throwawaylala1about 3 years ago

Love the idea, but I can't search for exact matches using quotes :(

评论 #30851290 未加载

bspearabout 3 years ago

This is cool! how are you thinking about attracting users (besides HN)

escapedmooseabout 3 years ago

Yes! Death to SEO! Really great tool; thank you for sharing.

akselmoabout 3 years ago

I love the flaming comic sans Directory header, lol

mr90210about 3 years ago

Great idea mate. Keep up the good work.

hidden-spyderabout 3 years ago

Where can I see an index of all posts?

ahmadrosidabout 3 years ago

I love this, it would be cool if we can submit blog to be indexed. Of course you can review first to avoid spam.

评论 #30850189 未加载

pcthrowawayabout 3 years ago

Very excited to see this! I noticed some of my favourite bloggers don't appear to be indexed for whatever reason: Josh Comeau, Julia Evans, Amelia Wattenberger. Any idea why these aren't indexed and if you plan to add them? I wonder if you could get a list of some of the most popular blogs on HN (perhaps the maintainer of upvotetracker can help) to add to the index.

diogenesjuniorabout 3 years ago

this is very cool, thank you