TechEcho

13 comments

onion2kabout 9 years ago

One thing I hope this project does that Google fails to do is give developers a good API to access search. Google closed down their first web search API and now only give developers access to a limited Custom Search API that's rate limited to 100 queries a day for free with a hard limit of 10k searches - that makes it either very hard to develop anything against or relatively expensive. There are other options (Bing, Faroo, raw access to CommonCrawl) but they're either low quality or hard to work with. A good quality, straightforward, open web search API would be awesome.

评论 #11283455 未加载

评论 #11282348 未加载

评论 #11285898 未加载

评论 #11284845 未加载

评论 #11283134 未加载

libeclipseabout 9 years ago

I've tried using different search engines to Google numerous times, but each time I've returned to Google simply because the searches are better. They're more accurate, more relevant, and I very rarely find myself searching more than once to find something.If commonsearch can beat Google in that regard, then count me in. But I doubt it will.

评论 #11282599 未加载

评论 #11282678 未加载

评论 #11285426 未加载

评论 #11283057 未加载

评论 #11282711 未加载

whazorabout 9 years ago

I think people might underestimate the power of an open source search engine. In my eyes it is like wikipedia versus the old paper encyclopedia books. Improvements to search results in Google are done by a relatively small amount of people from Google. Google decides where you buy, what you think and how you live. Behind their algorithms they probably have made dozens of subjective choices. Public debate, more attention to details, and open politics are as I see it, great tools to improve search engine quality.

评论 #11285697 未加载

评论 #11286224 未加载

jasodeabout 9 years ago

I like the project's goal but as techies, we inevitably want to understand the technical details and how it helps (or handicaps) the search results in comparison with Google.For example, the project's data sources[1] says that the bulk of data comes from The Common Crawl. It looks like the CC is ~150 TB of data[2]. I'm not familiar with google.com internals but various sources estimate that their proprietary crawl dataset is more than a petabyte. (A googler could chime in here with more accurate data.)So it's not as simple as the algorithm for Common Search being "more fair" than the algorithm for Google Inc. The underlying dataset in terms of quantity, recency, rules for the robot, etc all affect the algorithm.This is not a criticism of the project. It is my attempt to understand what is not obvious on the surface level.[1]<a href="https://about.commonsearch.org/data-sources" rel="nofollow">https://about.commonsearch.org/data-sources</a>[2]<a href="http://commoncrawl.org/2015/12/november-2015-crawl-archive-now-available/" rel="nofollow">http://commoncrawl.org/2015/12/november-2015-crawl-archive-n...</a>(I'm can't tell if each archive of MM/YYYY is cumulative or an addendum.)

评论 #11282604 未加载

mynewtbabout 9 years ago

Seeing how the founder is the same who founded Jamendo which later was turned into a sad, user-unfriendly attempt to make money with freely licenses music (destroying its community in the process), how can I trust commonsearch not to be a waste of time and attention?

评论 #11283243 未加载

评论 #11283878 未加载

评论 #11283766 未加载

jdimov10about 9 years ago

If it keeps being THAT fast after they've indexed the whole web, I'm switching search providers! :)

rmcabout 9 years ago

I'm trying to find out from their website, but it's unclear. Are the servers hosted in the USA? And will the organisation be incorporated in the USA?If you're talking about privacy and transparency, it's better to operate in a place bound the European Charter of Fundamental Rights, rather than the US Constitution, because the former gives people much more rights with their data, how it's used, etc.

评论 #11283689 未加载

评论 #11283738 未加载

faizshahabout 9 years ago

I like it!The explainer tool gives a really cool insight into the results: <a href="https://explain.commonsearch.org/" rel="nofollow">https://explain.commonsearch.org/</a>

structabout 9 years ago

Neat, I was working on a project to give a full programmatic keyword index to the contents of the common crawl, but I guess there's no need! It's very exciting to consider what kind of applications you can build with this.

mrfusionabout 9 years ago

I'd love to see a Wikipedia styled search where people can improve or flag results as they see fit. I wonder if that has been tried.Sure it might not handle the long long tail but the top ten million searches would still be pretty useful.

评论 #11285655 未加载

评论 #11286330 未加载

ocdtrekkieabout 9 years ago

This sounds awesome. Speaking of building AIs/bots and such in your FAQ, the lack of a good open API for search is probably what gates that market to Google and Microsoft and such... That nobody else can just tap a search engine. I'd love to be able to connect to this for queries at some point.

PaulHouleabout 9 years ago

"nonprofit" for me is a bad smell. I.e. the problem of sustainability, which for nonprofits is all about the money and not about carbon or solar energy, rainbows, plutonium or any of that.

tonylxcabout 9 years ago

I'm particularly interested in the discuss forum. Is it an open source one or built yourselves? Thanks!

评论 #11285527 未加载

13 comments

onion2kabout 9 years ago

评论 #11283455 未加载

评论 #11282348 未加载

评论 #11285898 未加载

评论 #11284845 未加载

评论 #11283134 未加载

libeclipseabout 9 years ago

评论 #11282599 未加载

评论 #11282678 未加载

评论 #11285426 未加载

评论 #11283057 未加载

评论 #11282711 未加载

whazorabout 9 years ago

评论 #11285697 未加载

评论 #11286224 未加载

jasodeabout 9 years ago

评论 #11282604 未加载

mynewtbabout 9 years ago

评论 #11283243 未加载

评论 #11283878 未加载

评论 #11283766 未加载

jdimov10about 9 years ago

If it keeps being THAT fast after they've indexed the whole web, I'm switching search providers! :)

rmcabout 9 years ago

评论 #11283689 未加载

评论 #11283738 未加载

faizshahabout 9 years ago

I like it!The explainer tool gives a really cool insight into the results: <a href="https://explain.commonsearch.org/" rel="nofollow">https://explain.commonsearch.org/</a>

structabout 9 years ago

mrfusionabout 9 years ago

评论 #11285655 未加载

评论 #11286330 未加载

ocdtrekkieabout 9 years ago

PaulHouleabout 9 years ago

"nonprofit" for me is a bad smell. I.e. the problem of sustainability, which for nonprofits is all about the money and not about carbon or solar energy, rainbows, plutonium or any of that.

tonylxcabout 9 years ago

I'm particularly interested in the discuss forum. Is it an open source one or built yourselves? Thanks!

评论 #11285527 未加载

Common Search – nonprofit search engine for the Web

13 comments

Common Search – nonprofit search engine for the Web

13 comments