TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Collaborative Map-Reduce in the browser

59 pointsby igrigorikabout 16 years ago

9 comments

timfabout 16 years ago
"<i>how hard would it be to assemble a million people to contribute a fraction of their compute time?</i>"<p>The BOINC project's done it, they've seen 1million+ computers. And they even have an installation barrier which is different than what you are suggesting (their software is robust and easy to install but you still have to do it).<p>One thing BOINC and BOINC projects do well is establish non-monetary incentives, whether it be competitions, fancy graphs, etc. That's something to solve, not sure enlisting just your social network (manually, with an URL) is going to be enough to cut it if you want thousands of participants (unless you are particularly "influential" I guess).<p>Or maybe this is something a legion of mechanical turkers would be interested in?
评论 #501647 未加载
评论 #501642 未加载
ryanwaggonerabout 16 years ago
I knew I'd seen something like this before...<p><a href="http://www.pluraprocessing.com" rel="nofollow">http://www.pluraprocessing.com</a><p>Launched on HN (where else) a few months ago:<p><a href="http://news.ycombinator.com/item?id=347359" rel="nofollow">http://news.ycombinator.com/item?id=347359</a>
lechaabout 16 years ago
Here are some more "business ideas" for your enjoyment:<p>- Buy tons of those fancy interactive visual advertisements, embed the worker into them and perform mapreduce jobs in the browsers of unsuspecting users<p>- Run some of the analytic/batch processing related to a popular social network on CPUs of your customers.<p>- Have a popular site? Sell CPUs of its audience just like one sells impressions via AdSense.
raghusabout 16 years ago
<i>Google's server farm is rumored to be over six digits (and growing fast), which is an astounding number of machines, but how hard would it be to assemble a million people to contribute a fraction of their compute time?</i> - maybe Google can put an optional thingy into Chrome so that users' computers can be part of their server farm?
评论 #501836 未加载
henrylabout 16 years ago
I realize the author wasn't proposing that something like this could be a business, but humor me:<p>Had this idea a few years back with a business model that paid publishers for cpu cycles gathered from a javascript or flash widget. We hoped to then sell this service to data-intensive industries. Decided it wasn't feasible.<p>We need to consider cpu cycles gained from this regime vs. bandwidth and cpu cycles lost from the hundreds of web, queue, data servers needed to run this model. IMO it is unlikely that this model will pay off once you consider things like network latency, and trade-offs like job size (higher job size is better) vs. job completion probability (lower job size is better).<p>Even if the potential for viability were there, it isn't clear that there is a market for something like this. Large scale computing challenges obviously exist, and a lot of people are making money with solutions like cloud computing, but these problems typically involve proprietary data sets, using proprietary or industry standard (good 'ol apps like MySQL) software. Chopping up your sensitive data and sending it en masse to the public to be processed on javascript instead of C++ doesn't exactly fit client needs.
评论 #502304 未加载
评论 #501894 未加载
sam_in_nycabout 16 years ago
Have fun making sure clients don't send you invalid data. You'll have to have some sort of voting system where several clients compute the same piece, and make sure they all match up. Even then, you can't be 100% sure of the results.
piramidaabout 16 years ago
how is this related to map-reduce besides method names? it has a single point of failure (server), nodes do not have logic to split the job further, and on top of that painfully slow javascript engine...
jacktangabout 16 years ago
my current work might related to the field. we created firefox add-ons and let the browsers work for us.
moonpolysoftabout 16 years ago
Sorry, but this is basically grid computing with a slightly different client. As pointed out many times before, most interesting problems right now are IO bound. It turns out that data locality is the most important thing in processing extremely large datasets. That is the key insight in the map-reduce paper and the linchpin to the success or failure of all the distributed map-reduce frameworks that have sprung from it.<p>Most startups and small scale companies that would see the value in leveraging a system like this simply don't have the right processing profile which would make something like this worth their while. I'm sure if you graphed CPU time per byte of data you'd find a sweet spot where a service like this would speed up jobs rather than slowing them down.<p>As it happens, most companies that have a high CPU time per byte ratio are either financial firms or pharma. Most of whom not only have their own infrastructure, but would rather close up shop than see their proprietary code out in the wild for competitors to analyze.<p>And there are already plenty of clients out there for running fourier transforms on possible seti signals.