TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Academic Torrents

448 pointsby yinghangover 11 years ago

25 comments

cingover 11 years ago
The team should learn from the ghost-town that is BioTorrents[1] and offer more than just a tracker. [1] <a href="http://www.biotorrents.net/browse.php?incldead=1" rel="nofollow">http:&#x2F;&#x2F;www.biotorrents.net&#x2F;browse.php?incldead=1</a>
评论 #7149883 未加载
评论 #7150087 未加载
评论 #7149429 未加载
TheBivover 11 years ago
This is really cool.<p>I simply wished that the messaging was more clear and told a story that I could tell to my friends who ultimately are &quot;too busy&quot; to think about the value of this product.<p>Unfortunately &quot;We&#x27;ve designed a distributed system for sharing enormous datasets - for researchers, by researchers. The result is a scalable, secure, and fault-tolerant repository for data, with blazing fast download speeds.&quot; Just isn&#x27;t a story that I can tell to my buddies and get them excited.
评论 #7149504 未加载
hardwaresoftonover 11 years ago
Wow, this is pretty cool -- one of the most direct approaches to open-data that I&#x27;ve seen so far (and the research world is of course in dire need of this kind of open data&#x2F;connect-the-dots enabling effort)!<p>I think it would be pretty cool to have trending datasets on the front page (I&#x27;m sure you could do a small cron that would find the most-downloaded per-week&#x2F;per-day&#x2F;etc)<p>Also, while not a dire necessity, I think a cooler name would help this project fly farther -- You should be able to make a play on &quot;data torrents&quot;, maybe something like datastorm&#x2F;samplerain&#x2F;datawave&#x2F;dataswell&#x2F;Acadata?<p>Any way, trivial stuff aside, nice implementation -- bookmarked for when I get the urge to do a data-analysis project!
评论 #7149099 未加载
teddyhover 11 years ago
So what do I do if I want to seed them all? Also, are all the data sets (and other things) freely licensed, i.e. no “non-commercial use only” clauses or things of that nature? Can I count on this going forward?
jakeoghover 11 years ago
A few TB of FOIA information related to the September 11th attacks is available via BT.<p>Direct link: <a href="http://911datasets.org/images/911datasets.org_all_torrents_Jan_30_2014.zip" rel="nofollow">http:&#x2F;&#x2F;911datasets.org&#x2F;images&#x2F;911datasets.org_all_torrents_J...</a>
dav-over 11 years ago
Any reason passwords for user accounts are limited to 40 chars?
sesover 11 years ago
Projects like this confirm my suspicion that traditional academic publishing is going to take a nosedive in the next few years. Working in this industry as I do, I don&#x27;t see commercial publishers moving quickly enough to change. Really love the idea of this and can&#x27;t help but support the general ethos of it, even if it &#x2F; its descendants will put a lot of us out of a job.
kartikkumarover 11 years ago
Brilliant idea if I understand it correctly. Just want to check that my use case would fit. I just submitted my first and main paper for my PhD to Icarus. I&#x27;m planning on soon uploading it to ArXiv as well. My paper is theoretical in nature and through a suite of Monte Carlo simulations I generated a few hundred MBs of data. Can I make use of this system as a way to deposit that data so that it&#x27;s available to anyone that wants to verify the conclusions I reach in my paper and possibly extend the research?
thedudemabryover 11 years ago
Wow! That&#x27;s a snappy site. Major props to the frontend dev(s).
评论 #7149602 未加载
csenseover 11 years ago
I&#x27;m surprised they don&#x27;t have the Google Books n-gram dataset [1]. Then again, maybe they&#x27;re more focused on data that doesn&#x27;t have a good home already than on mirroring.<p>[1] <a href="http://storage.googleapis.com/books/ngrams/books/datasetsv2.html" rel="nofollow">http:&#x2F;&#x2F;storage.googleapis.com&#x2F;books&#x2F;ngrams&#x2F;books&#x2F;datasetsv2....</a>
lancemjosephover 11 years ago
Many of the datasets that I&#x27;ve seen in academia are stored in static SQL databases that tend to be about 10-20 terabytes. Where does this leave individuals with limited resources who would like to query large databases without having to juggle the data management side of research? Are there softwares that make database querying P2P accessible?
macarthy12over 11 years ago
The problem is the word &quot;torrent&quot;. Too many negative connotations for many in the traditional academic world.
评论 #7153809 未加载
评论 #7155539 未加载
shitlordover 11 years ago
I have an idle server with 500 Mb&#x2F;s upload. Now I can finally put it to good use! :)
incogmindover 11 years ago
I remember the old days of DC++ whenever I hear blazing fast speeds.
linux_devilover 11 years ago
Great ! Looking forward to coursera, edx and ocw videos too
评论 #7149853 未加载
alagappanrover 11 years ago
We would need a significant number of seeders in order for this to become a successfully used product. Perhaps, universities can seed data?
nvdkover 11 years ago
this seems to be very focused on US academics, at least that is what impression I&#x27;m given by labeling &quot;.edu&quot; addresses. It gives a feeling that these torrents&#x2F;datasets are of better quality. I&#x27;m also missing a catalog on this tracker, some basic taxonomy would be most welcome...
评论 #7149636 未加载
mathattackover 11 years ago
I am no expert on torrents, but I like this conceptually. Publicly funded academic research should be free.
tallesover 11 years ago
What a wonderful idea. This fits so well with the torrent protocol (maybe even philosophically speaking).
huevosabioover 11 years ago
This awesome! Thanks for sharing!
erikbover 11 years ago
awesome invention! Could this be connected with the google scholar to add keyword searching?
guspeover 11 years ago
Aaron Swartz&#x27;s dream come true?
评论 #7149455 未加载
评论 #7153907 未加载
sillysaurus2over 11 years ago
One problem with offering a dataset as a torrent is that it&#x27;s impossible to edit it after it&#x27;s released. However, it seems like that doesn&#x27;t matter at all in this case, because any scenario I can think of which could be solved by editing the dataset (like redacting private info that was accidentally included) wouldn&#x27;t avoid the original problem: that they accidentally released private info in the first place. Perhaps it&#x27;d be useful to edit the original dataset in order to add to it &#x2F; enhance it with more info, but in that case they could just release a second dataset as an addendum.<p>So the core idea seems solid. Thank you for this!
评论 #7149404 未加载
评论 #7155487 未加载
评论 #7152919 未加载
评论 #7149492 未加载
评论 #7154326 未加载
jackmaneyover 11 years ago
Excellent! It&#x27;s far too early to tell, but I&#x27;d like to be hopeful that this distribution network could be another nail in the coffin of the old, expensive, dead-tree journals.
评论 #7149651 未加载
CompleteMoronover 11 years ago
thanks for sharing! I shall store in the vault of Hard Drives I keep here in the desert