科技回声

8 条评论

GrantS将近 11 年前

One of the most incredible parts is that they've already run feature detection on all 100M images/videos and extracted 50TB of:"SIFT, GIST, Auto Color Correlogram, Gabor Features, CEDD, Color Layout, Edge Histogram, FCTH, Fuzzy Opponent Histogram, Joint Histogram, Kaldi Features, MFCC, SACC_Pitch, and Tonality"The good part about this for researchers is not only that this saves dozens of CPU-years of computation (back of the envelope, it would take 15 years for my laptop to extract those SIFT features alone), but that any differences in learning/recognition performance on the dataset can be attributed to the algorithms in question, uncomplicated by which researcher engineered the best features for the dataset. On the other hand, it's a challenging dataset to work with because you can't just download it and process it locally as has been traditionally done. I'll be interested to see how many take advantage of it.

评论 #7990298 未加载

评论 #7990410 未加载

clickok将近 11 年前

It seems like Yahoo is a little bit worried about possible exploitation. From the Terms of Use:2.3. You may derive and publish summaries, analyses and interpretations of the Data, but only in a manner where it is impossible to reconstruct the Data from the publication. Small excerpts of the Data may be displayed to others or published in a scientific or technical context, solely for the purpose of describing your research and related issues and not for any commercial or anti-competitive purpose. Unless Yahoo! expressly requests no attribution, all publications resulting from research carried out using the Data must display an attribution to Yahoo!. This attribution must reference &quot;Yahoo! Webscope,” the web address <a href="http://webscope.sandbox.yahoo.com" rel="nofollow">http://webscope.sandbox.yahoo.com</a>, and the name of the specific dataset used, including version number, if applicable. This attribution should preferably appear among the bibliographic citations in the publication. If Yahoo! expressly requests no attribution, you agree not to mention Yahoo! in connection with the Data. Yahoo! invites you to provide a copy your publication to Yahoo!.This[0] seem fairly restrictive, considering that I can just crawl flickr and get all that data and more, were I so inclined. Also kinda interesting, in this passage and the rest of the TOU: they repeatedly use `&quot;` interchangeably with actual quotation marks ("), suggesting that nobody at Yahoo has proofread their own live TOU. Still, the dataset seems really cool.[0] ...and other parts of the agreement, but I don't want to spoil it for you, nor post its entirety as a comment.

评论 #7990314 未加载

spingsprong将近 11 年前

"Yahoo is hosting a contest to build the system best capable of identifying where a photo or video was taken without using geographic coordinates."Does this strike anyone else as being a bad idea?

评论 #7989534 未加载

评论 #7989531 未加载

评论 #7989658 未加载

评论 #7989787 未加载

cclogg将近 11 年前

"From the old world of unprocessed rolls of C-41 sitting in a fridge 20 years ago"Hey I still do that! :(I wonder if my (or anyone's) film photos on Flickr are completely useless metadata-wise. Because they are all scanned so they just say "NORITSU KOKI EZ Controller". There seems to be a large portion of people (on Flickr) shooting film still but I wonder if it's only a small percentage overall.

jitendraag将近 11 年前

Just when I was happy using Flickr's API for creative commons image search - <a href="http://www.outreachpanel.com/free-images/" rel="nofollow">http://www.outreachpanel.com/free-images/</a>They gave me this huge data to play with :)In past, I have had issues with CC images that were also tagged with 'getty'. I hope they have taken care of that issue.

chatman将近 11 年前

No access to non university based researchers. Useless for me.

liminal将近 11 年前

The data is only available to university researchers.

raphar将近 11 年前

Isn't this on torrent yet?

8 条评论

GrantS将近 11 年前

评论 #7990298 未加载

评论 #7990410 未加载

clickok将近 11 年前

评论 #7990314 未加载

spingsprong将近 11 年前

"Yahoo is hosting a contest to build the system best capable of identifying where a photo or video was taken without using geographic coordinates."Does this strike anyone else as being a bad idea?

评论 #7989534 未加载

评论 #7989531 未加载

评论 #7989658 未加载

评论 #7989787 未加载

cclogg将近 11 年前

jitendraag将近 11 年前

chatman将近 11 年前

No access to non university based researchers. Useless for me.

liminal将近 11 年前

The data is only available to university researchers.

raphar将近 11 年前

Isn't this on torrent yet?

100M Creative Commons Flickr Images for Research

8 条评论

100M Creative Commons Flickr Images for Research

8 条评论