TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

100M Creative Commons Flickr Images for Research

104 pointsby knethalmost 11 years ago

8 comments

GrantSalmost 11 years ago
One of the most incredible parts is that they&#x27;ve already run feature detection on all 100M images&#x2F;videos and extracted 50TB of:<p>&quot;SIFT, GIST, Auto Color Correlogram, Gabor Features, CEDD, Color Layout, Edge Histogram, FCTH, Fuzzy Opponent Histogram, Joint Histogram, Kaldi Features, MFCC, SACC_Pitch, and Tonality&quot;<p>The good part about this for researchers is not only that this saves dozens of CPU-years of computation (back of the envelope, it would take 15 years for my laptop to extract those SIFT features alone), but that any differences in learning&#x2F;recognition performance on the dataset can be attributed to the algorithms in question, uncomplicated by which researcher engineered the best features for the dataset. On the other hand, it&#x27;s a challenging dataset to work with because you can&#x27;t just download it and process it locally as has been traditionally done. I&#x27;ll be interested to see how many take advantage of it.
评论 #7990298 未加载
评论 #7990410 未加载
clickokalmost 11 years ago
It seems like Yahoo is a little bit worried about possible exploitation. From the Terms of Use:<p><i>2.3. You may derive and publish summaries, analyses and interpretations of the Data, but only in a manner where it is impossible to reconstruct the Data from the publication. Small excerpts of the Data may be displayed to others or published in a scientific or technical context, solely for the purpose of describing your research and related issues and not for any commercial or anti-competitive purpose. Unless Yahoo! expressly requests no attribution, all publications resulting from research carried out using the Data must display an attribution to Yahoo!. This attribution must reference &amp;quot;Yahoo! Webscope,” the web address <a href="http://webscope.sandbox.yahoo.com" rel="nofollow">http:&#x2F;&#x2F;webscope.sandbox.yahoo.com</a>, and the name of the specific dataset used, including version number, if applicable. This attribution should preferably appear among the bibliographic citations in the publication. If Yahoo! expressly requests no attribution, you agree not to mention Yahoo! in connection with the Data. Yahoo! invites you to provide a copy your publication to Yahoo!.</i><p>This[0] seem fairly restrictive, considering that I can just crawl flickr and get all that data and more, were I so inclined. Also kinda interesting, in this passage and the rest of the TOU: they repeatedly use `&amp;quot;` interchangeably with actual quotation marks (&quot;), suggesting that <i>nobody at Yahoo has proofread their own live TOU</i>. Still, the dataset seems really cool.<p>[0] ...and other parts of the agreement, but I don&#x27;t want to spoil it for you, nor post its entirety as a comment.
评论 #7990314 未加载
spingsprongalmost 11 years ago
&quot;Yahoo is hosting a contest to build the system best capable of identifying where a photo or video was taken without using geographic coordinates.&quot;<p>Does this strike anyone else as being a bad idea?
评论 #7989534 未加载
评论 #7989531 未加载
评论 #7989658 未加载
评论 #7989787 未加载
ccloggalmost 11 years ago
&quot;From the old world of unprocessed rolls of C-41 sitting in a fridge 20 years ago&quot;<p>Hey I still do that! :(<p>I wonder if my (or anyone&#x27;s) film photos on Flickr are completely useless metadata-wise. Because they are all scanned so they just say &quot;NORITSU KOKI EZ Controller&quot;. There seems to be a large portion of people (on Flickr) shooting film still but I wonder if it&#x27;s only a small percentage overall.
jitendraagalmost 11 years ago
Just when I was happy using Flickr&#x27;s API for creative commons image search - <a href="http://www.outreachpanel.com/free-images/" rel="nofollow">http:&#x2F;&#x2F;www.outreachpanel.com&#x2F;free-images&#x2F;</a><p>They gave me this huge data to play with :)<p>In past, I have had issues with CC images that were also tagged with &#x27;getty&#x27;. I hope they have taken care of that issue.
chatmanalmost 11 years ago
No access to non university based researchers. Useless for me.
liminalalmost 11 years ago
The data is only available to university researchers.
rapharalmost 11 years ago
Isn&#x27;t this on torrent yet?