TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: How to legally obtain sports data for commercial use?

40 点作者 sga超过 14 年前
This is a long standing question of mine reignited by bignoggins post at http://news.ycombinator.com/item?id=1772224 where he described the success of http://www.fantasymonsterapp.com.<p>I've often thought of building sports related apps (esp. pertaining to fantasy sports) but I've always struggled with how to legally obtain the necessary data (scheduling, statistics, player images, team logos, etc.) such that I can pursue it as a commercial venture. An obvious solution is to simply scrape the info but I'd assume you'd get shut down or blocked rather quickly. Yahoo offers a Fantasy Sports API but it's to be used for non-commercial purposes only.<p>Can anyone shed light on where/how to obtain current and past sports data that is available for commercial use? (I'm most interested in NFL, NBA, MLB, NHL data)<p>Thanks!

16 条评论

mattmaroon超过 14 年前
As for the stats, you purchase them from a stats provider. The big dog is Stats.com, but they're very expensive. They are the primary source for all of the major fantasy sites, though some use secondary sources (I think for accuracy verification).<p>At Draftmix we used a competitor of theirs called PA Sports Ticker, which Stats bought shortly after we shut Draftmix down. We had previously used a cheaper one called XML Team, but we realized quickly that we had gotten what we had paid for as the feeds were often updated very late or contained errors. They're fine enough for getting started (and probably the easiest to implement, since you pull the data on demand rather than having them post it to you) especially if you don't require live stats. Live stats cost more and are harder to implement. You could get post-game stats and schedule data for a few grand a year back then from XML Team, live stats for a few times that, but I don't know what Stats buying their primary competitor has done to prices. I can't imagine it's made them get cheaper.<p>There's a new one called Sports Direct. I don't have any experience with them, but our former salesman from PA works there. I'd be happy to put you in contact if you'd like, just email me. He's a good salesman at least.<p>For player images and team logos you need to set up licenses. Logos come from the league (NFL, MLB), player images from the players' unions (NFLPA, MLBPA). This is very costly. The actual images themselves can be provided by Stats and other sources, but you can't use them without paying the license (though Stats may have worked out a deal that lets them include that in the package).<p>The Fantasy Sports Trade Association (fsta.org) is the best place to find service providers for the industry. Anyone worth anything is a member.
评论 #1792214 未加载
iheartmemcache超过 14 年前
This[1] StackOverflow thread answers this question pretty thoroughly. Unfortunately it looks like if you want it to be legal, you should be prepared to pay a hefty (&#62; 4 figures) monthly subscription fee.<p>[1] <a href="http://stackoverflow.com/questions/57106/anyone-know-of-an-nfl-or-nba-api" rel="nofollow">http://stackoverflow.com/questions/57106/anyone-know-of-an-n...</a>
评论 #1791938 未加载
mccutchen超过 14 年前
I too have wondered about this for a couple of minor side projects. I've always resorted to scraping, but it feels wrong and I know it's not feasible for commercial projects.<p>I've always assumed that there was some commercial data source out there that would provide all of this information in a nice, structured format for some kind of fee, but I have yet to find it.<p>One nice thing about major sites moving to "live" scoreboards is that you can often find nicely structured data sources behind them. For instance, here's the NFL's live score feed, in JSON:<p><a href="http://www.nfl.com/liveupdate/scores/scores.json" rel="nofollow">http://www.nfl.com/liveupdate/scores/scores.json</a><p>(Unfortunately, it's empty as I write this because there are no games going on right now. Here's an example taken late on a Sunday or on Monday morning: <a href="http://gist.github.com/626612" rel="nofollow">http://gist.github.com/626612</a>)<p>Another, related question is how to get good gambling information (point spreads, totals, etc.) for the same use case. I think this might be easier, as I've come across various sports book sites in the past that offer subscription services.<p>On Yahoo's NFL odds page, it says their data source is OddsShark (<a href="http://www.oddsshark.com/" rel="nofollow">http://www.oddsshark.com/</a>) whose home page advertises<p><pre><code> Offer comparative live odds and other sports stats on your site FREE of charge. You pick the sportsbooks, you pick the bet types and OddsShark.com sports betting odds engine does the rest, delivering feeds to your site. </code></pre> I got in touch with them, but never received a response...
评论 #1792376 未加载
dustym超过 14 年前
I've worked with STATS, Inc and XML Team. I've also implemented features against the Yahoo! Fantasy API and it was very nice to work with.<p>First off, you are going to have to deal with a rep.<p>STATS is the big name in the business and they feed, at least partially, many stat resellers from who you might be able to get cheaper rates. From there I'd say you should find a cheap service or a mechanism (scraping, etc) that gives you just enough data to work with and start building against it. Look at XML team for competitive pricing. If you get to the point where your app is past prototype, you should then investigate buying into the full service.<p>Depending on the day and the feed, wrangling sports data is awesome or horrible or both.<p>On the subject of scraping, I'm not sure what the legalities are. Obviously you are probably violating the TOS of any site you are visiting if you grab the data, but at the same time, strikes, balls and fouls are facts of the game.<p>Images and logos are sometimes provided by sports data brokers.<p>Take a look at <a href="http://www.stats.com/" rel="nofollow">http://www.stats.com/</a> and <a href="http://www.xmlteam.com/" rel="nofollow">http://www.xmlteam.com/</a>
dougb超过 14 年前
What about crowd sourcing the data collection ? Make an app to let sports fans enter the data as they watched the game and publish it under a CC license ?<p>I've been to many baseball games where I've seen people keeping score on paper while watching the games.
jat850超过 14 年前
In my previous development experience for a fantasy gaming website, I dealt with Stats Inc and the API/datasets they provided.<p>Without permission I don't think it would be fair for me to provide you a direct contact, but they did offer all of the data our site required, in useable formats.<p>Our initial site only dealt with the NBA as they provided the best avenue for use of their logos and player names.<p>Feel free to contact me more directly if you want a bit more info.<p>Best of luck!
cloudkj超过 14 年前
Hmmm, seems to me there's a business opportunity in providing lower cost, more developer friendly sports data. I remember looking into getting access to sports data since I wanted to do some analytics after I read Moneyball. Old, archived data is easy to come by, but any fresh, real-time data sources seem to have non-trivial costs.<p>I guess there might be some restrictions on who gets access to the official raw data for various games, depending on the sports league. If the costs for getting that data are high, then the only way to circumvent that would be to collect them yourself. Even then, I don't know if the leagues would come at you hard for gathering data and using team names or player names...
评论 #1792380 未加载
评论 #1792367 未加载
mikerhoads超过 14 年前
I work for the NFL and we license our data directly here: <a href="https://www.nfl.info/NFLConsProd/Welcome/index.html" rel="nofollow">https://www.nfl.info/NFLConsProd/Welcome/index.html</a><p>I don't really know the exact pricing structure but I don't imagine it is cheap.
retree超过 14 年前
The situation in the UK, is that anything (and this includes fixtures, logos, statistics, even live twitter score updates) has to be licensed through a company called Football DataCo. The costs coming to ~$6000/season just for fixtures. You can't even use names that sound similar. For example calling Liverpool 'Merseyside Red'. [1]<p>They enforce this strongly, outsourcing it to a company who only does this sort of thing.<p>[1] <a href="http://www.epltalk.com/2010-11-premier-league-opening-day-fixtures-fiasco-21003" rel="nofollow">http://www.epltalk.com/2010-11-premier-league-opening-day-fi...</a>
gcaprio超过 14 年前
I'm actually glad someone brought this up. I'm starting a company around this very idea: making data available and consumable. Our first site is up:<p><a href="http://www.cfbreference.com" rel="nofollow">http://www.cfbreference.com</a><p>There's about 5 years of data that we've culled from the NCAA about CFB. We're adding more every week and will soon go back in time for historical data.<p>But, our twist is that the site will be upgraded to be a completely consumable site. Full REST API support, dynamic url data generation and more. We're adding new stuff every day. So you can get the data you way in JSON, RDF, XML &#38; HTML depending on your Accept header, querying string parameter and even url parameters.<p>We are going to try and build apps on top of this date, but data sites are and will remain FREE. We want to encourage community participation contributions. That means free for anyone, anywhere even if you yourself don't contribute data.<p>We're also going to add scoring / charting apps for mobile phones so that you can chart your own games and, if you'd like, contribute the data back to use.<p>We're not 100% there yet, but I'll post here when we are. We'd love feedback from the entire HN community, not only on the sports data aspect but on the technical implementation. After all, if it's not easy to use &#38; powerful, we're not doing a good enough job.
评论 #1793345 未加载
评论 #1793114 未加载
ironblunt超过 14 年前
We do mostly baseball and in our first year, we used a bunch of retrosheet data for historical data and we went with BIS for current season data. They're pretty laid back and their data was pretty detailed for us. Recently, we went with MLB.com's xml data over at <a href="http://gd2.mlb.com/components/game/mlb/" rel="nofollow">http://gd2.mlb.com/components/game/mlb/</a> where we had to do a bunch more calculations to get all the data we wanted, but it's free (for now).<p>We also looked at XML Team and I found their prices to be completely reasonable and they have a per document pricing structure which allows you to control your costs to a much greater extend.<p>We also spoke with Stats Inc and found them to be pretty unreasonable in terms of dealing with startups and for home projects.<p>Hit me up if you want any more data or if Benchcoach can help with the data on the baseball side. We're looking at expanding it to football and basketball this year so we've been speaking with XML team about that.
luffy超过 14 年前
Regarding scraping:<p>I have a hard time figuring out what the difference is between having a human read a web page with sports scores on it, and then entering those scores in to your application vs. having a scraper grab those scores automatically. In most cases, these source web pages will be publicly available without requiring any agreement to a terms of service contract.<p>Scraping a site and using the actual HTML in your application would be a copyright violation, definitely. Sometimes a particular format can even be patented. So I'd definitely stay away from actually scraping out an entire table and inserting that into your app.<p>But as far as the scores/facts - those are not subject to copyright. So what is the particular legal issue if you are scraping and only getting <i>non-copyrightable facts</i> from a publicly available web page? I'm genuinely curious to know.
评论 #1793331 未加载
hakan超过 14 年前
I had the same experience a lot of people are describing here with STATS - just too expensive out of the box. If you need real-time data, they're your only option, but if you need weekly updates or historical data, I may be able to help.<p>Playerfilter (<a href="http://www.playerfilter.com" rel="nofollow">http://www.playerfilter.com</a>) is built on top of an API that we are looking to expose to the public (use can see it being used in the URL hash). API support isn't live yet but we are working with beta testers. Basically, we return data for players, seasons and games over any time period since 1970. Please check it out and drop us a line if you'd be interested in more details.
terra_t超过 14 年前
With scraping I'd be concerned about legal issues more than I would about getting "shut down or blocked" technically. On the other time, I spent my postdoc time working on "low observable" webcrawlers rather than physics...<p>I'd seriously considered a sports-related project based on open data and I was still concerned that I could get into legal trouble, so I sorta merged the project into something much bigger, in which the sports content would be barely noticeable.
stevederico超过 14 年前
I know bloomberg sports provides data via an API, but it's not cheap. <a href="http://www.bloombergsports.com/" rel="nofollow">http://www.bloombergsports.com/</a>
shafqat超过 14 年前
We (<a href="http://platform.newscred.com" rel="nofollow">http://platform.newscred.com</a>) might be able yo help. Drop me a line (email in platform).