TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: CVPR 2020 unofficial statistics, with better search functionality

40 点作者 ck_one将近 5 年前

7 条评论

ck_one将近 5 年前
CVPR 2020 statistics (unofficial) + better search functionality<p>tldr; I have created a dataset about CVPR 2020 papers consisting of the title, author(s), affiliated institution(s) and the abstract of each paper and put it behind Elastic Search to make it more accessible. Happy searching!<p>Initially, I wanted to find out which research institution is in involved in what papers. To my surprise this information wasn&#x27;t readily available. I had to go through each of the 1500 papers to extract the information. I used a script to get the title, author(s) and the abstract of each paper and worked with a freelancer (100$&#x2F;~30h) to get the institution of every author. Then I used local sensitive hashing to clean institution names and put the whole dataset behind Elastic Search. A simple idea turned out to be a good learning project since it was my first time working with a freelancer and also my first time using Elastic Search.<p>Quick summary about CVPR 2020 statistics: Google is still the number one in terms of number of publications. China is gaining momentum quicker than I expected.
评论 #23791632 未加载
tziki将近 5 年前
Wow, I&#x27;d noticed the increase in Chinese sounding author names during the past few years but had no idea China was such a major player.
sbielmeier将近 5 年前
Glad to see TUM being represented in this with a couple of papers as well...
评论 #23793543 未加载
goodmattg将近 5 年前
Great project! Obviously a stretch, but would be curious to see a breakdown by workshop, general conference, orals, etc. The umber of publications is not the same as their quality.<p>Also, do you disambiguate corporate affiliations in academia? E.g. many papers will be +1&#x27;s for a university and a corporate research group (Stanford + Google)
评论 #23792759 未加载
eunos将近 5 年前
Thanks for the great work!<p>Btw If I may ask, when you compiled the Institutions, did you apply some disambiguation? For example consider that UCLA === University of California, Los Angeles
评论 #23792639 未加载
ck_one将近 5 年前
This is the traffic I got following the HN post:<p>- total number of search requests: 3500<p>- ~20 search requests&#x2F;min (at the moment)<p>I never had that much traffic with a private project. Thanks for your interest!
andrew_eit将近 5 年前
Great initiative! How long did it take you to parse through all the papers etc and build this tool?
评论 #23793689 未加载