TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Changes to Code Search Indexing

82 点作者 xPaw超过 4 年前

23 条评论

est31超过 4 年前
This is sad. Anyone working with larger codebases knows that 99% of the time, the code they encounter is years old. It&#x27;s not uncommon for me to stumble upon files whose last non-trivial change was 3 or 7 years ago... Of course other files get more regular updates, but this translates to entire repositories as well, especially in languages where smaller libraries are encouraged. I&#x27;m sure that even among github&#x27;s gem dependency tree, there are open source dependencies that haven&#x27;t seen changes less than 1 year ago. To say these are irrelevant is just wrong.<p>Github&#x27;s search already sucks quite a lot, but for some things it&#x27;s extremely useful. For example, when I&#x27;m interested where my Rust library is used, I can use the toml filetype restriction and search for the name of my library. It will show up way more results than the projects published to crates.io as those projects are only a tiny subset. These projects might not see extremely regular updates, but I consider them still relevant information. I want it to be my choice whether to discard them or not.
评论 #25459499 未加载
评论 #25462125 未加载
评论 #25462206 未加载
jan_Inkepa超过 4 年前
I noticed that something was amiss a year or so ago (IIRC) when they disabled global search for people not logged in.<p>This is sad, TBH. I&#x27;ve found global search very useful for searching for example-uses of rarely-used libraries. Having all that at my fingertips was useful.<p>I understand legacy examples aren&#x27;t useful for everyone, but for me they often were, and now a lot of code will be completely unfindable. :&#x2F;<p>Think of niche programming languages and the like that&#x27;ve passed their heyday - I guess tagging might help, but a lot of people don&#x27;t bother tagging projects.<p>Also, I guess this includes one&#x27;s own projects? I have about a hundred repos of various ages, and it&#x27;s easy to lose track of them. Not being able to search through my own code sounds like a bit of a bummer (Though I don&#x27;t have an intuition for how often I search for stuff in my own repos, TBH).<p>Pity there wasn&#x27;t a better solution available to solve their problems.
评论 #25479871 未加载
评论 #25460610 未加载
glup超过 4 年前
This is idiotic. A ML codebase for a repo from 2017 -- like Word2Vec (<a href="https:&#x2F;&#x2F;github.com&#x2F;tmikolov&#x2F;word2vec" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;tmikolov&#x2F;word2vec</a>) -- won&#x27;t show up in search anymore?<p>Guess we can all write cron jobs that commit useless crap yearly.
评论 #25459097 未加载
评论 #25459108 未加载
karlicoss超过 4 年前
I guess <a href="https:&#x2F;&#x2F;grep.app" rel="nofollow">https:&#x2F;&#x2F;grep.app</a> (discussed here [0]) becomes even more useful now. Although not sure what exactly are they indexing.<p>And for anyone using Github to search in their own code -- Ripgrep works really well even if you run it against your whole code directory and gives you instantaneous results (if you usean SSD!). I&#x27;m describing my code search setup with ripgrep + emacs here [1]<p>[0] <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22396824" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22396824</a><p>[1] <a href="https:&#x2F;&#x2F;beepb00p.xyz&#x2F;pkm-search.html#code" rel="nofollow">https:&#x2F;&#x2F;beepb00p.xyz&#x2F;pkm-search.html#code</a>
评论 #25460973 未加载
评论 #25459090 未加载
ballenf超过 4 年前
This worries me. At least monthly, I will find a repository that is 5ish years old that is attacking some problem I&#x27;m dealing with. I may not interact with it, but I review the code and learn a lot.<p>I guess Google will still index dormant repos.
评论 #25459112 未加载
评论 #25459369 未加载
评论 #25458935 未加载
neovintage超过 4 年前
Hey everyone. I had a lot of folks reach out to me about this change. We&#x27;ve heard y&#x27;all loud and clear that you want a better code search. The goal of this change was to balance performance and search relevance while we work on a new code search backend. The vast majority of folks shouldn&#x27;t notice a change in their search results. The folks that will are those that are using code search to count things across all the repositories in GitHub. That&#x27;s a use case that we don&#x27;t expressly support as part of code search but we know folks are doing it anyway. My goal for publishing this was to be open about the change and for those folks that are using code search for analytics not have to guess at what happened.
评论 #25460168 未加载
mrcarruthers超过 4 年前
AKA the indexes are too large and we don&#x27;t want to spend the money
评论 #25459076 未加载
评论 #25459017 未加载
评论 #25458869 未加载
jchw超过 4 年前
This kind of sucks. Sometimes I use Code Search to try to find things that are particularly obscure, such as usages of new or obscure APIs, or what have you. No matter what the rule is, not indexing all of Github breaks this.<p>I suppose I take it for granted that ridiculously huge search indices are a solved problem, but it turns out they aren’t.
chinhodado超过 4 年前
This sucks. I use code search from time to time to find example usages of obscure APIs. With this change it&#x27;s more or less useless now.
评论 #25459402 未加载
评论 #25459405 未加载
upbeat_general超过 4 年前
I saw the title and immediately I thought “yesss Github is finally fixing its terrible code search”….Instead they’re making it worse.<p>To be clear Github search has been very useful, just extremely sensitive and finicky
luhn超过 4 年前
The announcement is a bit ambiguous: Does this include the per-repo search, or just the global code search? Does code search include issues and pull requests, or just code?<p>Honestly I&#x27;ve always hated Github&#x27;s full text search on code. Give me an amped-up per-repo grep and scrap global code search entirely. Maybe I&#x27;m lacking in imagination, but I can&#x27;t see how full text search is useful to anybody.
superasn超过 4 年前
This is very bad. A lot of time when I&#x27;m stuck with some issue like how to use an API and can&#x27;t find answers in docs or SO, I usually turn to usage of the said functions with github code search. Most often they are some 3 old project with 1 star that shows me how they used it and what values to use, etc and that really serves a good purpose.
adamnemecek超过 4 年前
If anyone from GitHub is reading this, can you add dedup? I spend a lot of time searching for things on GitHub but sometimes out of 100 pages of results, 80 might be the same file included in different projects.
fahrradflucht超过 4 年前
Does this also affect repos in payed teams? This is an essential feature I often use to find usage of something throughout the org. Especially the old forgotten code is what I am looking for there...
bredren超过 4 年前
This is good idea for default search, lots of bloat and some languages and frameworks are changing incredibly quickly.<p>However, deeper searches should still be made available.<p>This could be resolved as simply as advanced time based flags like has issue updated.
enriquto超过 4 年前
Ah, these Microsoft guys are so funny...<p>They purportedly developed a search engine that can grep the entire internet; but in reality they have trouble indexing a single one of their own websites.
forrestthewoods超过 4 年前
This is very unfortunate. I regularly run into obscure issues but am able to find work arounds by searching GitHub for ancient projects that encountered similar edge cases.
andrewstuart超过 4 年前
I would have found it much more useful if it continued to index all repositories, but instead removed the vast number of duplicate results, which make github search very challenging to get value from.<p>Old code that was not recently active is still valuable to search through.
throwaway889900超过 4 年前
So surely with all that extra power and storage freed up, they can start indexing code in forks that are actually actively maintained?
hansvm超过 4 年前
Suppose somebody wanted to write their own code search backend; is anyone maintaining a common crawl of all github repos?
import超过 4 年前
Gonna write &quot;show-my-repos-in-search-results.sh&quot; script which I can run every year.
评论 #25460139 未加载
The_rationalist超过 4 年前
I wonder how that&#x27;s going to affect codota (the best code search engine to my knowledge) <a href="https:&#x2F;&#x2F;www.codota.com&#x2F;code" rel="nofollow">https:&#x2F;&#x2F;www.codota.com&#x2F;code</a>
q3k超过 4 年前
Does anyone actually use the GitHub global code search? I&#x27;ve always found it to be pretty much useless.<p>(compare the results from <a href="https:&#x2F;&#x2F;cs.opensource.google&#x2F;" rel="nofollow">https:&#x2F;&#x2F;cs.opensource.google&#x2F;</a> vs. GH search)
评论 #25459418 未加载
评论 #25459044 未加载