TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Indexing Code at Scale with Glean

132 pointsby GavCo5 months ago

9 comments

jtokoph5 months ago
I was really confused and surprised that Meta was using a commercial product for indexing instead of building in-house...until I realized that they weren't talking about the AI search indexing tool at glean.com
评论 #42569871 未加载
评论 #42571062 未加载
评论 #42571047 未加载
conqrr5 months ago
Glean: <a href="https:&#x2F;&#x2F;glean.software&#x2F;" rel="nofollow">https:&#x2F;&#x2F;glean.software&#x2F;</a> System for collecting, deriving and querying facts about source code
评论 #42590272 未加载
tomas7895 months ago
This is certainly a step in right direction especially with proliferation of AI based assistants there will be a greater need to have readily available information about the codebase. This could easily take those copilots yet another level up.<p>For example my workflow now with Cursor is to keep relevant code in spearate tabs even though I don’t work on the files. I found it makes the autocomplete better as at seems to me that all the active tabs are fed to the model. That means less space for me and more distraction. Glean might here.
PessimalDecimal5 months ago
Google&#x27;s equivalent to this is Kythe (<a href="https:&#x2F;&#x2F;kythe.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;kythe.io&#x2F;</a>). Earlier today I had noticed that Kythe ripped out its support for indexing Rust code and wondered what alternatives might exist. So iinteresting to see this right now! And it looks like it supports Rust (albeit via rust-indexer).
rockwotj5 months ago
Is there any UIs for this available openly? Or for glass? I am a former Googler and I know how awesome this kind of tooling is and it’s so hard to achieve with OSS. I would love open source code search. This seems very close but there is no UI layer (and it seems like meta uses this for code review and for IDEs) but a basic UI would be a good start
评论 #42575546 未加载
YetAnotherNick5 months ago
There are already 3 popular products with name glean with domains as .com, .ai and .co. This is glean with .software.
jepler5 months ago
My mind just balks at the idea of having so much source that a 2020s computer could take hours to index it. ctags is nothing special (both in terms of optimization but also the level of detail it gets to: just global function identifiers) and looks like it runs at about 400MB&#x2F;s on a single core of an i5-1235U. But still it looks ctags could process about 100TB in 4 hours across 16 threads on a workstation class CPU...
评论 #42570300 未加载
评论 #42570922 未加载
评论 #42570837 未加载
评论 #42570987 未加载
tonymet5 months ago
my favorite feature of code indexing at FB was how well integrated it was. Web search, cli search and IDE search all used the search index, but would reference your local context. This was useful for reference, call stack, dead code search.<p>e.g. search results from ide search would link back to your local file. CLI results would reference your local clone.<p>A great example of a small feature resulting in great usability.
评论 #42571308 未加载
archy_5 months ago
When I read about these things, I cant help but wonder if anybody took a step back and thought &quot;maybe we just have too much code&quot;?<p>At some point, perhaps you&#x27;re just doing too much
评论 #42572381 未加载
评论 #42571722 未加载