TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Python Tooling at Scale: LlamaIndex’s Monorepo Overhaul

36 点作者 cheesyFish4 天前

4 条评论

codethief大约 10 小时前
I find it quite astonishing there is no go-to build system &#x2F; task runner yet for handling small to medium-sized monorepos across ecosystems.<p>I want a tool that<p>- allows me to define tasks with inputs (+ secrets) and outputs. Inputs can be files &amp; folders from the repo, Docker images, build parameters &#x2F; env vars, outputs from other tasks, … Typical tasks I have in mind are setup&#x2F;build&#x2F;test&#x2F;deploy, which of course will typically depend on one another, thereby forming a pipeline or dependency graph.<p>- sandboxes&#x2F;containerizes tasks by default (in particular: no access to repo file system &#x2F; working copy, env vars, … beyond what&#x27;s specified as inputs) but does provide easy escape hatches (for deployment pipelines, sharing venv&#x2F;node_modules between task and working copy &#x2F; IDE, …),<p>- by default automatically caches a task&#x27;s output &amp; logs for a given input, unless I explicitly tell it not to (again, deployment tasks!). Then, when running a task upon the user&#x27;s request, it automatically figures out the dependency graph and runs only those tasks that have not been cached before. This includes the case of the task definition itself having changed. (Many tools allowing you to define tasks in a full-blown programming language struggle with detecting this reliably.)<p>- comes with monorepo support, so supports collecting definitions of e.g. the &quot;test&quot; task across subfolders&#x2F;projects and running them all in parallel (as far as the dependency graph allows),<p>- is language-&#x2F;ecosystem-agnostic, so that I can invoke whatever tool or shell script inside a given task.<p>- provides a sane configuration language (<i>not</i> YAML) – ideally a lightweight functional language that makes side effects very explicit,<p>- can be run both in CI and locally, without much setup effort. In fact, since the tool should be used as task runner for everything else in the repo, it should be easily bootstrappable after cloning the repo.<p>- can be integrated somewhat nicely with Github&#x2F;GitLab&#x2F;Azure DevOps&#x2F;… (actually not that easy).<p>Dagger comes pretty close in terms of general idea but I&#x27;m not sure I like it so far.
lyjackal4 天前
I recently did something similar. Using uv workspaces, I used the uv CLI&#x27;s dependency graph to analyze the dependency tree then conditionally trigger CI workflows for affected projects. I wish there was a better way to access the uv dependency worktree other than parsing the `tree` like output
评论 #44055431 未加载
评论 #44054580 未加载
tuanacelik4 天前
So just to let me get this straight: Does this new setup aim to make it easier to contribute to llamaindex submodules specifically?
评论 #44055144 未加载
SlimIon7294 天前
Interesting to see LlamaIndex&#x27;s journey from Poetry+Pants to uv+LlamaDev for managing their extensive monorepo. The speed improvements and better developer experience with `uv` are compelling. It&#x27;s a good reminder of how tooling choices evolve with scale.