TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Keeping master green at scale

301 pointsby roshanjabout 6 years ago

13 comments

underrunabout 6 years ago
Adrian Colyer dug into this a little further on the morning paper:<p><a href="https:&#x2F;&#x2F;blog.acolyer.org&#x2F;2019&#x2F;04&#x2F;18&#x2F;keeping-master-green-at-scale&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blog.acolyer.org&#x2F;2019&#x2F;04&#x2F;18&#x2F;keeping-master-green-at-...</a><p>His analysis indicates that what uber does as part of its build pipeline is to break up the monorepo into &quot;targets&quot; and for each target create something like a merkle tree (which is basically what git uses to represent commits) and use that information to detect potential conflicts (for multiple commits that would change the same target).<p>what it sounds like to me is that they end up simulating multirepo to enable tests to run on a batch of most likely independent commits in their build system. For multirepo users this is explicit in that this comes for free :-)<p>which is super interesting to me as it seems to indicate that an optimizing CI&#x2F;CD systems requires dealing with all the same issues whether it&#x27;s mono- or multi- repo, and problems solved by your layout result in a different set of problems that need to be resolved in your build system.
评论 #19693899 未加载
评论 #19695077 未加载
评论 #19694437 未加载
huacabout 6 years ago
&quot;Based on all possible outcomes of pending changes, SubmitQueue constructs, and continuously updates a speculation graph that uses a probabilistic model, powered by logistic regression. The speculation graph allows SubmitQueue to select builds that are most likely to succeed, and speculatively execute them in parallel&quot;<p>This is either brilliant or just something built for a promotion packet
评论 #19693876 未加载
评论 #19693723 未加载
jl-gitlababout 6 years ago
We&#x27;re building some similar tech at GitLab, though without the dependency analysis yet.<p>Merge Requests now combine the source and target branches before building, as an optimization: <a href="https:&#x2F;&#x2F;docs.gitlab.com&#x2F;ee&#x2F;ci&#x2F;merge_request_pipelines&#x2F;#combined-ref-pipelines-premium" rel="nofollow">https:&#x2F;&#x2F;docs.gitlab.com&#x2F;ee&#x2F;ci&#x2F;merge_request_pipelines&#x2F;#combi...</a><p>Next step is to add queueing (<a href="https:&#x2F;&#x2F;gitlab.com&#x2F;gitlab-org&#x2F;gitlab-ee&#x2F;issues&#x2F;9186" rel="nofollow">https:&#x2F;&#x2F;gitlab.com&#x2F;gitlab-org&#x2F;gitlab-ee&#x2F;issues&#x2F;9186</a>), then we&#x27;re going to optimistically (and in parallel) run the subsequent pipelines in the queue: <a href="https:&#x2F;&#x2F;gitlab.com&#x2F;gitlab-org&#x2F;gitlab-ee&#x2F;issues&#x2F;11222" rel="nofollow">https:&#x2F;&#x2F;gitlab.com&#x2F;gitlab-org&#x2F;gitlab-ee&#x2F;issues&#x2F;11222</a>. At this point it may make sense to look at dependency analysis and more intelligent ordering, though we&#x27;re seeing nice improvements based on tests so far, and there&#x27;s something to be said for simplicity if it works.
Scaevolusabout 6 years ago
There&#x27;s a nice middle ground between this and a one-at-a-time submit queue: have a speculative batch running on the side. This gives nice speedups (approaching N times more commits, where N is the batch size) with minimal complexity.<p>One useful metric is the ratio between test time and the number of commits per day. If your tests run in a minute, you can test submissions one at a time and still have a thousand successful commits each day. If your tests take an hour, you can have at most 24 changes per day under a one-at-a-time scheme.<p>I worked on Kubernetes, where test runs can take more than an hour-- spinning up VMs to test things is expensive! The submit queue tests <i>both</i> the top of the queue and a batch of a few (up to 5) changes that can be merged without a git merge conflict. If either one passes, the changes are merged. Batch tests aren&#x27;t cancelled if the top of the queue passes, so sometimes you&#x27;ll merge both the top of the queue AND the batch, since they&#x27;re compatible.<p>Here&#x27;s some recent batches: <a href="https:&#x2F;&#x2F;prow.k8s.io&#x2F;?repo=kubernetes%2Fkubernetes&amp;type=batch" rel="nofollow">https:&#x2F;&#x2F;prow.k8s.io&#x2F;?repo=kubernetes%2Fkubernetes&amp;type=batch</a><p>And the code to pick batches: <a href="https:&#x2F;&#x2F;github.com&#x2F;kubernetes&#x2F;test-infra&#x2F;blob&#x2F;0d66b18ea7e8d3f216287ad06b11042c12bc6e48&#x2F;prow&#x2F;tide&#x2F;tide.go#L759" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;kubernetes&#x2F;test-infra&#x2F;blob&#x2F;0d66b18ea7e8d3...</a><p>Merges to the main repo peak at about 45 per day, largely depending on the volume of changes. The important thing is that the queue size remains small: <a href="http:&#x2F;&#x2F;velodrome.k8s.io&#x2F;dashboard&#x2F;db&#x2F;monitoring?orgId=1&amp;panelId=10&amp;fullscreen&amp;from=now-7d&amp;to=now" rel="nofollow">http:&#x2F;&#x2F;velodrome.k8s.io&#x2F;dashboard&#x2F;db&#x2F;monitoring?orgId=1&amp;pane...</a>
评论 #19700877 未加载
评论 #19696668 未加载
评论 #19700606 未加载
antimoraabout 6 years ago
I am still trying to wrap my head around a giant monolithic repo model instead of breaking codes into multiple repos.<p>At Amazon, for example, they have multi repos setup. A single repo represents one package which has major version.The Amazon&#x27;s build system builds packages and pulls dependencies from the artifact repository when needed. The build system is responsible for &quot;what&quot; to build vs &quot;how&quot; to build, which is left to the package setup (e.g. maven&#x2F;ant).<p>I am currently trying to find a similar setup. I have looked as nix, bazel, buck and pants. Nix seems to offer something close. I am still trying to figure how to vendor npm packages and which artifact store is appropriate. And also if it is possible to have the nix builder to pull artifacts from a remote store.<p>Any pointer from the HN community is appreciated.<p>Here is what I would like to achieve:<p>1. Vendor all dependencies (npm packages, pip packages, etc) with ease. 2. Be able to pull artifact from a remote store (e.g. artifactory). 3. Be able to override package locally for my build purposes. For example, if I am working on a package A which depends on B, I should be able to build A from source and if needed to build B which A can later use for its own build. 4. Support multiple languages (TypeScript, JavaScript, Java, C, rust, and go). 5. Have each package own repository.
评论 #19695977 未加载
评论 #19695741 未加载
chairleaderabout 6 years ago
Quite a premise: &quot;Giant monolithic source-code repositories are one of the fundamental pillars of the back end infrastructure in large and fast-paced software companies.&quot;
评论 #19693487 未加载
评论 #19693403 未加载
评论 #19694490 未加载
评论 #19694783 未加载
评论 #19693394 未加载
richardwhiukabout 6 years ago
Anyone fancy comparing this to bors?
评论 #19695267 未加载
评论 #19695027 未加载
shimontabout 6 years ago
I think that what works for companies like Uber&#x2F;Google&#x2F;Facebook is not applicable to the rest of fortune 500 or all of the rest of the companies.<p>disclaimer: I am one of Datree.io founders. We provide a visibility and governance solution to R&amp;D organizations on top of GitHub.<p>Here are some rules and enforcement around Security and Compliance which most of our companies use for multi-repo GitHub orgs. 1. Prevent users from adding outside collaborators to GitHub repos. 2. Enforce branch protection on all current repos and future created ones - prevent master branch deletion and force push. 3. Enforce pull request flow on default branch for all repos (including future created) - prevent direct commits to master without pull-request and checks. 4. Enforce Jira ticket integration - mention ticket number in pull request name &#x2F; commit message. 5. Enforce proper Git user configuration. 6. Detect and prevent merging of secrets.
jonthepirateabout 6 years ago
Having been at both Lyft and DoorDash where I&#x27;ve been an engineer responsible for unit test health, I decided to do a side project called Flaptastic (<a href="https:&#x2F;&#x2F;www.flaptastic.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.flaptastic.com&#x2F;</a>), a flaky unit test resolution system.<p>Flaptastic will make your CI&#x2F;CD pipelines reliable by identifying which tests fail due to flaps (aka flakes) and then give you a &quot;Disable&quot; button to instantly skip any test which is immediately effective across all feature branches, pull requests, and deploy pipelines.<p>An on-premise version is in the works to allow you to run it onsite for the enterprise.
评论 #19694623 未加载
cjfdabout 6 years ago
A possible complication would occur if there are tests that occasionally fail.
revskillabout 6 years ago
What&#x27;s exactly a monothlic ? Is it only related to codebase (monothlic vs monorepo) ? Or it&#x27;s about runtime like microservices vs monothlic.
评论 #19694069 未加载
techmortalabout 6 years ago
How common is this in the industry? Do multirepos run on a batch?
7eabout 6 years ago
Is this novel? Other companies have had this for ages.
评论 #19693269 未加载
评论 #19693509 未加载
评论 #19693355 未加载