TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Monorepos: Please don’t

332 pointsby louis-paulover 6 years ago

73 comments

curtisover 6 years ago
My advice is that if components need to release together, then they ought to be in the same repo. I&#x27;d probably go further and say that if you just think components might need to release together then they should go in the same repo, because you can in fact pretty easily manage projects with different release schedules from the same repo if you really need to.<p>On the other hand if you&#x27;ve got a whole bunch of components in different repos which need to release together it suddenly becomes a real pain.<p>If you&#x27;ve got components that will never need to release together, then of course you can stick them in different repositories. But if you do this and you want to share common code between the repositories then you will need to manage that code with some sort of robust versioning system, and robust versioning systems are hard. Only do something like that when the value is high enough to justify the overhead. If you&#x27;re in a startup, chances are very good that the value is <i>not</i> high enough.<p>As a final observation, you can split big repositories into smaller ones quite easily (in Git anyway) but sticking small repositories together into a bigger one is a lot harder. So start out with a monorepo and only split smaller repositories out when it&#x27;s clear that it really makes sense.
评论 #18813635 未加载
评论 #18813682 未加载
评论 #18811492 未加载
评论 #18814025 未加载
评论 #18813866 未加载
评论 #18812985 未加载
评论 #18814017 未加载
评论 #18813702 未加载
评论 #18813397 未加载
评论 #18812159 未加载
mrgriffinover 6 years ago
My problem with polyrepos is that often organizations end up splitting things too finely, and now I&#x27;m unable to make a single commit to introduce a feature because my changes have to live across several repositories. Which makes code review more annoying because you have to tab back and forth to see all the context. It&#x27;s doubly frustrating when I&#x27;m (or my team is) the only people working on those repositories, because now it doesn&#x27;t feel like it gained any advantages. I know the author addresses this, but I can&#x27;t imagine projects are typically at the scale they&#x27;re describing. Certainly it&#x27;s not my experience.<p>Also I definitely miss the ability to make changes to fundamental (internal) libraries used by every project. It&#x27;s too much hassle to track down all the uses of a particular function, so I end up putting that change elsewhere, which means someone else will do it a little different in their corner of the world, which utterly confuses the first person who&#x27;s unlucky enough to work in both code bases (at the same time, or after moving teams).
评论 #18814284 未加载
评论 #18810718 未加载
评论 #18813338 未加载
评论 #18814213 未加载
评论 #18815609 未加载
评论 #18815618 未加载
yowlingcatover 6 years ago
I think this article is complete horseshit. A monorepo will serve you 99% of the time until you hit a certain level of scale when you get to worry about whether a monorepo or a polyrepo is actually material. Most cases are never going to get there. Before that point, a polyrepo is purely a distraction and makes synchronous deployment really painful. We had to migrate a polyrepo to a monorepo and it was not fun because it was a migration that should have never had to be done in the first place. Articles like this are fundamentally irresponsible.
评论 #18815893 未加载
评论 #18815496 未加载
评论 #18816337 未加载
评论 #18827676 未加载
评论 #18817194 未加载
sfrenchover 6 years ago
My last 2 jobs have been working on developer productivity for 100+ developer organizations. One is a monorepo, one is not. Neither really seems to result in less work, or a better experience. But I&#x27;ve found that your choice just dictates what type of problems you have to solve.<p>Monorepos are going to be mostly challenges around scaling the org in a single repo.<p>Polyrepos are going to be mostly challenges with coordination.<p>But the absolute worst thing to do is not commit to a course of action and have to solve <i>both</i> sets of challenges (eg: having one pretty big repo with 80% of your code, and then the other 20% in a series of smaller repos)
评论 #18813194 未加载
评论 #18813591 未加载
评论 #18815285 未加载
rossjudsonover 6 years ago
Hilariously misguided.<p>Pretty funny to read that the things I do every day are impossible.<p>Monorepo and tight coupling are orthogonal issues. Limits on coupling come from the build system, not from the source repository.<p>Yes, you should assume there is a sophisticated &quot;VFS&quot;. What is this &quot;checkout&quot; you speak of? I have no time for that. I am too busy grepping the entire code base, which is apparently not possible.<p>If the &quot;the realities of build&#x2F;deploy management at scale are largely identical whether using a monorepo or polyrepo&quot;, then why on earth would google invest enormous effort constructing an entire ecosystem around a monorepo? Choices: 1) Google is dumb. 2) Mono and poly are not identical.
评论 #18814120 未加载
评论 #18814457 未加载
评论 #18815482 未加载
评论 #18814339 未加载
评论 #18814616 未加载
评论 #18819023 未加载
0xFACEFEEDover 6 years ago
At least the author gave us the courtesy of italicizing his broken assumption from the outset of the post.<p>&gt; Because, at scale, a monorepo must solve every problem that a polyrepo must solve, with the downside of encouraging tight coupling, and the additional herculean effort of tackling VCS scalability.<p>Right.<p>But you have to get to &quot;scale&quot; first (as it relates to VCSs). Most companies don&#x27;t. Even if they&#x27;re successful. Introducing polyrepos front loads the scaling problems for no reason whatsoever. A giant waste of time.<p>Checkmate! I didn&#x27;t even need a snarky poll. The irony of that poll is that it clearly demonstrates his zealotry, not other people&#x27;s.
评论 #18813419 未加载
评论 #18813417 未加载
jayd16over 6 years ago
There&#x27;s a lot wrong with this article. Most of the arguments are either not backed up or are misleading. I haven&#x27;t heard anyone argue they can drop dependency management because of a monorepo.<p>The author lists downsides of monorepos without listing the upsides and downsides of polyrepos so its really half complete.<p>I don&#x27;t think anyone who likes a monorepo is suggesting you just commit breaking changes to master and ignore downstream teams. What it does do is give the ability to see who those downstream teams (if any) might be.<p>The crux of the author&#x27;s argument is that added information is harmful because you might use it wrong. Its just as easy (far easier in fact) to ignore your partners without the information a monorepo gives. Its not really an argument at all. There&#x27;s really nothing here but &quot;there be dragons&quot;.<p>Monorepo&#x27;s provide some cross functional information for a maintenance price. Its up to you whether the benefit is worth the overhead.
评论 #18815007 未加载
jonexover 6 years ago
Seems like the main point is that you&#x27;ll still need to add additional tooling (search, local cloning, build, etc) to handle scaling, something you can do just as well with polyrepos. Conversely, for polyrepos, you can add tooling to fix issues with dependency management and multi-project changes&#x2F;reviews. However, the author figures that monorepos engourage bad code culture and points out that Git is hard to build a monorepo on.<p>To me this message seems a bit shallow, of course we can build tooling to hide the fact that we have a polyrepo. Given well enough built tooling and consistent enough polyrepo structure (all using same VCS, all being linked from common tooling, following common coding standards and using the same build tooling, etc.) the distinction from having a monorepo is more of an implementation detail.<p>Given the choice between a consistent monorepo where everyone is running everything at HEAD and a polyrepo where each project have their own rules and there&#x27;s no tooling to make a multi-project atomic change, I&#x27;d go for the former.<p>Given the choice between identical working environments but different underlying implementations I would go for whatever the tools team think is easier to maintain.
评论 #18813820 未加载
评论 #18814837 未加载
olingernover 6 years ago
I’ve found monorepos to be extremely valuable in an immature, high-churn codebase.<p>Need to change a function signature or interface? Cool, global find &amp; replace.<p>At some point monorepos outgrow their usefulness. The sheer amount of files in something that’s 10K+ LOC ( not that large, I know ) warrants breaking apart the codebase into packages.<p>Still, I almost err on the side of monorepos because of the convenience that editors like vscode offer: autocomplete, auto-updating imports, etc.
评论 #18813007 未加载
评论 #18813241 未加载
评论 #18813776 未加载
评论 #18813362 未加载
im_down_w_otpover 6 years ago
The biggest gripe I have with modern day monorepos is that people are trying to use Git to interact with them, which doesn&#x27;t make a tremendous amount of sense, and results in either an immense amount of pain and&#x2F;or the creation of a bunch of tools to try to coerce Git into behaving basically like SVN.<p>Which of course begs the question, rather than trying to perform a bunch of unnatural acts, why not just use SVN to start with? It works extremely well with monorepo &amp; subtree workflows.<p>Sure it has some warts in a few dimensions around branching, versioning, etc. compared to Git when using Git in ways aligned with how Git wants to work, but those warts are minimal in comparison to what&#x27;s required to pretzel Git monorepos into scaling effectively.
theduferover 6 years ago
Maybe its just that the author&#x27;s cutoff is at the wrong team size, but the monorepo I work on (with ~150 devs) has almost none of the problems presented.<p>Unreasonable for a single dev to have the entire repo? I&#x27;m looking at a repo with ~10 million LoC and ~1.4 million commits. I have 74 different branches checked out right now. Hard drives are <i>cheap</i>.<p>Code refactors are impossible? I reviewed two of those this morning. They&#x27;re essentially a non-event. I&#x27;m not sure what to make of the merge issue - does code review have to start over after a merge? That seems like a deep issue in your code review process. The service-oriented point seems like a non-sequitur, unless you&#x27;re telling me I&#x27;m supposed to have a service for, say, my queue implementation or time library.<p>The VCS scalability issue is the only real downside I see here. And it <i>is</i> real, but it also seems worth it. It helps that the big players are paving the way here - Facebook&#x27;s contributions to the scalability of mercurial has definitely made a difference for us.
评论 #18815517 未加载
malkiaover 6 years ago
I do really like mono-repos, but google&#x27;s other significant new project: fuchsia - is set-up as multi-git repo (and I believe chromium too, maybe android (haven&#x27;t checked)). For fuchsia, they use a tool called &quot;jiri&quot;[1] to update the repos, previously (and maybe still in use) is the &quot;gclient&quot; sync tool [2] way from depot_tools[3]<p>[1] - <a href="https:&#x2F;&#x2F;fuchsia.googlesource.com&#x2F;jiri&#x2F;" rel="nofollow">https:&#x2F;&#x2F;fuchsia.googlesource.com&#x2F;jiri&#x2F;</a> [2] - <a href="https:&#x2F;&#x2F;chromium.googlesource.com&#x2F;chromium&#x2F;tools&#x2F;depot_tools.git&#x2F;+&#x2F;master&#x2F;gclient" rel="nofollow">https:&#x2F;&#x2F;chromium.googlesource.com&#x2F;chromium&#x2F;tools&#x2F;depot_tools...</a> [3] - <a href="https:&#x2F;&#x2F;chromium.googlesource.com&#x2F;chromium&#x2F;tools&#x2F;depot_tools.git" rel="nofollow">https:&#x2F;&#x2F;chromium.googlesource.com&#x2F;chromium&#x2F;tools&#x2F;depot_tools...</a><p>It even reflects a bit to the build system of choice, GN (used in the above), previously gyp, feels similar on the surface (script) to Bazel, but has some significant differences (gn has some more imperative parts, and it&#x27;s a ninja-build generator, while bazel, like pants&#x2F;bucks&#x2F;please.build is a build system on it&#x27;s own).<p>Simply fascinated :), and can&#x27;t wait to see what the resolution of all this would be... Bazel is getting there to support monorepos (through WORKSPACEs), but there are some hard problems there...
评论 #18813970 未加载
评论 #18814105 未加载
评论 #18823022 未加载
评论 #18817020 未加载
towaway1138over 6 years ago
My polyrepo cautionary tale: Two repos, one for fooclient, one for fooserver, talking to each other over protocol. Fooserver can do scary dangerous permanent things to company server instances, of which there are thousands.<p>Fooserver sprouts a query syntax (&quot;just do this for test servers A and B&quot;), pushed to production. Fooclient sprouts code that relies on this, pushed to production. A bit later, Fooserver is rolled back, blowing away query syntax, pushed to production. &quot;Just do this for test servers A and B&quot; now becomes &quot;Do this for every server in the company&quot;. Hilarity ensues.
评论 #18810818 未加载
CJeffersonover 6 years ago
Is there any examples of someone who actually maintained a monorepo for a massive company, who now says they shouldn&#x27;t? It always seems to be &quot;back seat drivers&quot; against monorepo, not people with practical experience (that I can see at least)
ajucover 6 years ago
I call bullshit on &quot;our repository is too big for one machine&quot;.<p>Seriously, you have over 1 TB of code and 100 people wrote it?
评论 #18813833 未加载
评论 #18813385 未加载
评论 #18814825 未加载
thehazardover 6 years ago
Better title would be &quot;Monorepos don&#x27;t fit with my particular use case.&quot;
评论 #18813361 未加载
评论 #18810630 未加载
rkangelover 6 years ago
To me, the key point is this: Splitting your code into multiple repos draws a permanent architectural boundary, and it&#x27;s done at the <i>start</i> of a project (when you know the least about the right solution).<p>The upsides and downsides of this are an interesting debate, but there is a cost to polyrepos if you want to change the system architecture. There is a cost to monorepos too as argued by this post, and its up to the tech leads as to which cost is greater.
peterwwillisover 6 years ago
<i>&quot;The frank reality is that, at scale, how well an organization does with code sharing, collaboration, tight coupling, etc. is a direct result of engineering culture and leadership, and has nothing to do with whether a monorepo or a polyrepo is used. The two solutions end up looking identical to the developer. In the face of this, why use a monorepo in the first place?&quot;</i><p>.....because, as the author directly stated, the type of repo has nothing to do with the product being successful. So stop bikeshedding, pick a model, and get on with the real business of delivering a successful product.
sterlindover 6 years ago
Could you get the best of both worlds by having a monorepo of submodules? Code would live in separate repos, but references would be declared in the monorepo. Checkins and rollbacks to the monorepo would trigger CI.
评论 #18811421 未加载
评论 #18813977 未加载
评论 #18810352 未加载
sierdolijover 6 years ago
Polyrepos are the way to go:<p>- Semantic versions.<p>- Group components into reusable packages.<p>- Don&#x27;t use git modules or other source cloning in builds, use native&#x2F;platform package management.<p>- Access control is made much easier.<p>- Sign commits and tags.<p>- Code review either before- or after-the-fact, just do it(tm).<p>- Reproducible builds - strip out timestamps&#x2F;random tokens&#x2F;unsorted metadata.<p>- Create CHANGELOGs semi&#x2F;automatically.<p>- Eliminate manual steps altogether.<p>- Distributed builds&#x2F;build caching (distcc, ccache).<p>- TDD smoke tests should run automatically in dev on save with 10 seconds. Bonus points for running personal TDD sandbox on faster remote servers via rsync and trigger on file-save.<p>- Standardize on 1-3 languages.<p>- Services composed of simpler 12factor microservices, not monorepo megaservices. Deploy fuse switching, proxying, HA&#x2F;redundancy, rate limiting, monitoring and performance stats collection just like macroservices.
评论 #18817858 未加载
rakooover 6 years ago
So the conclusion is &quot;monorepo or polyrepo, you&#x27;ll need a lot of tooling anyway. So why use monorepos?&quot;<p>Very easy: because having everything in a single place is just easier to work with.
评论 #18814995 未加载
harunurhanover 6 years ago
I have worked with polyrepo madness... I do remember doing commits to up to 5 different repos just for a feature. And to roll this feature to prod, few of these repos had to go through release process. On top of everything we couldn&#x27;t really write tests to ensure if the feature works. The best we could do is write tests on the &quot;user facing&quot; repo and keep fixing and releasing others until those pass.<p>Well, I am sure many companies doing better than we did, with poly repos though.
评论 #18814807 未加载
评论 #18814078 未加载
chvidover 6 years ago
Does a lot of the pain from a monorepo come from trying to use a tool - Git - that is explicitly designed to support distributed repositories? Wouldn&#x27;t things be easier if you used eg. Subversion instead? That is a tool that was designed around a client&#x2F;server paradigm and had a single repository as its main use case.
评论 #18813954 未加载
评论 #18813812 未加载
laurenceiover 6 years ago
Can anyone here explain to me how a monorepo like Google or Facebook handles security?<p>If I pull the repo - I have the <i>entire</i> contents of Google or Facebook? Is that right?<p>Surely that lacks the normal security measures around what must be highly sensitive information, so there must be more to it than I know of?
评论 #18813300 未加载
评论 #18813355 未加载
评论 #18813285 未加载
评论 #18815499 未加载
评论 #18813360 未加载
erik_seabergover 6 years ago
I wish this had touched on polyrepos&#x27; ability to pin known-good versions of dependencies; that tends to be the Achilles&#x27; heel of monorepos.
评论 #18809902 未加载
评论 #18811116 未加载
评论 #18811195 未加载
评论 #18814119 未加载
EngineerBetterover 6 years ago
Visited a customer recently who had inherited a monorepo.<p>All their CI and release problems traced back to it.<p>At the risk of sounding like an old git, package coupling and package cohesion principles were defined for a reason.<p>I do feel like a lot of patterns in contemporary development are kneejerk reactions to how last generation&#x27;s programmers did things.<p>Exceptions? Nah, multiple returns! Dependency management? Who needs it... Oh, wait.<p>Many small, single-responsibility repos? Wang it all in one, and then invent your own tooling to cope with it!
评论 #18815176 未加载
评论 #18814873 未加载
icedchaiover 6 years ago
Monorepos are <i>way</i> simpler for small teams to work with. At my startup we have roughly 10 services out of the same repo. It&#x27;s much easier to &quot;cut a release&quot; across the entire system. It&#x27;s much easier to share code internally, upgrade dependencies, etc.<p>For a larger company, it might not be a good idea. However, most startups start small and stay that way. Why take on the overhead you don&#x27;t need?
wtracyover 6 years ago
I&#x27;m not familiar with how monorepos work in practice, but it seems obvious to me that it&#x27;s going to complicate everyday tasks.<p>Ready to commit? Whoops, another team made a bunch of commits to their project, and you need to rebase your project before you can commit. (I&#x27;m having flashbacks to Clearcase already.)<p>Need to roll back the last two commits you made? Sure, that takes two seconds--oh, wait, another team made multiple commits that got interleaved with yours. Have fun cherry picking the files you want to revert.<p>Of course, I&#x27;m apparently a curmodgeon, because as soon as someone starts talking about running a find&#x2F;replace globally across multiple projects, I want to grab something sharp.
评论 #18814649 未加载
评论 #18814911 未加载
评论 #18814048 未加载
评论 #18813314 未加载
评论 #18814378 未加载
评论 #18813959 未加载
评论 #18815513 未加载
评论 #18813247 未加载
评论 #18815974 未加载
sytseover 6 years ago
The article is great summary of the pros and cons.<p>What is still missing from the default tooling is a way to make a change across repos.<p>At GitLab we&#x27;re working on group merge requests to solve this <a href="https:&#x2F;&#x2F;gitlab.com&#x2F;gitlab-org&#x2F;gitlab-ee&#x2F;issues&#x2F;3427" rel="nofollow">https:&#x2F;&#x2F;gitlab.com&#x2F;gitlab-org&#x2F;gitlab-ee&#x2F;issues&#x2F;3427</a>
rbettsover 6 years ago
Unless you are pure OSS or pure closed source - you end up with a poly-repo strategy regardless as you split open and closed code, suffering the annoyances of both systems.
评论 #18813302 未加载
madhadronover 6 years ago
The truth is that you&#x27;re not going to get to make this decision. If you&#x27;re starting greenfield, you&#x27;re going to start a single repo for your project. If that greenfield is the whole company and everything is part of that project, you get a giant monorepo. If greenfield is a new division that&#x27;s not part of another project, you&#x27;re going to create a new repo, and now you&#x27;re in a polyrepo environment.<p>Which way it goes is determined by the environment, wherein the engineers do the sensible thing at the time. Then you do the engineering to solve the problems with whatever way you went.
评论 #18817854 未加载
评论 #18813826 未加载
superasnover 6 years ago
The biggest advantages monorepos have offered is development of tools like lerna(1) or yarn workspaces.<p>Before that there used to be a node_modules folder with GBs of [useless] data in all my projects. Now there is just one folder on top and that&#x27;s it. Also if you&#x27;re developing lots of modules or plugins it makes it super to work without committing changes since they are symlinked.<p>(1) <a href="https:&#x2F;&#x2F;lernajs.io" rel="nofollow">https:&#x2F;&#x2F;lernajs.io</a>
评论 #18810827 未加载
sftwdsover 6 years ago
&gt;Scaling a single VCS to hundreds of developers, hundreds of millions lines of code...<p>Maybe I am way out of my element here, but is this a common problem? Do companies with only “hundreds of engineers” really have “hundreds of millions of lines of code”?
评论 #18812993 未加载
paulddraperover 6 years ago
&gt; is there any real difference between checking out a portion of the tree via a VFS or checking out multiple repositories? There is no difference.<p>How big is your monorepo? Assume each line of code is a full 80 characters, stored via ASCII&#x2F;UTF-8. That <i>67 million lines of code</i> in 5GB. I can fit five of those on a Blu-ray.<p>&gt; The end result is that the realities of build&#x2F;deploy management at scale are largely identical whether using a monorepo or polyrepo.<p>True.<p>&gt; It might be deployed over a period of hours, days, or months. Thus, modern developers must think about backwards compatibility in the wild.<p>Depends entire on the application. Lots of changes are deployed within short periods of time with low compatibility requirements.<p>&gt; Downside 1: Tight coupling<p>Monorepos do often have tightly coupled software. Polyrepos also often have tightly coupled software. Polyrepos <i>look</i> more decoupled, but pragmatically I can&#x27;t say I&#x27;ve noticed a much of difference.<p>&gt; Downside 2: VCS scalability<p>I&#x27;ve also heard Twitter engineers complain about the VCS. But what is the scope of the author&#x27;s discussion? 1,000 engineer orgs? Or 20 engineer orgs? Those are <i>vastly</i> different levels of engineering collaboration. I assume the article was not written to cover both of those. Or was it?<p>---<p>Ultimately, I think the author implicitly assumed a universe of discourse of gigantic repos with hundreds and hundreds of daily contributors.<p>When people talk about the spectrum of monorepo vs polyrepo architectures, that is very extreme. For example, last I knew, Uber has more repos than it did engineers. And I don&#x27;t assume that &quot;polyrepos&quot; always means multiple repos per engineer.
titzerover 6 years ago
No silver bullet here, I think.<p>It&#x27;s definitely the case that a mega monorepo doesn&#x27;t, in practice, have the atomic commit property. E.g. once you add owner files and separate code reviews, you&#x27;re in for a world of hurt. Case in point, Google developed an internal tool to split cross-cutting CLs into manageable pieces, wrangle all the owners and approvals, presubmits, etc, and then submit the CL piecemeal--i.e. <i>not</i> atomic.<p>Chromium uses a different model. It just DEPS&#x27;s in other repos at pinned versions. That has a whole other set of problems.
评论 #18818421 未加载
dajonkerover 6 years ago
I&#x27;ve been in a project where some (authoritative) people had a tendency to split things into separate repositories for very small things, e.g. repositories with a single class. This was pure developer hell. Any change meant changing at least 3 repositories, including a review for each change. Never understood this decision as all parts needed to be on the latest version anyway. Caused lots of dependency and versioning issues too.
notacowardover 6 years ago
You know what&#x27;s worse than a monorepo? A duorepo. Yes, that&#x27;s right, two huge repositories embodying all the problems of a monorepo, but coupled in such a way that it&#x27;s easy to break something if the commits and deployments from one are out of sync with the other. It&#x27;s like drinking both bottles of poison, yet it (or minor variations such as three or four entangled ginormorepos) is a thing that really exists.
pbiggarover 6 years ago
Alternate title: monorepos - ideal for teams under 100 devs
评论 #18810209 未加载
评论 #18810038 未加载
评论 #18811078 未加载
评论 #18813230 未加载
mr_tristanover 6 years ago
Open source software workflows are very common and provide a _lot_ of tooling, e.g., Maven, bundler, npm, etc. Add semantic versioning and you have a lot of tooling that you basically get for free for polyrepo setups. With monorepos, you have to really spend a lot of time tooling, because you basically don&#x27;t use the OSS tools.<p>There&#x27;s a lot of odd arguments in this blog that are very spurious:<p>&quot;If an organization wishes to create or easily consume OSS, using a polyrepo is required.&quot;<p>What? _Consuming_ OSS is usually not that bad. I&#x27;ve even imported the complete history from external repos, pretty easily. (It does suck with git but I wouldn&#x27;t use git for a monorepo...) _Contributing_ to OSS is tricky, but the fact you use a polyrepos don&#x27;t really help you much there either.<p>&quot;Polyrepo code layout offers clear team&#x2F;project&#x2F;abstraction&#x2F;ownership boundaries and encourages developers to think carefully about contracts.&quot;<p>Clear ownership boundaries has _zero_ to do with polyrepos. In fact, I&#x27;d say monorepos can be easier, since you say &quot;everything under this directory is owned by X,Y,Z&quot;. There&#x27;s no search function that&#x27;s required to figure out where some other team hid their code. So many times, with polyrepos, projects are _hidden_ because they&#x27;re off in some other grouping unit that you&#x27;re not a member of. So you don&#x27;t even know who owns what or where it came from.<p>In the end, I&#x27;d still strongly recommend using polyrepos because you get _a lot_ of tooling for free, and most integration issues are solved with semantic version locking and CD automation. But the arguments here are not really great.
hvindinover 6 years ago
I suspect the problem most people end up trying to solve isn&#x27;t &quot;how do I technically scale my tools&quot;, because the author points out that tools and techniques for this already exist and its an already solved problem.<p>Instead my experience has largely been that the problem to solve is &quot;how do I make some few hundred developers behave in a predictable way&quot; in the scenario where you have many ok developers, but that you can&#x27;t really be sure that none of them will break stuff because you are trying to solve organisational problems of keeping people with merge rights to only the people who wont break things but at the same time not bottle necking development on to small a number of people then sure, split your repos up so that people can only break stuff that they &#x27;own&#x27;.<p>But at least be honest about the fact that most of the technical issues of having a monorepo have been solved already so the issues you are probably trying to solve are actually people problems.
zbentleyover 6 years ago
The VCS&#x2F;codebase-tooling-size argument rings a bit hollow.<p>We have really good code-search tools that are heavily optimized and indexed (from ripgrep&#x2F;silversearcher to more centralized things like hound, when local-disk performance just won&#x27;t cutit).<p>It&#x27;s not hard to optimize Git workflows to be faster with relatively simple tricks, and if that absolutely doesn&#x27;t scale for some reason and VFS isn&#x27;t an option, there are always centralized VCS systems like Perforce that solve this. P4 gets a lot of shit, but it&#x27;s <i>really</i> good at solving the gigantic-repo domain; tune your client properly and you can initial-sync 10+ GB repos in the time it takes to get a cup of coffee (and, if your company is large&#x2F;old enough to have a repo that big, it can probably afford the Perforce licenses).
etxmover 6 years ago
I feel like a lot of arguments against monorepos assume micro services are the _only_ option on the other side.<p>I tend to break up most of my projects at the edge of business logic or domain logic and lean on a package manager to “deploy together” like any other dependency that’s not in your repo.<p>This allows teams to work independently without a large sprawling repo. If you’re following anything semver-ish hopefully your other teams in the company aren’t breaking releases and you can auto-upgrade on patch level changes. If not, we’ll thank goodness CI is there.<p>I’ve always had difficulty working in projects with too many purposes. This helps me focus where to put things and gives an easy point of escape if a dependency needs to become a service in the future.
cmrdporcupineover 6 years ago
It&#x27;s worth pointing out that while Google has a monorepo in Google3, it also doesn&#x27;t at the same time. We have are other projects such as Android, ones based on Chrome, etc. that are composed of multiple git projects and use repo to manage and sync.
userulluipesteover 6 years ago
This is just modularity in the broader information-related development handling. People tend to get political about things that make their involvement comfortable and&#x2F;or when it&#x27;s not them who are to deal with the pesky consequences. I strongly suspect that that must have been the cause for how giants like Google¹ or Facebook² ended up with monorepos. It is developers&#x27; workout or letting them be with their cake; kick the problem down the road and hope to acquire in time the resources necessary to throw at it later.<p>¹² &quot;Don&#x27;t be evil&quot; (with developers, among others) and &quot;move fast and break things&quot; most definitely asks for cutting a few corners here and there.
NicoJuicyover 6 years ago
Since i knew that Google did it, i have started to think about it, a lot.<p>And mono-repos really do make sense ( a lot) when you need them tied together. Finding errors in your console immediatly without version numbers gets the job faster done.<p>There are other ways though, like if you use dot net. A mono repo that creates nuget packages and projects that pull the latest build of them into their solution. This way, external parties can re-use the same components.<p>On a beta version, that releases new nuget components, if there is a file change ( and so a version update), notify the external parties.<p>Have one website which mentions the schedule of an update on the live version to reduce email traffic. Oldskool, but it seems to work.
obeattieover 6 years ago
The real issue is that this is more nuanced than is appropriate for one-size-fits-all advice. &quot;Everyone should use a monorepo&quot; isn&#x27;t helpful, but neither is &quot;everyone should not use a monorepo.&quot;<p>Sadly, this article falls into the same trap.
manigandhamover 6 years ago
As always, the real world is a whole lot of gray between the black and white articles that are fun but useless.<p>Multirepos like microservices are all about scaling people, not the project. Start with the monolith and monorepo until you need to split, and then focus on separating by groups of logical functionality or team responsibilities (although if those are different then you&#x27;ll end up with other problems).<p>Also stop taking things literally. Monorepo does not mean you must have everything in a single repo. Even a startup can put the majority of the codebase in one place and have things like a corporate website or small admin backend in another.
vikingcaffieneover 6 years ago
I am sure this author means well but I respectfully disagree with this advice.<p>The author is arguing against the monorepo approach and then proceeds to list out some of the most successful software companies on earth as reasons NOT to do it. The reason they were able to get to their lofty heights was in some part because they used a monorepo. The biggest advantage of a monorepo is you can move quickly and understand the implications of changes since everything is housed under one roof. That&#x27;s critical for startups IMO. By the time you reach the &quot;scale&quot; the author is talking about, you have the resources to deal with it. Is it hard? Yes. Is it worth throwing out the baby with the bathwater? No IMO.<p>I currently work in a polyrepo word that the author is encouraging. I can tell you it f*cking sucks. Just take the very simple example of firing up your dev environment. In a polyrepo world, you have to individually fire up each codebase or write up some sort of script to do that for you. The former example sucks for obvious reasons and the latter example makes the case for a monorepo since one dev could author a script that could then be used by all (since he&#x2F;she will know the paths to all things that need to start). Don&#x27;t even get me started on setting up an environment from scratch. Containers make this easier but again, it would be nice to just rock `.&#x2F;start.sh` and be off to the races. A monorepo can give you that.<p>Pulling&#x2F;pushing changes to your vcs becomes a tiresome error prone nightmare since now you need to remember to run git pull on all the codebases that touch the area you are working on. You might forget to pull on one of those codebases and everything starts breaking and now you need to stop and track it down. Dumb error? Yep. Not a thing in a monorepo? Yep. PR&#x27;s become really sucky because now you need to harass your team for n PR&#x27;s instead of just the one if the feature you are working on cuts across codebases. I&#x27;ve worked in some fairly large monorepo codebases with lifespans of &gt;10 years and I can tell you that I have yet to encounter any of the issues with VCS scaling the author speaks of. In the future if I find myself in a situation like that you know what I&#x27;ll do? Migrate to a more performant solution like Mercurial or something. Will it suck? Sure. But not as much as dealing with a polyrepo.<p>Then there&#x27;s dependency management. Holy sweet mother of god dependency management is the worst. Lets say you need to make a breaking change to one of your codebases, in a monorepo (with decent test coverage or a type system worth a damn) you have a decent chance of tracking down everything that needs to get patched. In a polyrepo? Phttt! Enjoy those bug reports from your customers and 1am hotfixes bruh.<p>I really really wish people on here would stop trying to solve problems of &quot;scale&quot; when that&#x27;s literally the last thing you need to worry about. Being able to respond quickly to business requirements is the only thing you should be worrying about until its obvious that you&#x27;ve made it. Then feel free to worry about scale.
评论 #18814100 未加载
daeminover 6 years ago
If you spend your time maintaining and improving a low level library then having a monorepo is much better than having a polyrepo. Primarily because as you make changes to the library you can update everyone else&#x27;s code that uses it rather than having to wait on them to do so. This reduces the need to maintain older versions of these libraries and applications.<p>Additionally you can submit the single change in one go which updates everyone and it is much cleaner than having to find out and know all of the repos that could use your library and manually submit to each one of them, probably breaking some for a time in the process.
lmilcinover 6 years ago
&quot;Yeah, well, that&#x27;s just, like, your opinion, man&quot;<p>Monorepos are one way of solving some of the problems each organization has. Monorepos require discipline in solving those problems and if the organization is not willing to get there all the way or if it takes too much time then it&#x27;s just pain and suffering for everybody.<p>I suspect the author works for one of those organizations that wanted to be hip but did not actually understand what it entails. Maybe faking agile and devops sort of works for you (works as in &quot;it&#x27;s difficult to pinpoint where the problem is&quot;) but faking monorepos certainly does not.
username90over 6 years ago
With the right tooling for both types each directory in a monorepo is equivalent to a repository in a multirepo setup. The only difference is that in the monorepo it is easier to create new repos and dependencies between repos (just add a directory in a commit or add a dependency on another directory).<p>The author of this piece apparently think that the ease to work in a monorepo is a bad thing, I disagree. I think that being able to treat repositories as easily as directories is awesome since it is a lot simpler so requires a lot less training for your devs to understand.
sandovover 6 years ago
I agree, but also...<p>Medium: Please don&#x27;t
评论 #18813514 未加载
评论 #18812181 未加载
kerngover 6 years ago
Static analysis is easier on monorepo. At least one can run it on all code. Polyrepo has the problem that some code is off the radar. That might be the only advantage of monorepo in my opinion.
ascorbicover 6 years ago
One thing I really dislike about monorepos for node modules is that you can&#x27;t npm install from them directly. Unless it&#x27;s a project with very fast PR merging and releases you can be stuck with a broken module that has an open or even merged PR to fix it that you can&#x27;t install because it hasn&#x27;t been pushed to npm. npm link might work locally, but that doesn&#x27;t help if it needs to build on a CI server. If it&#x27;s one repo per module then I can just npm install the git url and it works fine.
groestlover 6 years ago
Me, I dream of a monorepo covering the whole world. Give me a single hash, and let me know the state of things as they are, reaching from the toolchain used to compile the bootloader to the state of the database, which has just dropped a row and therefore generated a new commit, forever secure, an immutable history.<p>I accept the infeasibility of my dream. But I&#x27;d like my repo to cover as much as my tooling realistically allows.
mcguireover 6 years ago
&quot;<i>If an individual clone got too far behind, it took hours to catch up (for a time there was even a practice of shipping hard drives to remote employees with a recent clone to start out with). I bring this up not specifically to make fun of Twitter engineering, but to illustrate how hard this problem is.</i>&quot;<p>But mostly to make fun of Twitter engineering.<p>Seriously, what advantages would a big bag of billions of lines of code have?
isacikgozover 6 years ago
Just read the article and it was really great read. We&#x27;re using polyrepos and dealing with so many repos was not great. That&#x27;s why I created &quot;gitbatch&quot;. Gitbatch allows you to manage multiple repositories in an easy way.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;isacikgoz&#x2F;gitbatch" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;isacikgoz&#x2F;gitbatch</a>
jamietannaover 6 years ago
It&#x27;s funny you say this, because my most viewed article from organic searches is about converting your polyrepo setup to a monorepo <a href="https:&#x2F;&#x2F;www.jvt.me&#x2F;posts&#x2F;2018&#x2F;06&#x2F;01&#x2F;git-subtree-monorepo&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.jvt.me&#x2F;posts&#x2F;2018&#x2F;06&#x2F;01&#x2F;git-subtree-monorepo&#x2F;</a>
1024coreover 6 years ago
Google uses a monorepo: <a href="https:&#x2F;&#x2F;cacm.acm.org&#x2F;magazines&#x2F;2016&#x2F;7&#x2F;204032-why-google-stores-billions-of-lines-of-code-in-a-single-repository&#x2F;fulltext" rel="nofollow">https:&#x2F;&#x2F;cacm.acm.org&#x2F;magazines&#x2F;2016&#x2F;7&#x2F;204032-why-google-stor...</a><p>I don&#x27;t think &quot;scale&quot; gets much bigger than Google.
karmakazeover 6 years ago
TL;DR<p>The post just reads like some opinionated piece for traffic. The author has never even used a monorepo as far as I can tell, so can only argue from one side, the best one ever used: polyrepo. Then goes on to list &#x27;theoretical&#x27; benefits and the downsides (which should also be theoretical if having never been used) of monorepos. It concludes with &quot;The two solutions end up looking identical to the developer. In the face of this, why use a monorepo in the first place? Please don’t!&quot; implying that &#x27;Google, Facebook, Twitter, and others&#x27; do it for no benefit.
z3t4over 6 years ago
You can make shallow clones, and auto push to a single repo, it&#x27;s common to for example auto push to Github from an internal repo. It sure has it&#x27;s issues, but problems are solved with solutions, simply bury your head in the sand, eg. using single repos where a monorepo is the best solution - is not a solution.
qwerty456127over 6 years ago
I didn&#x27;t actually know somebody already does this, but a conceptual idea has came into my mind yesterday: what if there was just one big code repository for a particular programming language and everything anybody writes would immediately become a part of the standard library? It feels kind of a collective brain...
hennsenover 6 years ago
To sum up the discussion of why this is either absolutely right &#x2F; absolutely wrong, How about: „mono&#x2F;poly-repo - none of both is THE single solution for every usecase in every organization and project“? Besides that, sure let’s keep analyzing the pros and cons of each in different scenarios...
shereadsthenewsover 6 years ago
This argument boils down to people who have used Perforce, who believe in the benefits of a monorepo, and people who have only ever used git, who do not. While it&#x27;s true that git is a terrible program that does not lead to conclusions about the merits of a monorepo.
KuhlMenschover 6 years ago
Hm, I find monorepos are a natural in javascript land. There is allot less wiring afforded by a little meta-orchestration. This is especially helpful in the repos I&#x27;ve worked on.<p>But from reading the article, it seems like there are legitimate areas where they might not fit.
sbr464over 6 years ago
One glaring omission of the monorepo design, not sure why really, is if you want open and closed source software in the same monorepo, it doesn’t seem possible. Curious as to why this design choice was made.
评论 #18814021 未加载
astrostlover 6 years ago
The word &quot;workflow&quot; is suspiciously absent from the OP.<p>Annoying workflow is my #1 complaint against polyrepos.
costroucover 6 years ago
in my opinion several build systems &#x2F; package managers have already solved this issue. The answer is that it doesn&#x27;t matter mono repo vs polyrepo. Look at nixpkgs&#x2F;nixos&#x2F;nixpkgs if you are interested
nathan_f77over 6 years ago
I&#x27;m using a monorepo as a solo developer, and it&#x27;s been pretty good. I like having everything in one place, so I can work on everything in a branch, including the feature, updates to API clients, documentation, blog post, etc.<p>One problem is that my test suite is very inefficient. I have to run through every integration test, even if I haven&#x27;t changed any code that might cause these tests to fail. It&#x27;s especially weird that CI runs all my tests whenever I write a new blog post. So I&#x27;m very tempted to split up some things into internal libraries and keep them in a separate repo, and add all these repos as submodules. I know this can be pretty dangerous, and it&#x27;s easy to break things when you update dependencies, OS versions, language versions, etc.<p>If I go down this road, I have to be extremely careful to enumerate all the things that might break the library, and prevent any of these things from being updated automatically. I&#x27;ll set a very strict version constraint in the package.json &#x2F; gemspec, and throw an error if I detect a different version of Node, Python, Ruby, system libraries, etc. Then I&#x27;m forced to run all the library tests and explicitly bump the versions if I want to update anything.<p>I should also only do this when the library is a pure function with no side effects.<p>The really tricky part is figuring out how to write robust integration tests. API boundaries can be a big source of bugs. I think I&#x27;ll do something similar to VCR [1], where the first integration test executes all of the code without any mocks, and then records the response. The response would then include those exact arguments, and it would also be tied to a specific commit hash for the library. If I change anything in the library, then I just need to re-run the slow tests, and then everything will be cached. I guess a real advantage of putting things in a separate library is that you know exactly what files are required for a specific feature, and the commit hash gives you a &quot;fingerprint&quot; of those files that you can use for caching in your tests.<p>Just have to be super careful about any dependencies that might break the library. Also I really need to start running all my tests in a Docker container which matches CI and production. I even have some screenshot tests where I have alternative versions for Mac and Linux. Would be nice to delete those. The experience was really bad when I tried to do this in the past, so I need to figure out a better way.<p>Anyway, sorry for the train of thought! Would be interested to hear your thoughts, and if there&#x27;s anything else I should watch out for.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;vcr&#x2F;vcr" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;vcr&#x2F;vcr</a>
jrochkind1over 6 years ago
the grass is always greener.
nunezover 6 years ago
These arguments are weak, IMO.<p>Yes, monorepos can be slow to browse through if the VCS isn’t configured to handle the size (sparse pulls aren’t the default with Git; that alone can make a massive difference when your repo is massive). Polyrepos can be just as slow? however; what’s worse is that there are <i>more</i> of them.<p>I remember working with a repo that was &gt;20GB large, mostly from videos (we didn’t know that initially). Pulling that repo took _forever_. Nobody on that team cared because they almost never did a fresh pull and accounted the time it took for their CI&#x2F;CD to do so in their reports. If it were a monorepo, MANY teams would’ve felt that pain more immediately.<p>Yes, monorepos require some tooling to prevent a gazillion artifacts from being deployed at once (and to specify what’s related to what if code lives across different folders). So do polyrepos! I’ve configured a few Jenkins jobs for my clients to dynamically pull different co-dependent Git repositories at build time. It’s a pain! Especially when multiple credentials are involved! Then there’s the whole “We have a gazillion repos and 20% of them are junk” problem, which requires automated reaping; also a more difficult problem than it seems.<p>Same with refactors. Refactors across polyrepos are just as much of a pain because you’re now subject to <i>n</i> build and review processes&#x2F;pull requests, and seeing the entire diff is hard&#x2F;impossible. This introduces mistakes. If anything, refactors in polyrepos are more of an event than they are for monorepos.<p>While monorepos have their problems, I will continue to advocate for them because the ability to see what’s going on in one place and for any developer to propose changes to any part of the code (theoretically) is massively beneficial, ESPECIALLY for complex business domains like healthcare or financial services. Plus, you will have a RelEng&#x2F;BuildEng team when your codebase and engineering org gets large enough; why add more complexity by creating a gazillion repos that are possibly related to each other?<p>(The large engineering organization without a team focussed on tools and builds doesn’t exist. If it doesn’t, that means that some&#x2F;many developers are spending way more time spinning their wheels on build systems than they should be.)<p>The real reason why monorepos don’t happen in the aforementioned domains is because there’s no easy way to allow them and pass regulatory audits.<p>Many regulating bodies require hard boundaries enforced by role-based access control, especially for code that deals with personally-identifiable information or code between two or more domains that have a Chinese Wall between them. “All of my developers can check out the entire codebase” is an easy way to get fined hard, and polyrepos are much easier to restrict access into than folders in a monorepo are (one advantage not mentioned in the article). While you _can_ restrict access into directories within a single repo, doing so is not straightforward, and most organizations would rather not waste the engineering effort.<p>I would like to think that Google and Facebook have gotten away with it because they implemented a monorepo from the very beginning and the engineering involved in splitting it up is much more involved than engineering around it.<p>That said, I continue to advocate for them because discoverability is good and it builds a better engineering culture in the end. I would rather hit those walls and make just-in-time exceptions for them than assume that the walls are there and create a worse development experience without exploring better alternatives.
m0zgover 6 years ago
&quot;Scalability&quot; issues aren&#x27;t encountered until your repo has many millions of LOCs and a lot of churn. For 99.99% of organizations this is not an issue and will never be an issue.
mrbanksover 6 years ago
Overly opinionated garbage, imagine having to work with this guy.