Sometimes I feel like it's a minority position, but I think it strange all the efforts people go to in order to essentially make the git DAG look like a (lie of a) straight-line CVS or SVN commit list. Seeing how the sausage was actually made (no rebases, no squashes, sometimes not even fast-forwards) isn't pretty, but it is <i>meaningful</i> and will tell you a great deal about a project and its developers... I trust that. It's real and visceral and how software is actually made and you can learn from that or find things to explore in that jungle. Projects with multiple developers that yet have straight line commit histories and super tidy commits are aberrations and full of little lies...<p>Kudos to GitHub for providing this feature that a lot of people have asked for. I obviously don't plan to use it, but I appreciate that it's an option for those people that like their small, harmless lies. ;)
This is a bad idea masquerading as a good idea. Before making a pull request (or doing any sort of merge), you should rebase against upstream master (or whatever you're going to push to). However, keeping distinct atomic commits that change one and only one small thing, when possible, is much preferable if bisect or blame is used. If you have broken or poorly written commits, use fixup, reword, squash, etc. in rebase -i.<p>Using fast-forward (and possibly only allowing fast-forward) is a good idea. Squashing entire pull requests that may change multiple things into a single commit is a very bad idea.
I mentioned this in a tweet[0] but we have a quasi-tradition of shipping on April 1st:<p>* <a href="https://github.com/blog/1815-l-is-for-labels" rel="nofollow">https://github.com/blog/1815-l-is-for-labels</a><p>* <a href="https://github.com/blog/1451-branch-and-tag-labels-for-commit-pages" rel="nofollow">https://github.com/blog/1451-branch-and-tag-labels-for-commi...</a><p>* <a href="https://github.com/blog/626-announcing-svn-support" rel="nofollow">https://github.com/blog/626-announcing-svn-support</a><p>[0]: <a href="https://twitter.com/gjtorikian/status/715972348860633088" rel="nofollow">https://twitter.com/gjtorikian/status/715972348860633088</a>
This is a presentation issue masquerading as a data issue. If somebody suggested deleting data because a report was ugly, they'd be laughed out of the room.<p>Give us tools to mark commits as unimportant or group them together as a meta-commit object for history purposes.
I hate squashing branches. There's a lot of value in commit messages; they're educational and are a form of documentation. It's on the developer to squash the "oops" commits, during a rebase, rewriting the commit message so it has some value when going back in time to look at changes.<p>I'd love to see a commit linter, that points at commits with text like "oops" and "fix my derp" to suggest possible commits to squash.<p>Git history shouldn't resemble a hot mess, but the evolution of code should be pretty granular. I'd take the hot mess over full squashing, though.
Perhaps I'm greedy, but I'd like also an additional option to rebase without squashing ...<p>Also, it's not clear if it is possible to disable the merge button completely. I prefer to use the command line to rebase and fix the details in the commits, but the big green "merge" button is always too tempting and it's easy to press it by mistake.
This is a feature I've wanted for such a long time. While it's perfectly feasible to do through the command line, I've always found myself having to force push to update the history on github's side of things.<p>I typically did it through git rebase -i HEAD~N, so maybe someone here on HN knows of a better way to squash a commit whenever you're updating remote history. Albeit, it seems that updating remote history with a squashed commit isn't entirely attractive behavior and that's why I was forced to force push.
Nice to see Github catching up to Gitlab[1]<p>[1]: <a href="http://feedback.gitlab.com/forums/176466-deprecated-feedback-forum/suggestions/4289653-rebase-merge-requests-in-the-web-ui" rel="nofollow">http://feedback.gitlab.com/forums/176466-deprecated-feedback...</a>
Oh god no. I made the mistake of moving a team to squashed commits once. The lack of individual commits poses large problems down the line. 2 nonstarters come to mind:<p>1: Completely ruin your ability to git bisect any bug injected in your branch. Instead of getting a 10 line commit, bisect will point you to hundreds or thousands of lines instead.<p>2: All code will blame to a single person. Code with 6 people on a large branch? Want to git blame the code to see who wrote the function that is weird looking so you can ask questions? Too bad.<p>Do not squash branches on teams. One of the biggest mistakes of my professional career.
This is awfull because it destroys information.<p>A feature branch needs cleaning up before publishing. These bug and typo fixes needs to be squashed in the branch, but the history should preserve a sequence of commits corresponding to atomic changes because it tells a story.<p>A commit should be small enough to make it easy to check if the change is correct. It should be self contained so that it can be moved arround, cherry picked, etc.<p>This is why the local commit history should be considered as a draft of the story, and the one published should be the official one aimed to be easily readable, verifiable and manipulable.
Any plans to allow for squashing but with a merge instead of a fast-forward or fast-forwards without squashing?<p>CI will still run against the hypothetical merge commit, no? I wonder if there are edge cases where merge vs squash+fast-forward would result in different conflict resolutions and different trees, so master could end up with a tree that didn't have tests run against it.
I'm confused. When I don't want to see the details of the topic branches, I run 'git log --first-parent'.<p>Why not make it easier to see what I want to see, but then let me drill down and see more, instead of removing the details all together?
No, don't squash your commits. This is stupid advice that comes back over and over.<p>Squashing commits is a useless thing that has absolutely no benefit. It's dumb. It really makes no sense. It has very clear negatives.<p><a href="https://news.ycombinator.com/item?id=5631184" rel="nofollow">https://news.ycombinator.com/item?id=5631184</a>
Call me naive, but wouldn't this problem be best solved by requiring that commits represent meaningful and working increments of work?<p>I use 'git add -p' judiciously and only commit when having reached a point where something can be usefully said to be in some way "done". Sure, it's not perfect, and occasionally I end up having to do some cleanup of miscellaneous printf statements, debug values or typoes in subsequent commits, but this is something that should really be avoided if possible.
I'm waiting for the code review tool that takes a massive squashed PR and cuts it up into a series of clean, atomic commits that can be reviewed individually.
Yeah, this is dumb. Being able to effectively bisect large projects depends on having smaller commits. If your commits are all huge, your bisect will hit some 500 line patch that does 5 different things -- it'll easily quadruple the debugging time. Not to mention that having small self-contained commits is a <i>good thing</i>.
This, as rebase, is a great idea for people who would rather use SVN instead, but have to use git for some reason or another.<p>I mean, if you want your history simplified into a linear squashed series of commits, why the hell do you use VCS which models a history as acyclic directional graph in the first place?
Maybe I'm missing something, but I've never understood the point of cleaning up commit history. The only time I ever look at commit history past a few days is if I want to know when something broke (in which case I WANT every part of the history, even the messy bits), or if I'm trying to refresh my memory of what I've done for the year (for performance reviews or whatever) in which case I guess it's marginally useful, but it's pretty easy to skim over "fix build" and such.
As this thread approaches 300 posts I'm wondering when we're going to get out of this Git Tarpit we've somehow got ourselves into?<p>To be charitable: Git seems to be a good tool designed for Problem A, being widely used for Problems B,C,D for which it is a fairly poor choice.<p>I _think_ we got here through some mix of "but it's really fast!"; "I can do _anything_!!"; "But Linus says its great!"; "I don't need to pay for a beefy server any more!"; "New _must_ be good, right?".<p>In any event, how do we get out the ditch and back to work making software vs. trying to reason about the unreasonable? I personally, for the kind of projects I'm involved with (small teams, all paid by the same piper, with aligned clear goals and competent coders), had perfectly satisfactory revision control systems since around 2000 (except when required by employer to use Clearcase..). It would be nice to get back to that future.
Annoying: you can't turn off both. If your project has a workflow where the webui merge should not be used (e.g. using signed merges) there is still no way to achieve that.
Wow, loved this.<p>I contributed a little to JUnit and they ask you to squash your commits before making a pull, it took some time to do it, it's so confusing/wierd using regular GIT.
Awesome! Phabricator got me addicted to the clean history of squashing commits and the logical changes staying together in source control (as opposed to grouping in a PR).
It seems to me that the scope of this article is pretty narrow (in a good let's avoid a flame war way). And it describes one of the few scenarios where I think squashes are beneficial.<p>But then the conclusion doesn't quite add up. if I want to remove all the merge mess from a PR don't I actually want to rebase the PR, not merely squash the history? Or dos I miss the point he was making?
This is such a nice feature. Thank you for working on it. Keeping history clean is important, especially in enterprise solution where you have to keep support multiple releases. You want to select certain commits/features in one version. With merge/squash, you would get a cleaner history, and it is easy to pick commits you want.
I believe there should be more open communication and a record the public can fall back on while these developers make their improvements. Something in their face. Too many times did I track back and see unprofessional comments being made and things being done that seemed downright suspicious.<p>if it were up to me it would be alot harder to get a developers license and you would have to meet regulatory standards and have degrees to uphold your professionalism while acting as your own developer it seems to me that alot of developers have taken things into their own hands and are trying to make a quick buck any way they can get it. don't be surprised if you tell your mother or father or sibling to look at what licenses they have agreed to on their phones and they find alot of outdated unassigned licenses to back up their privacy. it seems to be an epidemic and how are we going to stop it?
This obsession with "clean" history seems to me nothing short of insanity. Source control has one job: to record the history of code in order to be aid in figuring out what happened when things go wrong. If everything always went well, we'd just have great merge tools and throw away the history.<p>I can understand a desire to filter out stages of a project by different levels of review (I just ran the build and tests passed so I committed vs a bunch of people reviewed it so I merged), but that people solve that by deleting or rewriting history to be something different from what actually happened is just nuts.<p>Is it just that git makes it easier to change history than to add metadata for filtering? Why have we not seen presentation tools to solve this problem rather than what seems like a ubiquitous readiness to alter and throw away the messy but accurate facts about what happened?
The key principle is that the software should always correctly work at any point in the chain of commits, so you must squash commits that are "oops, fix X in previous commit".<p>Once that is satisfied, commits should be as small as possible, so that information about the grouping of changes is preserved.
After completing a feature (or part of one) I often "squash" manually by performing a mixed reset from the tip of the in-development branch back to a good parent commit.<p>This leaves all of the feature's changes in your working copy, which you can then stage by line/hunk and individually commit in clean, atomic pieces with the benefit of already having written the final code.<p>This removes the in-progress commits like a Squash, but pieces of code can still be brought in as individual easy-to-review commits. And it's potentially easier to understand / perform manually than rewriting history via a git command that would do the same thing.
This is great! But what about fixups? I often prefer a fixup to a squash when the commit in question was a WIP or something, and I don't have any desire to preserve the message in the final commit.
This is wrong at so many levels. Of course we don't care about "progress" or "woops I fucked up" commits, but people also do commits for things like coding style, and these commits should be keep separated from the others, and DO NOT warrant a separate PR. If it is about changing all tabs to space in a file, for instance, different PRs will be hard to work with, because of conflicts. Also, people who write WIP commits should learn how to squash them themselves, and give them a meaningful commit message.
Usually when squashing branches locally, all the changes are set to myself as the author, effectively losing history of who did what in the feature branch. How does Github handle this?<p>Additionally, the lack of the commit history for that branch often caused merged conflicts when merging the same branch again (if it had additional new fixes, for example). That's why I switched to using git merge --no-ff.<p>If it were very easy to do, I would definitely do a git squash, as it keeps history very clean. I just don't want to have the problems listed above.
I think squashing commits makes sense if you're working in a big team and/or on a complex/large project - The main advantage of it is that it speeds up the QA process because it cleans up all the back-and-forth (exploratory) changes that tend to happen during development.<p>If you squash properly, each commit will represent a small standalone feature.<p>It does reduce your commit count though :(
Squashing is not an alternative to merge workflow. It's what you do to clean up <i>before</i> you integrate your work (whether by rebase or merge). You've just made seven changes, which should just be three: git rebase -i HEAD~7, squash away. Okay, now you have three. rebase them on top of new work in the upstream branch, or merge? Separate question.
This is great news for us as we prefer this flow. However, we actually mostly merge via our custom tool which uses GitHub's API. I can't find anything in the documentation about whether these options apply to API merges or if there's any query params that can achieve that behavior from the API. Anyone know anything about this?
I personally like this and it fits well with my team's workflow (though I am a bit concerned it will prevent engineers from learning to do these things with git). There are pros and cons for sure but I think if you are losing a lot of resolution by squashing then the scope of your pull requests might be too big in the first place.
The option to avoid merge commits when merging pull requests misses the point, but apparently many users demanded it for some reason, so Github implemented it.<p>Here's how merge commits, rebasing and isolated small commits work:<p>You branch your topic off master and make many commits while working on it. This is all local.<p>Once you're ready to publish for review/integration, you squash fixup and backup commits into a coherent patch set, where each and every revision builds and works. This is where you can rebase and squash for good reason. Now you can push to Github.<p>If you, like Phrabricator, create a single big commit with all changes lumped together, then it's impossible to bisect and follow the thought process behind changes. Try to git-bisect linux.git vs a repo that's been managed with Phabricator and its mega commits.<p>With small commits that each make one coherent change, you can easily include relevant explanations in the commit message, which is much harder to achieve with a single mega commit. Further, it's very simple to follow along the development process of changes with separate commits. If you have one big diff, it's hard to understand the changes of a branch, whereas reviewing small commits with an explanation in the message and the overall reduction in diff size makes it much, much easier to understand for reviewers.<p>With separate small commits you review each step and finally arrive at the complete feature implementation at the end. For someone who has to review code they didn't write this saves a huge amount of valuable reviewer time for actual reviewing than trying to reverse engineer the steps taken in a big diff.<p>Moreover, with multiple commits, you can easily approve of some of the commits, while requesting improvements for others.<p>Gerrit implements this well and the process is what linux and git and other projects use when reviewing big patch sets. Set is the important word.<p>Finally, why do you want merge commits? Unless you always make a single mega commit ala Phabricator or the new Github feature, having merge commits provides a very practical way to see that a set of commits landed via foo-branch-X. If you've ever viewed a git log graph, that's the interesting integration points, which you will lose if you omit merge commits. In a merge commit you can also include extra stuff as part of the merge commit itself, so it's not just useless metadata.
Do any VCSs have a notion of a commit of commits? If you could group a series of sequential commits into one commit on trunk it seems like you could have the best of both worlds: an overarching commit for your change and a series of how the sausage got made.
Equally, <a href="https://github.com/git-land/git-land" rel="nofollow">https://github.com/git-land/git-land</a><p>"This is a git extension that merges a pull request or topic branch via rebasing so as to avoid a merge commit."
Rarely do conversations around these parts get as heated as they do when git process comes up. As I read through these comments, only one thing surfaces: everybody organizes their shit differently. If you're here trying to sell your process, why?
A bit OT but a feature I would love to see in git is to be able to see which branch a commit came from. Especially for shortlived branches which we prune periodically on the server (git branch -r --contains won't work as a result)
This always reminds me of stackenblochen <a href="https://youtu.be/Qo_2ReMNzhU" rel="nofollow">https://youtu.be/Qo_2ReMNzhU</a>
Fantastic.<p>Regarding non-merge workflows: For me the question is:<p>should git history reflect a literal record of keystrokes or should it reflect intent?<p>I strongly believe in the latter.
no reason why developers should not be reliable for their work. if you ask me there should be a process to handing out open licenses and from what I have seen apache gives any one the right to do what they want. I would love to see stricter laws when it comes to third parties and open source licensing, including the chatter back and forth.
If everyone on your team actually knows how to use git, much better to let them rebase their commits and mark out a series of clean, atomic commits which introduce the feature you're reviewing. If you have people who are incompetent at using git on your team, this feature will help protect your history from them.