GitHub Actions: Ephemeral self-hosted runners and new webhooks for auto-scaling

192 pointsby gazabover 3 years ago

17 comments

sascha_slover 3 years ago

This feature was delayed every month after May.And yet it is still half baked. We prepared for this with internally shared docs and the branch built in private for a while, but still had to roll back yesterday because the scheduler reverted to putting jobs wherever it pleased (including on ephemeral runners that already have a job) and randomly cancels large sets of jobs too.I have been of the opinion that investing into GH Actions at this stage is purely sunken cost (at my org), and I'm not moving until the team behind this thing ships something that doesn't break half the time. These have been seriously frustrating months, because no amount of working around this messy code[1] made of 5 layers of MS style .NET (seriously, deleting a directory goes 5 layers deep in the call stack) will ever produce a stable product. They don't even know their own code base that well, when they first attempted ephemeral runners with `--once` it turned out the thing they produced could never work (because the server-side scheduler loves pipelining jobs to machines and failing miserably when these disappear, job times out after 20 minute of waiting type)[1]: <a href="https://github.com/actions/runner" rel="nofollow">https://github.com/actions/runner</a>

评论 #28607614 未加载

评论 #28604654 未加载

e_proxusover 3 years ago

I really wish the runner agent was written in something more portable than .NET. That choice feels like something purely political because they’re owned by Microsoft. I doubt and independent organization would have chosen it before other excellent choices such as Go, Rust etc.Currently hosting the runner on e.g. FreeBSD or custom embedded systems is not supported (or even possible).

评论 #28602839 未加载

评论 #28604268 未加载

评论 #28605290 未加载

rubicksover 3 years ago

Too little and too late. Meanwhile, I'm over here with gitlab self-hosted runners that "dispatch" ephemeral runners. I can tweak scaling limits and the whole contraption runs seamlessly on the AWS ec2 instances of my choosing.My company just competed the migration from github to gitlab and, while it's not perfect, there's a lot to like on gitlab.

评论 #28605785 未加载

19hover 3 years ago

The absolutely most annoying issue with GitHub Runners is the fact that they run 1 job .. at a time ... per server.You can only imagine our follow-up meetings about the fact that we had a fleet of 15 c5a.2xlarge instances and still half of the developers were waiting up to 20 minutes for an instance to go online.The worst part? The jobs don't clean up -- probably to allow for caching. We ran into into disk space issues regularly enough for it to force us to make the spot instances commit harakiri after 2 days.GitHub are a cool concept and we'll probably stick with them. But their quality is just bad. There's that .NET runner and it feels like it's so massively different from anything GitHub-like you could imagine .. almost as if it's a whitelabel program they licensed or like it's the result of a 4 week contract work. Simply bad.

评论 #28607002 未加载

评论 #28606592 未加载

wcdolphinover 3 years ago

If anyone has experience using self hosted GH Actions at scale, I’d love to buy you a virtual coffee and hear about pros/cons for a parallelized CI flow currently running in Circle. Main motivation for switching would be simplification of tooling and increasing performance with better cache reuse and running within AWS for faster network access to ECR.

评论 #28602142 未加载

评论 #28603085 未加载

评论 #28619939 未加载

koalalorenzoover 3 years ago

I wish there was an official Helm Chart for k8s, like GitLab CI/CD Runner has, and not the kind that sits there and does no scale, but he kind that spins up workers on demand without taking too much resources while idle.I wish GitHub copied that feature from GitLab too!

评论 #28603123 未加载

thinkafterbefover 3 years ago

The feature pull request has been there for over a year[1], it’s nice that’s it’s released!Incoming shameless plug; if you don’t have to handle the hosting runners, but still to reap the benefits of having proper hardware(close to the metal). Check out BuildJet for GitHub actions - 2x the speed for half the price. Easy to install and easy to revert.[1] <a href="https://github.com/actions/runner/pull/660" rel="nofollow">https://github.com/actions/runner/pull/660</a> [2] <a href="https://buildjet.com/for-github-actions" rel="nofollow">https://buildjet.com/for-github-actions</a>

评论 #28605007 未加载

评论 #28603145 未加载

评论 #28603376 未加载

noptdover 3 years ago

Ephemeral runner support has been highly anticipated for our organization - I'm excited to see it go live!However, GitHub Enterprise admins may want to take caution - some users have reported that the changes are not currently compatible <a href="https://github.com/actions/runner/pull/660" rel="nofollow">https://github.com/actions/runner/pull/660</a>

评论 #28601948 未加载

评论 #28601855 未加载

vyrotekover 3 years ago

We're pretty happy with Azure DevOps on our team.But, these competing offerings between Azure and GitHub have been really confusing to follow. Especially since folks are pointing out that GitHub Actions is partly Azure DevOps under the hood. It just seems like a complicated branding play because some people will refuse to use an Azure service but will gladly use a GitHub service still owned by Microsoft?

评论 #28610296 未加载

评论 #28606885 未加载

评论 #28619986 未加载

xvilkaover 3 years ago

The biggest problem with GitHub Actions that you can't restart just one job[1], it always restarts all jobs in the workflow. And this bug is not fixed for quite a while. Travis CI and Appveyor both allow that, of course.[1] <a href="https://github.com/actions/runner/issues/432" rel="nofollow">https://github.com/actions/runner/issues/432</a>

评论 #28619999 未加载

hardwaresoftonover 3 years ago

It's a bit of an old drum to beat on but just want to note that GitLab has supported this (and provides docs for running on EC2, Fargate, k8s and other platforms like LXD[0][1][2][3]) for a very long time, and the CI system there is quite robust.I've seen my fair share of CI systems (AppVeyor, CircleCI, GitLab, GitHub, TC, Jenkins, etc) and I'd argue that the GitLab CI is the best of all the ones I've seen:- great syntax (it's YAML like most others but somewhat easy to organize well with great documentation)- Fantastic documentation- Unparalleled flexibility- Unsurprising operation (things generally work as you'd expect)- The ability to clear your build runner cache (Just ran into the inability to do this with CircleCI again today)That said competition is a good thing so in general I'm glad to see this finally supported by GHA and dig into it over the weekend. GHA is making a lot of really good sustainable moves in the space and keeping the field open (their marketplace is the best) so I'm all for it.I run SurplusCI[4] which does what you'd think (runs these runners in VMs) so getting this on-demand runners working happens bit top-of-mind, right now I only offer dedicated runners which are cheaper but of course aren't as cheap as on-demand (depending on usage).Speaking of competition, just learned of a competitor here on HN in BuildJet[5], so if you don't want to manage your own runners check them out as well, unlike SurplusCI they actually offer to-the-minute on-demand runners, and the onboarding process looks way easier.[EDIT] - Just to say, the list above is absolutely NOT the full list of platforms GitLab Runner supports -- it's pretty insane how many directions the community and GL have gone in. The Docker Machine integration (they maintain a fork) actually means you could run your single-use-machines on Scaleway or Hetzner easily as well, no need to muss or fuss with ASGs or k8s.[0]: <a href="https://docs.gitlab.com/runner/configuration/runner_autoscale_aws/" rel="nofollow">https://docs.gitlab.com/runner/configuration/runner_autoscal...</a>[1]: <a href="https://docs.gitlab.com/runner/configuration/runner_autoscale_aws_fargate/" rel="nofollow">https://docs.gitlab.com/runner/configuration/runner_autoscal...</a>[2]: <a href="https://docs.gitlab.com/runner/executors/kubernetes.html" rel="nofollow">https://docs.gitlab.com/runner/executors/kubernetes.html</a>[3]: <a href="https://docs.gitlab.com/runner/executors/custom_examples/lxd.html" rel="nofollow">https://docs.gitlab.com/runner/executors/custom_examples/lxd...</a>[4]: <a href="https://surplusci.com" rel="nofollow">https://surplusci.com</a>[5]: <a href="https://buildjet.com/for-github-actions" rel="nofollow">https://buildjet.com/for-github-actions</a>

nicoisover 3 years ago

So this is a big step forward in terms of avoiding the race condition where CI runners would accept new jobs during scale-in operations. But how do you ensure you only spawn new ephemeral runners as jobs become available? The webhook provides part of the answer, but do we need to use something like redis to ensure exactly one runner per queued job is started?

评论 #28602737 未加载

anonymousDanover 3 years ago

Can someone tell me if GHA also supports non-ephemeral self hosted runners, and if so whether they work reliably? Any good resources for getting up and running with it quickly?

评论 #28604280 未加载

评论 #28604957 未加载

NiekvdMaasover 3 years ago

This is great news. The only part missing is official docker support for the runner (I'm using an unofficial solution right now) and/or Alpine support.

mcintyre1994over 3 years ago

The autoscaling piece is cool! One of the things that impressed me most about Gitlab CI was how easily we could get runners autoscaling in our own AWS environment. We'd run tiny instances as the actual runner, and they'd spin up bulky instances for different jobs with none of those running when nobody was working. It sounds like this might give a building block to build that in Github Actions.

elamjeover 3 years ago

I wonder if/when GitHub is going to start offering a Heroku-like service or full IaaS. It seems like an incredible opportunity to slap GitHubs branding on top of a subset of Azure's infrastructure and try to beat Heroku or AWS.

smcleodover 3 years ago

The (previous) lack of ephemeral runners was one of my few gripes with GitHub Actions, great to see it's been released!

评论 #28602431 未加载