Interestingly, the RCE fix was "smuggled" in public almost a month ago.<p><pre><code> When PerSourcePenalties are enabled, sshd(8) will monitor the exit
status of its child pre-auth session processes. Through the exit
status, it can observe situations where the session did not
authenticate as expected. These conditions include when the client
repeatedly attempted authentication unsucessfully (possibly indicating
an attack against one or more accounts, e.g. password guessing), or
when client behaviour caused sshd to crash (possibly indicating
attempts to exploit sshd).
When such a condition is observed, sshd will record a penalty of some
duration (e.g. 30 seconds) against the client's address.
</code></pre>
<a href="https://github.com/openssh/openssh-portable/commit/81c1099d22b81ebfd20a334ce986c4f753b0db29">https://github.com/openssh/openssh-portable/commit/81c1099d2...</a><p>It's not really a reversable patch that gives anything away to attackers: it changes the binary architecture in a way that has the side-effect of removing the specific vulnerability <i>and also</i> mitigates the whole exploit class, if I understand it correctly. Very clever.
One interesting comment in the OpenSSH release notes<p>> Successful exploitation has been demonstrated on 32-bit Linux/glibc
systems with ASLR. Under lab conditions, the attack requires on
average 6-8 hours of continuous connections up to the maximum the
server will accept. Exploitation on 64-bit systems is believed to be
possible but has not been demonstrated at this time. It's likely that
these attacks will be improved upon.<p><a href="https://www.openssh.com/releasenotes.html" rel="nofollow">https://www.openssh.com/releasenotes.html</a>
From the diff introducing the bug [1], the issue according to the analysis is that the function was refactored from this:<p><pre><code> void
sigdie(const char *fmt,...)
{
#ifdef DO_LOG_SAFE_IN_SIGHAND
va_list args;
va_start(args, fmt);
do_log(SYSLOG_LEVEL_FATAL, fmt, args);
va_end(args);
#endif
_exit(1);
}
</code></pre>
to this:<p><pre><code> void
sshsigdie(const char *file, const char *func, int line, const char *fmt, ...)
{
va_list args;
va_start(args, fmt);
sshlogv(file, func, line, 0, SYSLOG_LEVEL_FATAL, fmt, args);
va_end(args);
_exit(1);
}
</code></pre>
which lacks the #ifdef.<p>What could have prevented this? More eyes on the pull request? It's wild that software nearly the entire world relies on for secure access is maintained by seemingly just two people [2].<p>[1] <a href="https://github.com/openssh/openssh-portable/commit/752250caabda3dd24635503c4cd689b32a650794">https://github.com/openssh/openssh-portable/commit/752250caa...</a><p>[2] <a href="https://github.com/openssh/openssh-portable/graphs/contributors">https://github.com/openssh/openssh-portable/graphs/contribut...</a>
It’s also worth reading the release notes <a href="https://www.openssh.com/releasenotes.html" rel="nofollow">https://www.openssh.com/releasenotes.html</a><p>This is actually an interesting variant of a signal race bug. The vulnerability report says, “OpenBSD is notably not vulnerable, because its SIGALRM handler calls syslog_r(), an async-signal-safer version of syslog() that was invented by OpenBSD in 2001.” So a signal-safety mitigation encouraged OpenBSD developers to put non-trivial code inside signal handlers, which becomes unsafe when ported to other systems. They would have avoided this bug if they had done one of their refactoring sweeps to minimize the amount of code in signal handlers, according to the usual wisdom and common unix code guidelines.
Once I'd finished upgrading my openssh instances (which are linked against musl not glibc) I thought it'd be interesting to have a poke at musl's syslog(3) and see if it allocates too and so is easily exploitable in the same way. But as far as I can see, it doesn't:<p><a href="https://github.com/bminor/musl/blob/master/src/misc/syslog.c">https://github.com/bminor/musl/blob/master/src/misc/syslog.c</a><p>Everything there is either on stack or in static variables protected from reentrancy by the lock. The {d,sn,vsn}printf() calls there don't allocate in musl, although they might in glibc. Have I missed anything here?
Patch out for FreeBSD. Not clear if affected (it has only known to be exploitable with glibc, which we don't use) but best to be safe.<p><a href="https://www.freebsd.org/security/advisories/FreeBSD-SA-24:04.openssh.asc" rel="nofollow">https://www.freebsd.org/security/advisories/FreeBSD-SA-24:04...</a>
From the report:<p>> Finally, if sshd cannot be updated or recompiled, this signal handler
race condition can be fixed by simply setting LoginGraceTime to 0 in the
configuration file. This makes sshd vulnerable to a denial of service
(the exhaustion of all MaxStartups connections), but it makes it safe
from the remote code execution presented in this advisory.<p>Setting 'LoginGraceTime 0' in sshd_config file seems to mitigate the issue.
Patch out for Debian 12; Debian 11 not affected.<p><a href="https://security-tracker.debian.org/tracker/CVE-2024-6387" rel="nofollow">https://security-tracker.debian.org/tracker/CVE-2024-6387</a>
This is a really good find.<p>One thing which (as an independant person, who isn't doing any of the work!) is it often feels like in order to 'win', people are expected to find a full chain which gives them remote access, rather than just finding one issue, and getting it fixed / getting paid for it.<p>It feels to me like finding a single hole should be sufficient -- one memory corruption, one sandbox escape. Maybe at the moment there are just too many little issues, that you need a full end-to-end hack to really convince people to take you seriously, or pay out bounties?
Yearly reminder to run your ssh server behind spiped.[1] [2] [3]<p>[1] <a href="https://www.tarsnap.com/spiped.html" rel="nofollow">https://www.tarsnap.com/spiped.html</a><p>[2] <a href="https://news.ycombinator.com/item?id=29483092">https://news.ycombinator.com/item?id=29483092</a><p>[3] <a href="https://news.ycombinator.com/item?id=28538750">https://news.ycombinator.com/item?id=28538750</a>
> Finally, if sshd cannot be updated or recompiled, this signal handler
race condition can be fixed by simply setting LoginGraceTime to 0 in the
configuration file. This makes sshd vulnerable to a denial of service
(the exhaustion of all MaxStartups connections), but it makes it safe
from the remote code execution presented in this advisory.
Correct me if I'm wrong but it seems like sshd on RHEL-based systems is safe because they never call syslog.<p>They run sshd with the -D option already, logging everything to stdout and stderr, as their systemd already catches this output and sends it to journal for logging.<p>So I don't see anywhere they would be calling syslog, unless sshd does it on its own.<p>At most maybe add OPTIONS=-e into /etc/sysconfig/sshd.
TLDR: this vulnerability does appear to allow an attacker to potentially gain remote root access on vulnerable Linux systems running OpenSSH, with some important caveats:<p>1. It affects OpenSSH versions 8.5p1 to 9.7p1 on glibc-based Linux systems.<p>2. The exploit is not 100% reliable - it requires winning a race condition.<p>3. On a modern system (Debian 12.5.0 from 2024), the researchers estimate it takes:
- ~3-4 hours on average to win the race condition
- ~6-8 hours on average to obtain a remote root shell (due to ASLR)<p>4. It requires certain conditions:
- The system must be using glibc (not other libc implementations)
- 100 simultaneous SSH connections must be allowed (MaxStartups setting)
- LoginGraceTime must be set to a non-zero value (default is 120 seconds)<p>5. The researchers demonstrated working exploits on i386 systems. They believe it's likely exploitable on amd64 systems as well, but hadn't completed that work yet.<p>6. It's been patched in OpenSSH 9.8p1 released in June 2024.
<p><pre><code> In our experiments, it takes ~10,000 tries on average to win this race
condition, so ~3-4 hours with 100 connections (MaxStartups) accepted
per 120 seconds (LoginGraceTime). Ultimately, it takes ~6-8 hours on
average to obtain a remote root shell, because we can only guess the
glibc's address correctly half of the time (because of ASLR).
</code></pre>
MaxStartups default is 10
> In our experiments, it takes ~10,000 tries on average to win this race
condition, so ~3-4 hours with 100 connections (MaxStartups) accepted
per 120 seconds (LoginGraceTime). Ultimately, it takes ~6-8 hours on
average to obtain a remote root shell, because we can only guess the
glibc's address correctly half of the time (because of ASLR).<p>Mitigate by using fail2ban?<p>Nice to see that Ubuntu isn't affected at all
> Exploitation on non-glibc systems is conceivable but has not been examined.<p>( <a href="https://www.openssh.com/txt/release-9.8" rel="nofollow">https://www.openssh.com/txt/release-9.8</a> )<p>Darn - here I was hoping Alpine was properly immune, but it sounds more like "nobody's checked if it works on musl" at this point.
If you are on GCP and don't have time to patch, GCP recommends turning off your port 22 for now. <a href="https://cloud.google.com/compute/docs/security-bulletins" rel="nofollow">https://cloud.google.com/compute/docs/security-bulletins</a><p>1. Find things that are 0.0.0.0 port 22, example, <a href="https://gist.github.com/james-ransom/97e1c8596e28b9f759bac79a34dd92ac" rel="nofollow">https://gist.github.com/james-ransom/97e1c8596e28b9f759bac79...</a><p>2. Force them to the local network,
gcloud compute firewall-rules update default-allow-ssh --source-ranges=10.0.0.0/8 --project=$i;
After the xz backdoor a few months ago, I decided to turn off SSH everywhere I don't need it, either by disabling it or uninstalling it entirely. While SSH is quite secure, it's too lucrative a target, so it will always pose a risk.
RedHat put an 8.1 score on it: <a href="https://access.redhat.com/security/cve/cve-2024-6387" rel="nofollow">https://access.redhat.com/security/cve/cve-2024-6387</a>
Patch out for Arch Linux<p><a href="https://archlinux.org/packages/core/x86_64/openssh/" rel="nofollow">https://archlinux.org/packages/core/x86_64/openssh/</a><p><i>edit</i> be sure to manually restart sshd after upgrading; my systems fail during key exchange after package upgrade until restarting the sshd service:<p>% ssh -v 192.168.1.254<p>OpenSSH_9.8p1, OpenSSL 3.3.1 4 Jun 2024<p>... output elided ...<p>debug1: Local version string SSH-2.0-OpenSSH_9.8<p>kex_exchange_identification: read: Connection reset by peer<p>Connection reset by 192.168.1.254 port 22
For my own setup, I'm looking into Path Aware Networking (PAN) architectures like SCION to avoid exposing paths to my sshd, without having to set up a VPN or port knocking.<p><a href="https://scion-architecture.net" rel="nofollow">https://scion-architecture.net</a>
TLDR: these are the safe versions 4.4p1 <= OpenSSH < 8.5p1 AND >= 9.8p1<p>---<p>- OpenSSH < 4.4p1 is vulnerable to this signal handler race condition,
if not backport-patched against CVE-2006-5051, or not patched against
CVE-2008-4109, which was an incorrect fix for CVE-2006-5051;<p>- 4.4p1 <= OpenSSH < 8.5p1 is not vulnerable to this signal handler race
condition (because the "#ifdef DO_LOG_SAFE_IN_SIGHAND" that was added
to sigdie() by the patch for CVE-2006-5051 transformed this unsafe
function into a safe _exit(1) call);<p>- 8.5p1 <= OpenSSH < 9.8p1 is vulnerable again to this signal handler
race condition (because the "#ifdef DO_LOG_SAFE_IN_SIGHAND" was
accidentally removed from sigdie()).
Anyone else here just totally crap bricks when they see news like this? Like, I wake up and instantly think all my servers are going to be owned and freak out. Though its usually never that bad, sometimes it is. Looks like in this case my debian servers were fine though.<p>edit: maybe i should add an iptable rule to only allow ssh from my IP.
In some setups I decided to have jumphost via HAproxy ssl as described there <a href="https://www.haproxy.com/blog/route-ssh-connections-with-haproxy" rel="nofollow">https://www.haproxy.com/blog/route-ssh-connections-with-hapr...</a> so no ssh directly exposed at all.
I have a Ubuntu 22.10 system with ssh using socket activation. Does this bug still have an impact? I've read that Ubuntu 24.4 is safe because of socket activation. Can any expert here comment?
There's a purported PoC exploit that delivers shellcode available on GitHub, but I saw someone comment the link here, and then their comment disappeared on the next refresh.
And who was notoriously <i>not</i> exploitable? The ones hiding sshd behind port knocks. And fail2ban: would work too. And a restrictive firewall: would help too.<p>I don't use port-knocking but I really just don't get all those saying: <i>"It's security theater"</i>.<p>We had not one but two major OpenSSH "near fiasco" (this RCE and the xz lib thing) that were both rendered unusable for attackers by using port knocking.<p>To me port-knocking is not "security theater": it <i>adds</i> one layer of defense. It's defense-in-depth. Not theater.<p>And the port-knocking sequence doesn't have to be always the same: it can, say, change every 30 seconds, using TOTP style secret sequence generation.<p>How many exploits rendered cold dead in their tracks by port-knocking shall we need before people stop saying port-knocking is security theater?<p>Other measures do also help... Like restrictive firewalling rules, which many criticize as "it only helps keep the logs smaller": no, they don't just help keep the logs smaller. I'm whitelisting the three ISP's IP blocks anyone can reasonably be needing to SSH from: now the attacker needs not only the zero-day, but it also need to know he needs to be on one of those three ISPs' IPs.<p>The argument that consists in saying: <i>"sshd is unexploitable, so nothing else must be done to protect the server"</i> is...<p>Dead.
<p><pre><code> OpenBSD is notably not vulnerable, because its
SIGALRM handler calls syslog_r(), an async-signal-safer version of
syslog() that was invented by OpenBSD in 2001.
</code></pre>
Saving the day once again.
People still use SSH these days?<p>I kid, but really you probably shouldn't on Production. You should be exporting your logs and everything else. The host or VM bootstrapped golden images with everything as needed.<p>It is okay to start that way and figure out your enternals but that isn't for Production. Production is a locked down closed environment.<p>Recomment from another Hacker News post.