The SMBGhost that makes you WannaCry again
Another critical vulnerability in SMB is taking the world by storm. But what caused it, and why is it so devastating?
SMBGhost is a ‘wormable’ vulnerability in Windows 10 and Windows Server 2016. While the vulnerability is from March 2020, there was a new development drawing a lot of public attention in the last few days. Wormable means that it is capable of doing arbitrary remote code execution without needing authentication, allowing it to spread across networks on its own. Things like this never look good: the CVSS score is 10.0 (see CVE-2020-0796), and that already says a lot about how serious the issue is. SMBGhost has raised some serious concerns in the security community, especially after a hacker published a proof-of-concept RCE exploit on Github. This makes it possible for anyone to write their own malware and attack every unprotected Windows machine out there.
In theory, all systems should be patched against SMBGhost at this point (the patch was already released on March 16), but we have seen how a similar story played out just three years ago. We are, of course, talking about the WannaCry ransomware here (which itself was riding the coattails of the EternalBlue exploit kit). There are so many similarities with the current situation that some security experts even nicknamed this vulnerability ‘CoronaBlue’.
Let’s take a closer look at the vulnerability behind SMBGhost, and what we can learn from it.
Some More Bugs, version 3
The Server Message Block (SMB) protocol has a long and storied history. Originally designed some 30 years ago as a networked DOS file system, its first version, SMBv1 (responsible for the WannaCry outbreak in 2017), evolved into a cross-OS file sharing protocol implemented by many third parties – such as Samba for *nix systems – and also became popular among some NAS device vendors due to its relative simplicity.
Throughout its history, the SMB server has been implemented as a kernel driver within Windows. Since the server is enabled and listening by default on a predefined port, any vulnerabilities in it are extremely dangerous and can potentially fully compromise an Internet-connected Windows system. And there were quite a few of these vulnerabilities over the past few years – more than 30 of them just in SMBv1.
Of course, SMB itself evolved with time. Modern Windows versions support the more secure SMBv3 (available since 2012), whose server implementation relies less on legacy code from the 90s. Microsoft has deprecated SMBv1 a long time ago, but could not completely stop supporting it due to it still being in active use by many third-party components (such as NAS). Instead, as of the 2019 Fall Creators Update of Windows, the SMBv1 client and server are now bundled as a separate element that is not installed by default in Windows 10 and Windows server 2016. All this however did not help here: the SMBGhost vulnerability is actually in SMBv3, the ostensibly (more) secure protocol implementation. Not only that, but it is in SMBv3 compression – a relatively new feature added in 2019!
Numbers are hard
So let’s take a look at the vulnerable code, using the decompiled source from Ricerca Security with minor changes for readability. The vulnerability is in Srv2DecompressData, a function responsible for uncompressing the content of a SMBv3 message.
From Ricerca Security
The root cause of the problem behind SMBGhost is the line marked with (A) in the code above. It is an arithmetic integer overflow when handling a certain SMB message (for comparison, EternalBlue/WannaCry exploited an integer truncation problem). Since this code processes the messages received from outside, attackers have full control over the header of the message. Consider that they can send a tricky message by setting OriginalCompressedSegmentSize and Offset in the header so that adding the two numbers together will produce an unsigned integer overflow (technically a wraparound). Ultimately the function will allocate a very small amount of memory for the buffer and copy more bytes there, overflowing the buffer on the heap (line marked with B). A buffer overflow can trivially cause a crash (see this proof-of-concept code), but if the attacker chooses the two values just right, they can achieve controlled memory corruption, leading to remote code execution. That is exactly what makes this vulnerability a ‘wormable’ one.
This is a very common pattern in vulnerable C code: the developer allocates memory based on the combined (user-provided) size of several data fields, but the addition will result in allocating a very small amount of memory instead due to the integer overflow – and then actually copying the user data will trigger a buffer overflow.
Of course, all of the above is just theory so far – so let’s take a look at what the hackers actually achieved.
The more things change…
The initial situation actually looks much better than the EternalBlue (and WannaCry) situation back in 2017. First off, the bad guys cannot start with a working exploit kit capable of RCE out of the gate this time. Furthermore, this attack is targeting SMBv3 3.1.1, which is used only in the most modern versions of Windows (Windows Server 2016 and Windows 10). These versions are more secure than ever, with multiple kernel-level protections such as strong Address Space Layout Randomization (ASLR) and Control Flow Guard (CFG) enabled by default. So that means we should be safe, right?
Well, even just going by publicly-disclosed exploitation, it only took two weeks for researchers to publish the details of a Write-What-Where (WWW) and local privilege escalation (LPE) exploit based on SMBGhost – and less than a month after that for another group to publish the details of a fully working RCE (though they decided not to release a proof-of-concept exploit).
The first two exploitations by ZecOps (see technical writeup here) abuse the fact that the allocation function for this module (SrvNetAllocateBuffer) behaves in a predictable way when allocating small blocks of memory. Specifically, for efficiency reasons it uses so-called lookaside lists: fixed-size buffers that can be reused later as needed. Using this added an element of predictability. Furthermore, the heap layout was rather unusual in that the buffer was stored before the lookaside entry’s allocation header, so the buffer overflow could trivially overwrite the metadata structure (SRVNET_BUFFER_HDR) within it. These are basically the same tricks used by researchers porting EternalBlue to Windows 10 in 2018 (see page 9 and 10 of the RiskSense whitepaper)!
The simplified heap layout – only showing the relevant fields within the struct – was like this:
By overwriting the pNetRawBuffer pointer, the memmove() marked with (C) will write arbitrary data to an arbitrary location in memory. This was a devastating attack by itself, allowing them to gain SYSTEM access assuming they already had some level of access to the machine.
But – as always – the situation can get even worse.
Two weeks following the disclosure of the LPE exploit, Ricerca Security posted a detailed writeup on achieving full RCE by SMBGhost. In order to do that, they had to break address space layout randomization and the Control Flow Guard (which would normally catch illegal jumps, for instance to the shellcode).
In broad strokes:
- Entries in lookaside lists behave differently from heap blocks in that their SRVNET_BUFFER_HDR headers are reused for subsequent allocations. By using the Write-What-Where vulnerability, the researchers could directly overwrite a critical part of this header pointing to a memory descriptor list (pMDL). This gave them the power to do arbitrary read operations when the header was parsed.
- By using some other tricks specific to the Windows 10 kernel and its handling of page tables in memory, they could read physical and memory addresses, and learn the real address of a code pointer that should be overwritten with the address of the shellcode (which was in user space). This was used to bypass ASLR.
- Since they had arbitrary write capability (through the aforementioned Write-What-Where) in a kernel context, they could also simply remove the call to the Control Flow Guard function that would’ve normally detected an illegal jump during execution of the shellcode in user space.
- So, jumping right to the shellcode was possible despite ASLR and CFG: profit!
All in all, the researchers demonstrated that full remote code execution was eventually possible. While they didn’t release their exploit code to the wild, it only took a month and a half until someone else used their idea as a basis to develop and publish a working exploit.
1997 2017 anymore… or is it?
While SMBGhost is a different vulnerability from the one exploited in EternalBlue leading to WannaCry, it has some remarkable similarities. A simple programming bug (in this case, an integer overflow) allowed triggering a buffer overflow, which was then exploitable via a complex exploitation chain, ultimately leading to remote code execution in the kernel. And once someone has done it, following in their footsteps is relatively easy. An exploit like the one published in June can potentially open up millions of Windows machines to remote compromise, even by low-skill attackers. Though the vulnerability has been already fixed, not all users will be installing security patches in time. Of course, the aggressive auto-updating behavior of Windows 10 is admittedly a benefit here.
This time, the attackers had to defeat multiple protections on the way to exploitation. And yet, even with highly advanced protections such as the Control Flow Guard, they have demonstrated that defeating them is often just a cat-and-mouse game in favor of the attackers. And in this case, it wasn’t even legacy code being attacked, but 2019 code written for a more secure version of the protocol!
Built-in protections and anti-exploit techniques like ASLR and CFG naturally help a lot in improving the baseline security of your application – but they don’t make it immune to these kinds of threats. Eventually, a determined attacker may be able to get around them as the story of SMBGhost has demonstrated once again.
The best approach is to prevent these scenarios from happening in the first place, by doing proper input validation and employing defensive programming techniques. These things are all part of the approaches and best practices that you can learn on our courses, along with more details about those magic acronyms that appeared in this article – like RCE, LPE, ASLR or CFG.