Racing bugs in Windows kernel


  • ETW for security researcher

  • CVE-2023-21536 overview

  • Why other Filters are not vulnerable

  • Exploitation

  • CVE-2023-21537 overview

  • Exploitation

  • Conclusion

In this writeup, I share my research methodology and exploitation info about 2 kernel bugs I reported to MSRC and fixed January 2023. The first bug is a race condition that causes Use-After-Free on ETW filter, and the second bug is a double fetch race that causes "free" of controlled pointers in Windows messaging queue module.

ETW for security researcher overview
This first bug I reported exists in Windows since 2018 and is specifically related to the ETW module. ETW has quite wide functionality in the kernel and I will make some high level overview from a security researcher's perspective.

ETW stands for Event Tracing for Windows- and as its name suggests it collects a lot of logs and events during the kernel execution. The design is a Providers that can generate event logs and Consumers that listen to events emitted by Providers. These logs can be inspected by system admin for the purpose of technical OS problems or security-related issues. You can read more about ETW design here (1) and here (2).

ETW has 3 main kernel entry points which can be reached from userspace –

NtTraceControl - takes 5 arguments: the information class(“function num”), Input buffer + length, Output buffer + length. There are about 40 different functions that can be triggered and most of them are dealing with providers/consumers configurations such as:
-Start/stop tracing
-Register/Enable providers
-Configuring provider’s notifications
-Get trace/log info

Pretty much attack surface. The important point here is that some operations are restricted to specific users. This is checked via several functions (EtwpCheckLoggerControlAccess, EtwpAccessCheck,EtwpCheckGuidAccess) where basically when you want to interact with ETW provider/log these "check" functions have hardcoded access check permissions that are needed for certain functionality. For example, if you want to start a trace session, EtwpAccessCheck will be called with the required access of 0xA0 which means that the calling user must be in a group that holds the TRACELOG_CREATE_REALTIME + TRACELOG_GUID_ENABLE privileges. A good overview of ETW and permission groups can be found here (3).

A lot of applications already hold open handles to providers they have registered with. This means that once you have code execution in such application so you will be able to trigger a lot of NtTraceControl functionality. For example, Notepad.exe holds handles of type EtwReg with access permission of 0x804, this will allow you to interact with many ETW kernel functions.

NtTraceEvent – takes 4 arguments: handle to Provider, Flags, Input Buff length + Input Buff. It has about 10 functions that can be triggered from userspace and most of them make the actual event writing to logs. Here the same applies to access checks, for who can write events to where.

NtSetSystemInformation - takes 3 arguments: the class should be SystemPerformanceTraceInformation, Input Buff + Length. This API has about 30 functions that can be called from user space. Some of the functions are related to trace performance, restart info, profiles info and more.

CVE-2023-21536 overview
This bug was found by manual review and I doubt it can be found by a fuzzer because the race condition has quite a small window. The bug is in the main ntos kernel binary.
My approach to this research was to try understand what shared entities are encapsulated in the ETW module. When I say “shared” I mean which interesting structures exist in ETW which probably might be accessed from different functions. This is quite easy because ETW has several critical structures that for sure must be accessed from different places. For example:

You can see these structs descriptions with most fields reversed engineered at
Now let's say that you identified some interesting structure/object and you see that one of its members is a pointer to some allocated data/object. Now the question is – how is this pointer get managed? who allocates it? Who frees it? Is there any remove/replace logic for this pointed data? The basic question here is whether there is a possibility that two independent functions take this pointed data and operate on it simultaneously.
In most cases, the programmer of the code will need to use “locks” when touching such a pointer. Some locks can be for read access and some for write access. Usually "read locks" can be taken without being released first.
A crucial point here is at what exact places this “lock” is taken and whether it is actually taken and not forgotten. When I say “exact places” I mean is the lock taken early enough before any critical shared data is accessed? and not less critical is whether the lock is held and not released earlier than needed.
For my purpose, I started looking at the ETW_GUID_ENTRY. You can see at offset 0x180 there is a member called FilterData. This is a pointer to a struct of type ETW_FILTER_HEADER.

    LONG FilterFlags;                                                       
    struct _ETW_FILTER_PID* PidFilter;                                      
    struct _ETW_FILTER_STRING_TOKEN* ExeFilter;                             
    struct _ETW_FILTER_STRING_TOKEN* PkgIdFilter;                           
    struct _ETW_FILTER_STRING_TOKEN* PkgAppIdFilter;                        
    struct _ETW_FILTER_STRING_TOKEN* ContainerFilter;                       
    struct _ETW_PERFECT_HASH_FUNCTION* StackWalkIdFilter;                   
    struct _ETW_FILTER_EVENT_NAME_DATA* StackWalkNameFilter;                
    struct _EVENT_FILTER_LEVEL_KW* StackWalkLevelKwFilter;                  
    struct _ETW_PERFECT_HASH_FUNCTION* EventIdFilter;                       
    struct _ETW_PAYLOAD_FILTER* PayloadFilter;                              
    struct _EVENT_FILTER_HEADER* ProviderSideFilter;                        
    struct _ETW_FILTER_EVENT_NAME_DATA* EventNameFilter;                    

These filters are used to filter events that a provider is going to write to a trace session. A lot of pointers and they should be managed somehow, the fastest way to get an overview where this FilterData is used is by searching in IDA for offset 0x180. Any function that interacts with FilterData will need at some point to dereference the ETW_GUID_ENTRY at offset 0x180 in order to get the FilterData pointer.
The basic assembly instruction will look like this: mov rcx, [rdi+180h]
In IDA, go to Search -> Immediate Value -> Value to search=0x180
You will get many (600) results most of them false positive, but if you search in the results for functions containing "etw" you will narrow the results to 75. From here you can see that some results make compare or some irrelevant operations so you narrow to the final 60 results.
*pay attention that when narrowing the search to "etw" you may miss some results.

Here are the results:

Now after reversing some logic around these filters you will see that you can set/update filters to a provider via the NtTraceControl api. The inner function that is responsible for it is -EtwpUpdateFilterData(ETWGUID_ENTRY a1_Guid, unsigned int a2_LogerIndex, ETWENABLE_NOTIFICATION_PACKET a3_pEnablePacket, int64 a4_isRemove, ETWFILTER_INFO *a5_outFilterInfo)

The 3rd arg is controlled and sent from userspace and its structure is:

unsigned int LegacyProviderEnabled;
unsigned int FilterCount;
EVENT_FILTER_DESCRIPTOR Filters[0 to FilterCount];

When reversing the EtwpUpdateFilterData() function you can easily see that if you pass FilterCount=0 then the current filters will be freed. Additional option is to update a filter but it is not interesting. There is a total of 12 filters you can free so the critical question now is whether this flow is protected by a "lock". Why? because if this current flow can free a filter and no locks are taken so you can cause a Use-After-Free by racing another flow which "uses" this filter. So you basically get one flow freeing a filter and another flow using this filter.

The most appropriate "lock" to be used is the etw_guid_entry lock because all these filters belong to guid_entry. The calling function is EtwpUpdateGuidEnableInfo, a quick search comes with no lock here. Let's look at the upper calling function - EtwpEnableGuid, there is a lock via ExAcquirePushLockExclusiveEx(&guidEntry->Lock, 0). The lock is exclusive so no other flows can take the same lock. Looks like the code of the "freeing" side is safe.

All this is good unless you forget to lock the "filter use" flow. Locks must be on both the free and the use sides. Here comes some challenging reversing, I want to find all the places where these filters are used and verify that locks are taken appropriately. Once again you can use the search results on offset 0x180 to see which functions use filters. Seems like there are dedicated functions to use filters - EtwpApplyEventNameFilter(), EtwpApplyEventIdPayloadFilter(), EtwpApplyPackageIdFilter(), EtwpApplyContainerFilter(), EtwpApplyExeFilter, EtwpApplyLevelKwFilter(), EtwpApplyScopeFilters() and several more.

Now you need to reverse these functions and see if proper locking is done. Four filters caught my eye - EtwpApplyEventIdPayloadFilter, EtwpApplyEventNameFilter, EtwpApplyStackWalkIdFilter, EtwpApplyLevelKwFilter. The last one is the vulnerable filter which causes use-after-free.
EtwpApplyLevelKwFilter is called from two funcs - EtwpWriteUserEvent and EtwpEventWriteFull. EtwpWriteUserEvent can be easily triggered from user space by calling the NtTraceEvent api. At this flow no locks are taken at all, meaning that when you try to apply a filter you are actually in a race with the flow that can "free" the filter. The filter-apply side is unsafe while the filter-free side is safe.
You may also notice that NtTraceEvent can be called with Flags value=0x300 and thus getting a handle with access permission of 0x800. This permission is widely used and many processes have already open handles like it so you can trigger the EtwpApplyLevelKwFilter from many processes once you can execute your code at such a process.
The Race condition:

Why other filters are not vulnerable
As mentioned before, another three filters take exactly the same "no lock" flow via NtTraceEvent() api: EtwpApplyEventIdPayloadFilter(), EtwpApplyEventNameFilter(), EtwpApplyStackWalkIdFilter().
So how is it that they cannot be raced and do not cause use-after-free?
It seems like synchronization is done in not standard way via IRQL levels. When the CPU's IRQL is increased to high level it ensures that only the current CPU flow is running. This trick can be used for a very short period of time and is usually used when interrupts occur or very important task need to be done. If you look at the implementation of these 3 filters you can see that they all use writeCR8(2) just before applying logic on the filter and writeCR8(0) once finished. This will increase IRQL to DPC(2) level while "using" the filter.
On the freeing side at EtwpUpdateFilterData you may notice that before these 3 filters are freed there is a call to KeGenericCallDpc(KeAbCrossThreadDeleteNopDpcRoutine, 0). This will try to raise all CPUs IRQL to DPC(2) level and thus will fail if a filter is being used. So finally we are left with only 1 filter - KwFilter, which is freed without any IRQL tricks and also being used without a lock and without IRQL tricks.

Exploitation ideas and challenges
For the purpose to allow Windows users to update their systems, only general steps are discussed.
In order to win the race you will need to race two threads that constantly trigger the two flows - filter use + filter free. The good news is that if you lose the race you can just keep trying till you succeed and nothing will crash.

The KWFilter is allocated from PagedPool with a size 0x18. You will need to make very fast spray to be able to allocate on this exact spot. Alternatively, it might be possible to make the race window wider so the flter is used for longer time so you can cause a use-after-free.
KWFilter allocation will be with many other allocated objects ranging the size of 0x10-0x20 which means a pretty noisy bucket. You will probably need to make sure that some random allocations do not interfere with your spraying.
Spraying controlled data with WNF objects might be a good try. You can read more on WNF for exploitation here (4).

CVE-2023-21537 overview
This bug was found by manual code review and it exists in the mqac driver for more than 10 years. The bug is a double fetch race when kernel logic reads the same user-space address at different points. Message Queuing (mqac) driver enables applications running at different times to communicate across networks and systems that may be temporarily offline.
The easiest way to trigger functionality at this driver is via MQAC api which has user-mode functions such as MQOpenQueue, MQReceiveMessage, MQSetQueueProperties and more. An additional way is to directly send IOCTL commands from user-space and these commands will be parsed in mqac driver's ACDeviceControl() function.
While reversing the different IOCTLs I saw that ioctl 0x19658107 is accessible for any users and it calls to ACSendMessage(DeviceObject, Irp, InBuffLen, FileObj->FsContext, InBuff). The last argument is a big user space buffer with a size of 0x2C0.

The ACSendMessage() flow starts well when the user space buffer is get copied to local kernel buffer and then ACDeepProbeSendParams(DeviceObject, InBuffCpy, oCtxPtrs) is called to validate that the copied user buffer contains valid pointers that do not point or intersect kernel space. The user buffer is pretty complex combination of structures and some of these structures must be validated to have valid pointers.
One such structure at offset 0x250 is validated via the ACDeepCopyQueueFormat( ArrayOfElements, ElementsCount, outDeepCopyArray). This structure is an Array of elements and each element size is 0x20. All these elements are copied to kernel space and as these elements contain pointer to a user-supplied String so this String needs to be validated and copied to a new kernel space allocated memory – making a "deep copy". Also, elements count is validated not to go out of buffer when making the deep copy. The resulting "deep copy" buffer is returned via the 3rd argument outDeepCopyArray and saved locally on stack.

After this CQueue::PutNewPacket() is called to send the user-requested data and after it finishes a call to ACFreeDeepCopyQueueFormat(outDeepCopyArray, ElementsCount) is made.

Here is the bug - the 2nd argument is taken directly from the User Space input buffer from offset 0x248. The ElementsCount is now used as a counter to iterate over the deep copy buffer and make a clean-up logic. Because the ElementsCount is fetched from user space you totally control it and a pretty wide race occurs between the allocation flow(good) and the free flow(bad). Exploiting the race and changing the ElementsCount to big value will cause out of bound free of random pointers. The ACFreeDeepCopyQueueFormat() checks the type of the element and if it is 3, 6 or 8 then it frees the pointer that is in the +0x8 offset, the next subsequent QWord. This free logic 'expects' there to be a pointer to a previously allocated string that needs to be freed.

Exploitation ideas and challenges
For the purpose to allow Windows users to update their systems, only general steps are discussed.
The bug allows you to free pointers that exist after the "deep copy" array of elements. The out of bound elements are considered as 0x20 size. The only precondition is that the value at offset 0x0 (fake element) is 3, 6 or 8.
The main idea is to cause an implicit use-after-free by freeing some known object.

Three main tasks:
1. How to get controlled data exactly after the "deep copy" buffer.
2. What pointer/object to free.
3. What should I do with the freed object.

Task 1:
The "deep copy" buffer is allocated on Paged Pool which means the controlled fake elements will need to be allocated also on paged pool. The good news here is that the "deep copy" buffer is a multiple of 0x20 that means you pretty control the bucket where the original allocation will be. For example, if you specify at offset 0x248 that the array has 8 elements so a total buffer of 0x20*8=0x100 will be allocated at the kernel paged pool. This is a big advantage as if your fake controlled objects are exactly 0x100 size it will still be good with your exploit.

For the purpose of spraying controlled data you can use WNF objects which have full control of the data except for the first 0x10 bytes. The challenge here is how to place the fake data exactlly after the "deep copy" buffer - windows kernel heap allocator for this allocation sizes has full randomization when allocating 0x100 chunks so techniques such as making holes for allocations will fail. You can read Saar Amar overview of LFH here (5).

Task 2:
Here you want to choose an interesting object to be freed and thus causing use-after-free. Such an object could be I/O ring or ALPC reserved blob. Both these objects have capability to construct an arbitrary write primitive if you can free it and allocate your controlled data. At this stage it doesn't matter if the object to be freed is on Paged pool or not because you can free any pointer by design via the mqac bug.
The challenge here is how to know the address of the object to be force freed?. If you run on a Medium integrity process it is an easy task but if you are a Low integrity user you will need to construct a read primitive to leak the address of the object to be freed. For this purpose, you might consider first freeing a pipe attribute object which will help you construct a read primitive and leak the address of IO ring/ ALPC blob object that you want to free.

Task 3:
Once you free I/O ring object you can use Yarden Shafir's arbitrary kernel write technique from (6).
IO Ring allows user mode code to queue many IO operations and submit them in one shot. IO Ring is also capable of using a preregistered buffers as an input for IO Ring operations. At offset 0xB8 there is a IOP_MC_BUFFER_ENTRY* RegBuffers which is used to point to an array of preregistered buffers. Once you free IO Ring and instead allocate your controlled data you can make this RegBuffers point to your userspace array of controlled data. Why is it useful? because IO Ring logic will dereference this array for determining where to read or write the next IO operation. The logic will take the WhereToWrite value and use it as a destination for writing.

The first vulnerability is a Use-After-Free caused by a tight race condition when on one hand the Filter is freed and on the other hand it is used. The freeing flow uses an exclusive lock but the other flows using this filter make poor job on locking the Filter. It seems like when adding this specific filter to the code the programmer misunderstood how it should be done correctly and that is quite understandable because the other filters use strange synchronization.
The second vulnerability is a double fetch race that gives you the ability to free arbitrary objects. The code relies on that the user space input buffer remains without any changes during the execution of the function. This bug is very powerful and might be exploited in different ways.
At the first bug, I shared my approach to the research and how I managed to identify the different flows that need to be researched. Hope it will help other researchers to research and publish their write-ups about windows kernel bugs.
For questions and comments about the writeup you can reach me at

(3) HYPERLINK ",26,29,30,32,35,36,196;24,194"& HYPERLINK ",26,29,30,32,35,36,196;24,194"tx=25,26,29,30,32,35,36,196;24,194