Benefit is claimed under 35 U.S.C. 119(a)-(d) to Foreign Application Serial No. 201941002756 filed in India entitled “DYNAMIC DISCOVERY OF INTERNAL KERNEL FUNCTIONS AND GLOBAL DATA”, on Jan. 23, 2019, by VMWARE, INC., which is herein incorporated in its entirety by reference for all purposes.
One challenge in implementing hypervisor-based security protection for a guest operating system (OS) in a virtual machine (VM) is the lack of guest OS context. At the hypervisor layer, raw guest memory can be accessed but it is non-trivial to decode the memory to determine where the guest OS is keeping internal data structures such as process list, thread list, system call table, internal locks, global internal variables/data and so on. It is also difficult to find out offsets of various fields in a data structure since the offsets could change from one version of the guest OS to another. Further, it is desirable for the hypervisor to gain control when the guest kernel is executing certain internal functions. However, it is non-trivial to decode the raw guest memory to find the addresses of these internal functions. Modern OS security techniques like address space randomization (ASLR) make it even more difficult to decode the raw guest memory.
In the following detailed description, reference is made to the accompanying drawings, which form a part hereof. In the drawings, similar symbols typically identify similar components, unless context dictates otherwise. The illustrative embodiments described in the detailed description, drawings, and claims are not meant to be limiting. Other embodiments may be utilized, and other changes may be made, without departing from the spirit or scope of the subject matter presented here. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the Figures, can be arranged, substituted, combined, and designed in a wide variety of different configurations, all of which are explicitly contemplated herein.
VM 106 represents a software implementation of a physical machine. Hypervisor allocate virtual resources to VM 106 to support a guest operating system (OS) 110 with a guest kernel 112 running on the VM and at least one guest application 114 running on the guest OS. The virtual resources may include virtual (guest) memory, virtual processor, virtual local storage, and virtual network interface cards. Guest OS 110 may be implemented using any suitable operating system, such as Microsoft Windows, Linux, etc.
Guest kernel 112 typically exports functionality in the form of (1) system calls for user and kernel modes, and (2) exported functions and global data for the kernel mode. For example, the NTOSKRNL.EXE kernel on the Windows platform exports 400+ system calls and 2000+ exported function and global data.
In
In
Guest kernel 112 also has thousands of internal functions and global data. Unlike system calls, exported functions, and exported global data, the internal functions and global data cannot be discovered by walking the system call table and the export table in the PE image of guest kernel 112.
In examples of the present disclosure, method and apparatus are provided for a hypervisor to dynamically discover certain internal functions and global data of interest in a guest OS kernel in a virtual machine (VM). In some examples of the present disclosure, a tool disassembles the machine code of at least one kernel system call or exported function in the PE disk image of the OS kernel. The tool uses a program database (PDB) file to produce assembly code annotated with the names of system calls, functions, and variables. A computer engineer examines the assembly to find at least one internal function or global data (including internal data structures) of interest. For the internal functions or global data of interest, the computer engineer determines a pattern that identifies the internal function or global data from memory references in the assembly code. In some examples of the present disclosure, the computer engineer then creates an instrumentation in VMI module 106 (
In block 302, the instrumentation locates a kernel exported system call or function in image 200 (
In block 304, the instrumentation disassembles machine code of the kernel exported system call or function in the image into assembly code. Block 304 may be followed by block 306.
In block 306, the instrumentation matches a pattern against memory references in the assembly code. When the instrumentation detects the pattern from the memory references, block 306 may be followed by block 308.
In block 308, the instrumentation determines the internal address information of guest kernel 112 from the assembly code.
First and Second Instrumentations
In some examples of the present disclosure, an instrumentation in VMI module 108 (
To demonstrate this instrumentation, consider the disassembled code the exported “NtQueryKey” system call for querying a registry key listed in Table 1. This system call internally references a registry key object and later on checks if a registry lock is shared acquired.
NtQueryKey is a system call so it can be discovered using the PE memory image of guest kernel 112. “ObReferenceObjectByHandleWithTag” and “ExlsResourceAcquiredSharedLite” are exported functions so they also can be discovered in the export table of the PE memory image. “CmKeyObjectType” and “CmpRegistryLock” are internal global data that are cannot be easily discovered in the PE memory image.
A first instrumentation may be created to find the internal CmKeyObjectType global data, and a second instrumentation may be created to find the internal CmpRegistryLock global data.
In block 402, the first or the second instrumentation locates a kernel exported system call or function in image 200 (
In block 404, starting from the address of the NtQueryKey system call determined in block 402, the first or the second instrumentation disassembles the machine code in the PE memory image 200 into assembly code. Block 402 corresponds to block 304 of method 300. Block 404 may be followed by block 406.
In block 406, the first or the second instrumentation matches a pattern against memory references in the assembly code. When the first or the instrumentation detects the corresponding pattern from the memory references, block 406 may be followed by block 408. As the first instrumentation disassembles the code of the NtQueryKey system call, it tracks what is loaded in the “R8” register and look for a call to the exported ObReferenceObjectByHandleWithTag function, which is identified by its address determined in block 402. As the second instrumentation disassembles the code of the NtQueryKey system call, it tracks what is loaded in the “RCX” register and look for a call to the exported ExlsResourceAcquiredSharedLite function, which is identified by its address determined in block 402. When the first instrumentation detects the call to the exported ObReferenceObjectByHandleWithTag function or the second instrumentation detects the call to the exported ExlsResourceAcquiredSharedLite function, block 406 may be followed by block 408. Block 406 corresponds to block 306 of method 300.
In block 408, the first or the second instrumentation determines the internal address information of guest kernel 112 from the assembly code. The first instrumentation parses what is stored in the R8 register as the address of the internal CmKeyObjectType global data. The second instrumentation parses what is stored in the RCX register as the address of the internal CmpRegistryLock global data. Block 408 corresponds to block 308 of method 300.
Third Instrumentation
In some examples of the present disclosure, an instrumentation in VMI module 108 (
To demonstrate this instrumentation, consider the disassembled code the kernel system call “NtSuspendProcess” for suspending a process listed in Table 2.
NtSuspendProcess is a system call so it is easily discovered in the PE memory image of guest kernel 112. “PsProcessType” is an exported global data so it also can be discovered from the export table of the PE memory image. “ObpReferenceObjectByHandleWithTag” is an internal function that cannot be easily discovered in the PE memory image.
A third instrumentation may be created to find the internal ObpReferenceObjectByHandleWithTag function.
In block 502, the third instrumentation locates a kernel exported system call or function in PE memory image 200 (
In block 504, starting from the address of the NtSuspendProcess system call determined in block 502, the third instrumentation disassembles the machine code in the PE memory image 200 into assembly code. Block 504 corresponds to block 304 of method 300. Block 504 may be followed by block 506.
In block 506, the third instrumentation matches a pattern against memory references in the assembly code. As the third instrumentation disassembles the code of the NtSuspendProcess system call, it tracks what is loaded in the R8 register and looks for the address of the exported PsProcessType global data (determined in block 502) being loaded in the R8 register. When it detects the address of the exported PsProcessType global data is loaded in the R8 register, the third instrumentation looks for the next function call, which is a call to the ObpReferenceObjectByHandleWithTag internal function. When the third instrumentation detects the next function call, block 506 may be followed by block 508. Block 506 corresponds to block 306 of method 300.
In block 508, the third instrumentation determines the internal address information of guest kernel 112 from the assembly code. The third instrumentation parses the call destination of the next function call as the address of the ObpReferenceObjectByHandleWithTag internal function. Block 508 corresponds to block 308 of method 300.
Fourth Instrumentation
In some examples of the present disclosure, an instrumentation in VMI module 108 (
To demonstrate this instrumentation, consider the disassembled code the exported function “PsSetCreateThreadNotifyRoutine” for tracking the number of thread creation callbacks registered by kernel components listed in Table 3.
PsSetCreateThreadNotifyRoutine is an exported function so it is easily discovered in the PE memory image of guest kernel 112. “PspCreateThreadNotifyRoutineCount” is an internal global data that cannot be easily discovered in the PE memory image.
A fourth instrumentation may be created to find the internal global data PspCreateThreadNotifyRoutineCount.
In block 602, the fourth instrumentation locates a kernel exported system call or function in PE memory image 200 (
In block 604, starting from the address of the exported function PsSetCreateThreadNotifyRoutine determined in block 602, the fourth instrumentation disassembles the machine code in the PE memory image 200 into assembly code. Block 604 corresponds to block 304 of method 300. Block 604 may be followed by block 606.
In block 606, the fourth instrumentation matches a pattern against memory references in the assembly code. As the fourth instrumentation disassembles the code of the exported function PsSetCreateThreadNotifyRoutine, it looks for a “lock add” instruction with a second operand of a constant “1.” When the fourth instrumentation detects the instruction with the second operand of 1, block 606 may be followed by block 608. Block 606 corresponds to block 306 of method 300.
In block 608, the fourth instrumentation determines the internal address information of guest kernel 112 from the assembly code. The fourth instrumentation parses the first operand of the instruction, which provides the address of internal global data PspCreateThreadNotifyRoutineCount.
Fifth Instrumentation
Guest kernel 112 (
In some examples of the present disclosure, an instrumentation in VMI module 108 determines the offset of a field in an internal reference (e.g., an internal data structure) by locating an exported reference (e.g., an exported function) that returns a value at the offset in the internal data structure.
To demonstrate this instrumentation, consider the disassembled code the exported function “PsGetCurrentProcessId” for returning a pid of a process listed in Table 4.
PsGetCurrentProcessId is an exported function so it is easily discovered in the PE memory image of guest kernel 112. “PEPROCESS” is an internal data structure that is dynamically created for each process in the system and its address is easily discovered in the PE memory image, such as find an exported function that can be called to get the current PEPROCESS (e.g., PsGetCurrentProcess).
A fifth instrumentation may be created to find the offset of the pid field in the dynamically created internal data structure PEPROCESS.
In block 702, the fifth instrumentation locates a kernel exported system call or exported function in PE memory image 200 (
In block 704, starting from the address of the exported function PsGetCurrentProcessId determined in block 702, the fifth instrumentation disassembles the machine code in the PE memory image 200 into assembly code. Block 704 corresponds to block 304 of method 300. Block 704 may be followed by block 706.
In block 706, the fifth instrumentation matches a pattern against memory references in the assembly code. As the fifth instrumentation disassembles the code of the exported function PsGetCurrentProcessId, it tracks what is being read relative to the RCX register and the relative offset is the offset of the field in the internal data structure. Block 706 may be followed by block 708. Block 706 corresponds to block 306 of method 300.
In block 708, the fifth instrumentation determines the internal address information of guest kernel 112 from the assembly code. The fifth instrumentation parses the offset relative to the RCX register, which provides the offset of the pid field in the internal global data structure PEPROCESS.
From the foregoing, it will be appreciated that various embodiments of the present disclosure have been described herein for purposes of illustration, and that various modifications may be made without departing from the scope and spirit of the present disclosure. Accordingly, the various embodiments disclosed herein are not intended to be limiting, with the true scope and spirit being indicated by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
201941002756 | Jan 2019 | IN | national |
Number | Name | Date | Kind |
---|---|---|---|
8479188 | Singh | Jul 2013 | B2 |
9274823 | Koryakin | Mar 2016 | B1 |
10203968 | Lawson | Feb 2019 | B1 |
20120198428 | Schmidt | Aug 2012 | A1 |
20130117530 | Kim | May 2013 | A1 |
20140122454 | Brylyn | May 2014 | A1 |
20140181976 | Snow | Jun 2014 | A1 |
20150033227 | Lin | Jan 2015 | A1 |
20150332043 | Russello | Nov 2015 | A1 |
20160048679 | Lutas | Feb 2016 | A1 |
20160283260 | Bacher | Sep 2016 | A1 |
20170116108 | Miskelly | Apr 2017 | A1 |
20180060249 | Tsirkin | Mar 2018 | A1 |
20180267818 | Dabak et al. | Sep 2018 | A1 |
20180267819 | Dabak et al. | Sep 2018 | A1 |
20190347125 | Sankaran | Nov 2019 | A1 |
Number | Date | Country | |
---|---|---|---|
20200233686 A1 | Jul 2020 | US |