This application claims priority to Chinese Application No. 202310781080.8 filed on Jun. 28, 2023, which is incorporated herein by reference in its entirety.
This application relates to the field of computer technology, particularly to a method and a system for accessing registers of a device.
Registers are core components of a central processing unit (CPU), which are high-speed storage components used to temporarily store instructions, data, and addresses. When executing instructions, the CPU can store data corresponding to the instructions into the registers. The existence of the registers greatly speeds up the efficiency of computing and memory access.
In computing systems, there may be some slow buses such as I2C/SMBus to manage devices. Users can operate these devices by programming or configuring the registers of these devices on these buses. From the perspective of specific device functions, the configuration of the registers often follows relatively fixed patterns: for example, when enabling a certain function of a device, a fixed group of registers are needed to be configured in a fixed order; when disabling a certain function of a device, a fixed group of registers are also needed to be configured in a fixed order. The registers can be roughly divided into two types. One is a register that includes only one storage unit, which exclusively occupies the address of the register, and the other is a register whose register space is divided into multiple storage units, which share the address of the register. In scenarios where batch operations are performed on certain functions of the device, the registers that include multiple storage units may be frequently configured.
Due to the use of slow buses, the overall performance of the system will be greatly affected when accessing devices on these buses (hereinafter referred to low-speed devices). When performing batch operations of certain functions, some registers that are divided into multiple storage units are frequently accessed, which further affects the performance of the system.
This section aims to provide background or context for the implementation of the application stated in the claims. The description here should not be considered prior art merely because it is included in this section.
An object of this application is to provide a method and a system for accessing registers of a device, in order to reduce the number of accesses to these low-speed devices and improve system performance.
In one aspect, the present application provides a method for accessing registers of a device,
In some embodiments, the method further comprises: detecting whether the register address corresponding to these multiple instructions of the same type is marked as an optimizable address, and merging and optimizing these multiple execution units when determining that the register address is the optimizable address.
In some embodiments, merging and optimizing these multiple execution units comprises: merging these multiple instructions of the same type corresponding to the same register address of the multiple execution units and generating an optimized instruction.
In some embodiments, merging and optimizing these multiple execution units comprises: sorting instructions that do not participate in merging and optimizing and the optimized instruction, so that relative order between any two instructions in the optimized execution units is the same as their relative order in the execution units before optimization.
In some embodiments, each of the instructions in the instruction set comprises an operation field, an address field, and a data field.
In some embodiments, operation fields and address fields of these multiple instructions of the same type are the same, and data fields of these multiple instructions of the same type correspond to different storage units of the register.
In some embodiments, merging multiple instructions of the same type corresponding to the same register address comprises: performing calculation on data in the data fields of these multiple instructions of the same type corresponding to the same register address based on the operation fields of these multiple instructions of the same type to obtain data in the data field of the optimized instruction, the operation field and the address field of the optimized instruction are respectively the same as those in the operation fields and the address fields of these multiple instructions of the same type.
In some embodiments, these multiple instructions of the same type are bit operation instructions of the same type.
In some embodiments, the instruction set is created in a domain specific language manner.
In another aspect, the present application provides an system for accessing registers of a device, wherein the device is connected to a CPU via a bus, the device comprises multiple registers, each of which corresponds to a different address, and each register comprises one or more storage units which share a register address, the access system comprises:
A large number of technical features are described in the specification of the present application, and are distributed in various technical solutions. If a combination (i.e., a technical solution) of all possible technical features of the present application is listed, the description may be made too long. In order to avoid this problem, the various technical features disclosed in the above summary of the present application, the technical features disclosed in the various embodiments and examples below, and the various technical features disclosed in the drawings can be freely combined with each other to constitute various new technical solutions (all of which are considered to have been described in this specification), unless a combination of such technical features is not technically feasible. For example, feature A+B+C is disclosed in one example, and feature A+B+D+E is disclosed in another example, while features C and D are equivalent technical means that perform the same function, and technically only choose one, not to adopt at the same time. Feature E can be combined with feature C technically. Then, the A+B+C+D scheme should not be regarded as already recorded because of the technical infeasibility, and A+B+C+E scheme should be considered as already documented.
In the following description, numerous technical details are set forth in order to provide the readers with a better understanding of the present application. However, those skilled in the art can understand that the technical solutions claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments.
In order to make the objects, technical solutions and advantages of the present application more clearer, embodiments of the present application will be further described in detail below with reference to the accompanying drawings.
The present application relates to an access method for registers in a device, wherein the device is connected to a CPU via a bus, and the device includes multiple registers, each of which corresponds to a different address. It should be noted that the device described in this article refer to a low-speed device based on slow protocols (such as I2C, SPI, etc.). From the perspective of register organization, the registers included in the low-speed device can be divided into two types of registers. One includes only one storage unit, which exclusively occupies a single address, and this type of register is also called a register occupying an independent addressing unit; and the other includes multiple storage units, which share a single address, and each register unit occupies only one or several bits of the register, and this type of register is also called a register sharing an addressing unit. As shown in
In the implementation of this application, the flowchart of the method for accessing registers of a device is shown in
Step 101, creating an instruction set for accessing the device.
In one embodiment, the instruction set is created in a domain specific language (DSL) manner, which is a defined instruction set used to access the registers in the device. The instruction set may include but is not limited to the following instructions: write instructions, read instructions, bit operation instructions (such as, OR instruction, AND instruction), null operation instructions, etc. Table 1 provides examples of some instructions and briefly describes their meanings.
It can be seen that most instructions include an operation field, an address field, and a data field. Taking the instruction “REG_WRITE reg_x, data” as an example, “REG_WRITE” is the operation field used to indicate an operation type (that is, write operation), “reg_x” is the address field used to indicate an operation address, and “data” is the data field used to indicate an object being operated on.
Step 102, in response to operation requests from the CPU to the device, selecting instructions that implement the operation requests from the instruction set, so as to generate execution units with a minimum execution granularity corresponding to the operation requests.
Referring to
When receiving an operation request from the CPU to a low-speed device (such as enabling a certain function of the low-speed device), the corresponding instructions can be selected from a created instruction set to form the execution unit with the minimum execution granularity corresponding to the operation request. Continuing to refer to
From this, it can be seen that the embodiment of present application uses the corresponding instructions in the above instruction set to “describe” the configuration operations on the registers. Each group of instructions that describes “the minimum execution granularity” constitutes one execution unit.
Step 103, detecting the execution units, and when detecting that there are multiple execution units corresponding to the same type of operation request and multiple execution units include multiple instructions of the same type corresponding to the same register address, merging and optimizing the multiple execution units to generate optimized execution units, and sending the optimized execution units to the bus.
In the scenario where there is only one execution unit, due to the execution unit corresponds to the “minimum execution granularity”, each instruction in the execution unit points to a different register address, and it is impossible to merge the instructions in the execution unit to execute. Therefore, for the scenario of a single execution unit, the instructions in the execution unit can be directly sent to the bus in the execution order, and then transmitted by the bus to the corresponding low-speed device. In scenarios where there are multiple execution units (such as batch operations of certain functions), there may be multiple instructions of the same type pointing to the same register address. If these multiple instructions of the same type can be merged to execute, the number of the accesses to the register can be reduced, thereby improving system performance. For example, when three groups of functions of the low-speed device of
It can be seen that the I instructions in the three execution units mentioned above, namely “REG-AND E-addr, 0x1E”, “REG-AND E-addr, 0x1D”, and “REG-AND E-addr, 0x1B”, all have operation fields of “REG-AND” and address fields of “0xE”, and the data fields of the I instructions correspond to different storage units of the register E. These three instructions belong to multiple instructions of the same type pointing to the same register address. Similarly, the III instructions in the three execution units mentioned above, namely “REG-OR E-addr, 0x1”, “REG-OR E-addr, 0x2”, and “REG-OR E-addr, 0x4”, also belong to the multiple instructions of the same type pointing to the same register address.
When there are multiple execution units and there are multiple instructions of the same type pointing to the same register address in multiple execution units, after determining these instructions comply with optimization rules, these instructions of the same type are merged and optimized to form a merged and optimized instruction (also referred to as “merged instruction” or “optimized instruction” in the article). Determining whether the instructions comply with optimization rules includes determining whether the merging of these instructions will affect the final execution effect. Taking the above three execution units as an example, assuming if multiple instructions of the same type pointing to the same register address in the above three execution units can be merged and optimized, the merged execution units are shown in
Wherein, the I instructions in the three execution units, namely “REG-AND E-addr, 0x1E”, “REG-AND E-addr, 0x1D”, and “REG-AND E-addr, 0x1B”, are merged into the instruction “REG-AND E-addr, 0x18”. For the AND instruction REG-AND, an AND operation is performed on data values 0x1E, 0x1D, and 0x1B in the data fields of the three instructions before merging to obtain a data value 0x18 in the data field of the merged instruction. It can be understood that the operation field and the address field of the merged instruction are the same as those of the three instructions before merging. The data field of the merged instruction is calculated based on the data fields of the three instructions before merging (that is. 0x1E, 0x1D, 0x1B), ensuring that the operation result of data field of the merged instruction are equivalent to the operation result of data fields of three instructions before merging. Similarly, the III instructions of the three execution units, namely “REG-OR E-addr, 0x1”, “REG-OR E-addr, 0x2”, and “REG-OR E-addr, 0x4”, are merged into the instruction “REG-OR E-addr, 0x7”. For the OR instruction REG-OR, an OR operation is performed on the data values 0x1, 0x2, and 0x4 in the data fields of the three instructions before merging to obtain a data value 0x7 in the data field of the merged instruction.
After merging and optimizing, the execution order of each instruction has changed. For example, the I instruction (REG-AND E-addr, 0x1D) in the second execution unit before merging is the 4th in the execution order, but after merging, it is the 1st in the execution order. Similarly, the III instruction in the first execution unit before merging is the 3rd in the execution order, but after merging, it is the last in the execution order.
It is not difficult to find that a relative order of the instructions in each execution unit remains unchanged before and after merging. For example, as shown in
Whether this sequent characteristic will affect the final execution effect is determined by the optimization rules. Specifically, the embodiment of the present application predefines a group of optimization rules based on the characteristics of each chip/device, in which each register address is marked, and if a certain register address is marked as an optimizable address, it indicates that merging and optimizing the multiple instructions of the same type pointing to the register address will not affect the final execution effect. For example, as long as the relative order in which the instructions of each execution unit are executed remains unchanged after merging, the desired execution effect can still be obtained. Therefore, after detecting an existence of multiple instructions of the same type pointing to the same register address, the present embodiment can check the optimization rules, so as to determine whether the register address corresponding to the multiple instructions of the same type is marked as optimizable. When determining that the register address is an optimizable address, mergence and optimization can be performed.
In some embodiments, the same type of instructions mentioned above may include bit operation instructions, such as OR instructions, AND instructions, and shift instructions, etc. That is to say, the present embodiment may choose to merge and optimize multiple bit operation instructions of the same type pointing to the same register address. The following will provide a specific explanation of the merging and optimizing process of the OR instruction and the AND instruction in conjunction with
As shown in
Before merging, preceding instructions (instructions executed before the instructions to be merged) and succeeding instructions (instructions executed after the instructions to be merged) to be merged for each execution unit are recorded. For example, in the first execution unit, I instruction has no preceding instruction, and its succeeding instructions are the II instruction and the III instruction; the preceding instructions of the III instruction are the I instruction and the II instruction, and the III instruction has no succeeding instruction. Similarly, in the second execution unit, I instruction has no preceding instruction, and its succeeding instructions are the II instruction and the III instruction; the preceding instructions of the III instruction are the I instruction and the II instruction, and the III instruction has no succeeding instruction. In the third execution unit, the I instruction has no preceding instruction for the I instruction, and its succeeding instructions are the II instruction and the III instruction; the preceding instructions of the III instruction are the I instruction and the II instruction, and the III instruction has no succeeding instruction.
Afterwards, the instructions to be merged are merged to generate a merged OR instruction. Wherein, an operation field and an address field of the merged OR instruction are the same as those of the OR instructions before merging, while the data field of the merged OR instruction is generated based on all the data fields of the OR instructions before merging. Taking the III instructions in the three execution units mentioned above as an example, the III instructions in the three execution units are the three OR instructions to be merged. Firstly, masks of these three OR instructions are obtained, and OR operation is performed on all the masks obtained to generate a merged mask. The merged mask is the data field of the merged OR instruction. From this, it can be seen that the merged instruction of these three instructions is “REG-OR E-addr, 0x7”. The process of merging AND instructions is similar and will not be repeated here. The merged instruction of the I instructions (i.e. the three AND instructions to be merged) of the three execution units mentioned above is “REG-AND E-addr, 0x18”.
Then, based on the order of each execution unit and the relative order between instructions in each execution unit, all preceding instructions may be moved ahead the merged instruction, and all succeeding instructions may be moved behind the merged instruction, forming the merged execution units. For example, for each I instruction in the three execution units mentioned above, it has no preceding instruction, so an instruction formed after merging the I instructions in the three execution units is the first in the execution order. For each III instructions in the three execution units mentioned above, it has no succeeding instruction, therefore, an instruction formed after merging the III instructions in the three execution units mentioned above is the last in the execution order, and the remaining three instructions (i.e. the II instructions in the three execution units) belong to the succeeding instructions of the I instructions in the three execution units mentioned above, and belong to the preceding instructions of the III instructions in the three execution units mentioned above. Therefore, the remaining three instructions are between the first and the last in the execution order, and considering the order of each execution unit, the merged execution unit in
Specifically, after merging and optimizing multiple instructions of the same type pointing to the same register address, it is also necessary to sort the instructions that do not participate in merging and optimizing and merged instructions formed after the merging and optimizing based on the order of the respective execution units and the relative order of the instructions among the instructions in each execution unit, so as to ensure that the relative order among instructions remains unchanged.
In the implementation method of this application, an instruction set is created for accessing the device. When operating on the device, corresponding instructions are selected from the instruction set and execution units with the minimum execution granularity corresponding to the operation requests are generated. When multiple execution units corresponding to the same type of operation request and including multiple instructions of the same type corresponding to the same register address are detected, the multiple execution units are merged and optimized to generate optimized execution units and sent to the bus. This can reduce the number of accesses to the low-speed devices and improve system performance.
The implementation method of this application also relates to an access system for registers in a device, the device is connected to a CPU via a bus. The device includes multiple registers each corresponding to a different address. Each register includes one or more storage units, and the one or more storage units share a register address. The block diagram of the access system is shown in
The present embodiment is a system embodiment corresponding to the method embodiment described above, and the technical details in the method embodiment described above can be applied to the present embodiment, and the technical details in the present embodiment can be applied to the method embodiment described above.
In summary, compared to traditional bus access where access requests (configuration operations) to registers are directly sent to the bus, in scenarios of batch operations, frequent accesses to certain registers including multiple register units may be required, thus reducing the performance of the system. The present application firstly converts access requests to registers into instructions using a predefined instruction set, then merges and optimizes some of the instructions according to the characteristics of the instructions, and then sends the optimized instructions to the bus, which greatly reduces the number of accesses to the corresponding registers and improves the system performance.
It should be noted that those skilled in the art should understand that the implementation functions of the various modules shown in the embodiments of the above access system for the registers in the device can be referred to the relevant description of the foregoing access method for the registers in the device. The functions of each module shown in the implementation of the access system for the registers in the device mentioned above can be achieved through programs (executable instructions) running on the processor, or through specific logic circuits. If the access system for the registers in the device mentioned in the present embodiment is implemented in the form of a software functional module and sold or used as an independent product, it can also be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present invention can be embodied in the form of software products in essence or part of contributions to the prior art. The computer software product is stored in a storage medium, and includes several instructions to enable a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the methods described in the embodiments of the present invention. The foregoing storage media include various media that can store program codes, such as a U disk, a mobile hard disk, a read-only memory (ROM, Read Only Memory), a magnetic disk, or an optical disk. In this way, the embodiments of the present invention are not limited to any specific combination of hardware and software.
Correspondingly, the embodiments of the present invention also provide a computer-readable storage medium in which computer-executable instructions are stored. When the computer-executable instructions are executed by a processor, the method embodiments of the present invention are implemented. The computer-readable storage media comprises permanent and non-permanent, removable and non-removable media can be used by any method or technology to implement information storage. Information can be computer-readable instructions, data structures, program modules, or other data. Examples of storage media for computers include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, read-only optical disc read-only memory (CD-ROM), digital multifunctional optical disc (DVD) or other optical storage, magnetic cartridge tapes, magnetic tape disk storage or other magnetic storage devices, or any other non-transport media that can be used to store information that can be accessed by computing devices. As defined herein, a computer-readable storage medium does not include transient computer-readable media (transitory media), such as modulated data signals and carriers.
It should be noted that in this specification of the application, relational terms such as the first and second, and so on are only configured to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the term “comprises” or “comprising” or “includes” or any other variations thereof is intended to encompass a non-exclusive inclusion, such that a process, method, article, or device that comprises multiple elements includes not only those elements but also other elements, or elements that are inherent to such a process, method, item, or device. Without more restrictions, the element defined by the phrase “comprise(s) a/an” does not exclude that there are other identical elements in the process, method, item or device that includes the element. In this specification of the application, if it is mentioned that an action is performed according to an element, it means the meaning of performing the action at least according to the element, and includes two cases: the action is performed only on the basis of the element, and the action is performed based on the element and other elements. Multiple, repeatedly, various, etc., expressions include 2, twice, 2 types, and 2 or more, twice or more, and 2 types or more types.
All documents mentioned in this specification are considered to be included in the disclosure of this application as a whole, so that they can be used as a basis for modification when necessary. In addition, it should be understood that the above descriptions are only preferred embodiments of this specification, and are not intended to limit the protection scope of this specification. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of one or more embodiments of this specification should be included in the protection scope of one or more embodiments of this specification.
In some cases, the actions or steps described in the claims can be performed in a different order than in the embodiments and still achieve desired results. In addition, the processes depicted in the drawings do not necessarily require the specific order or sequential order shown in order to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Number | Date | Country | Kind |
---|---|---|---|
202310781080.8 | Jun 2023 | CN | national |