Self-tuning fixed-point least-squares solver

Information

  • Patent Grant
  • 12284058
  • Patent Number
    12,284,058
  • Date Filed
    Monday, May 1, 2023
    2 years ago
  • Date Issued
    Tuesday, April 22, 2025
    19 days ago
  • CPC
  • Field of Search
    • CPC
    • H04L25/0202
    • H04L25/024
    • H04L25/0242
    • H04L25/0244
    • H04L25/0246
    • H04L25/025
    • H04L25/03006
    • H04L25/03012
    • H04L25/03019
    • H04L25/03082
    • H04L25/03101
  • International Classifications
    • H04L25/02
    • Term Extension
      59
Abstract
A method and device for self-tuning scales of variables for processing in fixed-point hardware. The device includes a sequence of fixed-point arithmetic circuits configured to receive at least one input signal and output at least one output signal. The circuits are preconfigured with control scales associated with each of the input and output signals. A first circuit in the sequence is configured to receive a first input signal having a dynamic true scale that is different from the control scale associated with the first input signal. Each of the circuits is further configured to determine, for each of the output signals, an adaptive scale from the control scale associated with the output signal based on the true scale of the first input signal and the control scale associated with the first input signal, and generate, from the input signal, the output signal having the associated adaptive scale.
Description
TECHNICAL FIELD

This disclosure relates generally to wireless communications signal processing. More specifically, this disclosure relates to self-tuning fixed-point least-squares solvers that operate on data having variable bit width and scale.


BACKGROUND

To meet the demand for wireless data traffic having increased since deployment of 4G communication systems and to enable various vertical applications, 5G/NR communication systems have been developed and are currently being deployed. The 5G/NR communication system is considered to be implemented in higher frequency (mmWave) bands, e.g., 28 GHz or 60 GHz bands, so as to accomplish higher data rates or in lower frequency bands, such as 6 GHz, to enable robust coverage and mobility support. To decrease propagation loss of the radio waves and increase the transmission distance, the beamforming, massive multiple-input multiple-output (MIMO), full dimensional MIMO (FD-MIMO), array antenna, an analog beam forming, large scale antenna techniques are discussed in 5G/NR communication systems.


In addition, in 5G/NR communication systems, development for system network improvement is under way based on advanced small cells, cloud radio access networks (RANs), ultra-dense networks, device-to-device (D2D) communication, wireless backhaul, moving network, cooperative communication, coordinated multi-points (CoMP), reception-end interference cancelation and the like.


The discussion of 5G systems and frequency bands associated therewith is for reference as certain embodiments of the present disclosure may be implemented in 5G systems. However, the present disclosure is not limited to 5G systems, or the frequency bands associated therewith, and embodiments of the present disclosure may be utilized in connection with any frequency band. For example, aspects of the present disclosure may also be applied to deployment of 5G communication systems, 6G or even later releases which may use terahertz (THz) bands.


Complicated signal processing that involves many fixed-point operations requires careful bit width and scale management in order to achieve a good signal to quantization noise ratio (SQNR) as compared to a floating-point operation. This is because operations such as addition and multiplication increase a variable bit width and/or scale, however, bit width and scale cannot be allowed to infinitely increase and need to be reasonably adjusted at some points during processing. Without such management, bit underflow or overflow is highly likely to occur, which may break down the signal processing algorithms.


SUMMARY

Embodiments of the present disclosure provide methods and devices for self-tuning scales of variables for processing in fixed-point hardware.


In one embodiment, an electronic device comprises a sequence of fixed-point arithmetic circuits. Each of the circuits is configured to receive at least one input signal and output at least one output signal. The circuits are preconfigured with control scales associated with each of the at least one input and output signals. A first fixed-point arithmetic circuit in the sequence is further configured to receive a first input signal having a dynamic true scale that is different from the control scale associated with the first input signal. Each of the fixed-point arithmetic circuits is further configured to determine, for each of the at least one output signals, an adaptive scale from the control scale associated with the output signal based on the true scale of the first input signal and the control scale associated with the first input signal, and generate, from the at least one input signal, the at least one output signal having the adaptive scale of the at least one output signal.


In another embodiment, a method of operation of an electronic device comprising a sequence of fixed-point arithmetic circuits configured to receive at least one input signal and output at least one output signal is provided. The method comprises the steps of receiving, at a first fixed-point arithmetic circuit in the sequence, a first input signal having a dynamic true scale that is different from a control scale associated with the first input signal, wherein the fixed-point arithmetic circuits are preconfigured with control scales associated with each of the at least one input and output signals, determining, by each of the fixed-point arithmetic circuits for each of the at least one output signals, an adaptive scale from the control scale associated with the output signal based on the true scale of the first input signal and the control scale associated with the first input signal, and generating, by each of the fixed-point arithmetic circuits from the at least one input signal, the at least one output signal having the adaptive scale of the at least one output signal.


Other technical features may be readily apparent to one skilled in the art from the following figures, descriptions, and claims.


Before undertaking the DETAILED DESCRIPTION below, it may be advantageous to set forth definitions of certain words and phrases used throughout this patent document. The term “couple” and its derivatives refer to any direct or indirect communication between two or more elements, whether or not those elements are in physical contact with one another. The terms “transmit,” “receive,” and “communicate,” as well as derivatives thereof, encompass both direct and indirect communication. The terms “include” and “comprise,” as well as derivatives thereof, mean inclusion without limitation. The term “or” is inclusive, meaning and/or. The phrase “associated with,” as well as derivatives thereof, means to include, be included within, interconnect with, contain, be contained within, connect to or with, couple to or with, be communicable with, cooperate with, interleave, juxtapose, be proximate to, be bound to or with, have, have a property of, have a relationship to or with, or the like. The term “controller” means any device, system or part thereof that controls at least one operation. Such a controller may be implemented in hardware or a combination of hardware and software and/or firmware. The functionality associated with any particular controller may be centralized or distributed, whether locally or remotely. The phrase “at least one of,” when used with a list of items, means that different combinations of one or more of the listed items may be used, and only one item in the list may be needed. For example, “at least one of: A, B, and C” includes any of the following combinations: A, B, C, A and B, A and C, B and C, and A and B and C.


Moreover, various functions described below can be implemented or supported by one or more computer programs, each of which is formed from computer readable program code and embodied in a computer readable medium. The terms “application” and “program” refer to one or more computer programs, software components, sets of instructions, procedures, functions, objects, classes, instances, related data, or a portion thereof adapted for implementation in a suitable computer readable program code. The phrase “computer readable program code” includes any type of computer code, including source code, object code, and executable code. The phrase “computer readable medium” includes any type of medium capable of being accessed by a computer, such as read only memory (ROM), random access memory (RAM), a hard disk drive, a compact disc (CD), a digital video disc (DVD), or any other type of memory. A “non-transitory” computer readable medium excludes wired, wireless, optical, or other communication links that transport transitory electrical or other signals. A non-transitory computer readable medium includes media where data can be permanently stored and media where data can be stored and later overwritten, such as a rewritable optical disc or an erasable memory device.


Definitions for other certain words and phrases are provided throughout this patent document. Those of ordinary skill in the art should understand that in many if not most instances, such definitions apply to prior as well as future uses of such defined words and phrases.





BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of the present disclosure and its advantages, reference is now made to the following description taken in conjunction with the accompanying drawings, in which like reference numerals represent like parts:



FIG. 1 illustrates an example wireless network according to embodiments of the present disclosure;



FIG. 2 illustrates an example gNB according to embodiments of the present disclosure;



FIG. 3 illustrates an example UE according to embodiments of the present disclosure;



FIG. 4 illustrates example diagrams of digital signal processing algorithms according to embodiments of the present disclosure;



FIG. 5 illustrates an example process flow of a self-tuning fixed-point LS solver according to embodiments of the present disclosure;



FIG. 6 illustrates an example of a conventional design of a Cholesky-based LS solver according to embodiments of the present disclosure;



FIG. 7 illustrates an example design of a Cholesky-based LS solver with adaptive scales according to embodiments of the present disclosure;



FIG. 8 illustrates an example of scale adaptation using a Cholesky-based LS solver with adaptive scales according to embodiments of the present disclosure; and



FIG. 9 illustrates an example process for self-tuning scales of variables for processing in fixed-point hardware according to embodiments of the present disclosure.





DETAILED DESCRIPTION


FIGS. 1 through 9, discussed below, and the various embodiments used to describe the principles of the present disclosure in this patent document are by way of illustration only and should not be construed in any way to limit the scope of the disclosure. Those skilled in the art will understand that the principles of the present disclosure may be implemented in any suitably arranged system or device.


Embodiments of the present disclosure recognize that digital signal processing algorithms are typically designed using high precision floating-point operations and then implemented in fixed-point (or FxP) hardware, which is often less precise due to design constraints. SQNR is a measurement of the difference in precision between the fixed-point signal processing operation and its floating-point counterpart. One source of lowered SQNR in binary fixed-point implementation is improperly managed bit width and scale of processed data, where “bit width” refers to the number of bits in a binary number (e.g., the number of bits necessary to represent a decimal value in binary) and “scale” refers to the number of bits in a binary number that represent the fractional portion of the number. That is, the scale value determines the binary point (or radix point) of a binary fixed-point number, which defines which bits represent an integer portion of the number (integer bits) and which bits represent the fractional portion of the number (fractional bits).


Embodiments of the present disclosure further recognize that in fixed-point signal processing the least-squares (LS) solver is one of the most difficult and complex processing operations, as it involves matrix inversion which needs fine-tuning depending on the input and output bit widths and scales to avoid bit underflow or overflow. Input and output scales and bit widths refer to the scales and bit widths of the binary input and output, respectively. When the input has a large range of possible bit widths, the output scale needs to vary dynamically due to the nature of matrix inversion to avoid underflow or overflow. In traditional matrix inversion processing implementations, the output scale is tied to the input scale, and underflow or overflow can easily occur at the extremes of a large range of bit widths.


Accordingly, embodiments of the present disclosure provide methods and apparatuses for implementing binary LS solver operations in fixed-point hardware that accommodates variable bit width inputs and has the self-tuning property. The self-tuning property refers to the capability to adjust the input and output scales of processed data at various arithmetic circuits in the hardware as needed to reduce bit overflow and underflow, thereby improving SQNR.



FIGS. 1-3 below describe various embodiments implemented in wireless communications systems and with the use of orthogonal frequency division multiplexing (OFDM) or orthogonal frequency division multiple access (OFDMA) communication techniques. The descriptions of FIGS. 1-3 are not meant to imply physical or architectural limitations to the manner in which different embodiments may be implemented. Different embodiments of the present disclosure may be implemented in any suitably arranged communications system.



FIG. 1 illustrates an example wireless network according to embodiments of the present disclosure. The embodiment of the wireless network shown in FIG. 1 is for illustration only. Other embodiments of the wireless network 100 could be used without departing from the scope of this disclosure.


As shown in FIG. 1, the wireless network includes a gNB 101 (e.g., base station, BS), a gNB 102, and a gNB 103. The gNB 101 communicates with the gNB 102 and the gNB 103. The gNB 101 also communicates with at least one network 130, such as the Internet, a proprietary Internet Protocol (IP) network, or other data network.


The gNB 102 provides wireless broadband access to the network 130 for a first plurality of user equipments (UEs) within a coverage area 120 of the gNB 102. The first plurality of UEs includes a UE 111, which may be located in a small business; a UE 112, which may be located in an enterprise; a UE 113, which may be a WiFi hotspot; a UE 114, which may be located in a first residence; a UE 115, which may be located in a second residence; and a UE 116, which may be a mobile device, such as a cell phone, a wireless laptop, a wireless PDA, or the like. The gNB 103 provides wireless broadband access to the network 130 for a second plurality of UEs within a coverage area 125 of the gNB 103. The second plurality of UEs includes the UE 115 and the UE 116. In some embodiments, one or more of the gNBs 101-103 may communicate with each other and with the UEs 111-116 using 5G/NR, long term evolution (LTE), long term evolution-advanced (LTE-A), WiMAX, WiFi, or other wireless communication techniques.


Depending on the network type, the term “base station” or “BS” can refer to any component (or collection of components) configured to provide wireless access to a network, such as transmit point (TP), transmit-receive point (TRP), an enhanced base station (eNodeB or eNB), a 5G/NR base station (gNB), a macrocell, a femtocell, a WiFi access point (AP), or other wirelessly enabled devices. Base stations may provide wireless access in accordance with one or more wireless communication protocols, e.g., 5G/NR 3rd generation partnership project (3GPP) NR, long term evolution (LTE), LTE advanced (LTE-A), high speed packet access (HSPA), Wi-Fi 802.11a/b/g/n/ac, etc. For the sake of convenience, the terms “BS” and “TRP” are used interchangeably in this patent document to refer to network infrastructure components that provide wireless access to remote terminals. Also, depending on the network type, the term “user equipment” or “UE” can refer to any component such as “mobile station,” “subscriber station,” “remote terminal,” “wireless terminal,” “receive point,” or “user device.” For the sake of convenience, the terms “user equipment” and “UE” are used in this patent document to refer to remote wireless equipment that wirelessly accesses a BS, whether the UE is a mobile device (such as a mobile telephone or smartphone) or is normally considered a stationary device (such as a desktop computer or vending machine).


Dotted lines show the approximate extents of the coverage areas 120 and 125, which are shown as approximately circular for the purposes of illustration and explanation only. It should be clearly understood that the coverage areas associated with gNBs, such as the coverage areas 120 and 125, may have other shapes, including irregular shapes, depending upon the configuration of the gNBs and variations in the radio environment associated with natural and man-made obstructions.


Although FIG. 1 illustrates one example of a wireless network, various changes may be made to FIG. 1. For example, the wireless network could include any number of gNBs and any number of UEs in any suitable arrangement. Also, the gNB 101 could communicate directly with any number of UEs and provide those UEs with wireless broadband access to the network 130. Similarly, each gNB 102-103 could communicate directly with the network 130 and provide UEs with direct wireless broadband access to the network 130. Further, the gNBs 101, 102, and/or 103 could provide access to other or additional external networks, such as external telephone networks or other types of data networks.



FIG. 2 illustrates an example gNB 102 according to embodiments of the present disclosure. The embodiment of the gNB 102 illustrated in FIG. 2 is for illustration only, and the gNBs 101 and 103 of FIG. 1 could have the same or similar configuration. However, gNBs come in a wide variety of configurations, and FIG. 2 does not limit the scope of this disclosure to any particular implementation of a gNB.


As shown in FIG. 2, the gNB 102 includes multiple antennas 205a-205n, multiple transceivers 210a-210n, a controller/processor 225, a memory 230, and a backhaul or network interface 235.


The transceivers 210a-210n receive, from the antennas 205a-205n, incoming RF signals, such as signals transmitted by UEs in the network 100. The transceivers 210a-210n down-convert the incoming RF signals to generate IF or baseband signals. The IF or baseband signals are processed by receive (RX) processing circuitry in the transceivers 210a-210n and/or controller/processor 225, which generates processed baseband signals by filtering, decoding, and/or digitizing the baseband or IF signals. The controller/processor 225 may further process the baseband signals.


Transmit (TX) processing circuitry in the transceivers 210a-210n and/or controller/processor 225 receives analog or digital data (such as voice data, web data, e-mail, or interactive video game data) from the controller/processor 225. The TX processing circuitry encodes, multiplexes, and/or digitizes the outgoing baseband data to generate processed baseband or IF signals. The transceivers 210a-210n up-convert the baseband or IF signals to RF signals that are transmitted via the antennas 205a-205n.


The controller/processor 225 can include one or more processors or other processing devices that control the overall operation of the gNB 102. For example, the controller/processor 225 could control the reception of UL channel signals and the transmission of DL channel signals by the transceivers 210a-210n in accordance with well-known principles. The controller/processor 225 could support additional functions as well, such as more advanced wireless communication functions. For instance, the controller/processor 225 could support beam forming or directional routing operations in which outgoing/incoming signals from/to multiple antennas 205a-205n are weighted differently to effectively steer the outgoing signals in a desired direction. Any of a wide variety of other functions could be supported in the gNB 102 by the controller/processor 225.


The controller/processor 225 or the transceivers 210a-210n may include fixed-point arithmetic circuitry that may perform digital signal processing on digital UL or DL channel signals provided to the fixed-point arithmetic circuitry. For example, the fixed-point arithmetic circuitry may perform a least-squares estimate (using, e.g., a Cholesky decomposition and forward-backward substitution approach, as described below) as part of MIMO zero-forcing (ZF), minimum mean squared error (MMSE) precoding, equalization, channel prediction, or other such digital signal processing algorithms. The fixed-point arithmetic circuitry may include application-specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), or similar hardware implementations of one or more digital signal processing algorithms.


The controller/processor 225 is also capable of executing programs and other processes resident in the memory 230, such as an OS. The controller/processor 225 can move data into or out of the memory 230 as required by an executing process.


The controller/processor 225 is also coupled to the backhaul or network interface 235. The backhaul or network interface 235 allows the gNB 102 to communicate with other devices or systems over a backhaul connection or over a network. The interface 235 could support communications over any suitable wired or wireless connection(s). For example, when the gNB 102 is implemented as part of a cellular communication system (such as one supporting 5G/NR, LTE, or LTE-A), the interface 235 could allow the gNB 102 to communicate with other gNBs over a wired or wireless backhaul connection. When the gNB 102 is implemented as an access point, the interface 235 could allow the gNB 102 to communicate over a wired or wireless local area network or over a wired or wireless connection to a larger network (such as the Internet). The interface 235 includes any suitable structure supporting communications over a wired or wireless connection, such as an Ethernet or transceiver.


The memory 230 is coupled to the controller/processor 225. Part of the memory 230 could include a RAM, and another part of the memory 230 could include a Flash memory or other ROM.


Although FIG. 2 illustrates one example of gNB 102, various changes may be made to FIG. 2. For example, the gNB 102 could include any number of each component shown in FIG. 2. Also, various components in FIG. 2 could be combined, further subdivided, or omitted and additional components could be added according to particular needs.



FIG. 3 illustrates an example UE 116 according to embodiments of the present disclosure. The embodiment of the UE 116 illustrated in FIG. 3 is for illustration only, and the UEs 111-115 of FIG. 1 could have the same or similar configuration. However, UEs come in a wide variety of configurations, and FIG. 3 does not limit the scope of this disclosure to any particular implementation of a UE.


As shown in FIG. 3, the UE 116 includes antenna(s) 305, a transceiver(s) 310, and a microphone 320. The UE 116 also includes a speaker 330, a processor 340, an input/output (I/O) interface (IF) 345, an input 350, a display 355, and a memory 360. The memory 360 includes an operating system (OS) 361 and one or more applications 362.


The transceiver(s) 310 receives, from the antenna 305, an incoming RF signal transmitted by a gNB of the network 100. The transceiver(s) 310 down-converts the incoming RF signal to generate an intermediate frequency (IF) or baseband signal. The IF or baseband signal is processed by RX processing circuitry in the transceiver(s) 310 and/or processor 340, which generates a processed baseband signal by filtering, decoding, and/or digitizing the baseband or IF signal. The RX processing circuitry sends the processed baseband signal to the speaker 330 (such as for voice data) or is processed by the processor 340 (such as for web browsing data).


TX processing circuitry in the transceiver(s) 310 and/or processor 340 receives analog or digital voice data from the microphone 320 or other outgoing baseband data (such as web data, e-mail, or interactive video game data) from the processor 340. The TX processing circuitry encodes, multiplexes, and/or digitizes the outgoing baseband data to generate a processed baseband or IF signal. The transceiver(s) 310 up-converts the baseband or IF signal to an RF signal that is transmitted via the antenna(s) 305.


The processor 340 can include one or more processors or other processing devices and execute the OS 361 stored in the memory 360 in order to control the overall operation of the UE 116. For example, the processor 340 could control the reception of DL channel signals and the transmission of UL channel signals by the transceiver(s) 310 in accordance with well-known principles. In some embodiments, the processor 340 includes at least one microprocessor or microcontroller.


The processor 340 or the transceivers 310 may include fixed-point arithmetic circuitry that may perform digital signal processing on digital UL or DL channel signals provided to the fixed-point arithmetic circuitry. For example, the fixed-point arithmetic circuitry may perform a least-squares estimate (using, e.g., a Cholesky decomposition and forward-backward substitution approach, as described below) as part of MIMO zero-forcing (ZF), minimum mean squared error (MMSE) precoding, equalization, channel prediction, or other such digital signal processing algorithms. The fixed-point arithmetic circuitry may include application-specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), or similar hardware implementations of one or more digital signal processing algorithms.


The processor 340 is also capable of executing other processes and programs resident in the memory 360. The processor 340 can move data into or out of the memory 360 as required by an executing process. In some embodiments, the processor 340 is configured to execute the applications 362 based on the OS 361 or in response to signals received from gNBs or an operator. The processor 340 is also coupled to the I/O interface 345, which provides the UE 116 with the ability to connect to other devices, such as laptop computers and handheld computers. The I/O interface 345 is the communication path between these accessories and the processor 340.


The processor 340 is also coupled to the input 350, which includes for example, a touchscreen, keypad, etc., and the display 355. The operator of the UE 116 can use the input 350 to enter data into the UE 116. The display 355 may be a liquid crystal display, light emitting diode display, or other display capable of rendering text and/or at least limited graphics, such as from web sites.


The memory 360 is coupled to the processor 340. Part of the memory 360 could include a random-access memory (RAM), and another part of the memory 360 could include a Flash memory or other read-only memory (ROM).


Although FIG. 3 illustrates one example of UE 116, various changes may be made to FIG. 3. For example, various components in FIG. 3 could be combined, further subdivided, or omitted and additional components could be added according to particular needs. As a particular example, the processor 340 could be divided into multiple processors, such as one or more central processing units (CPUs) and one or more graphics processing units (GPUs). In another example, the transceiver(s) 310 may include any number of transceivers and signal processing chains and may be connected to any number of antennas. Also, while FIG. 3 illustrates the UE 116 configured as a mobile telephone or smartphone, UEs could be configured to operate as other types of mobile or stationary devices.



FIG. 4 illustrates example diagrams of digital signal processing algorithms according to embodiments of the present disclosure. As noted above, varying input bit width is one source of decreased SQNR in a fixed-point implementation of a digital signal processing algorithm. As such, existing fixed-point implementations of some algorithms, such as a two-dimensional extended Kalman filter (2D EKF), limit the maximum input bit width to reduce the potential variations in bit width. For example, as illustrated in diagram 402, a sounding reference signal (SRS) is 16 bits, but implementations of the 2D EKF only support 13 bits, requiring truncation of 3 bits of the SRS before input to the 2D EKF. Embodiments of the present disclosure may be used to create the system in diagram 404, which supports processing the full 16 bits of the SRS.



FIG. 5 illustrates an example process flow 500 of a self-tuning fixed-point LS solver according to embodiments of the present disclosure. Such a self-tuning fixed-point LS solver could be used, for example, in the fixed-point hardware implementation of the system of diagram 404. Furthermore, such a self-tuning fixed-point LS solver could be implemented in a UE such as UE 116 or a base station such as gNB 102 using fixed-point arithmetic circuits such as fixed-point arithmetic circuitry 365 of UE 116 or fixed-point arithmetic circuitry 240 of gNB 102.


In this example, the LS solver 506 is implemented using a Cholesky decomposition and forward-backward (FW-BW) substitution approach (as shown in blocks 5061, 5062, and 5063, which may represent separate fixed-point arithmetic circuits, or portions of an integrated fixed-point arithmetic circuit). However, it is understood that the disclosure is not limited to this approach, and any other LS solver approach could be implemented using the embodiments of the disclosure disclosed below.


The LS solver 506 solves the following equation for x:

y=Ax  (1)


where A is an M×N complex matrix, y is an M×1 complex vector, and x is an N×1 complex vector. The inputs at 502 are y and A.


For preprocessing operations at 504, AH is multiplied to both sides of equation (1) to obtain:

p=Cx  (2)


where C=AHA is an N×N complex Hermitian matrix and p=AHy is an N×1 complex vector.


The Cholesky-based LS solver 506 first performs Cholesky decomposition at block 5061 to decompose C in the form of LLH and find L, where L is an N×N complex lower triangular matrix. The Cholesky decomposition block 5061 also generates IL (an N×1 real vector) as a side product that can reduce the number of operations needed to perform the FW-BW substitution of blocks 5062 and 5063. IL is a vector with elements that are a reciprocal of the diagonal elements of L.


Once L and IL are obtained, the FW-BW substitution can be applied to p=LLHx (at blocks 5062 and 5063) to determine x. More specifically, forward substitution block 5062 performs forward substitution on p=Lz to find z, where z=LHx, and backward substitution block 5063 performs backward substitution on z=LHx to find x. It is understood that y and x, thus p and x as well, can readily be expanded from vectors to matrices.


The scales of variables in a fixed-point implementation are typically determined during the fixed-point hardware design stage and are provided to each fixed-point module (or arithmetic circuit, e.g., blocks 5061, 5062, and 5063). These pre-determined scales are referred to herein as control scales, denoted as N with a subscript that indicates the variable associated with the scale. The provided control scales are used to track and match the scales in internal operations and output generation. That is, each fixed-point module performs its operations assuming that the variables have scale values that correspond to their provided control scale. In the fixed-point Cholesky-based LS solver 506, the following control scales are provided to the Cholesky decomposition and FW-BW substitution blocks: NC, NL, NIL, and Np.


The true scale of a variable, denoted herein as S with a subscript that indicates the variable associated with the scale, refers to the actual scale value of the variable—that is, the scale of the variable as used in previous operations performed on that variable. In conventional designs, the control scale values are assumed to correspond to the true scales of the variables (i.e., the control scale is set equivalent to the true scale by design). In the conventional design of fixed-point Cholesky-based LS solver 506, then, SC=NC, SL=NL, SIL=NIL, and Sp=Np.



FIG. 6 illustrates an example of a conventional design of a Cholesky-based LS solver 600 according to embodiments of the present disclosure. As noted above, the control scales are designed such that SC=NC, SL=NL, SIL=NIL, and Sp=Np. The outputs of the FW and BW substitution blocks 604 and 606 (which may correspond to blocks 5062 and 5063, respectively, of FIG. 5) may have the same scale as the input p, i.e., Sp=Sz=Sx, and therefore Np=Nz=Nx.


The scales associated with the input variables C and p are dynamic in the sense that the true scales of the inputs they are dependent on the source of the inputs. However, because the system is designed to operate under the assumption that SC=NC and Sp=Np, the design of the values of the control scales NC and Np is constrained based on the expected true scales of the inputs. The values of NL and NIL are freely tunable during the design phase, however, and therefore may be tuned to optimize operations performed by the blocks of the LS solver 600.


In embodiments of the present disclosure, the input control scale NC is not constrained to be equivalent to the true scale SC of its associated input variable C. Accordingly, the values of NC, NL, and NIL are all freely tunable during the design phase and can be arbitrary values of choice. The value of the input control scale NC being different from the input true scale SC may have a cascading impact on the scale of all variables in the following operations that is needed to avoid underflow or overflow. This impact will therefore need to be analytically tracked and controlled to avoid underflow or overflow.



FIG. 7 illustrates an example design of a Cholesky-based LS solver 700 with adaptive scales according to embodiments of the present disclosure. In this example, NC≠SC, i.e., the control scale of the input C may be different from the true scale of C. Accordingly, the true scales of the outputs of the Cholesky decomposition and the FW-BW substitution arithmetic circuits (blocks 702, 704, and 706, respectively, which may correspond to blocks 5061, 5062, and 5063, respectively, of FIG. 5) may be different than the provided control scales (i.e., NL≠SL, NIL≠SIL, Np≠Sp≠Sz≠Sx) and are analytically tracked to control potential overflow and underflow.


In various embodiments, the true scales of the variables in the Cholesky-based LS solver 700 are tracked by determining a dynamic scale difference based on NC and SC, and applying the dynamic scale difference to determine adaptive scale values for SL, SIL, Sz, and Sx. The dynamic scale difference is denoted herein as δC. An adaptive scale value herein refers to a dynamic true scale value that is determined by adjusting a provided static control scale value using, e.g., the dynamic scale difference value. It is understood that other terminology could be used to refer to the adaptive scale without affecting this disclosure.


In the embodiment of the example of FIG. 7, the following equations are used to compute L such that C=LLH in the Cholesky decomposition arithmetic circuit 702:











L

j
,
j


=



C

j
,
j


-




k
=
1


j
-
1




L

j
,
k




L

j
,
k

*






,




(
3
)














L

i
,
j


=




I

L
,
j


(


C

i
,
j


-




k
=
1


j
-
1




L

i
,
k




L

j
,
k

*




)



for


i

>
j


,
where




(
4
)










I

L
,
j


=

1

L

j
,
j








and where the subscript i, j denotes the element of the matrix at the row i and column j. In other embodiments, different formulas may be used for similar purposes.


In computation of the equations (3) and (4) for the diagonal elements Lj,j and the off-diagonal elements Li,j of L, if NC≠SC, NL≠SL, and NIL≠SIL, then there will need to be two scale changes in order to satisfy conditions requiring matching scales of variables for performing operations or matching the specified output scale. As a result, the true scale of Lj,j and Li,j will be the adaptive scale SL=NLC, where








δ
C

=



S
c



N
C


2


,





and the true scale of IL,j will be the adaptive scale SIL=NIL−δC. These results are derived below.


In deriving the adaptive scale SL of L, although the diagonal elements Lj,j and the off-diagonal elements Li,j are computed using different equations, they need to have matching scales, as all values of L need to have the same scale. For computation of the diagonal element Lj,j for j=1 using equation (3), √{square root over (C1,1)} has the scale of







S
sqrt

+



S
C

2

.






The adaptive output true scale SL can then be obtained from the following equation for output scale matching using the output control scale NL:











S
L

=



S
sqrt

+


S
C

2

-

(


S
sqrt

+


N
C

2

-

N
L


)








=




(


S
C

-

N
C


)

2

+

N
L








=



N
L

+

δ
C








where







δ
C

=




S
C

-

N
C


2

.





For computation of the diagonal elements Lj,j for j≠1 using equation (3), first Cj,j and Lj,kLj,k* must have matching scales to perform the operation Cj,j−Σk=1j−1Lj,kLj,k*. Accordingly, the true scale of Lj,kLj,k*, which is 2SL, is changed to 2SL−(2NL−NC) after scale matching to the true scale of Cj,j, which is SC, based on the output control scale NL and the input control scale NC. Using the previously obtained value of SL=NLC, it can be confirmed that:










S
C

=



2


S
L


-

(


2


N
L


-

N
C


)








=



2


(


N
L

+

δ
C


)


-

(


2


N
L


-

N
C


)








=



2


δ
C


+

N
C








=



S
C

-

N
C

+

N
C








=


S
C








Next, the scale of √{square root over (Cj,j−Σk=1j−1Lj,kLj,k*)}, which is








S
sqrt

+


S
C

2


,





is scale matched to the specified output scale based on the control scales to become:










S
L

=



S
sqrt

+


S
C

2

-

(


S
sqrt

+


N
C

2

-

N
L


)








=




(


S
C

-

N
C


)

2

+

N
L








=



N
L

+

δ
C












where



δ
C


=




S
C

-

N
C


2

.





For computation of the off-diagonal elements Li,j using equation (4), the scale of Ci,j−Σk=1j−1Li,kLj,k* is SC=2SL−(2NL−NC) similarly to the diagonal elements. The result of multiplying Ci,j−Σk=1j−1Li,kLj,k* by IL,j according to equation (4) will have the scale SC+SIL. Output scale matching based on the control scales will result in the following scale change:










S
L

=



S
C

+

S
IL

-

(


N
C

+

N
IL

-

N
L


)








=




(


S
C

-

N
C


)

2

+

N
L








=



N
L

+

δ
C









where SIL=NIL−δC, as derived below. Therefore, the adaptive scale for all elements in L is SL=NLC.


In deriving the adaptive scale SIL of IL, for computation of







I

L
,
j


=

1

L

j
,
j








the scale of








1



C

j
,
j


-







k
=
1


j
-
1




L

j
,
k




L

j
,
k

*







is



S
sqrt


-



S
C

2

.






To match the specified output scale and obtain the adaptive output true scale SIL, the following scale changes are performed.










S
IL

=



S
sqrt

-


S
C

2

-

(


S
sqrt

-


N
C

2

-

N
IL


)








=



-


(


S
C

-

N
C


)

2


+

N
IL








=



N
IL

-

δ
C












where







δ
C


=




S
C

-

N
C


2

.





The FW substitution circuit 704 follows the Cholesky decomposition circuit 702 and solves p=Lz for z, where z=LHx, using the outputs of the Cholesky decomposition, L and IL, according to the following equation:










z
i

=


I

L
,
i


(


p
i

-




k
=
1


i
-
1





L

i
,
k




z
k




)





(
5
)







where z has the same provided control scale as p, i.e., Nz=Np. Satisfying the conditions requiring matching scales of variables for performing operations or matching the specified output scale in the FW substitution operation results in the true scale of z being the adaptive scale Sz=Np−δC. This result is derived below.


In deriving the adaptive scale Sz of z, equation (5) can be expressed as zi=IL, ipi for i=1. The scale of IL, ipi is Sp+SIL. After output scale matching, this becomes the adaptive output true scale Sz:










S
z

=



S
p

+

S
IL

-

(


N
p

+

N
IL

-

N
z


)








=



S
p

+

S
IL

-

N
IL








=



N
p

+

(


N
IL

-

δ
C


)

-

N
IL








=



N
p

-

δ
C









For computation of zi for i≠1 using equation (5), first pi and Li,k*zk must have matching scales to perform the operation pi−Σk=1i−1Li,k*zk. The true scale of Li,k*zk, which is SL+Sz, is therefore changed to SL+Sz−(NL+Nz−Np) after scale matching to the true scale of pi, which is Sp, based on the control scales NL, Nz, and Np. Using the previously obtained values of SL=NLC and Sz=Np−δC, and remembering that Sp=Np=Nz, it can be confirmed that:










S
p

=



S
L

+

S
z

-

(


N
L

+

N
z

-

N
p


)








=



S
L

+

S
z

-

N
L








=



(


N
L

+

δ
C


)

+

(


N
p

-

δ
C


)

-

N
L








=


N
p







=


S
p








Then, for computation of IL, i(pi−Σk=1i−1Li,k*zk) according to equation (5), the scale of pi−Σk=1i−1Li,k*zk is Sp, and thus the result of multiplying pi−Σk=1i−1Li,k*zk by IL, i will have the scale Sp+SIL. Output scale matching based on the control scales will result in the following scale change:










S
z

=



S
p

+

S
IL

-

(


N
p

+

N
IL

-

N
z


)








=



S
p

+

S
IL

-

N
IL








=



N
p

+

(


N
IL

-

δ
C


)

-

N
IL








=



N
p

-

δ
C









Therefore, the adaptive scale of all elements in z is Sz=Np−δC.


The BW substitution circuit 706 in turn solves z=LHx for x using the outputs of the Cholesky decomposition circuit 702 and the FW substitution circuit 704 blocks (L, IL, and z) according to the following equation:










x
i

=


I

L
,
i


(


z
i

-




k
=

i
+
1


N




L

k
,
i

*



x
k




)





(
6
)







where x has the same provided control scale as z, i.e., Nx=Nz=Np. Satisfying the conditions requiring matching scales of variables for performing operations or matching the specified output scale in the BW substitution operation results in the true scale of x being the adaptive scale Sx=Np−2δC, which can also be expressed as SL+SIL=NL+NIL. This result is derived below.


In deriving the adaptive scale Sx of x, equation (6) can be expressed as xi=IL, izi for i=1. The scale of IL, izi is Sz+SIL. After output scale matching, this becomes the adaptive output true scale Sx:










S
x

=



S
z

+

S
IL

-

(


N
IL

+

N
z

-

N
x


)








=



(


N
p

-

δ
C


)

+

(


N
IL

-

δ
C


)

-

N
IL








=



N
p

-

δ
C

-

δ
C








=



N
p

-

2


δ
C










For computation of xi for i≠1 according to equation (6), first zi and Lk,ixk must have matching scales to perform the operation zi−Σk=i+1NLk,ixk. The true scale of Lk,ixk, which is SL+Sx, is therefore changed to SL+Sx−(NL+Nx−Nz) after scale matching to the true scale of zi, which is Sz, based on the control scales NL, Nx, and Nz. Using the previously obtained values of SL=NLC, Sz=Np−δC, and Sx=Np−2δC, and remembering that Sp=Np=Nx=Nz, it can be confirmed that:










S
z

=



S
L

+

S
x

-

(


N
L

+

N
x

-

N
z


)








=



S
L

+

S
x

-

N
L








=



(


N
L

+

δ
C


)

+

(


N
p

-

2


δ
C



)

-

N
L








=



δ
C

+

(


N
p

-

δ
C


)

-

δ
C








=



δ
C

+

S
z

-

δ
C








=


S
z








Then, for computation of IL, i(zi−Σk=i+1NLk,ixk) according to equation (6), the scale of zi−Σk=i+1NLk,ixk is Sz, and thus the result of multiplying zi−Σk=i+1NLk,ixk by IL, i will have the scale Sz+SIL. Output scale matching based on the control scales will result in the following scale change:










S
x

=



S
z

+

S
IL

-

(


N
IL

+

N
z

-

N
x


)








=



(


N
p

-

δ
C


)

+

(


N
IL

-

δ
C


)

-

N
IL








=



N
p

-

δ
C

-

δ
C








=



N
p

-

2


δ
C










Therefore, the adaptive scale of all elements in x is Sx=Np−2δC.


As derived above, the true scales of the outputs of the Cholesky decomposition, FW and BW substitution blocks become different than the control scales and are functions of δC. When δC=0, this embodiment devolves to the conventional method wherein the true scales and control scales have the same value, i.e., SL=NL, SIL=NIL, and Sx=Np. In this case, SL and SIL are fixed values that do not vary with the input scale SC and the final output scale Sx is tied to the input scale Sp.


In the present embodiment with δC≠0, the primary outputs such as L, IL, and x have adaptive scales SL=NLC, SIL=NIL−δC, and Sx=Np−2δC, which can be exploited to make desirable adjustments to the output scales. The control scales in this case function as anchor points, and δC allows adjustment of the true scales SL, SIL, and Sx of the outputs and is determined by both the control input scale NC and the true input scale SC (i.e., δC varies with the input scale SC). Adjustments may be made to the output scales in order to reduce chances of bit overflow and underflow that would occur in the conventional method. This is referred to as the self-tuning property.


Examples of the benefits provided by a self-tuning fixed-point LS solver follow, in the context of the Cholesky-based LS solver 700 that solves equation (2), p=Cx, for x. For a given input p, the magnitude of x is inversely proportional to the magnitude of C. Likewise, for a given input C, the magnitude of x is inversely proportional to the magnitude of p. For a variable having a given bit width, larger magnitude data needs a smaller scale (as higher integer representation is necessary while less fractional precision is necessary) and smaller magnitude data needs a larger scale (as more fractional precision is necessary while lower integer representation is necessary)—i.e., magnitude is inversely proportional to the required scale.



FIG. 8 illustrates an example of scale adaptation using a Cholesky-based LS solver 700 with adaptive scales according to embodiments of the present disclosure. In the example of FIG. 8, arrows overlaid on a variable or scale represent a change in magnitude of that variable or scale. In this example, the true scale Sp of p is fixed and the true scale SC of C varies.


When SC increases, this means that the magnitude of C has decreased. In the Cholesky decomposition circuit 702, when computing C=LLH to find L and IL, a decrease in the magnitude of C means the magnitude of L will decrease and the magnitude of IL will increase, therefore the required scale for L will increase and the required scale for IL will decrease (where “required” scale means the scale needed to avoid underflow and overflow). The embodiments of the present disclosure may accommodate these changes in the required scales for L and IL due to capability of using adaptive scales SL and SIL.


Following on from the Cholesky decomposition circuit 702, in the FW substitution circuit 704, when computing p=Lz to find z, a decrease in the magnitude of L (and increase in the magnitude of IL) means the magnitude of z will increase (as the magnitude of z is inversely proportional to the magnitude of L and proportional to the magnitude of IL), and thus the required scale for z will decrease. Similarly, in the BW substitution circuit 706, when computing z=LHx to find x, a decrease in the magnitude of L (and increase in the magnitude of IL) means the magnitude of x will increase (as the magnitude of x is inversely proportional to the magnitude of L and proportional to the magnitude of IL), and thus the required scale for x will decrease. The embodiments of the present disclosure may accommodate these changes in the required scales for z and x due to capability of using adaptive scales Sz and Sx.


By comparison, in the case when δC=0 (i.e., using the conventional method with fixed scales), there will be a higher chance of underflow in the computation of L and a higher chance of overflow in the computation of IL because SL and SIL are fixed (to NL and NIL, respectively). Additionally, there will be a higher chance of overflow in the computation of z and x, as Sz and Sx are fixed (to Np).



FIG. 9 illustrates an example process 900 for self-tuning scales of variables for processing in fixed-point hardware according to embodiments of the present disclosure. The process of FIG. 9 may be performed by any appropriate device, such as a UE (e.g., UE 116 of FIGS. 1 and 3) or a gNB (e.g., gNB 102 of FIGS. 1 and 2), that includes a sequence of fixed-point arithmetic circuits configured to implement a digital signal processing algorithm. For simplicity, the process of FIG. 9 is discussed in the context of an LS solving algorithm using adaptive scales, but it is understood that the process could be used with any fixed-point hardware implementation of any suitable digital signal processing algorithm, e.g., an algorithm necessitating matrix inversion.


In the example of FIG. 9, each of the fixed-point arithmetic circuits is configured to receive at least one input signal and output at least one output signal. Furthermore, the fixed-point arithmetic circuits are preconfigured with control scales associated with each of the at least one input and output signals. In some embodiments, the fixed-point arithmetic circuits comprise an LS solver that includes a Cholesky decomposition circuit, a forward substitution circuit, and a backward substitution circuit, and the first circuit in the sequence is the Cholesky decomposition circuit.


The process begins by receiving, at the first fixed-point arithmetic circuit in the sequence, a first input signal having a dynamic true scale that is different from a control scale associated with the first input signal (step 905).


At step 910 of the process, each of the fixed-point arithmetic circuits determines, for each of the at least one output signals, an adaptive scale from the control scale associated with the output signal based on the true scale of the first input signal and the control scale associated with the first input signal. The adaptive scales are determined at step 910 such that likelihoods of bit underflow and bit overflow are reduced in the generation of the at least one output signal having the adaptive scale of the at least one output signal as compared to a generation of the at least one output signal having the control scale associated with the at least one output signal.


In some embodiments, each of the fixed-point arithmetic circuits at step 910 determines, for each of the at least one output signals, the adaptive scale from the control scale associated with the output signal by addition or subtraction of a scale tuning factor (e.g., δ). For example, each of the fixed-point arithmetic circuits subtracts, for each of the at least one output signals that represents a result of an operation that includes matrix inversion, the scale tuning factor from the control scale associated with the output signal to determine the adaptive scale. Each of the fixed-point arithmetic circuits adds, for each of the at least one output signals that represents a result of an operation that does not include matrix inversion, the scale tuning factor to the control scale associated with the output signal to determine the adaptive scale.


In such embodiments, a processor operatively coupled to the fixed-point arithmetic circuits may, at step 910, generate the scale tuning factor using the true scale of the first input signal and the control scale associated with the first input signal. In particular, the scale tuning factor may be one half of the difference between the true scale of the first input signal and the control scale associated with the first input signal.


The process concludes at step 915, where each of the fixed-point arithmetic circuits generates, from the at least one input signal, the at least one output signal having the adaptive scale of the at least one output signal.


In the process 900 a system of linear equations may be defined by the first input signal (e.g., C) and a second input signal (e.g., p) that is received by one of the fixed-point arithmetic circuits (e.g., the forward substitution circuit), wherein the second input signal has a dynamic true scale. In this case a final fixed-point arithmetic circuit in the sequence (e.g., the backward substitution circuit) generates, as the at least one output signal, a solution to the system of linear equations, and determines the adaptive scale of the solution such that it is different from the true scale of the second input signal.


In some embodiments of process 900 the first fixed-point arithmetic circuit in the sequence (e.g., the Cholesky decomposition circuit) performs matrix decomposition on the first input signal to generate at least two decomposition matrices as the output signals (e.g., L and IL). The other fixed-point arithmetic circuits in the sequence then determine the solution to a system of linear equations using the at least two decomposition matrices and the adaptive scales of the at least two decomposition matrices.


In some cases, the fixed-point arithmetic circuitry also includes a preprocessing circuit that preprocesses inputs to the LS solver circuitry. For example, when the fixed-point arithmetic circuits include a Cholesky decomposition circuit, a forward substitution circuit, and a backward substitution circuit, the preprocessing circuit may receive a matrix y and a matrix A as inputs, where y and A define a system of linear equations y=Ax, and may then generate the first input signal C such that C=AHA and generate the second input signal p such that p=AHy.


The above flowchart illustrates an example method or process that can be implemented in accordance with the principles of the present disclosure and various changes could be made to the methods or processes illustrated in the flowcharts. For example, while shown as a series of steps, various steps could overlap, occur in parallel, occur in a different order, or occur multiple times. In another example, steps may be omitted or replaced by other steps.


Although the present disclosure has been described with an exemplary embodiment, various changes and modifications may be suggested to one skilled in the art. It is intended that the present disclosure encompass such changes and modifications as fall within the scope of the appended claims. None of the description in this application should be read as implying that any particular element, step, or function is an essential element that must be included in the claims scope. The scope of patented subject matter is defined by the claims.

Claims
  • 1. An electronic device, comprising: a sequence of fixed-point arithmetic circuits, each of the circuits configured to: receive at least one input signal, andoutput at least one output signal,wherein the circuits are preconfigured with respective control scales associated with respective ones of each of the at least one input and output signals,wherein a first fixed-point arithmetic circuit in the sequence is further configured to receive a first input signal having a dynamic true scale that is different from the respective control scale associated with the first input signal, andwherein each of the fixed-point arithmetic circuits is further configured to: determine, for each of the respective at least one output signals, a respective adaptive scale from the respective control scale associated with the respective output signal based on the true scale of the first input signal and the respective control scale associated with the first input signal, andgenerate, from the respective at least one input signal, the respective at least one output signal having the respective adaptive scale of the respective at least one output signal.
  • 2. The electronic device of claim 1, wherein each of the fixed-point arithmetic circuits is further configured to: determine the respective adaptive scales such that likelihoods of bit underflow and bit overflow are reduced in the generation of the respective at least one output signal having the respective adaptive scale of the respective at least one output signal as compared to a generation of the respective at least one output signal having the respective control scale associated with the respective at least one output signal.
  • 3. The electronic device of claim 1, wherein each of the fixed-point arithmetic circuits is further configured to: determine, for each of the respective at least one output signals, the respective adaptive scale from the respective control scale associated with the respective output signal by addition or subtraction of a respective scale tuning factor.
  • 4. The electronic device of claim 3, further comprising: a processor operatively coupled to the fixed-point arithmetic circuits, the processor configured to generate the respective scale tuning factor using the true scale of the first input signal and the respective control scale associated with the first input signal.
  • 5. The electronic device of claim 4, wherein the processor is further configured to: generate the respective scale tuning factor to be one half of the difference between the true scale of the first input signal and the respective control scale associated with the first input signal.
  • 6. The electronic device of claim 3, wherein each of the fixed-point arithmetic circuits is further configured to: for each of the respective at least one output signals that represents a result of an operation that includes matrix inversion, subtract the respective scale tuning factor from the respective control scale associated with the respective output signal to determine the respective adaptive scale, andfor each of the respective at least one output signals that represents a result of an operation that does not include matrix inversion, add the respective scale tuning factor to the respective control scale associated with the respective output signal to determine the respective adaptive scale.
  • 7. The electronic device of claim 1, wherein: the first fixed-point arithmetic circuit in the sequence is further configured to perform matrix decomposition on the first input signal to generate at least two decomposition matrices as the output signals, andthe other fixed-point arithmetic circuits in the sequence are configured to determine a solution to a system of linear equations using the at least two decomposition matrices and the adaptive scales of the at least two decomposition matrices.
  • 8. The electronic device of claim 1, wherein: a system of linear equations is defined by the first input signal and a second input signal that is received by one of the fixed-point arithmetic circuits,the second input signal has a dynamic true scale, anda final fixed-point arithmetic circuit in the sequence is further configured to: generate, as the at least one output signal, a solution to the system of linear equations; anddetermine the adaptive scale of the solution such that it is different from the true scale of the second input signal.
  • 9. The electronic device of claim 1, wherein: the first input signal is a Hermitian matrix C having the dynamic true scale SC and the associated control scale NC,the first fixed-point arithmetic circuit in the sequence is further configured to: perform Cholesky matrix decomposition on C to generate, as the at least one output signal: a lower triangular matrix L having the associated control scale NL and the adaptive scale SL, anda vector IL having the associated control scale NIL and the adaptive scale SIL, wherein IL is a reciprocal of the diagonal elements of L;determine SL from NL based on SC and NC; anddetermine SIL from NIL based on SC and NC,a second fixed-point arithmetic circuit in the sequence is further configured to: receive a second input signal that is a matrix p having a dynamic true scale Sp and the associated control scale Np, wherein Sp=Np;perform forward substitution based on p, L, and IL, to generate, as the at least one output signal, a matrix z that is the solution of p=Lz for z, where z=LHx, z having the adaptive scale Sz and the associated control scale Nz such that Nz=Np; anddetermine Sz from Np based on SC and NC, anda third fixed-point arithmetic circuit in the sequence is further configured to: perform backward substitution based on z, L, and IL, to generate, as the at least one output signal, a matrix x that is a solution of z=LHx for x, x having the adaptive scale Sx and the associated control scale Nx such that Nx=Np; anddetermine Sx from Np based on SC and NC.
  • 10. The electronic device of claim 9, further comprising: a preprocessing circuit configured to: receive a matrix y and a matrix A as inputs, where y and A define a system of linear equations y=Ax,generate the first input signal C such that C=AHA, andgenerate the second input signal p such that p=AHy.
  • 11. A method of operation of an electronic device comprising a sequence of fixed-point arithmetic circuits configured to receive at least one input signal and output at least one output signal, the method comprising: receiving, at a first fixed-point arithmetic circuit in the sequence, a first input signal having a dynamic true scale that is different from a respective control scale associated with the first input signal, wherein the fixed-point arithmetic circuits are preconfigured with respective control scales associated with respective ones of each of the at least one input and output signals;determining, by each of the fixed-point arithmetic circuits for each of the respective at least one output signals, a respective adaptive scale from the respective control scale associated with the respective output signal based on the true scale of the first input signal and the respective control scale associated with the first input signal; andgenerating, by each of the fixed-point arithmetic circuits from the at least one input signal, the at least one respective output signal having the respective adaptive scale of the at least one output signal.
  • 12. The method of claim 11, further comprising: determining, by each of the fixed-point arithmetic circuits, the respective adaptive scales such that likelihoods of bit underflow and bit overflow are reduced in the generation of the respective at least one output signal having the respective adaptive scale of the respective at least one output signal as compared to a generation of the respective at least one output signal having the respective control scale associated with the respective at least one output signal.
  • 13. The method of claim 11, further comprising: determining, by each of the fixed-point arithmetic circuits for each of the respective at least one output signals, the respective adaptive scale from the respective control scale associated with the respective output signal by addition or subtraction of a respective scale tuning factor.
  • 14. The method of claim 13, further comprising: generating, by a processor operatively coupled to the fixed-point arithmetic circuits, the respective scale tuning factor using the true scale of the first input signal and the respective control scale associated with the first input signal.
  • 15. The method of claim 14, further comprising: generating, by the processor the respective scale tuning factor to be one half of the difference between the true scale of the first input signal and the respective control scale associated with the first input signal.
  • 16. The method of claim 13, further comprising: subtracting, by each of the fixed-point arithmetic circuits, for each of the respective at least one output signals that represents a result of an operation that includes matrix inversion, subtract the respective scale tuning factor from the respective control scale associated with the respective output signal to determine the respective adaptive scale; andadding, by each of the fixed-point arithmetic circuits, for each of the respective at least one output signals that represents a result of an operation that does not include matrix inversion, the respective scale tuning factor to the respective control scale associated with the output signal to determine the respective adaptive scale.
  • 17. The method of claim 11, further comprising: performing, by the first fixed-point arithmetic circuit in the sequence, matrix decomposition on the first input signal to generate at least two decomposition matrices as the output signals; anddetermining, by the other fixed-point arithmetic circuits in the sequence, a solution to a system of linear equations using the at least two decomposition matrices and the adaptive scales of the at least two decomposition matrices.
  • 18. The method of claim 11, wherein: a system of linear equations is defined by the first input signal and a second input signal that is received by one of the fixed-point arithmetic circuits,the second input signal has a dynamic true scale, andthe method further comprises: generating, by a final fixed-point arithmetic circuit in the sequence as the at least one output signal, a solution to the system of linear equations; anddetermining, by the final fixed-point arithmetic circuit in the sequence, the adaptive scale of the solution such that it is different from the true scale of the second input signal.
  • 19. The method of claim 11, wherein: the first input signal is a Hermitian matrix C having the dynamic true scale SC and the associated control scale NC, andthe method further comprises: performing, by the first fixed-point arithmetic circuit in the sequence, Cholesky matrix decomposition on C to generate, as the at least one output signal: a lower triangular matrix L having the associated control scale NL and the adaptive scale SL, anda vector IL having the associated control scale NIL and the adaptive scale SIL, wherein IL is a reciprocal of the diagonal elements of L;determining, by the first fixed-point arithmetic circuit in the sequence, SL from NL based on SC and NC;determining, by the first fixed-point arithmetic circuit in the sequence, SIL from NIL based on SC and NC;receiving, at a second fixed-point arithmetic circuit in the sequence, a second input signal that is a matrix p having a dynamic true scale Sp and the associated control scale Np, wherein Sp=Np;performing, by the second fixed-point arithmetic circuit in the sequence, forward substitution based on p, L, and IL, to generate, as the at least one output signal, a matrix z that is the solution of p=Lz for z, where z=LHx, z having the adaptive scale Sz and the associated control scale Nz such that Nz=Np;determining, by the second fixed-point arithmetic circuit in the sequence, Sz from Np based on SC and NC;performing, by a third fixed-point arithmetic circuit in the sequence, backward substitution based on z, L, and IL, to generate, as the at least one output signal, a matrix x that is a solution of z=LHx for x, x having the adaptive scale Sx and the associated control scale Nx such that Nx=Np; anddetermining, by the third fixed-point arithmetic circuit in the sequence, Sx from Np based on SC and NC.
  • 20. The method of claim 19, further comprising: receiving, at a preprocessing circuit, a matrix y and a matrix A as inputs, where y and A define a system of linear equations y=Ax;generating, by the preprocessing circuit, the first input signal C such that C=AHA; andgenerating, by the preprocessing circuit, the second input signal p such that p=AHy.
CROSS-REFERENCE TO RELATED APPLICATION(S) AND CLAIM OF PRIORITY

This application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Patent Application No. 63/352,794 filed on Jun. 16, 2022, which is hereby incorporated by reference in its entirety.

US Referenced Citations (156)
Number Name Date Kind
5511153 Azarbayejani Apr 1996 A
6636561 Hudson Oct 2003 B1
6873596 Zhang Mar 2005 B2
7050513 Yakhnich May 2006 B1
7092452 Taylor Aug 2006 B2
7203231 Laurent Apr 2007 B2
7218665 McElwain May 2007 B2
7289583 He Oct 2007 B2
7388935 Hui Jun 2008 B2
7420916 Zhang Sep 2008 B2
7430257 Shattil Sep 2008 B1
7437230 McClure Oct 2008 B2
7489746 Awater Feb 2009 B1
7492815 Guo Feb 2009 B2
7502312 Zhang Mar 2009 B2
7545893 He Jun 2009 B2
7593449 Shattil Sep 2009 B2
7804921 Hidaka Sep 2010 B2
7817754 Macleod Oct 2010 B2
7822150 Davis Oct 2010 B2
7885750 Lu Feb 2011 B2
8055074 Monro Nov 2011 B2
8065074 Liccardo Nov 2011 B1
8189559 Pi May 2012 B2
8223121 Shaw Jul 2012 B2
8284825 Kamalizad Oct 2012 B2
8284875 Park Oct 2012 B2
8340199 Nam Dec 2012 B2
8379743 Bury Feb 2013 B2
8494099 Arviv Jul 2013 B2
8605839 Jiang Dec 2013 B2
8611409 Robert Dec 2013 B2
8811550 Corona Aug 2014 B2
8873613 Aubert Oct 2014 B2
8981984 Dasgupta Mar 2015 B2
8995590 Challa Mar 2015 B2
9070202 Chandraker Jun 2015 B2
9125191 Papasakellariou Sep 2015 B2
9296495 Hartmann Mar 2016 B2
9306725 Papasakellariou Apr 2016 B2
9354237 Tao May 2016 B2
9403273 Payton Aug 2016 B2
9407303 Menon Aug 2016 B2
9408198 Papasakellariou Aug 2016 B2
9414239 Brunk Aug 2016 B2
9537552 Li Jan 2017 B2
9655091 Papasakellariou May 2017 B2
9702674 Fairfax Jul 2017 B2
10048385 Yu Aug 2018 B2
10237096 Ringh Mar 2019 B2
10327213 Han Jun 2019 B1
10447211 Rollins Oct 2019 B2
10735066 Mo Aug 2020 B2
10859709 Yu Dec 2020 B2
10868605 Wang Dec 2020 B2
10891754 Siessegger Jan 2021 B2
10911111 Hu Feb 2021 B2
10929746 Litvak Feb 2021 B2
10973440 Martin Apr 2021 B1
11284473 Qiu Mar 2022 B2
11367216 Siessegger Jun 2022 B2
11504029 Martin Nov 2022 B1
11558590 Damberg Jan 2023 B2
11609242 Martin Mar 2023 B1
11672426 Wang Jun 2023 B2
11700047 Lee Jul 2023 B2
11791871 Banuli Nanje Gowda Oct 2023 B2
11877102 Damberg Jan 2024 B2
11940277 Roumeliotis Mar 2024 B2
12019147 Advani Jun 2024 B2
12021583 Lee Jun 2024 B2
12132548 Jeon Oct 2024 B2
20030152142 Laurent Aug 2003 A1
20030198305 Taylor Oct 2003 A1
20040181419 Davis Sep 2004 A1
20040213360 McElwain Oct 2004 A1
20040228295 Zhang Nov 2004 A1
20040228392 Zhang Nov 2004 A1
20050078777 He Apr 2005 A1
20050117532 Zhang Jun 2005 A1
20050276356 Hui Dec 2005 A1
20060109891 Guo May 2006 A1
20060115026 Macleod Jun 2006 A1
20060240795 He Oct 2006 A1
20060285531 Howard Dec 2006 A1
20070036210 Wu Feb 2007 A1
20070042741 Wu Feb 2007 A1
20070121766 He May 2007 A1
20070133814 Wu Jun 2007 A1
20070153731 Fine Jul 2007 A1
20070211786 Shattil Sep 2007 A1
20070280391 Hidaka Dec 2007 A1
20080279091 Zhang Nov 2008 A1
20090028129 Pi Jan 2009 A1
20090097539 Furman Apr 2009 A1
20090304061 Kamalizad Dec 2009 A1
20090304116 Challa Dec 2009 A1
20100104034 Nam Apr 2010 A1
20100195575 Papasakellariou Aug 2010 A1
20110058618 Bury Mar 2011 A1
20110058619 Arviv Mar 2011 A1
20110267923 Shin Nov 2011 A1
20110268230 Xu Nov 2011 A1
20110280295 Corona Nov 2011 A1
20110299582 Robert Dec 2011 A1
20120069934 Jiang Mar 2012 A1
20130185018 Sheng Jul 2013 A1
20130243068 Aubert Sep 2013 A1
20140133395 Nam May 2014 A1
20140177572 Papasakellariou Jun 2014 A1
20140269460 Papasakellariou Sep 2014 A1
20140270484 Chandraker Sep 2014 A1
20140301492 Xin Oct 2014 A1
20140323897 Brown Oct 2014 A1
20150023443 Menon Jan 2015 A1
20150041595 Hartmann Feb 2015 A1
20150080248 Tao Mar 2015 A1
20150336268 Payton Nov 2015 A1
20160021653 Papasakellariou Jan 2016 A1
20160080052 Li Mar 2016 A1
20160157254 Novlan Jun 2016 A1
20160349026 Fairfax Dec 2016 A1
20170019272 Brannon Jan 2017 A1
20170085846 Damberg Mar 2017 A1
20170102465 Yu Apr 2017 A1
20180048496 Ringh Feb 2018 A1
20180270457 Damberg Sep 2018 A1
20180313958 Yu Nov 2018 A1
20180350098 Siessegger Dec 2018 A1
20190222275 Mo Jul 2019 A1
20190368879 Roumeliotis Dec 2019 A1
20200014895 Damberg Jan 2020 A1
20200037392 Qui Jan 2020 A1
20200099434 Wang Mar 2020 A1
20200112352 Hu Apr 2020 A1
20210010976 Wang Jan 2021 A1
20210082142 Siessegger Mar 2021 A1
20210092337 Damberg Mar 2021 A1
20210132213 Advani May 2021 A1
20210320825 Banuli Nanje Gowda Oct 2021 A1
20220147806 Yun May 2022 A1
20220200669 Banuli Nanje Gowda Jun 2022 A1
20220263564 Lee Aug 2022 A1
20220271802 Lee Aug 2022 A1
20220311489 Rakib Sep 2022 A1
20220376758 Leng Nov 2022 A1
20230113061 Zhu Apr 2023 A1
20230179749 Damberg Jun 2023 A1
20230379460 Andersson Nov 2023 A1
20230412428 Sung Dec 2023 A1
20240022349 Chen Jan 2024 A1
20240064044 Liu Feb 2024 A1
20240179279 Damberg May 2024 A1
20240187063 Jeon Jun 2024 A1
20240230335 Roumeliotis Jul 2024 A1
20240380423 Laporte Nov 2024 A1
Foreign Referenced Citations (2)
Number Date Country
2015184549 Dec 2015 WO
2022071847 Apr 2022 WO
Related Publications (1)
Number Date Country
20230412428 A1 Dec 2023 US
Provisional Applications (1)
Number Date Country
63352794 Jun 2022 US