This application claims priority to German patent application No. 10 2022 214 276.4, filed on Dec. 22, 2022, which is hereby incorporated by reference.
The technical field relates to a coherent lidar system for capturing the surroundings of a motor vehicle.
Motor vehicles are increasingly being equipped with driver assistance systems which capture the surroundings with the aid of sensor systems and deduce automatic reactions of the vehicle and/or instruct the driver as a result of the traffic situation recognized therefrom. A distinction is made between comfort and safety functions.
By now, however, developments have gone in an even more far-reaching direction. The driver is no longer only assisted, but rather the driver's task is increasingly being handled autonomously by the vehicle, i.e., the driver is being increasingly replaced; this is referred to as autonomous driving.
In particular, autonomous driving requires sensors with information about the surroundings which is highly accurate and easy to evaluate by machines. Radar systems are limited in their angular accuracy and separation capability and cannot satisfactorily meet these high capturing requirements on their own or even in combination with camera systems, at least not yet. For this reason, lidar systems, which have a similarly high angular resolution (horizontally and vertically) to a camera, but which additionally supply distance information and separation capability in each pixel, are also deployed in parallel. Today, so-called time-of-flight lidar systems which deal with electromagnetic radiation in the sense of particles and which can, thus, only measure the distance, but not the relative speed directly, are mostly deployed. However, the focus is now also increasingly on coherent lidar systems which deal with electromagnetic radiation in the sense of waves (like radar systems) and, therefore, can also directly measure the relative speed of objects via the Doppler effect. A further advantage of coherent lidar systems is that they have a higher sensitivity at higher distances and, therefore, allow higher ranges. In addition, coherent lidar systems are credited with a higher potential for high semiconductor integration, which promises lower manufacturing costs.
In the case of coherent lidar systems, the emitted electromagnetic wave is modulated, i.e., it alters in at least one of the parameters of amplitude, frequency or phase over time—otherwise no distance measurement would be possible. The most commonly used modulation in coherent lidar systems is the linear frequency modulation (FMCW=frequency modulated continuous wave), which mostly consists of two frequency ramps, the slopes of which have opposite algebraic signs. Admittedly, this modulation does have ambiguity problems in particular in the case of multiple reflections in the same beam direction and, in addition, the production of a highly linear frequency alteration is elaborate. The disadvantages do not occur or occur less in the case of a phase modulation (e.g., with pseudo-random change over discrete phase values), but the digital evaluation of the received signals is, admittedly, more elaborate and the approaches proposed in the prior art are associated with disadvantages, in particular in terms of sensitivity and, therefore, range.
As such, there is an opportunity to provide a coherent lidar system with phase modulation for the digital evaluation of the received signals, which makes possible maximum possible detection sensitivity, accuracy and separation capability.
A coherently working lidar system for capturing the surroundings of a vehicle initially emits a phase-modulated signal, in particular with pseudo-random change over discrete phase values, wherein the signals reflected back from objects, which are delayed with respect to the emitted signal by the distance-dependent transit time and are shifted in frequency by the relative speed-dependent Doppler effect, are received and are converted into a low-frequency signal by mixing and digitized. The lidar system additionally has digital signal processing means for correlation filtering of the low-frequency received signal or it includes these, in particular in order to guarantee as accurate a determination as possible (i.e., high sensitivity and range) of the lidar system, wherein the correlation filtering is two-dimensional due to the two dimensions of time shift and frequency shift of signals reflected by objects, which are not known initially or from the start of the measurement (i.e., a priori). According to the invention, at least a part of the two-dimensional correlation filter is realized by a hardwired digital circuit (i.e., implemented in the hardware and not to be altered), which is embodied as a pipeline (i.e., a calculation based on multiple stages separated by buffer memories or at least partial performance of the correlation filtering), wherein multiple or all of the output values are determined per clock frequency of the digital circuit in one of the two dimensions, and over a sequence of clock frequencies in the other dimension.
In the case of the lidar system, the signal multiplications required for a spectral transformation can be expediently realized with twiddle factors in the hardwired digital circuit by a few additions and/or subtractions of shifted signal values, wherein a maximum of one addition or subtraction may be utilized in order to realize a real-valued multiplication.
The bit length used may change over the pipeline stages of the hardwired digital circuit and may be, in each case, only large enough for the quantization noise produced in the digital circuit to not significantly increase the system noise which arises in the analog part of the receiver.
According to one configuration of the lidar system, the hardwired digital circuit has a front stage, in which the signal sequence or the complex-conjugated values thereof and modulation sequence or the complex-conjugated values thereof are multiplied by a position which is shifted with respect to one another, if necessary followed by a decimation of the sequence arising from the multiplication and/or, if necessary, followed by an extension with zeroes (zero padding) and followed by a fast Fourier transform realized in multiple stages (FFT), wherein each individual computing operation is realized in a dedicated circuit and the shift between the signal and modulation sequence is altered from clock frequency to clock frequency in the first stage and the result of a Fourier transform arises at the output of the rear stage, wherein the result refers in each case to multiple cycles of previously produced output data of the front stage.
A binary phase modulation which additionally includes two phase positions which differ by approximately 180° may be utilized, with which the multiplications can be realized with the values of the modulation sequence by switchable inverters.
The fast Fourier transform can be expediently executed in the form of a structure with decimation in frequency, in order to avoid a resorting of the input data in the form of long lines, and in order to have or arrange the longest lines of the structure and the nontrivial multiplications in the front stages with their lower bit length.
According to an advantageous configuration, the hardwired digital circuit can have a front stage, in which a Fourier transform of a signal sequence or the complex-conjugated values thereof and a Fourier transform of the modulation sequence or the complex-conjugated values thereof are multiplied by a position which is shifted with respect to one another, if necessary followed by a decimation of the sequence arising from the multiplication and/or, if necessary, followed by an extension with zeroes (zero padding) and followed by an inverse fast Fourier transform realized in multiple stages, wherein each individual computing operation is realized in a dedicated circuit, the shift between the two Fourier transforms is altered from clock frequency to clock frequency in the first stage and the result of an inverse Fourier transform arises at the output of the rear stage, wherein the result refers in each case to multiple cycles of the previously produced output data of the front stage.
A truncation may be expediently utilized for quantizing and/or purely bit inversion can be expediently utilized for inversion, in the hardwired digital circuit, wherein the effects of the mean errors arising are compensated for by addition of correction values in a stage of the digital circuit.
Components of couplings and reflections within the lidar system or its immediate surroundings, in particular a cover, which are contained in the digitized received signal, are eliminated to a large extent by addition or subtraction of a compensation signal.
The hardwired digital circuit may additionally be utilized for multiple or all of the pixels by virtue of its high throughput rate, wherein the pixels can be generated in particular by scanning light rays and/or parallel receive paths.
The hardwired digital circuit may be expediently extended by one or more further stages in order to evaluate the result of the correlation filtering, in particular for absolute-value or power formation and downstream totaling and/or searching for the maximum.
In
The expenditure-optimized realization of a complex-valued multiplier for an exemplary twiddle factor is depicted in
wherein c=3.108 m/s is the speed of light, and with a frequency shift which is dependent on the radial relative speed v and, therefore, variable, which is produced by the Doppler effect
the signal is then acquired by the transceiver unit 1.6 and is routed via the circulator 1.5 into the further receive path. In a complex-valued mixer 1.8, the modulated received signal is superimposed with the unmodulated laser signal and is converted with the aid of the photodiode unit 1.9 into a complex-valued, low-frequency signal; the frequency of the signal corresponds to the Doppler shift, the modulation of the signal is delayed by the signal transit time with respect to that of the transmit signal. In
wherein it is assumed here that the transit time to is an integral multiple m0 of the modulation time Tm=6.67 ns:
which corresponds to the Doppler shift is likewise integral; a is the complex-valued amplitude of the received signal, “exp” denotes the exponential function and j is the imaginary unit.
The complex-valued received signal e(n) according to (2) relates to an individual object without a longitudinal extent, and to an ideal receiver. In actual fact, there can be multiple and/or extended objects, and an additional noise r(n) is generated in the receiver, in particular due to thermal noise; this then produces the received signal
wherein “sumi=1, . . . . , l” constitutes the sum function over the index i=1, . . . , l of the l non-extended individual objects.
The discrete transit times m0,i and the discrete Doppler shifts k0,i of the l objects are to be established from the received signal e(n) of the period of time n=0,1, . . . ,N-1. For a determination which is as accurate as possible, that is to say a separation of the signal and noise which is as good as possible and, therefore, for maximum sensitivity and range of the lidar system, so-called optimal filtering is to be applied, that is to say filtering by correlation between the received signal e(n) and the two-dimensional space êm,k(n) of the possible ideal amplitude-standardized received signals of an individual object:
M corresponds to the largest object distance which is to be assumed or is of interest, and it is assumed for the Doppler shift k that it can assume any values. This therefore produces the two-dimensional correlation Em,k
wherein “conj” denotes the complex-conjugate formation and the modulation sequence b(n) does not alter because of its real-valuedness. The correlation Em,k has peaks (often referred to as power peaks) at the positions (m,k)=(m0,i,k0,i) of objects;
The calculation of the two-dimensional correlation and its downstream evaluation take place in the digital signal processing unit 1.11. It constitutes a high expenditure with the order of N·M·N. However, the above relationship (5) can also be considered as a discrete Fourier transform over the product e(n)·b(n-m), n=0, . . . , N-1 which is to be determined for each m=0, . . . , M-1; the discrete Fourier transform (DFT) is calculated by way of the fast Fourier transform (FFT):
wherein k=0, . . . , N-1 is the output dimension of the FFT, that is to say the discrete frequency, so that the computing expenditure is reduced to the order of M·N·log2(N).
The previously considered received signal e(n) of the period of time n=0,1, . . . , N-1 and the associated correlation Em,k refer to an individual capturing direction, that is to say based on the horizontal and vertical direction, to one pixel. In actual fact, roughly 160,000 capturing directions, that is to say pixels, are covered in each capturing cycle of 50 ms; this is typically realized by a combination of parallel transmitter and receiver, that is to say parallel capturing of pixels, and scanning, that is to say sequential capturing of pixels. A parallel transmitter and receiver means that all of the elements 1.4-1.10 in
Since the number of the FFTs to be calculated per second, 800 million, roughly corresponds to the realizable clock frequency of 1 GHz of such computing logic, programmable multipliers can be dispensed with, by realizing each butterfly of the FFT and, consequently, each adder and multiplier contained therein (for the corresponding twiddle factor), in a dedicated manner, directly in the computing logic. This is depicted in block 5.4 in
The main expenditure for realizing such computing logic relates to the multipliers. The product between complex-valued signals and the twiddle factors
that is to say unit indicators (amount=1), is formed in them; in general, four real-valued multipliers are needed for this. Each of these real-valued multipliers is typically realized by numerous additions of moved values. Admittedly, there is no requirement for a high degree of accuracy of the factors here; for example, an error of up to 1/32 can be tolerated, i.e., the quantized values
can be used; in this case, “round” designates the rounding function. The noise generated by the rounding at the output of the correlation Em,k lies below the required dynamic range and also typically below the effect of the receiver noise; and the signal loss due to the noise can also be neglected. Therefore, multipliers by the factors ± 1/16, ± 2/16, . . . , ± 15/16 still have to be realized. As an example, the multiplier by the factor 7/16 is considered; due to
it can be realized by a subtraction of the input value moved four places to the right from the input value moved one place to the right—this assumes a binary number representation; the above representation of 7/16 is called CSD code (canonical-signed-digit code). With the exception of the factors ± 11/16 and ± 13/16, all of the above factors can be realized in accordance with relationship (8) with a maximum of one addition or subtraction; to avoid having to deploy two additions or subtractions for these factors ± 11/16 and + 13/16, they are approximated by ± 10/16 and ± 14/16, which still leads to acceptable quantization noise.
The complex-valued multiplier for the twiddle factor
is depicted in
A complex-valued addition and subtraction of two complex-valued values takes place, in each case, in the butterflies; that is to say, the amount of the result can be twice as large as the amount of the input values, so that the value range has to be extended upwards by one bit. As a result, the bit length would increase by 12 over the 12 stages of the FFT. Admittedly, the noise component of the values originating from the receiver noise also increases over the additions and subtractions and, indeed, on average by √2 in terms of amplitude. That is to say that, following two stages, in each case, the noise amplitude doubles. For this reason, the least significant bit, that is to say the LSB, can be omitted in each second stage (that is to say, it is then scaled by the factor 0.5); the quantization noise generated by this lies below the effect of the receiver noise, since the value range at the input of the FFT is selected such that the receiver noise already has the amplitude of multiple LSBs there. The effect of the simple omission of the LSBs (that is to say, without rounding), that is to say the mean errors arising as a result, can also be compensated for by adding correction values to the input values of the FFT. In the circuit according to
According to
According to relationship (6), the FFT can be applied to the product between the received signal e(n) and the shifted modulation sequence b(n-m) in order to determine the correlation Em,k. However, due to the cyclical nature of the modulation sequence b(n) (it has period N), the product can also be formed between the unshifted modulation sequence b(n) and the cyclically shifted received signal e(mod(n+m,N)), where *mod′ represents the modulo function to module N, and the FFT applied thereto:
the values of this correlation differ from those of relationship (6) in phase, but are identical in amount and only the latter is relevant for the further evaluation, so that the same symbol is utilized here for the sake of simplicity (this relationship results from the time shift offset of the Fourier transform). The product between the cyclically shifted received signal e(mod(n+m,N)) and the modulation sequence b(n) is formed in block 5.2 in
Previously, in block 5.1, a correction signal c1(n) is added to the received signals, which serves to compensate for the effects of couplings and reflections within the lidar system or its immediate surroundings, in particular a cover.
As already explained above, in order to simplify calculations, pure truncation is utilized for quantization and purely bit inversion is utilized for inversion; the effects of the errors containing mean values which arise are compensated for by addition of correction values C2(n) in block 5.3 prior to the FFT. The correction stage could also be realized following the FFT instead of prior to the FFT.
Following the FFT, that is to say following the formation of the correlation Em,k, the result is still processed further. The amount for each of the N=4096 complex values is initially formed in block 5.5. Since a high degree of accuracy is not required here, the following approximation can be utilized for the amount |i| of the complex value i=iRe+j·lm:
wherein “max” and “min” denote the maximum and minimum function; the calculation can be implemented with little logic expenditure.
The amounts of the N=4096 values calculated in this way go both into block 5.6 for totaling and into block 5.7 for formation of the maximum. Both blocks are configured in a cascaded form; in each of the 12 stages, the sums or the maximums of value pairs are formed in each case. Required registers between the stages are not depicted.
The totaling is required in order to estimate the noise level in order to be able to distinguish peaks of the correlation, which are generated by objects, from noise peaks. Since there are only very few peaks generated by objects in the correlation, that is to say, most of the values only represent noise, the sum following division by 4096, that is to say, moving right by 12 bits, supplies a good estimate of the noise level.
The determination of the maximum establishes the maximum amount and the associated index k of N=4096 FFT output values for the respective shift m (which corresponds to the distance), that is to say, in the Doppler dimension, i.e., relative speed dimension. If the maximum is above the estimated noise by a factor of at least 3, it is deemed to be generated by an object; the distance and relative speed of the object can be determined from the associated shift m=m0 and Doppler index k=k0, its reflectivity can be determined from the level. If, as depicted in block 5.7, only the absolute maximum is determined, only the most reflective object in the respective pixel can be determined at a distance. If the aim is to cover the very unlikely case that there are two objects having different relative speeds in one pixel at one distance (that is to say, for instance, the range of one meter), the respective maximum of multiple value blocks could also be output—due to the cascaded construction of the search for the maximum, e.g., of 8 equally long blocks. If the input data of the search for the maximum are arranged appropriately, multiple blocks can also be utilized so that an interpolation of the peak in the FFT can be performed for a more precise relative speed; because the peak is typically seen in two adjacent FFT values (since it does not lie—as previously considered—at an integral Doppler index k0) and, if the input data are arranged appropriately, these are in different blocks of the search for the maximum, both values are obtained thereafter.
It should also be commented that no window function, that is to say no multiplication of the input values of the FFT by a kind of bell curve, is utilized for the FFT; this would only be necessary or useful if two objects having a similar relative speed and notably different reflectivity can occur at the same distance in one pixel and are to be separated. In particular, when no window function is utilized at the input of the FFT, the sensitivity at the output of the FFT is then reduced (that is to say, the detection capacity of objects having weak reflectivity and high distance) when the Doppler index k0 corresponding to the relative speed is not integral, that is to say, the peak is divided between two adjacent FFT values. The effect can be reduced by selecting the length of the FFT to be higher than that of its input signal, i.e., zeros are appended to the input signal, which is referred to as zero padding.
With regard to the index determination in the search for the maximum, it should be commented that this can be built up very easily bit by bit, beginning with the LSB due to the cascaded realization; at the output of each comparison of two values, in addition to the current maximum value, there is also an index value, the bit length of which corresponds to the number of the stage. The index thus arising refers to the linear numbering at the input of the search for the maximum; since the numbering at the output of the FFT is scrambled with respect to the Doppler index k, another conversion/mapping has to be performed later.
In the case of the design considered here (modulation time Tm=6.67 ns), the clear relative speed range at the output of the FFT is about +210 km/h and is therefore in the area of the region of interest. In the case of a considerably smaller modulation time, only a part of the relative speed range would be of interest at the output of the FFT. A decimation could then be performed prior to the FFT, that is to say following multiplication between the shifted received signal and modulation sequence, in order to reduce the length of the FFT; in the simplest case, such a decimation is effected by formation of subtotals of the product sequence.
That is to say, at the output of the hardwired digital circuit, which is depicted in
The logic of the digital circuit according to
An alternative structure to
wherein “CCm” means the cyclical correlation between the two sequences of length N and where m=0, . . . , N-1 is the dimension at the output of the correlation; that is to say that since N>M in the design under consideration, more discrete distances m than required are processed by the cyclical correlation.
A cyclical correlation in the time range corresponds to a multiplication of the discrete Fourier transforms in the frequency range:
wherein IFFTm means the inverse fast Fourier transform and m=0, . . . , N-1 is its output dimension (here, it is already assumed that the DFT is realized by way of a FFT). According to the set of frequency shifts of the Fourier transform, the factor exp(−j2π·n/N·k) applied to the modulation sequence b(n) in the time range means a shift in the frequency range, that is to say of the Fourier transform:
due to the set of frequency shifts of the Fourier transform and the cyclical nature of the discrete Fourier transform, the following further transformations can be conducted for the amount of the correlation:
This relationship can be implemented in a structure similar to that depicted in
If the same modulation sequence b(n) is always utilized, the multipliers can be implemented in a hardwired form for the realization of the product E(mod(I-k,N))· B(I)— thanks to approximation and CSD representation, only a small implementation expenditure is then necessary. If the modulation sequence alters, programmable multipliers will be necessary, which will mean considerably more implementation expenditure.
As explained above, in the case of N>M (N≈16M) assumed here, many more discrete distances m=0, . . . , N-1 than required are processed. This can be circumvented by performing a decimation prior to the IFFT—in the case of a decimation by the factor 16, only 256 values then arise from the N=4096 values, which are fed into the IFFT; in the simplest case, the decimation is realized by adding, in each case, 16 adjacent values. Therefore, instead of the original dimension 4096, the IFFT only has the dimension 256 and therefore requires much less implementation expenditure; with its length 256, the full range of distances of length M=250 is also still covered.
Admittedly, it should be taken into consideration that such a structure has to be cycled through N=4096 times per pixel (so many shifts have to be carefully calculated for the FFT of the input signal); that is a good factor 16 more than in the structure according to
In the lidar system considered so far according to
So far, the case has been considered that the modulation time Tm is equal to the sampling repetition time Ts. It could happen in the case of an ideal rectangular modulation signal, which retains its ideal shape even in the received signal, that it is sampled exactly at the edge where no meaningful information can be obtained. In order to avoid this, either the sampling repetition time Ts of the received signal can be provided so that it is smaller, e.g., half the size of the modulation duration Tm, or the form of the modulation sequence is either directly distorted, e.g., to an approximately triangular shape, when it is generated or in the receiver. In the correlation Em,k, one peak is then typically obtained in two consecutive discrete distances m and an interpolation can be performed over its values in order to determine the distance more accurately.
Number | Date | Country | Kind |
---|---|---|---|
10 2022 214 276.4 | Dec 2022 | DE | national |