The teachings presented herein relate to vision based control of a hovering air vehicle.
A contemporary topic of interest is that of so-called “Unmanned Air Vehicles” (UAVs), or “drone aircraft”. A significant challenge facing such UAVs is providing them with the ability to operate in cluttered, enclosed, and indoor environments with various levels of autonomy. It is desirable to provide such air vehicles with “Hover in Place” (HIP), or the ability to hold a position for a period of time. This capability would allow a human operator to, for example, pilot the air vehicle to a desired location, and release any “control sticks”, at which time the air vehicle would take over control and maintain its position. If the air vehicle were exposed to moving air currents or other disturbances, the air vehicle should then be able to hold or return to its original position. Such a HIP capability would also be beneficial to vehicles traveling in other mediums, for example underwater vehicles and space-borne vehicles.
It is also desirable to provide a HIP capability to smaller vehicles, such as so-called “micro air vehicles” (MAVs) having a maximum dimension of 30 cm or less. The small size of such MAVs implies that their payload capacity is limited, and for smaller vehicles may be on the order of just several grams or less. Implementing any avionics package for a platform of this class is therefore a challenge.
Most contemporary approaches to controlling an air vehicle incorporate the use of an inertial measurement unit (IMU), which typically includes both a three-axis gyro capable of measuring roll, pitch, and yaw rates, and an accelerometer capable of measuring accelerations in three directions. For short periods of time, the pose angles (roll, pitch, and yaw angles) of an air vehicle may be obtained by integrating the respective roll, pitch, and yaw rates over time. Likewise, the velocity of the air vehicle may be obtained by integrating the measured accelerations. The position of the air vehicle may then be obtained by integrating the velocity measurements over time. In practice, these methods can provide useful state estimations for short periods of time ranging from several seconds to a minute, depending on the quality of the IMU gyros and accelerometers. For longer periods of time, factors such as noise and offset will accumulate over time and cause the measured pose and position to diverge from the actual value. This is undesirable in an enclosed environment, where the accumulated error could cause the vehicle to crash into other objects. The effect of offset in accelerometer measurements on position estimate can be particularly drastic, since a constant offset integrated twice results in an error that grows quadratically in time.
In order to solve the task of providing HIP, inspiration may be drawn from biology. One biologically inspired method of providing such relative depth information is with the use of optical flow. Optical flow is the apparent visual motion seen from a camera or eye that results from relative motion between the camera and other objects or hazards in the environment. For an introduction to optical flow including how it may be used to control air vehicles, refer to the paper, which shall be incorporated herein by reference, entitled “Biologically inspired visual sensing and flight control” by Barrows, Chahl, and Srinivasan, in the Aeronautical Journal, Vol. 107, pp. 159-168, published in 2003. Of particular interest is the general observation that flying insects appear to hover in place by keeping the optical flow zero in all directions. This rule is intuitively sound, since if the optical flow is zero, then the position and pose of the insect relative to other objects in the environment is unchanging, and therefore the insect is hovering in place. An initial implementation of this basic control rule for providing a helicopter with HIP is disclosed in the paper, the contents of which shall be incorporated herein by reference, entitled “Visual control of an autonomous helicopter” by Garratt and Chahl and published in the proceedings of the AIAA 41st Aerospace Sciences Meeting and Exhibit, 6-9 Jan. 2003, Reno, Nev. In this implementation a stereo vision system aimed downwards is used to measure the helicopter's altitude while optical flow is used to measure and control lateral drift.
One characteristic of flying insects is that they have compound eyes that are capable of viewing the world over a wide field of view, which for many insects is nearly omnidirectional. They therefore sense optical flow over nearly the entire field of view. Furthermore, in some insects there have been identified neural cells that are capable of extracting patterns from the global optical flow field. This work has inspired both theoretical and experimental work on how to sense the environment using optical flow and then use this information to control a vehicle. The following collection of papers, which shall be incorporated herein by reference, describes this work in detail: “Extracting behaviorally relevant retinal image motion cues via wide-field integration” by Humbert and Frye, in the proceedings of the American Control Conference, Minneapolis Minn. 2006; the chapter “Wide-field integration methods for visuomotor control” by Humbert, Conroy, Neely, and Barrows in Flying Insects and Robotics, D. Floreano et al (eds.) Springer-Verlag Berlin Heidelberg 2009; “Experimental validation of wide-field integration methods for autonomous navigation” by Humbert, Hyslop, and Chinn, in the proceedings of the 2007 IEEE Intelligent Robots and Systems (IROS) conference; “Autonomous navigation in three-dimensional urban environments using wide-field integration of optic flow” by Hyslop and Humbert in the AIAA Journal of Guidance, Control, and Dynamics, Vol. 33, No. 1, January-February 2010; “Wide-field integration methods for autonomous navigation of 3-D environments” by Hyslop and Humbert in the proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, 18-21 Aug. 2008, Honolulu, Hi.; and “Bio-inspired visuomotor convergence” by Humbert and Hyslop in IEEE Transactions on Robotics, Vol. 26, No. 1, February 2010. A common theme to these papers is the spatial integration of optical flow over a wide field of view. These set of techniques may be referred to by the term “wide field integration”.
There are various methods of obtaining a wide field of view image of the environment suitable for optical flow processing. One method is described in the published patent application 20080225420 entitled “Multiple Aperture Optical Systems” by Barrows and Neely, which shall be incorporated herein by reference. This patent application discloses an array of vision sensors mounted on a flexible substrate that may be bent into a circle to image over a circular 360 degree field of view. Other methods include the use of a camera and a curved mirror, where the camera is pointed at the mirror in a way that the observed reflection covers a wide field of view. An example system is described in U.S. Pat. No. 5,790,181 entitled “Panoramic surveillance system” by Chahl et al, which shall be incorporated herein by reference.
Many methods exist for computing optical flow in a compact package. Applicable references, all of which shall be incorporated herein by reference, include U.S. Pat. No. 6,194,695 entitled “Photoreceptor array for linear optical flow measurement” by Barrows; U.S. Pat. No. 6,384,905 entitled “Optic flow sensor with fused elementary motion detector outputs” by Barrows; U.S. Pat. No. 7,659,967 entitled “Translational optical flow sensor” by Barrows; the paper “An image interpolation technique for the computation of optical flow and egomotion” by Srinivasan in Biological Cybernetics Vol. 71, No. 5, pages 401-415, September 1994; and the paper “An iterative image registration technique with an application to stereo vision” by Lucas and Kanade, in the proceedings of the Image Understanding Workshop, pages 121-130, 1981.
Two additional books that serve as reference material in the fields of optics and image processing include the book “Digital Image Processing, Third Edition”, by R. Gonzalez and R. Woods, published by Pearson Prentice Hall in 2008 and the book “Optical Imaging and Spectroscopy” by D. Brady, published by Wiley in 2009. Both books are incorporated herein by reference.
Since vision systems are a major part of the teachings included herein, we will now discuss the prior art of image sensors. An image sensor is a device that may be used to acquire an array of pixel values based on an image focused onto it. Image sensors are often used as part of a camera system comprising the image sensor, optics, and a processor. The optics projects a light image onto the image sensor based on the environment. The image sensor contains an array of pixel circuits that divides the light image into a pixel array. The pixel array may also be referred to as a “focal plane” since it generally may lie at the focal plane of the optics. The image sensor then generates the array of pixel values based on the light image and the geometry of the pixel circuits. The processor is connected to the image sensor and acquires the array of pixel values. These pixel values may then be used to construct a digital photograph, or may be processed by image processing algorithms to obtain intelligence on the environment.
In the late 20th Century, techniques were developed to fabricate image sensors on a chip using standard CMOS (complementary metal-oxide-semiconductor) integrated circuit fabrication techniques. The book “CMOS Imagers: From Phototransduction to Image Processing”, edited by Orly Yadid-Pecht and Ralph Etienne-Cummings, published by Kluwer Academic Publishers in 2004 is a reference on this art. The contents of this book are incorporated herein by reference.
In the teachings below, we will use the “C programming language convention” of indexing rows and columns starting with zero. For example, the top-most row will be referred to as “row 0” while the second row as “row 1” and so on. Similarly, the left-most column will be referred to as “column 0” and the second column as “column 1” and so on. We will use the notation (r,c) to refer to an element at “row r” and “column c”. This convention will generally be used for all two-dimensional arrays of elements. When referring to a digital signal, we will generally use the terms “high” and “low” to respectively indicated a digital “one” or digital “zero”. A signal that “pulses high” is a signal that starts out a digital zero, rises to a digital one for a short time, and then returns to digital zero. A signal that “pulses low” similarly is a signal that starts out a digital one, falls to a digital zero for a short time, and then returns to a digital one.
We will now describe the design of a prior art image sensor that may be manufactured in an integrated circuit. Refer to
The pixel circuit 101 of
Refer to
The image sensor 201 may be read out using the following algorithm, expressed as pseudocode, to acquire an image and store it in the two dimensional (2D) matrix IM. Variables NumRows and NumColumns respectively denote the number of rows and columns in the focal plane 203. It will be understood that since the pixel circuits of
Refer to
The inventions claimed and/or described herein are further described in terms of exemplary embodiments. These exemplary embodiments are described in detail with reference to the drawings. These embodiments are non-limiting exemplary embodiments, in which like reference numerals represent similar structures throughout the several views of the drawings, and wherein:
We will now describe a number of exemplary embodiments for an image sensor. All of these embodiments may be implemented in a single integrated circuit to form an image sensor chip which may be used in a camera such as camera 281 of
First Exemplary Image Sensor
Refer to
The focal plane circuit 303 may be constructed in the same manner as the pixel array 231 of
The capacitor row select circuit 311 receives as input the aforementioned digital word RS 315 and two additional binary signals “loadrow” 323 and “readrow” 325, and generates an array 327 of capacitor load signals and capacitor read signals. The switched capacitor array 307 receives as input the amplified column signals 321 and the array 327 of capacitor load signals and capacitor read signals. The switched capacitor array 307 also receives as input an array of horizontal switching signals (not shown) and an array of vertical switching signals (not shown). The switched capacitor array 307 also generates an array of capacitor column signals 331, which are sent to the column select circuit 313. The operation of the capacitor row select circuit 311 and the switched capacitor array 307 will be discussed below.
The column select circuit 313 operates in a similar manner as the column select circuit 219 of
Refer to
When signal “bypamp” 417 is set to digital low and “selamp” 419 is set to digital high, the remaining components of the amplifier circuit 401 may be used to amplify the input signal “in” 405. Transistors M3421 and M4403 form an inverter 404. Transistors M1423 and M2425 are switches that connect the left side of capacitor C1427 to either “Vref” 409 or “in” 405. Capacitor C2429 is a feedback capacitor. The amplifier circuit 401 may then be operated as follows: First set “phi” 415 to a digital high, so that the inverter input 431 and the inverter output 433 are shorted together. Second set “sw1” 411 to a digital high and “sw2” 413 to a digital low, so that the left end of capacitor C1427 is connected to “Vref” 409. Third set “phi” 415 to a digital low, so that the input 431 and output 433 of the inverter 404 are no longer shorted together. Fourth, set “sw1” 411 to a digital low and “sw2” 413 to a digital high, so that the left end of C1427 is connected to the input voltage “in” 405. Fifth, wait for a short settling duration. After this duration, the output voltage at “out” 407 will be approximately equal to the value:
where K is a constant voltage depending on the geometry of the transistors and capacitors in circuit 401 and on the fabrication process used to manufacture the image sensor 301. By choosing the ratio of C1427 to C2429, it is possible to select the gain of the amplifier 401. By selecting the voltage Vref 409, it is possible to shift the output voltage up or down and compensate for values of K or for changing global light levels.
Refer to
When signal “load0” 511 is pulsed high, capacitor C1533 stores a voltage equal to the “in0” signal 521. (We define “pulsed high” as setting a signal to a digital high for a short time and then returning the signal to a digital low.) Similarly, the capacitors of all the other switched capacitor cell in the first row 571 (e.g. “row 0”) store the other respective amplified column signals 321. Essentially the topmost row 571 of switched capacitor cells stores a “snapshot” corresponding to the light focused on the topmost row of pixel circuits in the focal plane 303.
Let us next set RS=1, so that the focal plane 303 is outputting the second row (e.g. “row 1”) of pixel signals, and then operate the row amplifier array 305 as described above. When signal “load1” 515 is pulsed high, the capacitors in the second row 573 (e.g. “row 1”) of the switched capacitor cells will store potentials corresponding to row 1 of pixel signals. This process can be repeated for the remaining rows in the focal plane 303 and switched capacitor array 307. After all rows have been accordingly processed, the capacitors in the switched capacitor array 307 store a sampling or a “snapshot” of the amplified column signals 321 generated according to the pixel signals of the focal plane 303. These capacitors effectively store an image. This process of cycling through all rows of the focal plane 303 and the switched capacitor array 307 to deposit a sampled image onto the capacitors of the switched capacitor array 307 may be referred to hereinafter as “the capacitor array 307 grabbing an image from the focal plane 303”.
Once the capacitor array 307 has grabbed an image from the focal plane 303, the image may be read out as follows: First, set signal “read0” 513 to a digital high. This closes the switch formed by transistor M3537 and forms a source follower with M2535 and M6561 to read out the potential stored across capacitor C1533. The entire “row 0” 571 of switched capacitor cells is similarly selected by “read0” 513. The column select circuit 311 (of
We may now define how the capacitor row select circuit 311 functions: When RS 315 is set to the value i (i is an integer), the respective signal “loadi” (of FIG. 5) will be digital high whenever “loadrow” 323 is digital high, and the signal “readi” will be digital high whenever “readrow” 325 is digital high. Thus row 0 (571) of the switched capacitor array 307 may be loaded by setting RS=0 and pulsing high the “loadrow” 323 signal. Similarly row 0 (571) may be read out by setting RS=0 and pulsing high the “readrow” 325 signal. It is possible to both load a row and read out a row at the same time. It will be clear to those with basic knowledge of digital electronics how to construct such a circuit using a single decoder circuit and an array of AND gates, one for each “loadi” and “readi” signal.
So far we have only discussed transistors M1531, M2535, and M3537 of each switched capacitor cell. We had assumed that since signals H1601, H2602, H3603, . . . , and signals V1611, V2612, . . . are digital low, transistors M4539 and M5541 (and their replicates in other switched capacitor cells) behave as open switches and are thus equivalently ignored. Note that transistor M4 (e.g. 539) of each switched capacitor cell connects the capacitors of two horizontally adjacent switched capacitor cells. Also note that transistor M5 (e.g. 541) of each switched capacitor cell connects the capacitors of two vertically adjacent switched capacitor cells. We can refer to the M4 and the M5 transistors of the switched capacitor cells respectively as “horizontally shorting switches” and “vertically shorting switches”. Similarly signals H1601, H2602, and so on may be referred to as “horizontal switch signals” and signals V1611, V2612, and so on may be referred to as “vertical switch signals”. Note also that signal H1601 closes the M4 transistors between columns 0 and 1, signal H2602 closes the M4 transistors between columns 1 and 2, and so on. Note also that signal V1611 closes the M5 transistors between rows 0 and 1, signal V2612 closes the M5 transistors between rows 1 and 2, and so on.
Refer to
For this next discussion, we will now use the aforementioned notation (r,c) to denote “row r” and “column c” of an array. Referring back to the circuit shown in
The image stored on the switched capacitor array 307 may be binned down by other amounts. For example the signals H1, H2, H3, H5, H6, H7, V1, V2, V3, V5, V6, and V7 may be set to digital high, and the other switching signals set to digital low, to short out 4×4 blocks of switched capacitor cells, implement 4×4 size super pixels, and thereby bin and downsample the image by a factor of 4 in each direction. To read out the resulting image from the switched capacitor array, it will be necessary to read out only every fourth row and column of switched capacitor cells, for example switched capacitor cells (0,0), (0,4), (0,8), . . . , (4,0), (4,4), and so on. It is possible to bin and downsample the image by other amounts, including by a different amount in each direction: Setting signals H1 . . . H7 high, H8 low, and keeping V1 through V8 low will implement horizontal rectangular super pixels of 1×8 size. Super pixels shaped like elongated rectangles may be useful for certain image processing functions, for example the computation of one dimensional optical flow as discussed in the aforementioned U.S. Pat. No. 6,194,695.
It is also possible to use the switched capacitor array 307 to implement an approximate Gaussian smoothing function. Suppose the switched capacitor array 307 grabs an image from the focal plane 303 using the methods described above. Then consider this sequence of switching signals:
Step 1: Set H1, H3, H5, H7, V1, V3, V5, and V7 high, and others low
Step 2: Set all switching signals low
Step 3: Set H2, H4, H6, H8, V2, V4, V6, and V8 high, and others low
Step 4: Set all switching signals low
If we repeat this sequence of four steps a number of times, then the electronic effect will be that of smoothing the image stored on the switched capacitor array. The resulting image will have similarly with the original image detected by the focal plane 303, and then stored on the switched capacitor array 307, convolved with a Gaussian smoothing function. More repetition of the above four steps will result in greater smoothing.
As a more detailed example, let us discuss an exemplary algorithm that may be used to operate and read out an image from the first exemplary image sensor 301. For purposes of illustration, we will assume the image sensor has a raw resolution of 64×64 pixels, and that we are binning and downsampling it by 8 pixels in each direction. The resulting 8×8 image will be stored in the 2D matrix IM. We will also assume that we are using amplification in the row amplifier array 305. It will be understood that since the pixel circuits of
One substantial advantage of the techniques described above in association with
If the first exemplary embodiment is implemented in an integrated circuit, it is advantageous to cover up the switched capacitor array so that no light strikes it. This may reduce the amount of leakage current between the top node of capacitor C1 (e.g. 533) of each switched capacitor cell and the substrate, and allow an image to be stored for more time.
It will be understood that other variations of the exemplary image sensor 301 are possible. For example, the focal plane 303, the row amplifier array 305, and the switched capacitor array 307 may each be varied. The row amplifier array may in fact be optionally eliminated if unamplified pixel signals are tolerable. Likewise the capacitors in the switched capacitor array 307 may be connected in other manners than as described, for example by utilizing additional switches to connect diagonally adjacent or other switched capacitor cells.
Second Exemplary Image Sensor
Refer to
Refer to
Using the horizontal and vertical shorting transistors, it is possible to short together blocks of pixel circuits into super pixels. For example, if H1, H3, H5, H7, V1, V3, V5, and V7 are high, while the other switching signals are low, the focal plane array will be configured to form 2×2 super pixels from the individual pixel circuits. In a manner similar to that of the first exemplary image sensor 301, only every other row and column of pixel circuits needs to be read out and acquired. In the same manner as the first exemplary image sensor 301, this would require acquiring fewer pixel signals and require less memory. Other sizes and dimensions of super pixels may similarly be defined. Once an image has been read out with one set of switching signals, the switching signals may be changed and after a short delay (to let the pixel circuits settle to the new connections) a new image may be read out.
As a more detailed example, let us discuss an exemplary algorithm that may be used to operate and read out an image from the second exemplary image sensor 701. For purposes of illustration, we will assume the image sensor has a raw resolution of 64×64 pixels, and that we are downsampling it by 8 pixels in each direction. The resulting 8×8 image will be stored in the two dimensional matrix IM. We will also assume that we are using amplification in the row amplifier array 713. It will be understood that since the pixel circuits formed by transistors M1 (e.g. 801) and D1 (e.g. 811) output a lower voltage for brighter light, the values stored in matrix IM may similarly be lower for brighter light.
Third Exemplary Image Sensor
Another variation of the first exemplary image sensor 301 will now be discussed. The third exemplary embodiment may be constructed exactly the same as the first exemplary image sensor 301. The one difference is in the construction of the switched capacitor cells (e.g. 543) of the switched capacitor array 307. Refer to
The input signal “in” 903, the load signal “load” 905, transistor M1907, and capacitor C1909 behave substantially the same as the corresponding input signal (e.g. 521), load signal (e.g. 511), transistor M1 (e.g. 531), and capacitor C1 (e.g. 533) of a switched capacitor cell (e.g. 543) of
Transistors M2911 and M3913 form a source follower circuit that buffers the voltage on capacitor C1909. Transistor M4915 is connected to a global signal “copy” 917 that, when pulsed high, deposits a potential on capacitor C2919 that is a buffered version of the potential on capacitor C1909. Note that it is beneficial for the bias voltage 921 at the gate of transistor M3913 to be set to place transistor M3913 in the “subthreshold region” to limit the current consumption of this circuit. To further reduce current consumption, it is possible to set the bias voltage 921 to zero except for when the potential across capacitor C1909 is being copied to capacitor C2919. It will be understood that the “copy” signal 917 and the bias voltage 921 may be global signals shared by all instances of the switched capacitor cell 901 using in the third exemplary embodiment.
A switched capacitor array constructed from switched capacitor cells (e.g. 901) as shown in
Transistors M5931, M6933, M7935, and M8937 behave substantially and respectively the same as transistors M2535, M3537, M4539, and M5541 of
The algorithm for operating and reading out the third exemplary image sensor is essentially the same as that for reading out the first exemplary image sensor 301, except that after the switched capacitor array 307 is loaded and before the switching signals are operated, the “copy” signal 917 needs to be pulsed high.
If the third exemplary embodiment is implemented in an integrated circuit, it is advantageous to cover up the switched capacitor array 307 so that no light strikes it. This may reduce the amount of leakage current between the top node of capacitor C1909 and the substrate, and between the top node of capacitor C2919 and the substrate, and allow an image to be stored for more time.
Comparison of the Above Exemplary Image Sensors
The above three exemplary image sensors are similar in that they allow a raw image as captured by their respective pixel arrays to be read out at raw resolution or read out at a lower resolution. The binning function implemented by the switching or shorting transistors is capable of merging together blocks of pixels or sampled pixel values into super pixels. The readout circuits then allow the reading out of only one pixel value from each super pixel, thus reducing the number of pixels acquired and memory required for storing the image data. However there are differences between the three exemplary image sensors that may make one image sensor more appropriate for a given application than another.
The second exemplary image sensor 701 is the simplest circuit, since the pixel shorting transistors are located within the focal plane 709. For a given resolution, fewer transistors and capacitors need to be utilized to implement the second exemplary image sensor 701. In addition, the second exemplary image sensor 701 is potentially faster than the other two exemplary image sensors. This is because once the switching signals H1 through H8 and V1 through V8 are set, and the pixel circuits settle to the new resulting values, the desired pixel signals may then be read out. There is no need to first load a switched capacitor array with pixel signals prior to binning/smoothing and readout. However the second exemplary image sensor 701 as depicted above is unable to implement Gaussian type smoothing functions by switching multiple times (e.g. turn on first odd-valued switching signals and then even-valued, and repeating this process several times). The second exemplary image sensor circuit generally only constructs rectangular super pixels, when implemented in the manner shown above in
The first exemplary image sensor 301 is more flexible than the second exemplary image sensor 701 in that smoothing may be implemented by cycling through different patterns of the switching signals. Gaussian smoothing functions may be approximated. However the first exemplary image sensor 301 requires more components per pixel to implement than the second 701, and may be slower since the switched capacitor array 307 needs to be loaded with an image from the focal plane 303 prior to any binning, smoothing, or downsampling. (There is an exception—it is possible to sample an image, perform some binning and/or smoothing, read out the image from the switched capacitor array 307, perform more binning and smoothing, and then read out the resulting image from the switched capacitor array 307.)
The third exemplary image sensor is similar to the first exemplary image sensor 301 but has one advantage—Once an image is sampled onto the C1 capacitors (e.g. 909) of the switched capacitor array, this image may then be quickly loaded with the “copy” signal 917 onto the C2 capacitors (e.g. 919) for smoothing and/or binning Once the raw image is processed with the C2 capacitors and switching transistors (e.g. 935 and 937) and then read out, the raw image may be quickly restored with the same “copy” signal 917. This allows essentially the same raw image to be processed in different ways without having to reload the switched capacitor from the focal plane every time. Multiple binned/smoothed images may thus be generated from the same original snapshot of pixel signals. In contrast, with the first exemplary image sensor 301, once the raw image has been binned or smoothed, it may be necessary to reload the switched capacitor array, after which time the visual scene may have changed. However the third exemplary image sensor has the disadvantage of requiring more transistors and capacitors per pixel to implement. Furthermore transistors M2 (e.g. 911) and M5 (e.g. 931) each contribute a voltage drop that may limit the available voltage swing to encode image intensities.
Other Variations
Let us now discuss a variation that may be made to the first and third exemplary image sensors described above. Note that in both of these exemplary image sensors, a single switching transistor connects two capacitors from two adjacent switched capacitor cells. Specifically these are M4539 and M5541 of each switched capacitor cell (e.g. 543) of
When both switching signals “swA” 1011 and “swB” 1013 are digital high, both transistors 1003 and 1005 are on and the two connected nodes are shorted together. Alternatively the following sequence may be used and repeated a number of times:
Step 1: swA=high, swB=low
Step 2: swA=low, swB=low
Step 3: swA=low, swB=high
Step 4: swA=low, swB=low
If this sequence is repeated several times, then the charges of the two respective capacitors connected by these transistors is not equalized (in potential) but redistributed slightly, by an amount determined by Cp 1007 and the appropriate capacitor in the switched capacitor cell. For example, suppose a left capacitor and a right capacitor were connected by the circuit 1001 shown in
Another variation that may be made to the first and third exemplary image sensors, including the aforementioned variations incorporating the technique depicted in
One characteristic of many image sensors incorporating logarithmic pixel circuits is that they may have a significant amount of fixed pattern noise (FPN). Such FPN originates from mismatch between transistors or other components from one instance of a circuit to another. Sample transistors that may contribute to FPN include any transistors in the pixel circuits 101 or 121, row readout transistors such M2239 and M4245 in
It will be understood that each image sensor, even if fabricated from the exact same layout or design, has its own fixed pattern noise. Likewise if any of the above three exemplary image sensors is operated with a different switching signal configuration for forming super pixels or for implementing smoothing, each such configuration will also have its own associated fixed pattern noise mask. Even changing parameters such as whether or not to use the amplification provided by the row amplifier array 305 may affect the fixed pattern noise. Thus it may be necessary to record and store a separate fixed pattern noise calibration mask for every permutation of specific image sensor, switching signal configuration, and amplifier setting to be used.
Exemplary Method of Computing Optical Flow
We now discuss an algorithm that may be used to compute optical flow from a camera system incorporating any of the image sensors above. As described above, logarithmic response pixels have a fixed pattern noise that can vary with temperature. Therefore it is desirable to have a way to compute optical flow in a manner that does not require a precomputed fixed pattern noise mask. We next discuss one such algorithm. For this discussion, we will use the prior art image sensor 201 described in
Refer to
Step 1 (1201): “Initialize FP and clear X1, X2, and XLP”. This may be performed using the following MATLAB instructions. Note that variable “fpstrength” indicates the strength of a fixed pattern noise mask and is a parameter that may be adjusted for a particular application. It is advantageous for fpstrength to be substantially less than the typical range of values observable in the visual field, but greater than or equal to the typical frame to frame noise in the sensor system. For example, suppose the typical noise level within a pixel is on the order of two ADC quantum levels (as determined by the precision of the analog to digital converter used to acquire the pixel values), and the typical variation of the texture from darker regions to brighter regions is on the order of 50 quantum levels. A suitable value for fpstrength may be on the order of two to ten.
fpstrength=1.0;
FP=fpstrength*rand(m,n);
X1=zeros (m,n);
X2=zeros (m,n);
XLP=zeros (m,n);
The matrix FPN may alternatively be formed using tilings of the array [0 0 0 0; 0 1 1 0; 0 1 1 0; 0 0 0 0] multiplied by fpstrength. For example, for an 8×8 array the matrix FPN may be generated as follows:
Step 2 (1202): “Grab image X from sensor”. In this step, the processor 297 grabs an m×n image from the image sensor 283 and deposits it into m×n matrix X. Image X may be a raw image or may be an image of super pixels. This step may be performed as described above for the above exemplary image sensors or prior art image sensor. It will be understood that the acquisition of X from the image sensor accounts for the aforementioned image inversion performed by any optical assembly when an image is focused onto a focal plane.
Step 3 (1203): “Compute XLP=XLP+alpha(X-XLP)”. This may be performed using the following MATLAB instructions. XLP will be a time-domain low-passed version of X. Note that variable “alpha” is between zero and one and controls the low pass filter cutoff frequency. This parameter may be adjusted for a particular application.
alpha=0.1; % set to a value between 0 and 1
XLP=XLP+alpha*(X-XLP);
Step 4 (1204): “Set X1=X2”. This may be performed with the following MATLAB instruction:
Step 5 (1205): “Set X2=X−XLP”. X2 will be a time domain high-passed version of X. In other words, each element of X2 will be a time domain high-passed version of the corresponding element of X. This may be performed with the following MATLAB instruction:
Step 6 (1206): “Set X1F=X1+FP and X2F=X2+FP”. This may be performed with the following MATLAB instructions:
Step 7 (1207): “Compute optical flow from X1F and X2F”. Using X1F and X2F respectively as a first and second image, compute the optical flow between the two images. The result becomes the output optical flow of the algorithm. After completing this step the algorithm returns to Step 2 (1202). Steps 2 (1202) through Steps 7 (1207) therefore represent one cycle of the algorithm 1200. The computation of optical flow may be performed with a wide variety of algorithms. One possible method of computing optical flow may be performed using the following MATLAB instructions:
delta=1;
[ofx,ofy]=ii2(X1F,X2F,delta);
using the following MATLAB function “ii2”. This function is an implementation of Srinivasan's “Image Interpolation Algorithm (IIA)” which is disclosed in the aforementioned publication “An image-interpolation technique for the computation of optical flow and egomotion” by Srinivasan.
It will be understood that other optical flow algorithms may be used, including but not limited to variations of the aforementioned algorithm by Lucas and Kanade. The MATLAB function LK2 below shows an implementation of the Lucas and Kanade optical flow algorithm in two dimensions:
The above algorithm 1200 functions as follows: At the end of Step 5 (1205), the matrices X1 and X2 will contain two time domain high-passed images based on two sequential frames of X acquired from the image sensor. Note that it will take several cycles of the algorithm to occur before a good optical flow measurement is obtained. This is because it will take at least several cycles for X1 and X2 to represent valid sequential frames, and also because it will take time for the matrix XLP to adapt towards the input environment and the fixed pattern noise in the image sensor. Since the matrices X1 and X2 are time domain high-passed versions of sequential values of X, and since fixed pattern noise is essentially constant (e.g. a “DC term”), the fixed pattern noise is filtered out and thus substantially removed in X1 and X2.
However note that in algorithm 1200 we do not compute optical flow directly from X1 and X2. This is because if the actual visual motion stops, then X will be substantially constant. Then XLP will adapt to become equal to the input image X. This will cause X1 and X2 to be matrices with values near zero. Electrical noise in the image sensor or the entire system may dominate the values X1 and X2, and may cause erroneous large optical flow measurements. We mitigate this problem by adding back a small but controlled fixed pattern FP to X1 and X2, to generate X1F and X2F, and compute optical flow on these latter values. Thus when the scene is unchanging, X1F and X2F will be dominated by the same pattern FP, and thus the computed optical flow will be near zero. The use of FP in this manner may be considered a practical modification to limit the computed optical flow when the actual visual motion stops.
Note that the value “delta=1” is appropriate for when the optical flow is generally less than one pixel per frame. For larger optical flow, a larger value of “delta” may be appropriate. In this case, it may be beneficial to smooth images X1F and X2F (either in hardware using the techniques above or in software) before calling the function “ii2”. Also note that the parameter “alpha” adjusts the cutoff frequency of the high pass filter being applied to pixel values, with a higher value of “alpha” corresponding to a higher cutoff frequency. Thus if “alpha” is set too high for a particular environment, the algorithm may have trouble sensing motion. Finally note that other optical flow algorithms than that depicted in the above MATLAB function “ii2” may be used.
If we know for a fact that the above algorithm will be used in an application in which the sensor will always be in motion, it may be possible to remove Step 1206 and compute optical flow directly from X1 and X2.
Note that a system incorporating any image sensor, in particular the prior art image sensor of
Refer to
W(m,n)=(m,n)×E(m,n),
where matrices L and E are element-wise multiplied. However when the light pattern W is presented to an image sensor with logarithmic response pixels, the acquired image X will be based on the logarithm of W, in the following equation (using element-wise matrix arithmetic):
It will be understood that the value log(E) is a DC term. Hence it is, from the perspective of the above algorithm, mathematically similar to any fixed pattern noise already inherent in the image sensor. Thus the above algorithm 1200 of
Let us discuss another application of an optical flow sensor implemented using algorithm 1200. Refer to
An optical flow sensor used in this configuration may be used to measure slip between the wheel 1325 and the road 1327. Knowledge of wheel slip or tire slip is useful since it can indicate that the car 1323 is undergoing rigorous motion or potentially spinning out of control. If the car 1323 is a high performance sports car or race car, then knowledge of wheel slip or tire slip may be used to help detect the car's physical state and assist with any vehicle stability mechanisms to help the car's driver better control the car 1323. Wheel slip may be measured as follows: First compute the two dimensional optical flow as seen by the optical flow sensor 1321 in pixels per frame. Then compute the optical flow in radians per second by multiplying the pixels per frame optical flow measurement by the frame rate of the optical flow sensor in frames per second, and multiplying the result by the pixel pitch in radians per pixel. It will be understood that the pixel pitch in radians per pixel may be obtained by dividing the pitch between pixels on the image sensor by the focal length of the vision sensor's optics. Then multiply the two dimensional optical flow measurement in radians per second by the height 1333 to obtain a two dimensional measurement of the actual ground velocity 1329. Then, measure the angular rate of the wheel 1325, and multiply the angular rate by the radius of the wheel 1325. This will produce a wheel speed measurement. Produce a wheel velocity measurement by forming a vector according to the orientation of the wheel, which may generally be perpendicular to the wheel's axle, and whose magnitude is the wheel speed measurement. Wheel slip or tire slip is then the difference between the actual ground velocity measurement and the wheel velocity measurement.
In an outdoor environment, the presence of sun 1335 may affect the accuracy of the optical flow measurement seen by the optical flow sensor 1321. This is because at certain angles, the sun 1335 may cast a shadow on the road 1327. If the border 1337 of the shadow rests partially within the field of view 1331 of the optical flow sensor 1321, the shadow may corrupt the optical flow measurement, in particular if the contrast of the shadow is stronger than any texture in the road 1327. As the car 1323 drives through a curve, the shadow's boundary 1337 itself may move, further adding erroneous components to the optical flow measurement. It is therefore desirable to remove the effects of the shadow on the optical flow measurement. One characteristic that may be exploited, however, is the fact that for many cars, driving speeds that are sufficient to cause slip between the wheel 1325 and the road 1327, the optical flow viewed by the sensor 1321 due to the road 1327 will be much faster than any optical flow due to the moving shadow edge 1337.
Suppose the optical flow sensor 1321 is implemented using the vision sensor 281 described above, using a logarithmic response image sensor and the exemplary algorithm 1200 shown in
Preliminary Information for Vision Based Flight Control
We will now discuss the control of an air vehicle using optical flow, which may be performed using the exemplary image sensors described above. First we provide background information that will be useful for the teachings that follow. It will be understood that an “air vehicle” may refer to any vehicle capable of flying, including but not limited to a helicopter, a fixed-wing air vehicle, a samara-type air vehicle, a helicopter with coaxial and contra-rotating rotors, and a quad-rotor helicopter, or any other type of air vehicle. The teachings below will be described in the context of a small rotary-wing air vehicle flying in air. It will also be understood that the following teachings may be applied to vehicles capable of moving through other mediums, including but not limited to underwater vehicles and space-borne vehicles.
An air vehicle may undergo six basic motions, three associated with linear translation and three associated with rotation.
It will be understood that the directions of the optic flow vectors shown in
It will also be understood that if the air vehicle 1403 were undergoing a motion that is a weighted sum of the six basic motions depicted in
First Exemplary Method for Vision Based Hover in Place
In the first exemplary method for vision based hover in place, the optical flow or visual motion in the yaw plane may be measured by a ring of sensors positioned around the vehicle to see in all directions. The optical flow values are used to compute visual displacement values, which may be the integrals of the optical flow values over time. In one variation, visual displacements are computed directly.
Two servos (not shown) control the pose of the swash plate 1651 via two control arms 1653 and 1655. In the exemplary air vehicle 1631, the servos may be mounted on the rear side of a controller board 1657 mounted towards the front side of the air vehicle 1631, and are thus not visible in
A number of passive stability mechanisms may exist on air vehicle 1631 that may simplify its control. A stabilizer bar 1659 on the upper rotor 1649 may implement a passive feedback mechanism that dampens roll and pitch rates. Also when flying, both rotors will tend to cone in a manner that exhibits a passive pose stability mechanism that tends to keep the air vehicle 1631 horizontal. Finally a tail fin 1661 may dampen yaw rates through friction with air. These passive stability mechanisms may be augmented by a single yaw rate gyro (not shown), which may be mounted on the controller board 1657. The yaw rate measurement acquired by the yaw rate gyro may be used to help stabilize the air vehicle's yaw angle using a PID (proportional-integral-derivative) control rule to apply a differential signal to the motors 1641 and 1643 as described above.
As a result of these passive stability mechanisms, such helicopters tend to be stable in flight and will generally remain upright when the swashplate servos are provided with a neutral signal. These helicopters may be controlled in calm environments without having to actively monitor and control roll and pitch rates. Therefore, the teachings that follow will emphasize control of just the yaw rate, heave rate, and the swash plate servo signals. The term “heave signal” will refer to a common mode applied to the rotor motors 1641 and 1643 causing the air vehicle 1631 to ascend or descend as described above. The term “yaw signal” will refer to a differential mode applied to the rotor motors 1641 and 1643 causing the air vehicle 1631 to undergo yaw rotation. The term “roll servo signal” will refer to a signal applied to the servo that manipulates the swashplate 1651 in a manner causing the helicopter to undergo roll rotation, e.g. rotate about the X axis 1405, and therefore move in the Y direction 1407. The term “pitch servo signal” will refer to a signal applied to the servo that manipulates the swashplate 1651 in a manner causing the air vehicle 1631 to undergo pitch rotation, e.g. rotate about the Y axis 1407, and therefore move in the X direction 1405.
Also shown in
In the first exemplary method, the sensors on the sensor ring are capable of measuring visual motion displacements, or equivalently “visual displacements”. A visual displacement is similar to optical flow in that both are a measure of visual motion. The difference is that optical flow represents an instantaneous visual velocity, whereas a visual displacement may represent a total visual distance traveled. Visual displacement may thus be considered to be an integral of optical flow over time. For example, optical flow may be measured in degrees per second, radians per second, or pixels per second, whereas visual displacement may be respectively measured in total degrees, radians, or pixels traveled. Methods of measuring visual displacement using optical flow and other algorithms will be discussed below.
The vision processor 1721 operates the image sensors 1611 through 1618 to output analog signals corresponding to pixel intensities, and uses an analog to digital converter (not shown) to digitize the pixel signals. It will be understood that the vision processor 1721 has access to any required fixed pattern noise calibration masks, in general at least one for each image sensor, and that the processor 1721 applies the fixed pattern noise calibration mask as needed when reading image information from the image sensors. It will also be understood that when acquiring an image, the vision processor accounts for the flipping of the image on an image sensor due to the optics, e.g. the upper left pixel of an image sensor may map to the lower right area of the image sensor's field of view. The vision processor 1721 then computes, for each image sensor, the visual displacement as seen by the image sensor.
The vision processor 1721 then outputs one or more motion values to a control processor 1725. The control processor 1725 may be located on the control board 1657 of
The second step 1813 is to grab initial image information from the visual scene. For example, this may include storing the initial images acquired by the image sensors 1611 through 1618. This may also include storing initial visual position information based on these images.
The third step 1815 is to initialize the position estimate. Nominally, this initial position estimate may be “zero” to reflect that it is desired for the air vehicle to remain at this location.
The fourth through seventh steps 1817, 1819, 1821, and 1823 are the recurring steps in the algorithm. One iteration of the fourth, fifth, and sixth steps may be referred to as a “frame”, and one iteration of the seventh step may be referred to as a “control cycle”. The fourth step 1817 is to grab current image information from the visual scene. This may be performed in a similar manner as the second step 1813 above.
The fifth step 1819 is to compute image displacements based on the image information acquired in the fourth step 1817. In this step, for each sensor of the eight sensors 1611 through 1618 the visual displacement between the initial visual position and the current visual position is computed.
The sixth step 1821 is to compute the aforementioned motion values based on the image displacements computed in the fifth step 1819.
The seventh step 1823 is to use the computed motion values to control the air vehicle 1631. For some implementations, the resulting control signals may be applied to the air vehicle 1631 once every frame, such that the frame rate and the control update rate are equal. For other implementations, a separate processor or even a separate processor thread may be controlling the air vehicle 1631 at a different update rate. In this case the seventh step 1823 may comprise just sending the computed motion values to the appropriate processor or processor control thread.
The above seven steps will now be described in greater detail. In the first exemplary method 1801, the first step is to perform a general initialization. The helicopter and all electronics are turned on if they are not on already.
In the exemplary embodiment, the second step 1813 is to grab initial image information from the eight image sensors 1611 through 1618. For each sensor i, a horizontal image Hio and a vertical image Vio is grabbed in the following manner: Let Ji(j,k) denote the pixel (j,k) of the 64×64 raw image located on image sensor i. The indices j and k indicate respectively the row and the column of the pixel. This 64×64 image J, may then be converted into a 32 element linear image of superpixels using binning or averaging. Such 32 element linear images of superpixels may be acquired using the aforementioned techniques described with the three exemplary image sensors. The first superpixel of Hio may be set equal to the average of all pixels in the first two columns of J, the second superpixel of Hio may be set equal to the average of all pixels in the third and fourth columns of Ji, and so forth, until all 32 elements of Hio are obtained from the columns of J. Entire columns of the raw 64×64 image J, may be binned or averaged together by setting V1611 through V8618 all to digital high. Vertical image Vio may be constructed in a similar manner: The first superpixel of Vio may be set equal to the average of the first two rows of J, the second superpixel to the average of the third and fourth rows, and so forth. Entire rows of the raw 64×64 image may be binned or averaged together by setting H1601 through H8608 all to digital high. The images Hio and Vio therefore will respectively have a resolution of 32×1 and 1×32. These images may be referred to as “reference images”.
It is possible to generate the horizontal and vertical images Hio and Vio in other ways. For example, it is possible to acquire all 4096 pixels from the raw 64×64 image Ji, and then arithmetically calculate the horizontal and vertical images by averaging rows and columns as appropriate. Another possibility is to use any image sensor having a binning capability. Aside from the above three exemplary image sensors, other example image sensors that have binning capability are described in the following list of U.S. Patents, the contents of which are incorporated by reference: U.S. Pat. No. 5,949,483 by Fossum, Kemeny, and Pain; U.S. Pat. No. 7,408,572 by Baxster, Etienne-Cummings, Massie, and Curzan; U.S. Pat. No. 5,471,515 by Fossum, Mendis, and Kemeny; and U.S. Pat. No. 5,262,871 by Wilder and Kosonocky.
The third step 1815 is to initialize the position estimate. In the exemplary method, this may be performed for each sensor i by setting accumulated displacement signals uio=0 and vio=0.
The fourth step 1817 is to grab current image information from the visual scene. For each image sensor i, grab current horizontal image Hi and current vertical image Vi using the same techniques as in the second step 1813.
The fifth step 1819 is to compute image displacements based on the images Hi, Vi, Hio, and Vio. Refer to
fm=length(Hoi);
ndxs=2:fm-1;
f0=Hoi(ndxs);
fz=Hi(ndxs);
f1=Hoi(ndxs-1);
f2=Hoi(ndxs+1);
top=sum((fz-f0).*(f2-f1));
bottom=sum((f2-f1).̂2);
ui=−2*top/bottom;
Alternatively, ui may be computed using a one-dimensional version of the aforementioned optical flow algorithm by Lucas and Kanade. The variable vi may be computed from Vi and Vio in the same similar manner. It will be understood that although the above calculations are described above in the MATLAB programming language, they can be rewritten in any other appropriate programming language. It will also be understood that both sets of calculations written above are capable of obtaining a displacement to within a fraction of a pixel of accuracy, including displacements substantially less than one pixel. It is beneficial for the method 1801 to be performed at an adequately fast rate that the typical displacements measured by the above MATLAB script (for computing ui from Hoi and Hi) are less than one pixel. The selection of the frame rate may therefore depend on the dynamics of the specific air vehicle.
An alternative is to use a one dimensional version of the aforementioned optical flow algorithm by Lucas and Kanade. For computing this may be performed by using the following MATLAB function LK1, with input array “X1” being set to Hio, input array “X2” being set to Hi, and ui being set to the resulting value “Shift”:
The second part 1863 of the three part process 1851 is to update Hio, uio, ui, Vio, vio, and vi if necessary. In the exemplary embodiment, this is performed if the magnitude of ui or vi is greater than a predetermined threshold θ. It is beneficial for the value of θ to be less than one pixel, for example about a quarter to three quarters of a pixel. More specifically:
The third part 1865 of the three part process 1851 is to compute the resulting total displacements. Let uid and vid be the respective total displacements in the u 1417 and v 1419 directions. These values may be set as:
u
i
d
=u
i
0
+u
i
vi
The sixth step 1821 is to compute the motion values based on the image displacements computed in the previous step 1819. In the exemplary embodiment, these may be computed based on the displacement values uid and vid. A total of six motion values may be computed based on the optical flow patterns shown above in
It will be understood that each of these motion values is effectively an inner product between the visual displacements uid and vid and the respective optical flow pattern from one of
These six motion values can serve as a measure of how far the air vehicle 1403 has drifted from the original location. The a0 motion value is a measure of the yaw rotation, e.g. rotation about the Z axis 1409. The a1 motion value is a measure of horizontal drift in the sideways direction, e.g. drift parallel to the Y axis 1407. The b1 motion value is a measure of horizontal drift in the forward-backward direction, e.g. drift parallel to the X axis 1405. The c0 motion value is a measure of drift in the heave direction, e.g. drift parallel to the Z axis 1409. The c1 motion value is a measure of pitch rotation, e.g. rotation about the Y axis 1407. Finally the d1 motion value is a measure of roll rotation, e.g. rotation about the X axis 1405. It will be understood that the three motion values associated with translation, e.g. a1, b1, and c0 express a distance traveled that is relative to the size of the environment, and not necessarily an absolute distance traveled. Suppose the air vehicle 1631 is in the center of a four meter diameter room, and drifts upwards by 1 meter, and as a result c0 increases by a value “1”. If the same air vehicle 1631 were placed in the center of a two meter diameter room, and drifted upwards by 1 meter, c0 may increase by a value of “2”. It will be understood that the relative nature of the motion values a1, b1, and c0 will generally not be an issue for providing hover in place because when an air vehicle is hovering in place, it is generally staying within the same environment. Similarly, if the air vehicle's position is perturbed, the result will be a change of the motion values from zero (or their original values). Controlling the air vehicle to bring the motion values back to zero (or their original values) may then bring the air vehicle back to its original location to recover from the position perturbation.
Obtaining motion values in the manner shown above has several advantages. First, monitoring a wider visual field increases the chance of finding visual texture that can be used to provide motion estimate. If the imagery seen by one or several of the sensors is poor or lacking in contrast, the above method may still be able to provide a useful measurement. Second, if the individual visual displacement measurements from individual sensors are corrupted by noise that is independent for each sensor, then the effect of the noise on the resulting motion values is reduced by the central limit theorem.
It is beneficial for the sensors 1611 through 1618 to be arranged to cover a wide field of view as shown in
It will be understood that although it is beneficial for the sensor ring 1663 shown in
If the air vehicle being used is passively stable in the pitch and roll directions, then it will not be necessary to measure pitch and roll displacements. In this case, the step of computing motion values c1 and d1 may be omitted. Similarly, if the air vehicle has a yaw rate gyro, then the yaw angle may be controlled additionally or instead by the measured yaw rate.
The seventh step 1823 is to use the motion values to control the air vehicle. In the exemplary embodiment, this may be performed using a proportional-integral-derivative (PID) control rule. PID control is a well-known algorithm in the art of control theory. The yaw angle of the air vehicle may be controlled by using a PID control rule to try to enforce a0=0, by applying the control signal to the rotor motors in a differential manner as described above. The heave of the air vehicle may be controlled by using a PID control rule to try to enforce c0=0, by applying the control signal to the rotor motors in a common mode manner as described above. The drift parallel to the X direction 1405 may be controlled by using a PID control rule to try to enforce b1=0, by applying the control signal to the swashplate servo that adjusts pitch angle. Finally the drift in the Y direction may be controlled by using a PID control rule to try to enforce a1=0, by applying the control signal to the swashplate servo that adjusts roll angle. Keeping these motion values generally constant has the effect of keeping the air vehicle in one location since if all the motion values are constant, then generally the measured sensor displacements are being kept constant, and therefore the air vehicle is generally not moving. Similarly keeping these motion values all near zero has the effect of keeping the air vehicle in its original location at Step 1813.
For purposes of definition, we will use the term that a motion value is kept “substantially constant” to mean that the associated state value is allowed to vary within a limited range when no external perturbations (e.g. wind gusts) are applied. For example, if it is said that the yaw angle is kept substantially constant, then the actual yaw angle of the air vehicle may vary with a range of ±θ, where θ is a reasonable threshold for an application, which may be one degree, ten degrees, or another appropriate value. Likewise, if it is said that the position of an air vehicle is kept substantially constant, then the air vehicle may move around within a sphere centered at its original location, with a reasonable radius of the sphere for a given application and environment. The allowable size of the sphere may be increased for larger environments.
It will be understood that the use of the aforementioned image interpolation algorithm is for illustrative purposes and that a variety of other optical flow or visual motion algorithms, as well as a variety of other array sizes, may be used. For example, rather than binning the raw 64×64 array down into 32×1 or 1×32 images, it is possible to form 64×8 and 8×64 images using respectively 1×8 and 8×1 superpixels, and then obtain eight one dimensional optical flow measurements from each of these images, to produce a total of 64 optical flow measurements in each direction over the eight vision sensors. Similarly the image from each sensor may be divided into a two dimensional array, for example by downsampling the 64×64 raw pixel array into 8×8 arrays. Then a direct two dimensional algorithms, for example the aforementioned algorithm “ii2’, may be used to directly compute both horizontal and vertical displacements ui and vi. It will be understood that yet other variations are possible.
Second Exemplary Method for Vision Based Hover in Place
A number of variations may be made to the first exemplary method 1801 by using different methods of computing visual displacements. An example is the second exemplary method for vision based hover in place, which shall be described next. The second exemplary method may require a faster processor and faster analog to digital converter (ADC) than the first exemplary method, but does not require the use of an image sensor with binning capabilities. The second exemplary method uses the same steps shown in
The first step 1811 is unchanged. In the second step 1813, for each sensor of the eight sensors 1611 through 1618 all pixels of the 64×64 image are digitized and acquired. Let Ri denote the 64×64 matrix that corresponds to the raw 64×64 image of sensor i. A patch of pixels Wi is then selected near the middle of the image Ri. The patch may be an 11×11, 13×13, or other similar block size subset of the raw pixel array Ri. It will be understood that non-square patch sizes, e.g. 11×13 or other, may be used. Let the variable ws denote the size of the block in one dimension, so that the size of patch Wi is ws×ws. The patch of pixels Wi may be chosen using a saliency algorithm or a corner detection algorithm, so that it is easy to detect if the block moves horizontally or vertically in subsequent frames. The implementation of saliency or corner detection algorithms is a well-known art in image processing. This block is stored in matrix Let the values mi0 and ni0 respectively store the vertical and horizontal location of the block for example by setting and equal to the pixel location of the upper left pixel of block Wi in the raw 64×64 image Ri. Additionally, set mi=mi0 and ni=ni0.
The third step 1815 is unchanged. In the fourth step 1817, the 64×64 matrices Ri corresponding to each sensor i are again acquired.
The fifth step 1819 is to compute image displacements based on the current matrices Ri. This step may be performed in three parts that are similar to the three parts 1851 discussed in the first exemplary method. In the first part 1861, a block tracking algorithm is used to determine to where block Wi has moved in the current image Ri. This may be performed by searching around previous location defined by mi and ni for the ws×ws window that best matches W. This may be performed using a sum of squares of differences (SSD) match metric, minimum absolute difference (MAD) metric, variation of differences (VOD), correlation metrics, or other metrics. The implementation of block matching algorithms for sensing visual motion is a well-known and established art in image processing. Set mi and ni to the new best match locations. The values ui and vi may be computed as follows:
u
i
=n
i
−n
i
0
v
i=−(mi−mi0)
The negative sign in the equation for vi is due to the convention that top-most row of pixels of a matrix are given the lowest row index number.
The second part 1863 of the three-step process 1851 is to update Wi, mi0, ni0, ui0, and vi0 as needed. More specifically:
The purpose of the above set of steps is to handle the situation that occurs if the window Wi is about to move off the image Ri. In this case, the accumulated displacements ui0 and vi0 are updated and a new window Wi is grabbed using the same techniques as above in the second step 1813. The threshold θ may be a value such as one, two, three, or another number of pixels depending on parameters such as the air vehicle's speed, the frame rate at which the system operates, or the scale of the environment in which the air vehicle is operating. It is beneficial for θ to be greater than the search radius used to search for the motion of the block Wi from frame to frame.
The third part 1865 of the three part process 1851 may be performed in the same manner as above. The sixth step 1821 and seventh step 1823 may be performed in the same manner as in the above exemplary algorithm, however the control constants for the PID control rules may need to be modified. For the viewing pose angle γi associated with each block one may use just the pose angle of the respective sensor i, or one may use a pose angle constructed from both the pose angle of sensor i and the (mi,ni) location of the block in image Ri.
For a graphical depiction of block matching, refer to
The above variation will be particularly useful if the air vehicle is operating in a controlled environment in which visual features are deposited on the walls to facilitate the selection of a salient block to track. Sample visual features may include dark or bright patterns on the walls, or may include bright lights. If bright lights are used, it may be possible to eliminate the steps of extracting blocks Wi from the raw images Ri and instead look for bright pixels which correspond to the lights. This variation is discussed below as the fourth exemplary algorithms.
A variation of the second exemplary method may be implemented by tracking more than one patch of pixels in each image sensor. This variation may be appropriate if the environment surrounding the air vehicle is textured enough that such multiple patches per image sensor may be acquired. In this case the motion values may be computed using all of the pixel patches being tracked.
Third Exemplary Method for Vision Based Hover in Place
The third exemplary method for vision based hover in place is essentially identical to the first exemplary method 1801 with one change: The fifth step 1819 may be modified so that the optical flows ui and vi obtained every frame are directly integrated in order to obtain ui0 and vi0. More specifically, the fifth step 1819 may then be implemented as follows:
u
i
0
=u
i
0
+u
i;
u
i=0;
H
i
0
=H
i;
v
i
0
=v
i
0
+v
i;
v
i=0;
V
i
0
=V
i;
It will understood that this variation is mathematically equivalent to the first exemplary method when θ=0, e.g. the variables Hi0, ui0, Vi0, and vi0 are always updated every frame. This variation will achieve the intended result of providing hover in place, however in some applications it may have the disadvantage of allowing noise in the individual optical flow measurements to accumulate and manifest as a slow drift or random walk in the air vehicle's position.
Variations to the First Three Exemplary Methods
A number of other variations of the three exemplary methods of vision based hover in place may be made. For example, if the air vehicle is not passively stable in the roll or pitch directions, or if the air vehicle experiences turbulence that disturbs the roll and pitch angles, the c1 and d1 motion values may be used to provide additional control input, mixed-in, to the appropriate swashplate servos.
Another variation is to mount the sensor ring along a different plane. The mounting position in the X-Y plane, e.g. the yaw plane, has already been discussed above. Another possible mounting position is in the X-Z plane, e.g. the pitch plane. In this mounting location, the a0 motion value indicates change in pitch, the a1 motion value indicates drift in the heave or Z direction, the b1 motion value indicates drift in the X direction, the c0 motion value indicates drift in the Y direction, the c1 motion value indicates change in yaw, and the d1 motion value indicates change in roll. Yet another possible mounting location is in the Y-Z plane, e.g. the roll plane. In order to increase the robustness of the system, it is possible to mount multiple sensor rings in two or all of these directions, and then combine or average the roll, pitch, yaw, X, Y, and Z drifts detected by the individual sensor rings.
It is possible to mount a sensor ring in positions other than one of the three planes discussed above. However in this case it may be necessary to transform the measured values a0, a1, b1, c0, ci, and d1 to obtain the desired roll, pitch, yaw, X, Y, and Z drifts.
It will be understood that a number of other variations may be made to the above exemplary embodiments that would provide HIP capability. For example, clearly the array of sensors need not be literally placed in a physical ring such as shown in
Exemplary Method of Control Incorporating an External Control Source Including Human Based Control
The above three exemplary methods of providing vision based hover in place focus on methods to keep the air vehicle hovering substantially in one location. If the air vehicle is perturbed, due to randomness or external factors such as a small gust of air, the above methods may be used to recover from the perturbation. It is desirable for other applications to integrate external control sources including control sources from a human operator. The external control source may then provide general high-level control information to the air vehicle, and the air vehicle would then execute these high-level controls while still generally maintaining stability. For example, the external control source may guide the air vehicle to fly in a general direction, or rotate in place, or ascend or descend. Alternatively a human operator, through control sticks (e.g. 1731) may issue similar commands to the air vehicle. This next discussion will address how such higher level control signals from an external control source (whether human or electronic) may be integrated with the above exemplary methods of providing vision-based hover in place. For the purposes of illustration, we will assume that four external control signals are possible that correspond to the four motion types associated with the four motion values a0, a1, b1, and c0. In other words, if the external control input is from control sticks 1731, then the control sticks may respectively indicate yaw rotation, X motion, Y motion, and heave motion. Below we discuss two methods of incorporating an external control signal into hover in place, which may be applied to any of the aforementioned three methods of vision based hover in place.
One method of incorporating an external control signal is to add an offset to the computed motion values a0, a1, b1, and c0. For example, adding a positive offset to the value c0, and sending the sum of c0 and this offset to the PID control rule modifying heave, may give the heave PID control rule the impression the air vehicle is too high in the Z direction. The PID algorithm would respond by descending e.g. traveling in the negative Z direction. This is equivalent to changing a “set point” associated with the c0 motion value and thus the air vehicle's heave state. If a human were providing external control input via control sticks (e.g. 1731), then the offset value added to the c0 parameter may be increased or decreased every control cycle depending on the human input to the control stick associated with heave. The air vehicle may similarly be commanded to rotate in the yaw direction (about the Z axis) or drift in the X and/or Y directions by similarly adjusting respectively the a0, b1, and a1 motion values.
A second method of incorporating an external control signal is to modify Step 1823 to overwrite one or more of the motion values computed in Step 1821 with new values based on the external control signal. Suppose the external control signal were provided by a human operator via control sticks (e.g. 1731). When the control sticks are neutral, e.g. the human is not providing input, then the Step 1823 may operate as described above. When one of the control sticks is not neutral, then Step 1823 may be modified as follows: For all external control inputs that are still neutral, the corresponding motion values computed in Step 1821 may be untouched. However for all external control inputs that are not neutral, the corresponding motion value may be set directly to a value proportional to (or otherwise based on) the respective external control input. For example, if the human operator adjusts the sticks 1731 to indicate positive heave, e.g. ascending in the Z direction, then the c0 motion value may be overwritten with a value based on the external heave control signal, and the other three motion values a0, b1, and a1 may be left at their values computed in Step 1821. The algorithm may then perform Step 1823 using the resulting motion values. Finally, if at any time during the execution of algorithm 1801, it is detected that one of the external control signals is released back to neutral, for example if the human operator lets go of one of the control sticks, then the algorithm may reset by going back to Step 1813. This will cause the algorithm to initiate a hover in place in the air vehicle's new location.
Because of the dominating nature of yaw rate on many rotary wing air vehicles, it may be beneficial to further modify the second method of incorporating an external control signal so that if an external yaw signal is provided, for example a human is moving the control sticks 1731 to command a yaw turn, then all other motion values are zeroed out. This may prevent rapid yaw turns from causing erroneous motion values associated with the other types of motion (e.g. heave, X drift, and Y drift).
Fourth Exemplary Method for Vision Based Hover in Place
The fourth exemplary method for providing vision based hover in place to an air vehicle will now be discussed. The fourth exemplary method may be used in environments comprising an array of lights arranged around the environment. Such lights may be substantially point-sized lights formed by bright light emitting diodes (LEDs) or incandescent lights or other similar light sources. If the lights are the dominant sources of light in the environment, and when viewed by an image sensor appear substantially brighter than other texture in the environment, then it may be possible to compute image displacements by just tracking the location of the lights in the respective images of the image sensors. Will now discuss this variation in greater detail.
For purposes of discussion, let us assume the air vehicle is the same rotary wing platform to that discussed above (e.g. 1631), and that the air vehicle contains the same aforementioned exemplary vision based flight control system 1701. Refer to
The fourth exemplary method for vision based hover in place has the same steps as the second exemplary method, with the individual steps modified as follows: The first step 1811 is unchanged.
The second step 1813 modified as follows: the vision processor 1721 acquires the same image Ri of 64×64 raw pixels from each sensor i. The image Ri is then negated, so that more positive values correspond to brighter pixels. This may be performed by subtracting each pixel value from the highest possible pixel value that may be output by the ADC. For example, if a 12-bit ADC is used, which has 4096 possible values, each pixel value of Ri may be replaced with 4096 minus the pixel value. For each image Ri, the vision processor 1721 identifies the pixels associated with bright lights in the environment. Refer to
2P>A+B+θ1;
2P>C+D+θ1;
P>max(A,B,C,D); and
P>θ
2,
where A, B, C, D, and P denote the intensities of the respective pixel points 2213, 2215, 2217, 2219, and 2211. The first two conditions are a curvature test, and are measure of how much brighter P 2211 is than its four neighbors. The third condition tests whether P 2211 is brighter than all of its four neighbors. The fourth condition tests whether P 2211 is brighter than a predetermined threshold. All pixel points in the pixel array 2201 are provided the same test to identify pixels that may be associated with lights in the environment. Thresholds θ1 and θ2 may be empirically chosen and may depend on the size of the lights (e.g. 2001, 2002, and so on) in the environment as well as how much brighter these lights are than the background.
After all eight images have been processed, we will have a collection of L points of light (e.g. light pixels) obtained from these eight images. It will be understood that L does not need to equal eight, since the number of lights detected seen by each image sensor need not equal one. For each light pixel j (of the L light pixels), we compute the angles γi and φj based on the respective pixel location and the calibration parameters associated with the image sensor that acquired the pixel associated with light pixel j. Sample calibration parameters include the pose of each image sensor (e.g. roll, pitch, and yaw parameters with respect to the coordinate system 1401), the position of the optics over the image sensor, and any geometric distortions associated with the optics.
The third step 1815 is to initialize the position estimate. We set the values
γj0=γj and
φj0=φj;
for j=1, . . . L, where γj0 and ωj0 represent the “initial image information” or equivalently the initial locations of the lights. It will be understood that if the image sensors (e.g. 1611 through 1618) have overlapping fields of view, it is possible for the same physical light to be detected in more than one image sensor and thus be represented twice or more in the list of L lights.
The fourth step 1817 is to grab current image information from the visual scene. Essentially this may be performed by repeating the computations of the second step 1813 to extract a new set of light pixels corresponding to bright lights in the environment and thus extract a new set of values γk and φk. Note that the number of points may have changed if the air vehicle has moved adequately that one of the sensors detects more or fewer lights in its field of view.
The fifth step 1819 is to compute image displacements. Step 1819 may also be divided into three parts described next: In the first part 1861, we re-order the current points γk and φk so that they match up with the respective reference points γj0 and φj. Essentially for each current point γk and φk we may find the closest reference point γj0 and φj0 (using a distance metric) and re-index the current point as γi and φj. Note that the current points and the reference points may be in different order due to motion of the air vehicle. Also note that some lights may disappear and other lights may appear, also due to motion of the air vehicle. There may also be ambiguities due to two points crossing paths, also due to motion of the air vehicle. We define any point γi and φi and its matched reference point γj0 and φj0 as “unambiguous” if there is only one clear match between the two points, which additionally means that point γj and φj is not a new point.
The second part 1863 of step 1819 is to compute the actual image displacements. This may be performed by computing the following displacements for each unambiguous point γi and φi:
u
1=γj−γj0 and
v
j=(φj−φj0.
It will be understood that the number of unambiguous points γj and φj may be a number other than eight.
The third part 1865 of step 1819 is to update the list of reference points γj0 and φj0. Any such reference points that are matched up to an unambiguous point γj and φj may be left in the list of reference points. New points γj and φj that appeared in the current iteration of step 1817 may be added to the reference list. These correspond to points of light that appeared. Any points γj0 and φj0 that were not matched up may be removed from the reference list. These correspond to points of light that disappeared.
The sixth step 1821 and seventh step 1823 may be performed in the same manner as described above for the first exemplary method. The sixth step 1821 computes motion values from uj and vj, while the seventh step 1823 applies control to the air vehicle.
Note that the locations of the lights in images Ri, as detected in step 21813 and step 41817, have a precision that corresponds to one pixel. Modifications may be made to these steps to further refine the position estimates to a sub-pixel precision. Recall again the point P 2211 and it's four neighbors in the pixel grid 2201. One refinement may be performed as follows: Let (m,n) denote the location of light point P 2211 in the pixel grid 2201, with m being the row estimate and n being the column estimate. If A>B, then use m−0.25 as the row estimate. If A<B, then use m+0.25 as the row estimate. If C>D, then use n−0.25 as the column estimate. If C<D, then use n+0.25 as the column estimate. These simple adjustments double the precision of the position estimate to one half a pixel.
A further increase in precision may be obtained by using interpolation techniques which shall be described next. Refer to
The sub-pixel precision estimate of the row location may thus be given by the value of h. The sub-pixel precision estimate of the column location may be similarly computed using the same equation, but substituting m with n, A with C, and B with D, where n is the column location of point P.
It is also possible to obtain a sub-pixel precision estimate for the point P 2211 using a curve other than a LaGrange polynomial as used in
The sub-pixel precision estimate of the column location may be similarly computed using the same equations, but substituting m with n, A with C, and B with D, where n is the column location of point P.
The use of either LaGrange interpolation or isosceles triangle interpolation may produce a more precise measurement of the light pixel location than using the simple A>B test. Which of these two methods is more accurate will depend on specifics such as the quality of the optics and the size of the lights. It is suggested that LaGrange interpolation be used when the quality of the optics is poor or if the lights are large. It is suggested that isosceles triangle interpolation be used when the images produced by the optics is sharp and when the lights are small in size.
Hover in Place for Samara Type Air Vehicles
Another type of rotary-wing air vehicle is known as a samara air vehicle. Samara air vehicles have the characteristic that the whole body may rotate, rather than just rotors. Effectively the rotor may be rigidly attached to the body as one rotating assembly. Examples of samara type air vehicles, and how they may be controlled and flown, may be found in the following papers, the contents of which shall be incorporated herein by reference: “From Falling to Flying: The Path to Powered Flight of a Robotic Samara Nano Air Vehicle” by Ulrich, Humbert, and Pines, in the journal Bioinspiration and Biomimetics Vol. 5 No. 4, published in 2010, “Control Model for Robotic Samara: Dynamics about a Coordinated Helical Turn”, by Ulrich, Faruque, Grauer, Pines, Humbert, and Hubbard in the AIAA Journal of Aircraft, 2010, and “Pitch and Heave Control of Robotic Samara Air Vehicles” in the AIAA Journal of Aircraft, Vol. 47, No. 4, 2010.
Refer to
Since samara type air vehicles rotate, it is possible to obtain an omnidirectional image from a single vision sensor 2415 mounted on the air vehicle 2401. Refer to
In order to define a single image from the omnidirectional field of view 2501, it is possible to use a yaw angle trigger that indicates the air vehicle 2401 is at a certain yaw angle. This may be performed using a compass mounted on the air vehicle 2401 to detect it's yaw angle and a circuit or processor that detects when the air vehicle 2401 is oriented with a predetermined angle, such as North. The two dimensional image sweeped out by the vision sensor 2415 between two such yaw angle trigger events may be treated as an omnidirectional image. Sequential omnidirectional images may then be divided up into subimages based on the estimated angle with respect to the triggering yaw angle. Visual displacements and then motion values may then be computed from the subimages.
It is suggested that a modification of the aforementioned third exemplary method for providing hover in place be used to compute displacements and motion values and ultimately allow the air vehicle 2401 to hover in place. The modified third exemplary method may be described as follows, using
In Step 1813, initial image information is obtained from the visual scene. Refer to
The third step 1815 may be performed essentially the same as in the third exemplary method. The primary difference is that yaw angle is not a meaningful quantity to control since the samara air vehicle is constantly rotating and the yaw angle may already be determined by a compass. The fourth step 1817 may be performed in the same manner as the second step 1813 by grabbing a new and similar omnidirectional image and a new set of eight subimages Ji.
Steps 1819, 1821, and 1823 may then be performed in the same manner as in the third exemplary method. In step 1823 the air vehicle 2401 may be controlled using any of the techniques described in the aforementioned papers by Humbert or any other appropriate methods.
To handle situations in which the air vehicle 2401 is undergoing accelerations or decelerations in the yaw rate, a slight variation of the above techniques may be used. Refer to
Other variations of the algorithm may be considered. For example, it may be possible to apply sub-pixel warpings to the omnidirectional images (2601 or 2701 and 2703) to account for any angular acceleration or deceleration that occurs within one cycle. If an accurate and fast enough compass is used, the instantaneous yaw angle associated with each column (e.g. like 2605) may be used to compute similar sub-pixel shiftings that may be used to accordingly warp the subimages or adjust the measured image displacements. Finally other optical flow algorithms may be used, for example a block matching algorithm as described in the aforementioned second exemplary method of providing vision based hover in place.
While the inventions have been described with reference to the certain illustrated embodiments, the words that have been used herein are words of description, rather than words of limitation. Changes may be made, within the purview of the appended claims, without departing from the scope and spirit of the invention in its aspects. Although the inventions have been described herein with reference to particular structures, acts, and materials, the invention is not to be limited to the particulars disclosed, but rather can be embodied in a wide variety of forms, some of which may be quite different from those of the disclosed embodiments, and extends to all equivalent structures, acts, and, materials, such as are within the scope of the appended claims.
This application claims the benefit to U.S. Provisional Patent Applications No. 61/320,718 entitled “Vision Based Hover in Place” and filed Apr. 3, 2010, No. 61/361,610 entitled “Vision Based Hover in Place in Lighted Environment” and filed Jul. 6, 2010, and No. 61/441,204 entitled “Image Sensor” and filed Feb. 9, 2011.
This invention was made with Government support under Contract No. W31P4Q-06-C-0290 awarded by the United States Army and Contract No. FA8651-09-C-0178 awarded by the United States Air Force. The Government has certain rights in this invention.
Number | Date | Country | |
---|---|---|---|
61320718 | Apr 2010 | US | |
61361610 | Jul 2010 | US | |
61441204 | Feb 2011 | US |