Various embodiments relate to automatic object targeting for camera view control.
Camera view control systems apply camera orientation adjustment and object recognition technologies to cover different area and to find a target object in camera view. Many prior art schemes work only when an object to be tracked is already in camera view and is recognized. Before that, users have to manually control the camera view device to scan a local area in order to find and specify an object of interest. There is no available method to automatically initialize the camera view to cover an object before it has been specified as the target object of interest. On the other hand, an object not appeared in camera view cannot be specified either.
While working in an activity area covered by a wireless local area network, the received radio wave signals from a wireless communication device associated to an object can be used to determine the position of the object in the activity area. Furthermore, when a field coordinate system is defined for the activity area, the object position can be uniquely represented in coordinates. Such identified object position can be used for a camera view control system to initially locate the object of interest in order to automatically target the object in camera view.
The following summary provides an overview of various aspects of exemplary implementations of the invention. This summary is not intended to provide an exhaustive description of all of the important aspects of the invention, or to define the scope of the inventions. Rather, this summary is intended to serve as an introduction to the following description of illustrative embodiments.
In a first illustrative embodiment, a camera includes a view control device configured for automatic object targeting; and a controller configured to receive radio wave signals from wireless communication devices in a wireless local area network and determine the position of an object in a field coordinate system based on radio wave signals from a wireless communication device associated to the object such that the camera orientation adjustment and object recognition can be used by the view control device to target the object in camera view according to the determined object position.
In a second illustrative embodiment, a method includes determining the position of an object in a field coordinate system based on received radio wave signals from a wireless communication device associated to the object and performing the camera orientation adjustment and object recognition for the camera view control device to target the object in camera view automatically according to the determined object position.
In a third illustrative embodiment, a view control system includes at least one controller configured to determine the position of an object in a field coordinate system based on received radio wave signals from a wireless communication device associated to the object and to perform the camera orientation adjustment and object recognition for the camera view control device to target the object in camera view automatically according to the determined object position.
Additional features and advantages of the invention will be made apparent from the following detailed description of illustrative embodiments.
As required, detailed embodiments of the present invention are disclosed herein; however, it is to be understood that the disclosed embodiments are merely exemplary of the invention that may be embodied in various and alternative forms. The figures are not necessarily to scale; some features may be exaggerated or minimized to show details of particular components. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a representative basis for teaching one skilled in the art to variously employ the present invention.
The present invention discloses methods and apparatus for a new camera view control system that is able to capture and follow an object of interest in camera view even before the object has been specified as the target object. As a result, fast and automatic object initialization can be achieved for camera based object targeting and object following functions.
In this system, video frames are captured from a camera whose orientation is determined by the camera view control device's position and motion in a camera system coordinate system. The camera's orientation and zooming ratio determine the region of a field covered by the camera view frame. The camera orientation also determines the position in the field where the center of the captured frame is at.
Exemplary embodiment of the camera's view control device for orientation adjustment includes the camera platform's pan and tile angles as well as their angular speeds and angular accelerations. Alternative embodiment of the camera's orientation adjustment is realized by a software feature that allows the camera view delivered to the user by panning and tilting digitally a sub-frame within the full view of the camera frame without physically moving the camera. The sub-frame of the camera video is then delivered to service customer as the video output. The camera has control system that has computer software function to recognize candidate objects in camera frames.
When object targeting service is requested from users, even before a target object is specified for the camera, the camera view control system determine the position of an object of interest in a filed coordinate system defined for an activity field based on received radio wave signals from wireless communication devices associated to the object. The association of a wireless communication devices to an object mostly indicates that the devices is attached to the object, hold by the object or it is following the object closely such that the position of the device is regarded as the position of the object in the activity field. The camera view controller next perform at least one of orientation adjustment and object recognition for the camera view control device to target the object in camera view according to the determined position of the object in the activity field.
With reference to
A first novelty of the present invention is the incorporation of the field coordinate system (FCS) 30 and the positioning system 64 to the camera view service system 10. The field coordinate system 30 enables seamlessly integration of the positioning system 64 and the camera view control 70 to achieve unified and high precision object positioning and camera targeting functions. Exemplary embodiment of the field coordinate system is a two dimensional or three dimensional Cartesian coordinate system. In the two dimension case, two perpendicular lines are chosen and the coordinates of a point are taken to be the signed distances to the lines. In the three dimension (3D) case, three perpendicular planes are defined for the local activity region and the three coordinates of any location are the signed distances to each of the planes. In the present embodiment of the invention, the field coordinate system 30 is a 3D system with three planes, X-Y, X-Z and Y-Z, perpendicular to each other. A position in the field coordinate system 30 has unique coordinates (x, y, z) to identify where it is and its geographic and geometric relationships with respect to other position and object in the activity field.
In the field coordinate system 30, an object surface 42 at the height of zo defines the base activity plane for an object 34. The object 34 is illustrated as a human being in the activity field 38. The object surface 42 can be in any orientation angle with respect to the 3D planes of the field coordinate system. In the present embodiment, it is illustrated as a plane that is parallel to the X-Y plane. The camera line-of-sight 18 is the centerline of the camera lens and it determines the center point of the camera view. The intersection point of the line-of-sight 18 of the camera 14 with the object surface 42 defines the aim-point 22. The position of the object 34 in the field coordinate system is defined by (xo, yo, zo). The coordinates of aim-point 22 is (xsc, ysc, zsc). On the object surface, the camera view region 26 is illustrated by a dark parallelogram. The view region 26 is the area on the activity field that is covered by the camera frames at the present camera orientation and zooming ratio. The aim-point 22 is at the center of the view region. Both the camera view frame and the view region 26 in the exemplary embodiments of the invention are rectangle in shape.
The position of the object 34 in the field coordinate system 30 is determined by the positioning system 64. In the primary embodiment of the invention, the positioning system is established based on a wireless local area network (WLAN). The WLAN and positioning system 64 comprises WLAN access points 54 and a WLAN positioning engine 62. The WLAN access points 54 receive radio wave signals 50 from a wireless communication device 46 in the activity field. Besides for normal communication functions, information from such signals as well as their reception properties are transmitted via communication channel 58 to the WLAN positioning engine 62 to determine the position of the wireless communication device 46 in the FCS 30. When the wireless communication device 46 is associated to the object 34, the determined position is also recognized as the position of the object 34 in the FCS 30. Subsequently, the determined position is communicated to the camera view control 70 via communication channel 66. Based on the determined object position in FCS 30, the camera view control 70 operates the camera view control device 14 to target the object in camera view using at least one of the following methods: 1) adjust camera orientation and zooming ratio such that the view region covers the object position with sufficient exhibition of the object in the camera view; 2) recognize and outline the object in the camera view presented to service users.
An embodiment of WLAN is a WiFi communication network. Typical wireless communication devices include WiFi tag device, smartphone and tablet computer, etc. From the WiFi devices, information data, operation commands and measurements are transmitted to the access points (APs) of the WiFi network. In an exemplary embodiment, these information and data will then be sent to a WLAN network station. Besides passing the normal network data, the WiFi network station redirect the received signal strength (RSS) data to the positioning engine 62, where the position of the WiFi device is determined based on fingerprinting data calibrated over the field coordinate system 30. Alternative embodiment of the wireless communication network can be a cellular network and an alternative embodiment of the positioning system can be a GPS.
It is important to point out that the following descriptions of the technologies use target objects on a planar activity ground as an example to demonstrate the invention. This shall not be treated to limit the scale of the invention. The presented technology can be easily modified and extended to support applications in which the activities of the target object involves large vertical motions. With reference to
WiFi positioning has a distinct advantage of low cost and wireless connectivity. Through the local WiFi network, service users connect to the camera view system from wireless communication devices like smartphone 148, tablet/laptop computer 152, and WiFi attachment devices 140. Although WiFi has not been designed for positioning, its radio signal can be used for location estimation by exploiting the Received Signal Strength (RSS) value measured with respect to WiFi access points (APs). Alternatively, Angle of Arrival, Time of Arrival and Time Difference of Arrival can be used to determine the location based on geometry. The positioning methods used are at least one of pattern recognition method, triangulation method and trilateration method. The RSS based Fingerprinting method is a type of pattern recognition method for positioning and it is used as the exemplary embodiment to illustrate the new object targeting technology.
A typical local WiFi network 100 comprises WLAN stations and multiple access points 132. The distribution of the access points constructs a network topology that can be used for RSS fingerprinting based positioning service. Beacons and information messages 136 are communicated between the local WiFi network 132 and the wireless service terminal devices. The local WiFi network 132 communicates received information and measurement data 128 with a WLAN management unit called WLAN manager 104. The WLAN manager 104 then directs the positioning measurement data 108 to the positioning engine 112 while it direct service related information and control data 124 to the camera control system 120. The positioning engine 112 processes the received positioning measurement data 108 from the WLAN manager 104 to determine the present position and motion of the wireless communication devices in FCS 30. The determined position and motion data 116 is then sent to the camera control system 120 for object targeting functions.
For the camera viewing system, both network based WiFi positioning system topology and terminal assisted WiFi positioning system topology can be used. In the network based topology, the RSS measurement is done centralized by WiFi network stations 132 rather than by the wireless service terminal devices. Beacons 136 for positioning purpose are sent from the wireless communication devices 140, 148 and 152 and they are received by the stations in local WiFi network 132. The RSS measurement is carried out at the stations based on their received beacon signal strength. On the other hand, in the terminal assisted WiFi positioning system topology, signal beacons are generated at the network stations in the local WiFi network 132. The RSS measurement is carried out at individual wireless communication devices 140, 148 and 152. These devices then package the RSS measurement into positioning data messages and transmit the messages through the local WiFi network 132 to the WLAN manager 104. In both system topologies, the RSS measurement data is then redirected to the positioning engine 112. This engine has a location fingerprinting database that stores the RSS values that are obtained at different calibration points in the area of interest. In positioning application, a location estimation algorithm is used to estimate the present location based on the measured RSS values from a WiFi device at an unknown location and the previously created database of RSS map.
Location fingerprinting based WiFi positioning systems usually work in two phases: calibration phase and positioning phase. The following descriptions use the network based WiFi positioning system topology as an exemplary embodiment to introduce the fingerprinting based positioning method. In the calibration phase, a mobile device is used to send out wireless signal beacons at a number of chosen calibration points. The RSS values are measured from several APs. Each measurement becomes a part of the radio map and is a tuple (qi, ri), for i=1, 2, . . . , n known calibration locations. qi=(xi, yi) are the coordinates of the i-th location in the field coordinate system. ri=(ri1, ri2, . . . , rim) are the m RSS values measured from APs with respect to signal beacons sent out at the calibration location. In the positioning phase, a mobile device sends out signal beacon at an unknown location. The RSS values are measures from the APs and the positioning engine estimate the location using the previously created radio map and a weighted k-Nearest Neighbors algorithm for location fingerprinting. After that, the (x, y) coordinate of the unknown location is determined. The fingerprinting techniques usually do not require knowing exact locations of APs.
With reference to
The camera with view control device 14 may contain different types of camera systems. Analog cameras and IP cameras are typically used. Depending on the camera system's orientation and zooming capability, the camera systems used are further classified to three categories: static camera, static zooming (SZ) camera and pan-tilt-zooming (PTZ) camera. A static camera has fixed orientation and focus after installation. In other words, the camera view and view region with respect to the activity field 38 is fixed. A SC camera has fixed orientation but automatically adjustable zooming ratio such that the area of the view region can be changed according to the zooming ratio selected. A PTZ camera can change both its orientation by adjusting its pan and tile angles and its view region by adjusting its zooming ratio. As a result, it provides flexibility to change the view region toward an area of interest with the best view centering capability in order to place the point of interest in the activity field at the center of the camera frames as close as possible.
With reference to
The camera 304 is an optical instrument that records images and videos of camera views. The camera 304 has a line-of-sight 320 that determines the center of its view. The camera 304 has a camera platform 306 that can provide pan and tilt motion to adjust the orientation of the camera line-of-sight 320. The camera platform's pan angle 312 and tilt angle 316 determines its coordinates (α,β) 318 in a camera orientation coordinate system 324. The camera view control device also comprises a position sensing unit 328 to measure and report the present pan and tile angle of the camera platform 306. The position sensing unit 328 may further provide the pan and tilt motion measurements of the camera platform 306.
The camera view control device optionally comprises a camera track subsystem 332 that supports translational motion of the camera platform on the track through movable connections 336. Multiple movable connections 336 are illustrated in
The camera controller 308 is not only responsible to control the camera's basic functions and operations, but also in charge of controlling the camera orientation adjustment by operating the pan and tilt functions of the camera platform 306, and optionally adjusting the camera position on track 332. Furthermore, it is configured to perform at least one of orientation adjustment and object recognition for the view control device to target an object in camera view according to the determined position of the object in FCS 30.
With reference to
The output camera view frame 366 delivered to service customers is only a subarea 362 of the original full scale camera video frame 354. The area of frame 366 vs. the area of frame 354 is determined by the digital zooming ratio of the digital PTZ function. The relative pixel position difference between the full scale frame center 370 and the output frame center 374 in the camera frame coordinate system determines the relative pan and tilt positions of the output frame 366 with respect to the full scale frame 354. In this case, the pan position 378 is defined by the horizontal distance α between center 370 and center 374. The tilt position 382 is defined by the vertical distance β between the centers. The digital pan motion is along the X-axis and the digital tilt motion is along the Y-axis in the camera frame coordinate system. In continuous video frame outputs, the relative motion of the output frame center 374 with respect to the full scale frame center 370 defines the orientation motion of the camera viewing system. Particularly, the relative orientation velocity vector [uα, uβ] of the output frame with respect to the full scale video frame is depicted by the arrow 386. In a digital PTZ embodiment of the camera orientation control system, the camera frame coordinate system is also the camera system coordinate system.
With reference to
Based on the estimated height zo of the object position above the ground surface 38 from WLAN positioning, the height of the camera above the object surface 42 is: hc=zc−zo. And the height of the camera above the ground surface 38 is: hg=zc. The height of the object above the ground is: ho=zo. The z-axis value for the ground surface is usually assumed to be zero. A surface plane at the height zo is called the object surface 42 and a surface plane at the height of zc is called camera platform surface. Both of the surfaces are parallel to the plane of activity ground.
According to the values of camera reported pan and tilt angles, the camera's heading angle α 408 and its overlook (look-down/look-up) angle β 412 can be derived. These two angles are usually linearly offset versions of the pan and tilt angles of the camera system. The horizontal distance between the camera and the object on the object surface can be computed as: lx=hc cos α/tan β denoted by numeral 416 and ly=hc sin α/tan β denoted by numeral 420. The interception point of the camera line-of-sight 120 on the object surface 42 is the aim-point 22 at location (xsc, ysc, zsc) where (xsc, ysc, zsc)=(xc+lx, yc+ly, zo) in the field coordinate system 34. Similarly, the camera aim-point 424 evaluated on the ground surface is: (xgc, ygc, zgc)=(xc+lxg, yc+lyg, 0), where lyg=hg cos α/tan β and lyg=hg sin α/tan β. Given the knowledge of (xc, yc, zc), the camera orientation heading angle α and overlook angle β can be derived from a target aim-point (xsc, Ysc, zsc) as:
The camera orientation heading angular velocity ωα and overlook angular velocity ωβ can be derived from a target aim-point velocity [usc vsc] on the object surface as:
Equation (1) is used to determine the desired camera orientation (α,β) based on known camera aim-point position (xsc, ysc) on the object surface and Equation (2) is used to transform the aim-point velocity in FCS 30 to the desired pan-tilt speeds of the camera system in camera system coordinate system. When the object surface is not available, the ground surface is used instead.
After an object has been recognized in the camera frame, its position in the camera frame coordinate system can be used to determine the object's position in FCS 30. To this end, a coordination transformation method is needed for position conversation from the camera frame coordinate system to FCS. This process is called vision based positioning method. An exemplary embodiment of the vision positioning technique applies 3D projection method to establish coordinate mapping between the three-dimensional field coordinate system 30 to a two-dimensional camera video frame coordinate system 358. In the presentation of the proposed invention, perspective transform is used as exemplary embodiment of the 3D projection method. A perspective transform formula is defined to map coordinates between 2D quadrilaterals. Using this transform, a point in the first quadrilateral surface (P, Q) can be transformed to a location (M, N) on the second quadrilateral surface using the following formula:
And a velocity vector [uP, uQ] at point (P, Q) in the first quadrilateral surface can be transformed to a velocity vector [uM, uN] at point (M, N) on the second quadrilateral surface using the following formula:
Where a, b, c, d, e, f, g, h are constant parameters whose value are determined with respect to selected quadrilateral area and surface to be transformed between the two surfaces in different coordinate systems. After the positions of the characteristic points of a target object are identified in the camera video frame, equation (3) is used to locate their corresponding positions in the field coordinate system 30. In this case, the first quadrilateral is the image frame and the second quadrilateral is an area on a surface at a certain height zr in the FCS 30. The object surface or the ground surface is typically used. When digital PTZ camera is used, equation (4) and (5) are used to transform the reference aim-point velocity [urap, vrap] in the field coordinate system to the digital pan and tilt velocity [uα, uβ] 386 in the camera frame coordinate system 358.
With reference to
With reference to
When both a WiFi based positioning result and a vision based positioning result are available, they are jointed together through a position fusion algorithm. Let Cw and Cv denote the object location estimated from the WiFi positioning technique and the vision based positioning technique, respectively. Their associated noise variances are σw2 and σv2. By applying the Central Limit Theorem, the combined object location estimation Cwv is obtained as:
C
wv=σc2(σw−2Cw+σv−2Cv) (6)
where σwv2=(σw2+σv2)−1 is the variance of the combined estimate. It can be seen that the fused result is simply a linear combination of the two measurements weighted by their respective noise variances. Alternatively, Kalman filter can be used to fuse together the WiFi and Vision position estimations by applying a first-order system. Particle filters and Hiden Markov Model can also be used to improve the positioning accuracy. The Hiden Markov Model is a statistical model allows the system to integrate the likelihood of a movement or positional change. The fusion of the target object positioning results generates a higher accurate and reliable target object position (
A vision base object positioning result is available until candidate objects have been determined. Before that, the camera view control relies solely on the WLAN positioning result to initialize the object finding and targeting process. The vision positioning result for candidate objects can help identifying the final object of interest by comparing the consistency and correlation between the WLAN object position (xo, yo, zo) and the vision object position ({circumflex over (x)}o, ŷo, {circumflex over (z)}o). For the finalized target object, the vision based object position can help improving the camera orientation adjustment precision to achieve better targeting and exhibition of the object in camera view.
When a camera's orientation adjustment is not available or is limited, centering an object in the camera frame becomes difficult. In these situations, object recognition is used for object targeting in order to support object initialization and specification. With reference to
With reference to
After object initialization, when target object has been found but it has not been confirmed, the camera view control needs to operate the camera view control device in a motion to follow the target object's motion in order to keep the target object continuously in camera view while it is moving. With reference to
where by assuming zo(t)=zo(t−Δt), the presented embodiment of the method is used for object in planar motion. This assumption is only used to simplify the presentation. It shall not be used as limiting the scope of the application. The object velocity [uo, vo] computed from numerical derivative is usually filtered using a low pass filter to smooth the result from noise signal. Alternatively, a Kalman filter or particle filter can be used for object velocity estimation.
At the next step 916, the target aim-point velocity needs to be derived in order for the camera view to cover the object continuously in motion. To this end, the position error is first evaluated as:
e
x
=x
o
−x
sc
,e
y
=y
o
−y
sc
The target aim-point motion aims at follow the object motion while minimizing the position error between the aim-point position and the object position. An exemplary embodiment of the target aim-point velocity is determined as:
The target aim-point velocity in Equation (7) has to be transformed into corresponding camera coordinate system to be implemented for orientation adjustment. This is done at step 920, if a physical PTZ camera is used, Equation (2) is used to derive the desired camera orientation motion in pan and tilt angular speeds. If a digital PTZ camera is used, Equation (4) and (5) are used to transform the aim-point motion in FCS 30 to the camera frame coordinate system 358. Next, the derived camera orientation velocity is commended to the camera view control device at step 924 to operate at the desired orientation velocity such that the camera view achieves the same motion as a moving object in the FCS 30. This method ends at step 928 and the camera view controller continues with other control functions.
Now, the overall camera view control system and method can be summarized. With reference to
As demonstrated by the embodiments described above, the methods and apparatus of the present invention provide advantages over the prior art by enabling automatic object initialization and targeting in activity field before a target object has been specified.
While the best mode has been described in detail, those familiar with the art will recognize various alternative designs and embodiments within the scope of the following claims. Additionally, the features of various implementing embodiments may be combined to form further embodiments of the invention. While various embodiments may have been described as providing advantages or being preferred over other embodiments or prior art implementations with respect to one or more desired characteristics, those of ordinary skill in the art will recognize that one or more features or characteristics may be compromised to achieve desired system attributes, which depend on the specific application and implementation. These attributes may include, but are not limited to: cost, strength, durability, life cycle cost, marketability, appearance, packaging, size, serviceability, weight, manufacturability, ease of assembly, etc. The embodiments described herein that are described as less desirable than other embodiments or prior art implementations with respect to one or more characteristics are not outside the scope of the disclosure and may be desirable for particular applications. Additionally, the features of various implementing embodiments may be combined to form further embodiments of the invention.
This application is a continuation of U.S. Provisional Patent Application Ser. No. 61/864,533