A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
Field of the Disclosure
The present disclosure relates to apparatus and methods for tracking human subjects, and/or other moving and/or static objects using aerial video data.
Description of Related Art
Aerial unmanned vehicles may be used for collecting live video data. A programming and/or two way communication between the remote vehicle and a user may be employed in order to control video collection. Users engaged in attention consuming activities (e.g., surfing, biking, skateboarding, and/or other activities) may not be able to control remote devices with sufficient speed and/or accuracy using conventional remote control devices and/or pre-programmed trajectories.
One aspect of the disclosure relates to a method of context based video acquisition by an autonomous mobile camera apparatus. The method may comprise one or more of acquiring video of a visual scene at a first data rate using the mobile camera apparatus; producing lower rate video from the video, the lower rate video characterized by a lower data rate compared to the first data rate; transmitting the lower data rate video via a wireless communications interface; detecting an indication of interest associated with the video scene; in response to detection of the indication, storing the video at the camera apparatus at the first data rate, and/or other operations.
One aspect of the disclosure relates to a mobile camera apparatus. The mobile camera apparatus may include one or more of a camera sensor, a circular memory buffer, a nonvolatile storage, a communications interface, a processing component, and/or other components. The camera sensor may be configured to provide video at a full data rate. The circular memory buffer may be configured to store a portion of the video at the full data rate, the portion characterized by a first duration. The nonvolatile storage may be configured to store video at a reduced data rate for a second duration, the second duration being greater than the first duration. The communications interface may be configured to detect indications of interest associated with the video being acquired. The processing component may be configured to produce and store video snippets in response the detected indications of interest by, based on an individual indication of interest: produce a video snippet, the video snippet characterized by the full resolution, the video snippet production comprising transferring the portion of video at the full resolution from the buffer to the nonvolatile storage; and store a time tag in a table in the nonvolatile storage, the tag associated with the video snippet. The nonvolatile storage may be configured to store video at the full data rate for a third duration, the third duration being greater than the first duration and smaller than the second duration. Producing the video snippets may enable the mobile camera apparatus to obtain video at the full resolution over a time period of at least the second duration.
One aspect of the disclosure relates to a mobile video acquisition apparatus. The video acquisition apparatus may include one or more of a camera component, a nonvolatile storage, a communications interface, a logic, and/or other components. The camera component may be configured to provide video of a user. The nonvolatile storage may be capable of storing the video for a first duration. The communications interface may be configured to detect an indication of interest of a plurality of indication of interest associated with the video being provided. The logic may be configured to produce times stamps in response to detected indications of interest by, based on an individual indication of interest, produce a time stamp. The individual indication of interest may be produced by a user wearable device based on an action of the user, the wearable device disposed remote from the video acquisition apparatus and in data communication with the video acquisition apparatus. The individual time stamp may enable automatic access to a respective video snippet of the video stored on the nonvolatile storage corresponding to the individual indication of interest. The respective snippet may be of a snippet duration. The combined duration of snippet durations for video snippets may correspond to the detected indications of interest is smaller than the first duration.
These and other features, and characteristics of the present technology, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, will become more apparent upon consideration of the following description and the appended claims with reference to the accompanying drawings, all of which form a part of this specification, wherein like reference numerals designate corresponding parts in the various figures. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the invention. As used in the specification and in the claims, the singular form of “a”, “an”, and “the” include plural referents unless the context clearly dictates otherwise.
All Figures disclosed herein are © Copyright 2014 Brain Corporation. All rights reserved.
Implementations of the present technology will now be described in detail with reference to the drawings, which are provided as illustrative examples so as to enable those skilled in the art to practice the technology. Notably, the figures and examples below are not meant to limit the scope of the present disclosure to a single implementation or implementation, but other implementations and implementations are possible by way of interchange of or combination with some or all of the described or illustrated elements. Wherever convenient, the same reference numbers will be used throughout the drawings to refer to same or like parts.
Where certain elements of these implementations can be partially or fully implemented using known components, only those portions of such known components that are necessary for an understanding of the present technology will be described, and detailed descriptions of other portions of such known components will be omitted so as not to obscure the disclosure.
In the present specification, an implementation showing a singular component should not be considered limiting; rather, the disclosure is intended to encompass other implementations including a plurality of the same component, and vice-versa, unless explicitly stated otherwise herein.
Further, the present disclosure encompasses present and future known equivalents to the components referred to herein by way of illustration.
As used herein, the term “bus” is meant generally to denote all types of interconnection or communication architecture that is used to access the synaptic and neuron memory. The “bus” may be optical, wireless, infrared, and/or another type of communication medium. The exact topology of the bus could be for example standard “bus”, hierarchical bus, network-on-chip, address-event-representation (AER) connection, and/or other type of communication topology used for accessing, e.g., different memories in pulse-based system.
As used herein, the terms “computer”, “computing device”, and “computerized device “may include one or more of personal computers (PCs) and/or minicomputers (e.g., desktop, laptop, and/or other PCs), mainframe computers, workstations, servers, personal digital assistants (PDAs), handheld computers, embedded computers, programmable logic devices, personal communicators, tablet computers, portable navigation aids, J2ME equipped devices, cellular telephones, smart phones, personal integrated communication and/or entertainment devices, and/or any other device capable of executing a set of instructions and processing an incoming data signal.
As used herein, the term “computer program” or “software” may include any sequence of human and/or machine cognizable steps which perform a function. Such program may be rendered in a programming language and/or environment including one or more of C/C++, C#, Fortran, COBOL, MATLAB™, PASCAL, Python, assembly language, markup languages (e.g., HTML, SGML, XML, VoXML), object-oriented environments (e.g., Common Object Request Broker Architecture (CORBA)), Java™ (e.g., J2ME, Java Beans), Binary Runtime Environment (e.g., BREW), and/or other programming languages and/or environments.
As used herein, the terms “connection”, “link”, “transmission channel”, “delay line”, “wireless” may include a causal link between any two or more entities (whether physical or logical/virtual), which may enable information exchange between the entities.
As used herein, the term “memory” may include an integrated circuit and/or other storage device adapted for storing digital data. By way of non-limiting example, memory may include one or more of ROM, PROM, EEPROM, DRAM, Mobile DRAM, SDRAM, DDR/2 SDRAM, EDO/FPMS, RLDRAM, SRAM, “flash” memory (e.g., NAND/NOR), memristor memory, PSRAM, and/or other types of memory.
As used herein, the terms “integrated circuit”, “chip”, and “IC” are meant to refer to an electronic circuit manufactured by the patterned diffusion of trace elements into the surface of a thin substrate of semiconductor material. By way of non-limiting example, integrated circuits may include field programmable gate arrays (e.g., FPGAs), a programmable logic device (PLD), reconfigurable computer fabrics (RCFs), application-specific integrated circuits (ASICs), and/or other types of integrated circuits.
As used herein, the terms “microprocessor” and “digital processor” are meant generally to include digital processing devices. By way of non-limiting example, digital processing devices may include one or more of digital signal processors (DSPs), reduced instruction set computers (RISC), general-purpose (CISC) processors, microprocessors, gate arrays (e.g., field programmable gate arrays (FPGAs)), PLDs, reconfigurable computer fabrics (RCFs), array processors, secure microprocessors, application-specific integrated circuits (ASICs), and/or other digital processing devices. Such digital processors may be contained on a single unitary IC die, or distributed across multiple components.
As used herein, the term “network interface” refers to any signal, data, and/or software interface with a component, network, and/or process. By way of non-limiting example, a network interface may include one or more of FireWire (e.g., FW400, FW800, etc.), USB (e.g., USB2), Ethernet (e.g., 10/100, 10/100/1000 (Gigabit Ethernet), 10-Gig-E, etc.), MoCA, Coaxsys (e.g., TVnet™), radio frequency tuner (e.g., in-band or OOB, cable modem, etc.), Wi-Fi (802.11), WiMAX (802.16), PAN (e.g., 802.15), cellular (e.g., 3G, LTE/LTE-A/TD-LTE, GSM, etc.), IrDA families, and/or other network interfaces.
As used herein, the terms “node”, “neuron”, and “neuronal node” are meant to refer, without limitation, to a network unit (e.g., a spiking neuron and a set of synapses configured to provide input signals to the neuron) having parameters that are subject to adaptation in accordance with a model.
As used herein, the terms “state” and “node state” is meant generally to denote a full (or partial) set of dynamic variables used to describe node state.
As used herein, the term “synaptic channel”, “connection”, “link”, “transmission channel”, “delay line”, and “communications channel” include a link between any two or more entities (whether physical (wired or wireless), or logical/virtual) which enables information exchange between the entities, and may be characterized by a one or more variables affecting the information exchange.
As used herein, the term “Wi-Fi” includes one or more of IEEE-Std. 802.11, variants of IEEE-Std. 802.11, standards related to IEEE-Std. 802.11 (e.g., 802.11 a/b/g/n/s/v), and/or other wireless standards.
As used herein, the term “wireless” means any wireless signal, data, communication, and/or other wireless interface. By way of non-limiting example, a wireless interface may include one or more of Wi-Fi, Bluetooth, 3G (3GPP/3GPP2), HSDPA/HSUPA, TDMA, CDMA (e.g., IS-95A, WCDMA, etc.), FHSS, DSSS, GSM, PAN/802.15, WiMAX (802.16), 802.20, narrowband/FDMA, OFDM, PCS/DCS, LTE/LTE-A/TD-LTE, analog cellular, CDPD, satellite systems, millimeter wave or microwave systems, acoustic, infrared (i.e., IrDA), and/or other wireless interfaces.
It may be desirable to utilize autonomous aerial vehicles for video data collection. A video collection system comprising an aerial (e.g., gliding, flying and/or hovering) vehicle equipped with a video camera and a control interface may enable a user to start, stop, and modify a video collection task (e.g., circle around an object, such as a person and/or a vehicle), as well as to indicate to the vehicle which instances in the video may be of greater interest than others and worth watching later. The control interface apparatus may comprise a button (hardware and/or virtual) that may cause generation of an indication of interest associated with the instance of interest to the user. The indication of interest may be communicated to a video acquisition apparatus (e.g., the aerial vehicle).
In one or more implementations, the video collection system may comprise a multi-rotor Unmanned Aerial Vehicle (UAV), e.g., such as illustrated and described with respect to
The interface apparatus may communicate to the UAV via a wireless communication channel (e.g., radio frequency, infrared, light, acoustic, and/or a combination thereof and/or other modalities).
By way of an illustration, a sports enthusiast may utilize the proposed video collection system to record footage of herself surfing, skiing, running, biking, and/or performing other activity. In some implementations, a home owner may use the system to collect footage of the leaves in the roof's gutter, roof conditions, survey not easily accessible portion of property (e.g., up/down a slope from the house) and/or for other needs. A soccer coach may use the system to collect footage of all the plays preceding a goal.
Prior to flight (also referred to as “pre-flight”) the user may configure flight trajectory parameters of the UAV (e.g., altitude, distance, rotational velocity, and/or other parameters), configure recording settings (e.g., 10 seconds before, 20 seconds after the indication of interest), the direction and/or parameters of rotation after a pause (e.g., clockwise, counter clock, alternating, speed). In one or more implementations, the user may load an operational profile (e.g., comprising the tracking parameters, target trajectory settings, video acquisition parameters, and/or environment metadata). As used herein the term video acquisition may be used to describe operations comprising capture (e.g., transduction is light to electrical signal volts) and buffering (e.g., retaining digital samples after an analog to digital conversion). Various buffer size and/or topology (e.g., double, triple buffering) may be used in different systems, with common applicable characteristics: buffers fill up; for a given buffer size, higher data rate may be achieved for shorter clip duration. Buffering operation may comprise producing information related to acquisition parameters, duration, data rate, time of occurrence, and/or other information related to the video.
The term video storage may be used to describe operations comprising persistent storing of acquired video (e.g., on flash, magnetic and/or other medium). Storing operations may be characterized by storage medium capacity greatly exceeding the buffer size. In some implementations, storage medium does not get depleted by subsequent capture events in a way that would hinder resolution of the capture process for, e.g., 0.005 second to 500 second clips. Storage may be performed using a local storage device (e.g., an SD card) and/or on a remote storage apparatus (e.g., a cloud server).
The pre-flight configuration may be performed using a dedicated interface apparatus and/or using other computerized user interface (UI) device. In some implementations, the user may employ a portable device (e.g., smartphone running an app), a computer (e.g., using a browser interface), wearable device (e.g., pressing a button on a smart watch and/or clicker remote and/or mode button on a smart hand-grip), and/or other user interface means.
The user may utilize the interface apparatus for flight initiation, selection of a subject of interest (SOI) (e.g., tracking target), calibration, and/or operation of the UAV data collection. In some implementations the SOI may be used to refer to a tracked object, a person, a vehicle, an animal, and/or other object and/or feature (e.g., a plume of smoke, extend of fire, wave, an atmospheric cloud, and/or other feature). The SOI may be selected using video streamed to a portable device (e.g., smartphone) from the UAV, may be detected using a wearable controller carried by the SOI and configured to broadcasts owners intent to be tracked, and/or other selection methods. In some implementations, a user may utilize a remote attention indication methodology described in, e.g., co-owned and co-pending U.S. patent application Ser. No. 13/601,721 filed on Aug. 31, 2012 and entitled “APPARATUS AND METHODS FOR CONTROLLING ATTENTION OF A ROBOT”, incorporated supra. As described in above-referenced application No. '721, attention of the UAV may be manipulated by use of a spot-light device illuminating a subject of interest. A sensor device disposed on the UAV may be used to detect the signal (e.g., visible light, infrared light), reflected by the illuminated area requiring attention. The attention guidance may be aided by way of an additional indication (e.g., sound, radio wave, and/or other) transmitted by an agent (e.g., a user) to the UAV indicating that the SOI has been illuminated. Responsive to detection of the additional indication, the UAV may initiate a search for the signal reflected by the illuminated area requiring its attention. Responsive to detecting the illuminated area the UAV may associate one or more objects within the area as the SOI for subsequent tracking and/or video acquisition. Such approach may be utilized, e.g., to indicate SOI disposed in hard to reach areas (e.g., underside of bridges/overpasses, windows in buildings and/or other areas.
In one or more implementations, the sensor component 104 may comprise one or more cameras configured to provide video information related to the person 106. The video information may comprise for example multiple streams of frames received from a plurality of cameras disposed separate from one another. Individual cameras may comprise an image sensor (e.g., charge-coupled device (CCD), CMOS device, and/or an active-pixel sensor (APS), photodiode arrays, and/or other sensors). In one or more implementations, the stream of frames may comprise a pixel stream downloaded from a file. An example of such a file may include a stream of two-dimensional matrices of red green blue RGB values (e.g., refreshed at a 12 Hz, 30 Hz, 60 Hz, 120 Hz, 250 Hz, 1000 Hz and/or other suitable rate). It will be appreciated by those skilled in the art when given this disclosure that the above-referenced image parameters are merely exemplary, and many other image representations (e.g., bitmap, luminance-chrominance (YUV, YCbCr), cyan-magenta-yellow and key (CMYK), grayscale, and/or other image representations) are equally applicable to and useful with the various aspects of the present disclosure. Furthermore, data frames corresponding to other (non-visual) signal modalities such as sonograms, infrared (IR), lidar, radar or tomography images may be equally compatible with the processing methodology of the disclosure, or yet other configurations.
The device 100 may be configured to move around the person 106 along, e.g., a circular trajectory denoted by arrow 102 in
In some implementations wherein the sensor component comprises a plurality of cameras, the device 100 may comprise a hardware video encoder configured to encode interleaved video from the cameras using motion estimation encoder. Video information provided by the cameras may be used to determine direction and/or distance 108 to the person 106. The distance 108 determination may be performed using encoded interleaved video using, e.g., methodology described in co-pending and co-owned U.S. patent application Ser. Nos. 14/285,414, entitled “APPARATUS AND METHODS FOR DISTANCE ESTIMATION USING MULTIPLE IMAGE SENSORS”, filed on May 22, 2014, and/or 14/285,466, entitled “APPARATUS AND METHODS FOR ROBOTIC OPERATION USING VIDEO IMAGERY”, filed on May 22, 2014, each of the foregoing incorporated herein by reference in its entirety.
The aerial vehicle 200 of
The user may use one or more interface elements 406 in order to indicate to the camera an instance of interest (e.g., “awesome”) for recording and/or viewing. In one or more implementation, the smart watch (e.g., the watch 460 of
In one or more implementation, the wearable device 420 of
In some implementations, the wearable device may comprise a smartphone 440 of
Prior to flight (also referred to as “pre-flight”) the user may utilize one or more of the devices 400, 420, 440, 460 in order to configure flight trajectory parameters of the UAV (e.g., altitude, distance, rotational velocity, and/or other parameters), configure recording settings (e.g., 10 seconds before, 20 seconds after the indication of interest), direction and parameters of rotation after a pause (e.g., clockwise, counter clock, alternating, speed). In one or more implementations, the user may load a SOI profile (e.g., comprising the tracking parameters and/or desired trajectory parameters and/or video acquisition parameters and/or environment metadata).
The pre-flight configuration may be performed using a dedicated interface apparatus and/or using other computerized user interface (UI) device. In some implementations, the user may employ a portable device (e.g., smartphone running an app), a computer (e.g., using a browser interface), wearable device (e.g., pressing a button on a smart watch and/or clicker remote), or other user interface means.
The user may further utilize the interface apparatus for flight initiation/SOI selection, calibration, and/or operation of the UAV data collection. In some implementations of SOI selection, the SOI may comprise the user, may be selected in video streamed to a portable device (e.g., smartphone) from the UAV, an object/person carrying the wearable controller configured to broadcasts owners intent to be tracked, and/or other selection methods.
In some implementations of SOI acquisition (e.g., identification) and/or calibration of the acquired SOI (e.g., user identity confirmation), the user may turn in place in order to provide views to enable the UAV controller to acquire the SOI. In one or more implementation the last used SOI may be used for subsequent video acquisition sessions. The UAV controller may provide the user with visual/audio feedback related to state/progress/quality of calibration (e.g., progress, quality, orientation).
Event indicators may be utilized in order to index the longer segment and/or to generate shorter clips, via, e.g., a software process. In one or more implementations, the event indicators may comprise an electrical signal provided to a capture hardware (e.g., to initiate capture) and/or to the buffering hardware (e.g., to modify what is being saved to long term storage out of some part of the buffer), and that this trigger may bear the signal of relevance, by a potentially-automated event detector (e.g., ball in the net). Various electrical signal implementations may be employed, e.g., a pulse of voltage (e.g., a TTL pulse with magnitude threshold between greater than 0 V and less than 5 V, a pulse of frequency, and/or other signal modulation. A spike from a neuron may be used to signal to commence saving a high resolution clip from a memory buffer. In one or more implementations, the event indicator may comprise a software mechanism, e.g., a message, a flag in a memory location. In some implementations, the software implementation may be configured to produce one or more electronic time stamps configured to provide a temporal order among a plurality of events. Various timestamp implementations may be employed such as, a sequence of characters or encoded information identifying when an event occurred, and comprising date and time of day, a time stamp configured in accordance with ISO 8601 standard representation of dates and times, and/or other mechanisms.
In one or more implementations, the time stamps may be used to modify the video storage process, a subsequent processing stage, by, e.g., enabling a greater compression of regions in the inter-clip intervals (e.g., 518, in
Based on receipt of one or more indication of interest from the user, an analysis of sensory messages on the smart device 400 and/or aerial platform 100, a controller of the UAV may generate snippets of equal duration 502, 504, 506 within the video stream 500 of
In some implementations, the video streams 500, 510 may be stored on the UAV and/or streamed to an external storage device (e.g., cloud server). The snippets 502, 504, 506, 512, 514, 516 may be produced from the stored video stream using the snippet duration information (e.g., 544, 546) and time stamps (bookmarks) associated with times when the user indication(s) of interest are detected. In some implementations, e.g., such as illustrated in
In one or more implementations, snippets associated with user indications of interest may be characterized by video acquisition parameters that may be configured differently compared to the rest of the video stream. By way of an illustration, snippet video may comprise data characterized by one or more of higher frame rate (e.g., for recording bungee or sky-diving jumps, greater bit depth, multiple exposures, increased dynamic range, storing of raw sensor output, and/or other characteristics that may produce larger amount of data (per unit of time) compared to regular video stream portion (e.g., 508, 518, 568 in
Those skilled in the arts will appreciate that with a finite communication channel and/or data transfer (e.g., write) rates, there may be a limit to the resolution in space, time, bit depth, spectral channels, etc. A limit may exist with regard to the signal available to the imaging sensor based on one or more of the discretization of individual sensors, the quantity of photons, the properties of the compression of air, the quantal efficiency of the sensor and its noise floor, and/or other limiting factors. Given a set of parameters for transducing the energy upon a single optical or acoustic sensing element, a separate bottle-neck may exist for writing the data. This process 570 may be parallelized to enable multiple media clips with different parameter settings. Individual clip may follow a process of storing the previous sampled data 548 and the subsequent sampled
The apparatus 600 may comprise a processing module 616 configured to receive sensory input from sensory block 620 (e.g., cameras 104 in
The apparatus 600 may comprise storage component 612 configured to store video acquired during trajectory navigation by the autonomous vehicle. The storage component may comprise any applicable data storage technologies that may provide high volume storage (gigabytes), in a small volume (less than 500 mm3, and operate at low sustained power levels (less than 5 W). In some implementations, the storage 612 may be configured to store a video stream (e.g., 500, 510 in
The apparatus 600 may comprise memory 614 configured to store executable instructions (e.g., operating system and/or application code, raw and/or processed data such as portions of video stream 500, information related to one or more detected objects, and/or other information). In some implementations, the memory 614 may be characterized by faster access time and/or lower overall size compared to the storage 612. The memory 614 may comprise one or more buffers configured to implement buffering operations described above with respect to
In some implementations, the processing module 616 may interface with one or more of the mechanical 618, sensory 620, electrical 622, power components 624, communications interface 626, and/or other components via driver interfaces, software abstraction layers, and/or other interfacing techniques. Thus, additional processing and memory capacity may be used to support these processes. However, it will be appreciated that these components may be fully controlled by the processing module. The memory and processing capacity may aid in processing code management for the apparatus 600 (e.g., loading, replacement, initial startup and/or other operations). Consistent with the present disclosure, the various components of the device may be remotely disposed from one another, and/or aggregated. For example, the instructions operating the haptic learning process may be executed on a server apparatus that may control the mechanical components via network or radio connection. In some implementations, multiple mechanical, sensory, electrical units, and/or other components may be controlled by a single robotic controller via network/radio connectivity.
The mechanical components 618 may include virtually any type of device capable of motion and/or performance of a desired function or task. Examples of such devices may include one or more of motors, servos, pumps, hydraulics, pneumatics, stepper motors, rotational plates, micro-electro-mechanical devices (MEMS), electroactive polymers, shape memory alloy (SMA) activation, and/or other devices. The mechanical component may interface with the processing module, and/or enable physical interaction and/or manipulation of the device. In some implementations, the mechanical components 618 may comprise a platform comprising plurality of rotors coupled to individually control motors and configured to place the platform at a target location and/or orientation.
The sensory devices 620 may enable the controller apparatus 600 to accept stimulus from external entities. Examples of such external entities may include one or more of video, audio, haptic, capacitive, radio, vibrational, ultrasonic, infrared, motion, and temperature sensors radar, lidar and/or sonar, and/or other external entities. The module 616 may implement logic configured to process user commands (e.g., gestures) and/or provide responses and/or acknowledgment to the user.
The electrical components 622 may include virtually any electrical device for interaction and manipulation of the outside world. Examples of such electrical devices may include one or more of light/radiation generating devices (e.g., LEDs, IR sources, light bulbs, and/or other devices), audio devices, monitors/displays, switches, heaters, coolers, ultrasound transducers, lasers, and/or other electrical devices. These devices may enable a wide array of applications for the apparatus 600 in industrial, hobbyist, building management, surveillance, military/intelligence, and/or other fields.
The communications interface may include one or more connections to external computerized devices to allow for, inter alia, management of the apparatus 600. The connections may include one or more of the wireless or wireline interfaces discussed above, and may include customized or proprietary connections for specific applications. The communications interface may be configured to receive sensory input from an external camera, a user interface (e.g., a headset microphone, a button, a touchpad, and/or other user interface), and/or provide sensory output (e.g., voice commands to a headset, visual feedback, and/or other sensory output).
The power system 624 may be tailored to the needs of the application of the device. For example, for a small hobbyist UAV, a wireless power solution (e.g., battery, solar cell, inductive (contactless) power source, rectification, and/or other wireless power solution) may be appropriate.
The GPS component 734 disposed in the UAV apparatus 710 and the wearable device 720 may provide position information associated with the UAV and the SOI, respectively. The sensor apparatus 730 may comprise a measurement component (MC) 736. The MC 736 may comprise one or more accelerometers, magnetic sensors, and/or rate of rotation sensors configured to provide information about motion and/or orientation of the apparatus 730. The MC 736 disposed in the UAV apparatus 710 and the wearable device 720 may provide motion information associated with the UAV and the SOI, respectively. The sensor apparatus 730 may comprise a wireless communications component 738. The communications component 738 may be configured to enable transmission of information from the UAV to the wearable apparatus and vice versa. In some implementations, the communications component 738 may enable data communications with a remote entity (e.g., a cloud server, a computer, a wireless access point and/or other computing device). In one or more implementations, the communications component 738 may be configured to provide data (e.g., act as a wireless beacon) to a localization process configured to determine location of the apparatus 710 with respect to the SOI 718 and/or geo-referenced location.
The apparatus 710 may be characterized by a “platform” coordinate frame, denoted by arrows 715. The wearable device 720 and/or the SOI 718 may be characterized by subject coordinate frame, denoted by arrows 725 in
In some implementations, a camera component of the apparatus 710 may be mounted using a gimbaled mount configured to maintain camera component view field extent (e.g., 716 in
The UAV apparatus 710 and the wearable device 720 may cooperate in order to determine and/or maintain position of the UAV relative the SOI. In some implementation, the position determination may be configured based on a fusion of motion data (e.g., position, velocity, acceleration, distance, and/or other motion data) provided by the UAV sensor apparatus and/or the wearable device sensor apparatus (e.g., the apparatus 730).
In some implementations images provided by a stereo camera component may be used to localize the SOI within the camera image. The subject localization may comprise determination of the SOI position, distance, and/or orientation. The SOI position datum determined from the camera 732 imagery may be combined with data provided by the position component (734), measurement component 736, camera gimbal position sensors, and/or wireless beacon date in order to orient the camera view field 716 such as to place the SOI in a target location within the frame. In some implementations, e.g., those described with respect to
The wireless component 738 may be utilized to provide data useful for orienting the camera view field 716 such as to place the SOI in a target location within the frame. In one or more implementations, the wireless component data may comprise receiver signal strength indication (RSSI), time of arrival, and/or other parameters associated with transmission and/or receipt of wireless data by the beacon. The beacon may provide an SOI-centric position, and a platform-centric direction of SOI.
By way of an illustration, augmenting GPS data with inertial motion measurement may enable to reduce errors associated with SOI and/or UAV position determination. Combining position and/or velocity data provided by the UAV and the wearable device GPS component may enable reduction in systematic error associated with the GPS position determination.
The bicycle may comprise a smart grip apparatus 820 configured to enable the cyclist to operate the UAV during tracking and/or data collection.
The smart grip apparatus 820 may comprise a button 824. The user may activate the button 824 (as shown by arrow 832 in
The sleeve 826 may be actuated using rotational motion (e.g., shown by arrow 822 in
The smart grip apparatus 820 may comprise an actuator 834. In one or more implementations, the actuator 834 may comprise a bi-action shifter lever. The user may activate the actuator 834 in two directions, e.g., as shown by arrows 836, 838 in
In one or more implementations, the control action 838 may be used to select mode of operation of the UAV tracking and/or data collection.
Panel 900 in
The separation vector 908 between the camera position 920 and the SOI position 910 at time t1 may be configured in accordance with a specific task. In some implementations, the task may comprise acquisition of a video of the SOI by the UAV with the SOI being in a particular portion (e.g., a center) of the video frame. At time t1, the vector 908 may denote orientation of the camera configured in accordance with the task specifications. At time t2, the camera orientation may be denoted by vector 916. With the camera pointing along direction 916 at time t2, acquisition of the video footage of the SOI may be unattainable and/or characterized by a reduced angular and/or depth resolution compared to the camera orientation denoted by line 914. In some implementation, a controller component may be configured to determine angular adjustment 928 that may be applied in order to point the camera along the target direction 914 using a state determination process configured to reduce a discrepancy between the current state (current orientation denoted by broken line 916) and the target state 914, e.g., as described in detail with respect to
The control component may utilize various sensory information in order to determine the camera orientation adjustment (e.g., 918). In some implementations, the sensory information may comprise one or more images obtained by the mobile camera. Panels 930, 932, 934, 936, 938, 939 illustrate exemplary image frames comprising representation of the SOI useful for camera orientation adjustment.
Panel 930 may represent SOI representation 944 that is disposed distally from target location (e.g., not in the center of the frame as shown by the representation 942 in panel 932). At time t1, the control component may determine expected position of the SOI at time t2, shown by representation 944 in panel 934, in absence of camera orientation adjustment. Application of the adjustment 928 may enable the camera to obtain the SOI representation 946, shown in panel 936. The representation 936 may be referred to as matching the target configuration of the task (e.g., being located in center of frame 936).
In some implementations, the control component may evaluate the expected SOI position within a frame by applying one or more actions (e.g., the adjustment 928). The control component may represent adjustment actions stochastically, and/or may implement a control policy that may draw samples from a stochastic representation of an internal state (e.g., a stochastic representation of one or more variables such as position, velocity, acceleration, angular velocity, torque, and/or control command, as they apply to one or both the SOI 912 and the camera 922. Panel 938 illustrate a distribution (denoted by dots 948 in
In some implementations, the stochastic representation of internal state may be configured using a parametric form. A probability distribution of a state variable (e.g., estimated future position) may be maintained with a parametric representation (e.g., a Gaussian distribution with a given mean and variance). A cost function may be utilized for trajectory navigation. In some implementations, the cost function may be configured based on proximity to SOI, variability of vehicle position and/or speed. The cost function may be configured using a product of a function indicating the distance to an object (e.g., a stepwise or sigmoidal cost over location, configured to characterize proximity of the SOI and/or objects to the vehicle) and a probability distribution of a state variable (e.g., estimated future position), as assessed by a function or its approximation. In one or more implementations, the cost function may be configured to characterize e.g., distance of an outer edge (proximal surface) of a building from the vehicle. A stepwise cost function may be configured to produce a zero value for the open space up to the wall (e.g., up to 5 meters), and a value of one for the blocked off region behind. A sigmoid may provide a smooth transition and enable to handle an uncertainty that may be associated with location of the vehicle and/or objects and/or the relative position of the wall. Those skilled in the art mat appreciate that risk of a candidate action may reduce to the product of a fixed cost co-efficient and an evaluation of an error function (e.g., the cumulative distribution function of a Gaussian), which may be stored in a lookup table.
The parametric stochastic representation may be sampled in order to obtain a distribution of samples that may provide (within noise bounds), a measure of a corresponding cost function that may reflect a given user parameter. In some implementations, the probability distribution of the state space may be sampled via a statistical method (e.g., Gibbs sampling, a Monte Carlo Markov Chain, and/or some other sampling method) whose cost could be evaluated after the result of each independent sample, such that a command is accepted (e.g., within bounds, according to user criteria of desired smoothness), or rejected (e.g., unacceptably jerky), by a search process over actions. Such a search process may be evaluated on each or any of the samples, such that the magnitude of the K samples within the criteria out of N total processed, is above or below a threshold (e.g., according to the confidence interval of a binomial distribution with a particular stringency alpha), terminating the search process over actions (e.g., for acceptance of the action) and/or terminating the evaluation of a particular action (e.g., for rejection). Some implementations may include a specification for what policy to apply in the condition that the search process does not terminate in time, or by the Xth sample (e.g., that the system return to a stable state, despite violation of some user criteria). Some implementations may include a method for choosing the next candidate action (e.g., based on the estimated gradient or curvature of K/N for each or any criteria), potentially increasing the likelihood that an action selection terminates with fewer action evaluations.
The control component may utilize posterior samples of the candidate world state given that a proposed action would be attempted 939 or no action would be attempted 938. The representations of a control state 948 and 949 may reflect computational stages in the ongoing processing of sensory data for autonomous navigation by a controller and/or may be used to indicate an anticipated future sensory state to a user via a GUI.
In
In some implementations, the video acquisition methodology described herein may be utilized for providing additional services besides video acquisition.
As shown in
In
In one or more applications that may require computational power in excess of that that may be provided by a processing module of the controller 1210_2 the local computerized interface device 1204 may be used to perform computations associated with operation of the robotic body coupled to the learning controller 1210_2. The local computerized interface device 1204 may comprise a variety of computing devices including, for example, a desktop PC, a laptop, a notebook, a tablet, a phablet, a smartphone (e.g., an iPhone®), a printed circuit board and/or a system on a chip (SOC) comprising one or more of general processor unit (GPU), field programmable gate array (FPGA), multi-core central processing unit (CPU), an application specific integrated circuit (ASIC), and/or other computational hardware.
In one or more implementations, the data link 1214 may be utilized in order to transmit a video stream and/or accompanying time stamps, e.g., as described above with respect to
During image acquisition, the system 1320 may be configured to navigate a target trajectory. In one or more implementations, the trajectory navigation may comprise maintaining a location in space, varying platform vertical position, and/or horizontal position (e.g., oscillating between two locations at a defined frequency, potentially pausing at extrema to capture image samples) and/or performing of other actions.
The physical structure of the camera component 1324 may be configured to maintain a constant relative position of individual optical elements while supporting effectors to actuate angular displacements that change the angular elevation 1328 and/or azimuth 1326 with respect to a coordinate system defined by the body frame 1322 and/or a world frame. The azimuthal rotation 1326 of the imaging plane may be enabled by a rotation mechanism 1330. The imaging plane in the camera module 1324 may be centered over a visually-defined SOI and/or a GPS-defined coordinate, enabling a sequence of visualization in polar coordinates. For example, a contiguous change in azimuth 1326 may enable an imaging sensor to capture a series of images along a circular path 1336. A change in the elevation may enable imaging along a different circular path, such that a fixed sampling rate of video and a constant angular velocity of azimuth may produce a greater number of pixel samples for a given square centimeter that is closer to the location 1334 below the camera module, than at regions more displaced in the horizontal axis (e.g., locations along path 1336). The 1330—a mechanism that supports rotation of the imaging plane
Imaging sequences may be used to construct one or more of representations of the physical layout of a scene, the surface properties of objects in a scene (e.g., appearance, material, reflectance, albedo, illumination, and/or other surface properties), changes in the scene (e.g., changes in temperature or the movement of people, animals, plants, vehicles, fluids, objects, structures, equipment, and/or other changes in the scene), changes in surface properties of the scene (e.g., the spectral reflection of surfaces), and/or other aspects pertaining to the region near a SOI or GPS defined landmark, or otherwise. For example, the system 1320 may be used to reconstruct a building structure including a surface map of thermal emissions, localized around a SOI (e.g., a window that may or may not have good insulation).
In some implementations, methods 1400, 1410, 1440, 1460 may be implemented in one or more processing devices (e.g., a digital processor, an analog processor, a digital circuit designed to process information, an analog circuit designed to process information, a state machine, and/or other mechanisms for electronically processing information). The one or more processing devices may include one or more devices executing some or all of the operations of methods 1400, 1410, 1440, 1460 in response to instructions stored electronically on an electronic storage medium. The one or more processing devices may include one or more devices configured through hardware, firmware, and/or software to be specifically designed for execution of one or more of the operations of methods 1400, 1410, 1440, 1460.
At operation 1402 a state parameter may be determined while navigating a trajectory. In some implementations, the trajectory navigation may comprise navigation of the trajectory 300 and/or 330 by an aerial vehicle described above with respect to
At operation 1404 a determination may be made as to whether the state parameter falls within the target area of the state space. In some implementations, the target area of the state space may comprise volume bounded by curves 315, 316 in
At operation 1406 the target state space may be populated with one or more trajectory paths. In some implementations, the population of the state space may comprise one or more trajectory types, e.g., oscillating, spiral, random walk, grid, hovering, and/or combination thereof and/or other trajectories. In one or more implementations, populating the state space with one or more paths may be configured based on a timer (e.g., adapt the course when a time interval elapses), platform location (e.g., when passing a landmark, and/or other criteria (e.g., upon completing a revolution around the SOI). The time interval for trajectory adaptation may be selected from the range between 1 second and 30 seconds.
At operation 1412 a target trajectory may be navigated. In one or more implementations, the target trajectory navigation may comprise one or more actions described above with respect to
At operation 1414 an indication of interest may be received. In some implementations, the indication of interest may be provided by a user via a smart wearable device (e.g., as shown and described with respect to
At operation 1416 a time stamp associated with the indication of interest may be produced. In some implementations, the time stamp may comprise an entry in a list configured to indicate a snippet (e.g., 502) in a video stream (e.g., 500 in
At operation 1442 a SOI may be tracked while navigating a target trajectory. In some implementations, the SOI tracking may comprise tracking one or more of a person (e.g., a cyclist 810 in
At operation 1444 video of the SOI may be acquired. In some implementations, the acquired video may be stored on board of the UAV and/or streamed to an external storage. In some implementations, e.g., such as described above with respect to
At operation 1446 a determination may be made as to whether an indication of relevance has been received. In one or more implementations, the indication of relevance may be provided by the SOI (e.g., the cyclist and/or a person within the group 202 in
Responsive to a determination at operation 1446 that the indication of relevance had occurred, the method may proceed to operation 1448 wherein a time stamp may be produced. In one or more implementations, the time stamp may comprise an entry in a list configured to denote one or more portions (snippets) of video (e.g., acquired at operation 1444) corresponding to period of relevance, e.g., as described above with respect to
In some implementations, the time stamp may be configured to cause recording of a historical video portion and/or subsequent video portion, e.g., the portions 544, 546, respectively, described above with respect to
At operation 1452 subsequent video portion may be acquired and stored. In some implementations, the storing of the historical video portion and/or acquisition of the subsequent portion may be configured based on use of a multiple buffering techniques comprising read and write memory buffers. Time stamp(s) may be utilized in order to index the longer segment and/or to generate shorter clips, via, e.g., a software process. In one or more implementations, the time stamps may be used to modify the video storage process, a subsequent processing stage, by, e.g., enabling a greater compression of regions in the inter-clip intervals (e.g., 518, in
In one or more implementations, snippets associated with user indications of interest may be characterized by video acquisition parameters that may be configured differently compared to the rest of the video stream. By way of an illustration, snippet video may comprise data characterized by one or more of higher frame rate (e.g., for recording bungee or sky-diving jumps, greater bit depth, multiple exposures, increased dynamic range, storing of raw sensor output, and/or other characteristics that may produce larger amount of data (per unit of time) compared to regular video stream portion (e.g., 508, 518, 568 in
At operation 1462, the wearable device may be used to configure UAV operational parameters. In one or more implementations, the UAV operational parameters may comprise one or more of trajectory parameters such as minimum/maximum range from SOI (e.g., 315, 316 in
At operation 1464 SOI may be indicated. In some implementations, the SOI indication may comprise a selection of a subject in a video stream provided by the UAV to the wearable device (e.g., a user may touch a portion of the apparatus 440 screen of
At operation 1466 SOI video quality may be confirmed. In some implementations, the SOI quality confirmation may be effectuated based on a user command (touch, audio), and/or absence of user action within a given period (e.g., unless a button is pressed within 30 seconds, the SOI quality is considered satisfactory).
At operation 1466 video produced during trajectory navigation by the UAV may be observed. In some implementations, the video produced during the trajectory navigation by the UAV may be streamed to the wearable device (e.g., 440, 460 in
At operation 1470 an “awesome” indication may be provided. In some implementations, the user may utilize the wearable smart device (e.g., 460 in
Methodology described herein may advantageously allow for real-time control of the robots attention by an external smart agent. The external agent may be better equipped for disregarding distractors, as well as rapidly changing strategies when the circumstances of the environment demand a new cost function (e.g., a switch in the task at hand.) The system may provide means to train up the robot's attention system. In other words, it learns that what it should (automatically) attend to for a particular context, is what the external operator has guided it to in the past.
Exemplary implementations may be useful with a variety of devices including without limitation autonomous and robotic apparatus, and other electromechanical devices requiring attention guidance functionality. Examples of such robotic devises may include one or more of manufacturing robots (e.g., automotive), military, medical (e.g., processing of microscopy, x-ray, ultrasonography, tomography), and/or other robots. Examples of autonomous vehicles may include one or more of rovers, unmanned air vehicles, underwater vehicles, smart appliances (e.g., ROOMBA®), inspection and/or surveillance robots, and/or other vehicles.
Implementations of the principles of the disclosure may be used for entertainment, such as one or more of multi-player games, racing, tag, fetch, personal sports coaching, chasing off crop scavengers, cleaning, dusting, inspection of vehicles and goods, cooking, object retrieval, tidying domestic clutter, removal of defective parts, replacement of worn parts, construction, roof repair, street repair, automotive inspection, automotive maintenance, mechanical debauchery, garden maintenance, fertilizer distribution, weeding, painting, litter removal, food delivery, drink delivery, table wiping, party tricks, and/or other applications.
Implementations of the principles of the disclosure may be applicable to training coordinated operations of automated devices. For example, in applications such as unexploded ordinance/improvised explosive device location and removal, a coordinated search pattern between multiple autonomous learning devices leads to more efficient area coverage. Learning devices may offer the flexibility to handle wider (and dynamic) variety of explosive device encounters. Such learning devices may be trained to identify targets (e.g., enemy vehicles) and deliver similar explosives.
It will be recognized that while certain aspects of the technology are described in terms of a specific sequence of steps of a method, these descriptions are only illustrative of the broader methods of the technology, and may be modified as required by the particular application. Certain steps may be rendered unnecessary or optional under certain circumstances. Additionally, certain steps or functionality may be added to the disclosed implementations, or the order of performance of two or more steps permuted. All such variations are considered to be encompassed within the technology disclosed and claimed herein.
While the above detailed description has shown, described, and pointed out novel features of the technology as applied to various implementations, it will be understood that various omissions, substitutions, and changes in the form and details of the device or process illustrated may be made by those skilled in the art without departing from the disclosure. The foregoing description is of the best mode presently contemplated of carrying out the technology. This description is in no way meant to be limiting, but rather should be taken as illustrative of the general principles of the technology. The scope of the technology should be determined with reference to the claims.
This application claims the priority benefit of U.S. Provisional Patent Application Ser. No. 62/007,311 filed on Jun. 3, 2014 and entitled “APPARATUS AND METHODS FOR TRACKING USING AERIAL VIDEO”; and is related to co-owned and co-pending U.S. patent application Ser. No. XXX, Client Reference BC201413A, Attorney Docket No. 021672-0432604 filed on Jul. 15, 2014 herewith, and entitled “APPARATUS AND METHODS FOR TRACKING USING AERIAL VIDEO”, U.S. patent application Ser. No. XXX, Client Reference BC201415A, Attorney Docket No. 021672-0433333 filed on Jul. 15, 2014 herewith, and entitled “APPARATUS AND METHODS FOR AERIAL VIDEO ACQUISITION”, U.S. patent application Ser. No. 13/601,721 filed on Aug. 31, 2012 and entitled “APPARATUS AND METHODS FOR CONTROLLING ATTENTION OF A ROBOT” and U.S. patent application Ser. No. 13/601,827 filed Aug. 31, 2012 and entitled “APPARATUS AND METHODS FOR ROBOTIC LEARNING”, each of the foregoing being incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
62007311 | Jun 2014 | US |