Method for object detection using shallow neural networks

Description

BACKGROUND

Object detection is required in various systems and applications.

There is a growing need to provide a method and a system that may be able to provide highly accurate object detection at a low cost.

SUMMARY

There may be provided a method for object detection, the method may include receiving an input image by an input of an object detector; wherein the object detector may include multiple branches; generating at least one downscaled version of the input image; feeding the input image to a first branch of the multiple branches; feeding each one of the at least one downscale version of the input image to a unique branch of the multiple branches, one downscale version of the image per branch; calculating, by the multiple branches, candidate bounding boxes that may be indicative of candidate objects that appear in the input image and each one of the at least one downscaled version of the input image; selecting bounding boxes out of the candidate bounding boxes, by a selection unit that followed the multiple branches; wherein the multiple branches may include multiple shallow neural networks that may be followed by multiple region units; wherein each branch may include a shallow neural network and a region unit; wherein the multiple shallow neural networks may be multiple instances of a single trained shallow neural network; and wherein the single trained shallow neural network may be trained to detect objects having a size that may be within a predefined size range and to ignore objects having a size that may be outside the predefined size range.

The method may include generating the multiple downscaled applying a same downscaling ratio between (a) the input image and a first downscaled version of the image and between (b) the first downscale version of the input image to a second downscale version of the input image.

There may be provided a non-transitory computer readable medium for detecting an object by an object detector, wherein the non-transitory computer readable medium may store instructions for: receiving an input image by an input of the object detector; wherein the object detector may include multiple branches; generating at least one downscaled version of the input image; feeding the input image to a first branch of the multiple branches; feeding each one of the at least one downscale version of the input image to a unique branch of the multiple branches, one downscale version of the image per branch; calculating, by the multiple branches, candidate bounding boxes that may be indicative of candidate objects that appear in the input image and each one of the at least one downscaled version of the input image; selecting bounding boxes out of the candidate bounding boxes, by a selection unit that follows the multiple branches; wherein the multiple branches may include multiple shallow neural networks that may be followed by multiple region units; wherein each branch may include a shallow neural network and a region unit; wherein the multiple shallow neural networks may be multiple instances of a single trained shallow neural network; and wherein the single trained shallow neural network may be trained to detect objects having a size that may be within a predefined size range and to ignore objects having a size that may be outside the predefined size range.

The non-transitory computer readable medium that may store instructions for generating the multiple downscaled applying a same downscaling ratio between (a) the input image and a first downscaled version of the image and between (b) the first downscale version of the input image to a second downscale version of the input image.

There may be provided an object detection system that may include an input, a downscaling unit, multiple branches, and a selection unit; wherein the input may be configured to receive an input image; wherein the downscaling unit may be configured to generate at least one downscaled version of the input image; wherein the multiple branches may be configured to receive the input image and the at least one downscaled version of the input image, one image per branch; wherein the multiple branches may be configured to calculate candidate bounding boxes that may be indicative of candidate objects that appear in the input image and each one of the at least one downscaled version of the input image; wherein the selection unit may be configured to select bounding boxes out of the candidate bounding boxes; wherein the multiple branches may include multiple shallow neural networks that may be followed by multiple region units; wherein each branch may include a shallow neural network and a region unit; wherein the multiple shallow neural networks may be multiple instances of a single trained shallow neural network; and wherein the single trained shallow neural network may be trained to detect objects having a size that may be within a predefined size range and to ignore objects having a size that may be outside the predefined size range.

The downscaling unit may be configured to generate the multiple downscaled applying a same downscaling ratio between (a) the input image and a first downscaled version of the image and between (b) the first downscale version of the input image to a second downscale version of the input image.

The predefined size range may range between (a) about ten by ten pixels, till (b) about one hundred by one hundred pixels.

The predefined size range may range between (a) about sixteen by sixteen pixels, till (b) about one hundred and twenty pixels by one hundred and twenty pixels.

The predefined size range may range between (a) about eighty by eighty pixels, till (b) about one hundred by one hundred pixels.

The multiple branches may be three branches and wherein there may be two downscaled versions of the input image.

The at least one downscaled version of the image may be multiple downscaled versions of the input image.

The first downscale version of the input image may have a width that may be one half of a width of the input image and a length that may be one half of a length of a length of an input image.

The each shallow neural network may have up to four layers.

The each shallow neural network may have up to five layers.

BRIEF DESCRIPTION OF THE DRAWINGS

The embodiments of the disclosure will be understood and appreciated more fully from the following detailed description, taken in conjunction with the drawings in which:

FIG. 1 illustrates an example of an object detection system;

FIG. 2 illustrates an example of an image, two objects, two bounding boxes and a bounding box output;

FIG. 3 illustrates an image and various objects;

FIG. 4 illustrates an example of a training process; and

FIG. 5 illustrates an example of a method for object detection.

DESCRIPTION OF EXAMPLE EMBODIMENTS

In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, and components have not been described in detail so as not to obscure the present invention.

The subject matter regarded as the invention is particularly pointed out and distinctly claimed in the concluding portion of the specification. The invention, however, both as to organization and method of operation, together with objects, features, and advantages thereof, may best be understood by reference to the following detailed description when read with the accompanying drawings.

It will be appreciated that for simplicity and clarity of illustration, elements shown in the figures have not necessarily been drawn to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements for clarity. Further, where considered appropriate, reference numerals may be repeated among the figures to indicate corresponding or analogous elements.

Because the illustrated embodiments of the present invention may for the most part, be implemented using electronic components and circuits known to those skilled in the art, details will not be explained in any greater extent than that considered necessary as illustrated above, for the understanding and appreciation of the underlying concepts of the present invention and in order not to obfuscate or distract from the teachings of the present invention.

Any reference in the specification to a method should be applied mutatis mutandis to a device or system capable of executing the method and/or to a non-transitory computer readable medium that stores instructions for executing the method.

Any reference in the specification to a system or device should be applied mutatis mutandis to a method that may be executed by the system, and/or may be applied mutatis mutandis to non-transitory computer readable medium that stores instructions executable by the system.

Any reference in the specification to a non-transitory computer readable medium should be applied mutatis mutandis to a device or system capable of executing instructions stored in the non-transitory computer readable medium and/or may be applied mutatis mutandis to a method for executing the instructions.

Any combination of any module or unit listed in any of the figures, any part of the specification and/or any claims may be provided.

There may be provided a low power object detection system (detector), non-transitory computer readable medium and method. The object detection system, non-transitory computer readable medium and method also provide a high level semantic multi scale feature maps, without impairing the speed of the detector.

Each additional convolution layer increases the detector physical receptive field, therefore, enlargement of the maximum object size that is managed by the detector result in increasing the required number of convolution layers.

Since each layer of the convolutional network has a fixed receptive field, it is not optimal to detect objects of different scales utilizing only features generated by the last convolutional layer.

Shallow feature maps have small receptive fields that are used to detect small objects, and deep feature maps have large receptive fields that are used to detect large objects.

Nevertheless, shallow features might have less semantic information, which may impair the detection of small objects.

The above theorem was very popular at the first object detectors that have been released until 2016. In contrast, at the last few years, we are witness to a new trend of very deep networks integrated into state of the art object detectors. hence state of the art object detectors detect small objects using feature maps extracted from enormous receptive fields.

That implementation forces ineffective forward propagation of small object features from earlier network's stages to deeper network's stages.

Thus while managing larger objects required deeper network, the ineffective detection of small objects increase the number of channels along the network or complicating the memory data transition between layers.

Interesting theorem explaining the motivation of using feature maps that have large receptive fields for small objects suggests that in order to detect a small object we take advantage of the context information surrounding it. For example, we can easily distinguish between small car driving on the roadway and boat sailing on the sea employing the surrounding background information which is notably more differently than the internal context information of that two small objects.

However, real-time automotive application can't take advantage of deeper/wider/Complex networks because those networks are not applicable due to power consuming limitation requirements.

FIG. 1 illustrates an object detection system 9000 that includes an input 9010 (illustrated as receiving input image 9001), a downscaling unit 9011, multiple branches (such as three branches 9013(1), 9013(2) and 9013(3)), and a selection unit 9016 such as a non-maximal suppression unit.

Input 910 may be configured to receive an input image by an input of an object detector.

Downscaling unit 9011 may be configured to generate at least one downscaled version of the input image.

The multiple branches 9013(1), 9013(2) and 9013(3) may be configured to receive the input image and the at least one downscaled version of the input image, one image per branch.

Input image 9001 is fed to first branch 9013(1) that is configured to calculate first candidate bounding boxes that may be indicative of candidate objects that appear in the input image.

First downscaled version of the input image (DVII) 9002 is fed to second branch 9013(2) that is configured to calculate second candidate bounding boxes that may be indicative of candidate objects that appear in first DVII 9002.

Second DVII 9003 is fed to third branch 9013(3) that is configured to calculate third candidate bounding boxes that may be indicative of candidate objects that appear in second DVII 9003.

The multiple branches may include multiple shallow neural networks that may be followed by multiple region units.

In first branch 9013(1), a first shallow neural network 9012(1) is followed by first region unit 9014(1).

The first shallow neural network 9012(1) outputs a first shallow neural network output (SNNO-1) 9003(1) that may be a tensor with multiple features per segment of the input image. The first region unit 9014(1) is configured to receive SNNO-19003(1) and calculate and output first candidate bounding boxes 9005(1).

The second shallow neural network 9012(2) outputs a second SNNO (SNNO-2) 9003(2) that may be a tensor with multiple features per segment of the first DVII 9002. The second region unit 9014(2) is configured to receive SNNO-29003(2) and calculate and output second candidate bounding boxes 9005(2).

The third shallow neural network 9012(3) outputs a third SNNO (SNNO-3) 9003(3) that may be a tensor with multiple features per segment of the second DVII 9003. The third region unit 9014(3) is configured to receive SNNO-39003(3) and calculate and output third candidate bounding boxes 9005(3).

The multiple shallow neural networks 9012(1), 9012(2) and 9012(3) may be multiple instances of a single trained shallow neural network.

The single trained shallow neural network may be trained to detect objects having a size that may be within a predefined size range and to ignore objects having a size that may be outside the predefined size range.

The selection unit 9016 may be configured to select bounding boxes (denoted BB output 9007) out of the first, second and third candidate bounding boxes.

The selected bounding boxes may be further processed to detect the objects. Additionally or alternatively—the bounding boxes may provide the output of the object detection system.

The branch that receives the input image is configured to detect objects that have a size that is within the predefined size range.

The predefined size range may span along certain fractions of the input image (for example—between less than a percent to less than ten percent of the input image—although other fractions may be selected).

The predefined size range may be tailored to the expected size of images within a certain distance range from the sensor.

The predefined size range may span along certain numbers of pixels—for example between (a) about 10, 20, 30, 40, 50, 60, 70, 80, and 90 pixels by about 10, 20, 30, 40, 50, 60, 70, 80, and 90, and (b) about 100, 110, 120, 130, 140, 150, 160 pixels by about 100, 110, 120, 130, 140, 150, 160 pixels.

Each branch that receives a downscaled version of the input image (assuming of a certain downscaling factor) may detect objects have a size (within the downscaled version of the input image) that is within the predefined size range—and thus may detect images that appear in the input image having a size that is within a size range that equals the predefined range multiplied by the downscaling factor.

Assuming, for example that the input image is of 576×768 pixels (each pixel is represented by three colors), the first DVII is 288×384 pixels (each pixel is represented by three colors), and the second DVII is 144×192 pixels (each pixel is represented by three colors), that SNNO-1 has 85 features per each segment out 36×48 segments, that SNNO-2 has 85 features per each segment out 18×24 segments, that SNNO-3 has 85 features per each segment out 9×12 segments.

The assumption above as well as the example below are merely non-limiting examples of various values. Other values may be provided.

Under these assumptions, each shallow neural network may detect an object having a size between 20×20 to 100×100 pixels and physical receptive field around 200×200 pixels. This assumes automotive objects can be effectively represented using bounding box dimension below 100×100.

In contrast to a single model trained end to end, the following architecture contains several identical shallow neural networks.

The first branch detects small object (as appearing in the input image), the second branch detects medium objects (as appearing in the input image), and the third branch detects large objects (as appearing in the input image)—all may be within a limited predefined size range.

The number of branches, scales, and the downscale factor may differ from those illustrated in FIG. 1. For example—there may be two or more than three branches, the downscaling factor may differ from 2×2, downscaling factors between different images may differ from each other, and the like.

FIG. 2 illustrates an example of an image 9020, two objects-pedestrian 9021 and car 9022, two bounding boxes 9023 (bounding pedestrian 9021) and 9024 (bounding car 9022) and a bounding box output 9025.

The bounding box output 9025 may include coordinates (x,y,h,w) of the bounding boxes, objectiveness and class. The coordinate indicate the location (x,y) as well as the height and width of the bounding boxes. Objectiveness provides a confidence level that an object exists. Class—class of object—for example cat, dog, vehicle, person . . . ). The (x,y) coordinates may represent the center of the bounding box.

The object detection may be compliant to any flavor of YOLO—but other object detection schemes may be applied.

FIG. 3 illustrates an image 9030 and various objects 9031, 9032, 9033 and 9034.

Objects 9033 and 9034 are outside the predefined size range and should be ignored of. The single trained neural network is trained to detect objects 9031 and 9032 (within the predefined size range) and ignore objects 9033 and 9034.

FIG. 4 illustrates an example of a training process.

Test images 9040 are fed to single shallow neural network 9017 that outputs, for each test image, a single shallow neural network output that may be a tensor with multiple features per segment of the test image. The region unit 9018 is configured to receive the output from single shallow neural network 9017 and calculate and output candidate bounding boxes per test image. Actual results such as the output candidate bounding boxes per test image or an output of a selecting unit 9019 (that follows region unit 9018) may be fed to error calculation unit 9050.

Error calculation unit 9050 also receives desired results 9045—objects of a size of the predefined range that should be detected by the single shallow neural network 9017.

Error calculation unit 9050 calculates an error 9055 between the the actual results and the desired results- and the error is fed to the single shallow neural network 9017 during the training process.

FIG. 5 illustrates an example of a method 9100 for object detection.

Method 9100 may include the following steps:

- Step 9101 of receiving an input image by an input of an object detector. The object detector may include multiple branches. The multiple branches may include multiple shallow neural networks that may be followed by multiple region units. Each branch may include a shallow neural network and a region unit. The multiple shallow neural networks may be multiple instances of a single trained shallow neural network. The single trained shallow neural network may be trained to detect objects having a size that may be within a predefined size range and to ignore objects having a size that may be outside the predefined size range.
- Step 9102 of generating at least one downscaled version of the input image.
- Step 9103 of feeding the input image to a first branch of the multiple branches.
- Step 9104 of feeding each one of the at least one downscale version of the input image to a unique branch of the multiple branches, one downscale version of the image per branch.
- Step 9105 of calculating, by the multiple branches, candidate bounding boxes that may be indicative of candidate objects that appear in the input image and each one of the at least one downscaled version of the input image.
- Step 9106 of selecting bounding boxes out of the candidate bounding boxes, by a selection unit that followed the multiple branches.
- Step 9107 of outputting the bonding boxes and/or further processing the bounding boxes.

Method 9100 may include training the single trained shallow neural network.

While the foregoing written description of the invention enables one of ordinary skill to make and use what is considered presently to be the best mode thereof, those of ordinary skill will understand and appreciate the existence of variations, combinations, and equivalents of the specific embodiment, method, and examples herein. The invention should therefore not be limited by the above described embodiment, method, and examples, but by all embodiments and methods within the scope and spirit of the invention as claimed.

In the foregoing specification, the invention has been described with reference to specific examples of embodiments of the invention. It will, however, be evident that various modifications and changes may be made therein without departing from the broader spirit and scope of the invention as set forth in the appended claims.

Moreover, the terms “front,” “back,” “top,” “bottom,” “over,” “under” and the like in the description and in the claims, if any, are used for descriptive purposes and not necessarily for describing permanent relative positions. It is understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in other orientations than those illustrated or otherwise described herein.

Furthermore, the terms “assert” or “set” and “negate” (or “deassert” or “clear”) are used herein when referring to the rendering of a signal, status bit, or similar apparatus into its logically true or logically false state, respectively. If the logically true state is a logic level one, the logically false state is a logic level zero. And if the logically true state is a logic level zero, the logically false state is a logic level one.

Those skilled in the art will recognize that the boundaries between logic blocks are merely illustrative and that alternative embodiments may merge logic blocks or circuit elements or impose an alternate decomposition of functionality upon various logic blocks or circuit elements. Thus, it is to be understood that the architectures depicted herein are merely exemplary, and that in fact many other architectures may be implemented which achieve the same functionality.

Any arrangement of components to achieve the same functionality is effectively “associated” such that the desired functionality is achieved. Hence, any two components herein combined to achieve a particular functionality may be seen as “associated with” each other such that the desired functionality is achieved, irrespective of architectures or intermedial components. Likewise, any two components so associated can also be viewed as being “operably connected,” or “operably coupled,” to each other to achieve the desired functionality.

Furthermore, those skilled in the art will recognize that boundaries between the above described operations merely illustrative. The multiple operations may be combined into a single operation, a single operation may be distributed in additional operations and operations may be executed at least partially overlapping in time. Moreover, alternative embodiments may include multiple instances of a particular operation, and the order of operations may be altered in various other embodiments.

Also for example, in one embodiment, the illustrated examples may be implemented as circuitry located on a single integrated circuit or within a same device. Alternatively, the examples may be implemented as any number of separate integrated circuits or separate devices interconnected with each other in a suitable manner.

However, other modifications, variations and alternatives are also possible. The specifications and drawings are, accordingly, to be regarded in an illustrative rather than in a restrictive sense.

In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word ‘comprising’ does not exclude the presence of other elements or steps then those listed in a claim. Furthermore, the terms “a” or “an,” as used herein, are defined as one or more than one. Also, the use of introductory phrases such as “at least one” and “one or more” in the claims should not be construed to imply that the introduction of another claim element by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim element to inventions containing only one such element, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an.” The same holds true for the use of definite articles. Unless stated otherwise, terms such as “first” and “second” are used to arbitrarily distinguish between the elements such terms describe. Thus, these terms are not necessarily intended to indicate temporal or other prioritization of such elements. The mere fact that certain measures are recited in mutually different claims does not indicate that a combination of these measures cannot be used to advantage.

While certain features of the invention have been illustrated and described herein, many modifications, substitutions, changes, and equivalents will now occur to those of ordinary skill in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the true spirit of the invention.

It is appreciated that various features of the embodiments of the disclosure which are, for clarity, described in the contexts of separate embodiments may also be provided in combination in a single embodiment. Conversely, various features of the embodiments of the disclosure which are, for brevity, described in the context of a single embodiment may also be provided separately or in any suitable sub-combination.

It will be appreciated by persons skilled in the art that the embodiments of the disclosure are not limited by what has been particularly shown and described hereinabove. Rather the scope of the embodiments of the disclosure is defined by the appended claims and equivalents thereof

Claims

1. A method for object detection, the method comprises: receiving an input image by an input of an object detector; wherein the object detector comprises multiple branchesgenerating at least one downscaled version of the input image;feeding the input image to a first branch of the multiple branches;feeding each one of the at least one downscale version of the input image to a unique branch of the multiple branches, one downscale version of the image per branch;calculating, by the multiple branches, candidate bounding boxes that are indicative of candidate objects that appear in the input image and each one of the at least one downscaled version of the input image;selecting bounding boxes out of the candidate bounding boxes, by a selection unit that followed the multiple branches;wherein the multiple branches comprise multiple shallow neural networks that are followed by multiple region units; wherein each branch comprises a shallow neural network and a region unit;wherein the multiple shallow neural networks are multiple instances of a single trained shallow neural network; andwherein the single trained shallow neural network is trained to detect objects having a size that is within a predefined size range and to ignore objects having a size that is outside the predefined size range.
2. The method according to claim 1 wherein the predefined size range ranges between (a) ten by ten pixels, till (b) one hundred by one hundred pixels.
3. The method according to claim 1 wherein the predefined size range ranges between (a) sixteen by sixteen pixels, till (b) one hundred and twenty pixels by one hundred and twenty pixels.
4. The method according to claim 1 wherein the predefined size range ranges between (a) eighty by eighty pixels, till (b) one hundred by one hundred pixels.
5. The method according to claim 1 wherein the multiple branches are three branches and wherein there are two downscaled versions of the input image.
6. The method according to claim 1 wherein the generating of the at least one downscaled version of the input image comprises generating multiple downscaled versions of the input image.
7. The method according to claim 6 comprising generating the multiple downscaled applying a same downscaling ratio between (a) the input image and a first downscaled version of the image and between (b) the first downscale version of the input image to a second downscale version of the input image.
8. The method according to claim 6 wherein a first downscale version of the input image has a width that is one half of a width of the input image and a length that is one half of a length of a length of an input image.
9. The method according to claim 1 wherein each shallow neural network has up to four layers.
10. The method according to claim 1 wherein each shallow neural network has up to five layers.
11. A non-transitory computer readable medium for detecting an object by an object detector, wherein the non-transitory computer readable medium stores instructions for: receiving an input image by an input of the object detector; wherein the object detector comprises multiple branches;generating at least one downscaled version of the input image;feeding the input image to a first branch of the multiple branches;feeding each one of the at least one downscale version of the input image to a unique branch of the multiple branches, one downscale version of the image per branch;calculating, by the multiple branches, candidate bounding boxes that are indicative of candidate objects that appear in the input image and each one of the at least one downscaled version of the input image;selecting bounding boxes out of the candidate bounding boxes, by a selection unit that follows the multiple branches;wherein the multiple branches comprise multiple shallow neural networks that are followed by multiple region units; wherein each branch comprises a shallow neural network and a region unit;wherein the multiple shallow neural networks are multiple instances of a single trained shallow neural network; andwherein the single trained shallow neural network is trained to detect objects having a size that is within a predefined size range and to ignore objects having a size that is outside the predefined size range.
12. The non-transitory computer readable medium according to claim 11 wherein the predefined size range ranges between (a) ten by ten pixels, till (b) one hundred by one hundred pixels.
13. The non-transitory computer readable medium according to claim 11 wherein the predefined size range ranges between (a) sixteen by sixteen pixels, till (b) one hundred and twenty pixels by one hundred and twenty pixels.
14. The non-transitory computer readable medium according to claim 11 wherein the predefined size range ranges between (a) eighty by eighty pixels, till (b) one hundred by one hundred pixels.
15. The non-transitory computer readable medium according to claim 11 wherein the multiple branches are three branches and wherein there are two downscaled versions of the input image.
16. The non-transitory computer readable medium according to claim 11 wherein the generating of the at least one downscaled version of the input image comprises generating multiple downscaled versions of the input image.
17. The non-transitory computer readable medium according to claim 16 that stores instructions for generating the multiple downscaled applying a same downscaling ratio between (a) the input image and a first downscaled version of the image and between (b) the first downscale version of the input image to a second downscale version of the input image.
18. The non-transitory computer readable medium according to claim 16 wherein a first downscale version of the input image has a width that is one half of a width of the input image and a length that is one half of a length of a length of an input image.
19. The non-transitory computer readable medium according to claim 11 wherein each shallow neural network has up to four layers.
20. The non-transitory computer readable medium according to claim 11 wherein each shallow neural network has up to five layers.
21. An object detection system that comprises an input, a downscaling unit, multiple branches, and a selection unit; wherein the input is configured to receive an input image;wherein the downscaling unit is configured to generate at least one downscaled version of the input image;wherein the multiple branches are configured to receive the input image and the at least one downscaled version of the input image, one image per branch;wherein the multiple branches are configured to calculate candidate bounding boxes that are indicative of candidate objects that appear in the input image and each one of the at least one downscaled version of the input image;wherein the selection unit is configured to select bounding boxes out of the candidate bounding boxes;wherein the multiple branches comprise multiple shallow neural networks that are followed by multiple region units; wherein each branch comprises a shallow neural network and a region unit;wherein the multiple shallow neural networks are multiple instances of a single trained shallow neural network; andwherein the single trained shallow neural network is trained to detect objects having a size that is within a predefined size range and to ignore objects having a size that is outside the predefined size range.
22. The object detection system according to claim 21 wherein the predefined size range ranges between (a) ten by ten pixels, till (b) one hundred by one hundred pixels.
23. The object detection system according to claim 21 wherein the predefined size range ranges between (a) sixteen by sixteen pixels, till (b) one hundred and twenty pixels by one hundred and twenty pixels.
24. The object detection system according to claim 21 wherein the predefined size range ranges between (a) eighty by eighty pixels, till (b) one hundred by one hundred pixels.
25. The object detection system according to claim 21 wherein the multiple branches are three branches and wherein there are two downscaled versions of the input image.
26. The object detection system according to claim 21 wherein the generating of the at least one downscaled version of the input image comprises generating multiple downscaled versions of the input image.
27. The object detection system according to claim 26 wherein the downscaling unit is configured to generate the multiple downscaled applying a same downscaling ratio between (a) the input image and a first downscaled version of the image and between (b) the first downscale version of the input image to a second downscale version of the input image.
28. The object detection system according to claim 26 wherein a first downscale version of the input image has a width that is one half of a width of the input image and a length that is one half of a length of a length of an input image.
29. The object detection system according to claim 21 wherein each shallow neural network has up to four layers.
30. The object detection system according to claim 21 wherein each shallow neural network has up to five layers.

CROSS REFERENCE

This application claims priority from U.S. provisional patent 62/827,121 filing date Mar. 31 2019.

US Referenced Citations (358)

Number	Name	Date	Kind
4733353	Jaswa	Mar 1988	A
4932645	Schorey et al.	Jun 1990	A
4972363	Nguyen et al.	Nov 1990	A
5078501	Hekker et al.	Jan 1992	A
5214746	Fogel et al.	May 1993	A
5307451	Clark	Apr 1994	A
5412564	Ecer	May 1995	A
5436653	Ellis et al.	Jul 1995	A
5568181	Greenwood et al.	Oct 1996	A
5638425	Meador, I et al.	Jun 1997	A
5745678	Herzberg et al.	Apr 1998	A
5754938	Herz et al.	May 1998	A
5763069	Jordan	Jun 1998	A
5806061	Chaudhuri et al.	Sep 1998	A
5835087	Herz et al.	Nov 1998	A
5835901	Duvoisin et al.	Nov 1998	A
5852435	Vigneaux et al.	Dec 1998	A
5870754	Dimitrova et al.	Feb 1999	A
5873080	Coden et al.	Feb 1999	A
5887193	Takahashi et al.	Mar 1999	A
5926812	Hilsenrath et al.	Jul 1999	A
5978754	Kumano	Nov 1999	A
5991306	Burns et al.	Nov 1999	A
6052481	Grajski et al.	Apr 2000	A
6070167	Qian et al.	May 2000	A
6076088	Paik et al.	Jun 2000	A
6122628	Castelli et al.	Sep 2000	A
6128651	Cezar	Oct 2000	A
6137911	Zhilyaev	Oct 2000	A
6144767	Bottou et al.	Nov 2000	A
6147636	Gershenson	Nov 2000	A
6163510	Lee et al.	Dec 2000	A
6243375	Speicher	Jun 2001	B1
6243713	Nelson et al.	Jun 2001	B1
6275599	Adler et al.	Aug 2001	B1
6314419	Faisal	Nov 2001	B1
6329986	Cheng	Dec 2001	B1
6381656	Shankman	Apr 2002	B1
6411229	Kobayashi	Jun 2002	B2
6422617	Fukumoto et al.	Jul 2002	B1
6507672	Watkins et al.	Jan 2003	B1
6523046	Liu et al.	Feb 2003	B2
6524861	Anderson	Feb 2003	B1
6546405	Gupta et al.	Apr 2003	B2
6550018	Abonamah et al.	Apr 2003	B1
6557042	He et al.	Apr 2003	B1
6594699	Sahai et al.	Jul 2003	B1
6601026	Appelt et al.	Jul 2003	B2
6611628	Sekiguchi et al.	Aug 2003	B1
6618711	Ananth	Sep 2003	B1
6643620	Contolini et al.	Nov 2003	B1
6643643	Lee et al.	Nov 2003	B1
6665657	Dibachi	Dec 2003	B1
6681032	Bortolussi et al.	Jan 2004	B2
6704725	Lee	Mar 2004	B1
6732149	Kephart	May 2004	B1
6742094	Igari	May 2004	B2
6751363	Natsev et al.	Jun 2004	B1
6751613	Lee et al.	Jun 2004	B1
6754435	Kim	Jun 2004	B2
6763069	Divakaran et al.	Jul 2004	B1
6763519	McColl et al.	Jul 2004	B1
6774917	Foote et al.	Aug 2004	B1
6795818	Lee	Sep 2004	B1
6804356	Krishnamachari	Oct 2004	B1
6813395	Kinjo	Nov 2004	B1
6819797	Smith et al.	Nov 2004	B1
6877134	Fuller et al.	Apr 2005	B1
6901207	Watkins	May 2005	B1
6938025	Lulich et al.	Aug 2005	B1
6985172	Rigney et al.	Jan 2006	B1
7013051	Sekiguchi et al.	Mar 2006	B2
7020654	Najmi	Mar 2006	B1
7023979	Wu et al.	Apr 2006	B1
7043473	Rassool et al.	May 2006	B1
7158681	Persiantsev	Jan 2007	B2
7215828	Luo	May 2007	B2
7260564	Lynn et al.	Aug 2007	B1
7289643	Brunk et al.	Oct 2007	B2
7299261	Oliver et al.	Nov 2007	B1
7302089	Smits	Nov 2007	B1
7302117	Sekiguchi et al.	Nov 2007	B2
7313805	Rosin et al.	Dec 2007	B1
7340358	Yoneyama	Mar 2008	B2
7346629	Kapur et al.	Mar 2008	B2
7353224	Chen et al.	Apr 2008	B2
7376672	Weare	May 2008	B2
7383179	Alves et al.	Jun 2008	B2
7433895	Li et al.	Oct 2008	B2
7464086	Black et al.	Dec 2008	B2
7529659	Wold	May 2009	B2
7657100	Gokturk et al.	Feb 2010	B2
7660468	Gokturk et al.	Feb 2010	B2
7805446	Potok et al.	Sep 2010	B2
7860895	Scofield et al.	Dec 2010	B1
7872669	Darrell et al.	Jan 2011	B2
7921288	Hildebrand	Apr 2011	B1
7933407	Keidar et al.	Apr 2011	B2
8023739	Hohimer et al.	Sep 2011	B2
8266185	Raichelgauz et al.	Sep 2012	B2
8285718	Ong et al.	Oct 2012	B1
8312031	Raichelgauz et al.	Nov 2012	B2
8315442	Gokturk et al.	Nov 2012	B2
8345982	Gokturk et al.	Jan 2013	B2
8386400	Raichelgauz et al.	Feb 2013	B2
8396876	Kennedy et al.	Mar 2013	B2
8418206	Bryant et al.	Apr 2013	B2
8442321	Chang et al.	May 2013	B1
8457827	Ferguson et al.	Jun 2013	B1
8495489	Everingham	Jul 2013	B1
8635531	Graham et al.	Jan 2014	B2
8655801	Raichelgauz et al.	Feb 2014	B2
8655878	Kulkarni et al.	Feb 2014	B1
8799195	Raichelgauz et al.	Aug 2014	B2
8799196	Raichelquaz et al.	Aug 2014	B2
8818916	Raichelgauz et al.	Aug 2014	B2
8868861	Shimizu et al.	Oct 2014	B2
8886648	Procopio et al.	Nov 2014	B1
8954887	Tseng et al.	Feb 2015	B1
8990199	Ramesh et al.	Mar 2015	B1
9009086	Raichelgauz et al.	Apr 2015	B2
9104747	Raichelgauz et al.	Aug 2015	B2
9165406	Gray et al.	Oct 2015	B1
9311308	Sankarasubramaniam et al.	Apr 2016	B2
9323754	Ramanathan et al.	Apr 2016	B2
9466068	Raichelgauz et al.	Oct 2016	B2
9646006	Raichelgauz et al.	May 2017	B2
9679062	Schillings et al.	Jun 2017	B2
9807442	Bhatia et al.	Oct 2017	B2
9875445	Amer et al.	Jan 2018	B2
9984369	Li et al.	May 2018	B2
20010019633	Tenze et al.	Sep 2001	A1
20010034219	Hewitt et al.	Oct 2001	A1
20010038876	Anderson	Nov 2001	A1
20020004743	Kutaragi et al.	Jan 2002	A1
20020010682	Johnson	Jan 2002	A1
20020010715	Chinn et al.	Jan 2002	A1
20020019881	Bokhari et al.	Feb 2002	A1
20020032677	Morgenthaler et al.	Mar 2002	A1
20020038299	Zernik et al.	Mar 2002	A1
20020042914	Walker et al.	Apr 2002	A1
20020072935	Rowse et al.	Jun 2002	A1
20020087530	Smith et al.	Jul 2002	A1
20020087828	Arimilli et al.	Jul 2002	A1
20020091947	Nakamura	Jul 2002	A1
20020107827	Benitez-Jimenez et al.	Aug 2002	A1
20020113812	Walker et al.	Aug 2002	A1
20020126002	Patchell	Sep 2002	A1
20020126872	Brunk et al.	Sep 2002	A1
20020129140	Peled et al.	Sep 2002	A1
20020147637	Kraft et al.	Oct 2002	A1
20020157116	Jasinschi	Oct 2002	A1
20020163532	Thomas et al.	Nov 2002	A1
20020174095	Lulich et al.	Nov 2002	A1
20020184505	Mihcak et al.	Dec 2002	A1
20030004966	Bolle et al.	Jan 2003	A1
20030005432	Ellis et al.	Jan 2003	A1
20030037010	Schmelzer	Feb 2003	A1
20030041047	Chang et al.	Feb 2003	A1
20030089216	Birmingham et al.	May 2003	A1
20030093790	Logan et al.	May 2003	A1
20030101150	Agnihotri et al.	May 2003	A1
20030105739	Essafi et al.	Jun 2003	A1
20030110236	Yang et al.	Jun 2003	A1
20030115191	Copperman et al.	Jun 2003	A1
20030126147	Essafi et al.	Jul 2003	A1
20030140257	Peterka et al.	Jul 2003	A1
20030165269	Fedorovskaya et al.	Sep 2003	A1
20030174859	Kim	Sep 2003	A1
20030184598	Graham	Oct 2003	A1
20030200217	Ackerman	Oct 2003	A1
20030217335	Chung et al.	Nov 2003	A1
20030229531	Heckerman et al.	Dec 2003	A1
20040095376	Graham et al.	May 2004	A1
20040098671	Graham et al.	May 2004	A1
20040111432	Adams et al.	Jun 2004	A1
20040117638	Monroe	Jun 2004	A1
20040128511	Sun et al.	Jul 2004	A1
20040153426	Nugent	Aug 2004	A1
20040162820	James et al.	Aug 2004	A1
20040267774	Lin et al.	Dec 2004	A1
20050021394	Miedema et al.	Jan 2005	A1
20050080788	Murata	Apr 2005	A1
20050114198	Koningstein et al.	May 2005	A1
20050131884	Gross et al.	Jun 2005	A1
20050163375	Grady	Jul 2005	A1
20050172130	Roberts	Aug 2005	A1
20050177372	Wang et al.	Aug 2005	A1
20050226511	Short	Oct 2005	A1
20050238198	Brown et al.	Oct 2005	A1
20050238238	Xu et al.	Oct 2005	A1
20050249398	Khamene et al.	Nov 2005	A1
20050256820	Dugan et al.	Nov 2005	A1
20050262428	Little et al.	Nov 2005	A1
20050281439	Lange	Dec 2005	A1
20050289163	Gordon et al.	Dec 2005	A1
20050289590	Cheok et al.	Dec 2005	A1
20060004745	Kuhn et al.	Jan 2006	A1
20060015580	Gabriel et al.	Jan 2006	A1
20060020958	Allamanche et al.	Jan 2006	A1
20060033163	Chen	Feb 2006	A1
20060050993	Stentiford	Mar 2006	A1
20060069668	Braddy et al.	Mar 2006	A1
20060080311	Potok et al.	Apr 2006	A1
20060112035	Cecchi et al.	May 2006	A1
20060129822	Snijder et al.	Jun 2006	A1
20060217818	Fujiwara	Sep 2006	A1
20060217828	Hicken	Sep 2006	A1
20060218191	Gopalakrishnan	Sep 2006	A1
20060224529	Kermani	Oct 2006	A1
20060236343	Chang	Oct 2006	A1
20060242130	Sadri et al.	Oct 2006	A1
20060248558	Barton et al.	Nov 2006	A1
20060251338	Gokturk et al.	Nov 2006	A1
20060253423	McLane et al.	Nov 2006	A1
20060288002	Epstein et al.	Dec 2006	A1
20070022374	Huang et al.	Jan 2007	A1
20070033170	Sull et al.	Feb 2007	A1
20070038614	Guha	Feb 2007	A1
20070042757	Jung et al.	Feb 2007	A1
20070061302	Ramer et al.	Mar 2007	A1
20070067304	Ives	Mar 2007	A1
20070074147	Wold	Mar 2007	A1
20070083611	Farago et al.	Apr 2007	A1
20070091106	Moroney	Apr 2007	A1
20070130159	Gulli et al.	Jun 2007	A1
20070136782	Ramaswamy et al.	Jun 2007	A1
20070156720	Maren	Jul 2007	A1
20070244902	Seide et al.	Oct 2007	A1
20070253594	Lu et al.	Nov 2007	A1
20070298152	Baets	Dec 2007	A1
20080049789	Vedantham et al.	Feb 2008	A1
20080072256	Boicey et al.	Mar 2008	A1
20080079729	Brailovsky	Apr 2008	A1
20080152231	Gokturk et al.	Jun 2008	A1
20080159622	Agnihotri et al.	Jul 2008	A1
20080165861	Wen et al.	Jul 2008	A1
20080201299	Lehikoinen et al.	Aug 2008	A1
20080201314	Smith et al.	Aug 2008	A1
20080201361	Castro et al.	Aug 2008	A1
20080228995	Tan et al.	Sep 2008	A1
20080237359	Silverbrook et al.	Oct 2008	A1
20080247543	Mick et al.	Oct 2008	A1
20080253737	Kimura et al.	Oct 2008	A1
20080263579	Mears et al.	Oct 2008	A1
20080270373	Oostveen et al.	Oct 2008	A1
20080294278	Borgeson et al.	Nov 2008	A1
20080307454	Ahanger et al.	Dec 2008	A1
20080313140	Pereira et al.	Dec 2008	A1
20090024641	Quigley et al.	Jan 2009	A1
20090037088	Taguchi	Feb 2009	A1
20090043637	Eder	Feb 2009	A1
20090096634	Emam et al.	Apr 2009	A1
20090125544	Brindley	May 2009	A1
20090157575	Schobben et al.	Jun 2009	A1
20090165031	Li et al.	Jun 2009	A1
20090172030	Schiff et al.	Jul 2009	A1
20090208106	Dunlop et al.	Aug 2009	A1
20090208118	Csurka	Aug 2009	A1
20090216761	Raichelgauz et al.	Aug 2009	A1
20090220138	Zhang et al.	Sep 2009	A1
20090245573	Saptharishi et al.	Oct 2009	A1
20090254572	Redlich et al.	Oct 2009	A1
20090282218	Raichelgauz et al.	Nov 2009	A1
20090297048	Slotine et al.	Dec 2009	A1
20100042646	Raichelgauz et al.	Feb 2010	A1
20100082684	Churchill et al.	Apr 2010	A1
20100104184	Bronstein et al.	Apr 2010	A1
20100125569	Nair et al.	May 2010	A1
20100162405	Cook et al.	Jun 2010	A1
20100191391	Zeng	Jul 2010	A1
20100198626	Cho et al.	Aug 2010	A1
20100212015	Jin et al.	Aug 2010	A1
20100284604	Chrysanthakopoulos	Nov 2010	A1
20100293057	Haveliwala et al.	Nov 2010	A1
20100312736	Kello	Dec 2010	A1
20100318493	Wessling	Dec 2010	A1
20100325138	Lee et al.	Dec 2010	A1
20100325581	Finkelstein et al.	Dec 2010	A1
20110035373	Berg et al.	Feb 2011	A1
20110055585	Lee	Mar 2011	A1
20110164180	Lee	Jul 2011	A1
20110164810	Zang et al.	Jul 2011	A1
20110216209	Fredlund et al.	Sep 2011	A1
20110218946	Stern et al.	Sep 2011	A1
20110276680	Rimon	Nov 2011	A1
20110296315	Lin et al.	Dec 2011	A1
20120131454	Shah	May 2012	A1
20120136853	Kennedy et al.	May 2012	A1
20120167133	Carroll et al.	Jun 2012	A1
20120179642	Sweeney et al.	Jul 2012	A1
20120185445	Borden et al.	Jul 2012	A1
20120207346	Kohli et al.	Aug 2012	A1
20120221470	Lyon	Aug 2012	A1
20120227074	Hill et al.	Sep 2012	A1
20120239690	Asikainen et al.	Sep 2012	A1
20120239694	Avner et al.	Sep 2012	A1
20120265735	McMillan et al.	Oct 2012	A1
20120294514	Saunders et al.	Nov 2012	A1
20120299961	Ramkumar et al.	Nov 2012	A1
20120301105	Rehg et al.	Nov 2012	A1
20120331011	Raichelgauz et al.	Dec 2012	A1
20130043990	Al-Jafar	Feb 2013	A1
20130066856	Ong et al.	Mar 2013	A1
20130067364	Berntson et al.	Mar 2013	A1
20130086499	Dyor et al.	Apr 2013	A1
20130089248	Remiszewski et al.	Apr 2013	A1
20130151522	Aggarwal et al.	Jun 2013	A1
20130159298	Mason et al.	Jun 2013	A1
20130226930	Amgren et al.	Aug 2013	A1
20130227023	Raichelgauz et al.	Aug 2013	A1
20130283401	Pabla et al.	Oct 2013	A1
20130346412	Raichelgauz et al.	Dec 2013	A1
20140019264	Wachman et al.	Jan 2014	A1
20140025692	Pappas	Jan 2014	A1
20140125703	Roveta et al.	May 2014	A1
20140147829	Jerauld	May 2014	A1
20140149918	Asokan et al.	May 2014	A1
20140152698	Kim et al.	Jun 2014	A1
20140156691	Conwell	Jun 2014	A1
20140169681	Drake	Jun 2014	A1
20140176604	Venkitaraman et al.	Jun 2014	A1
20140193077	Shiiyama et al.	Jul 2014	A1
20140198986	Marchesotti	Jul 2014	A1
20140201330	Lopez et al.	Jul 2014	A1
20140250032	Huang et al.	Sep 2014	A1
20140282655	Roberts	Sep 2014	A1
20140300722	Garcia	Oct 2014	A1
20140330830	Raichelgauz et al.	Nov 2014	A1
20140341476	Kulick et al.	Nov 2014	A1
20140363044	Williams et al.	Dec 2014	A1
20150052089	Kozloski et al.	Feb 2015	A1
20150100562	Kohlmeier et al.	Apr 2015	A1
20150117784	Lin et al.	Apr 2015	A1
20150120627	Hunzinger et al.	Apr 2015	A1
20150127516	Studnitzer et al.	May 2015	A1
20150248586	Gaidon et al.	Sep 2015	A1
20150254344	Kulkarni et al.	Sep 2015	A1
20150286742	Zhang et al.	Oct 2015	A1
20150286872	Medioni et al.	Oct 2015	A1
20150324356	Gutierrez et al.	Nov 2015	A1
20150332588	Bulan et al.	Nov 2015	A1
20160007083	Gurha	Jan 2016	A1
20160026707	Ong et al.	Jan 2016	A1
20160132194	Grue et al.	May 2016	A1
20160221592	Puttagunta et al.	Aug 2016	A1
20160275766	Venetianer et al.	Sep 2016	A1
20160306798	Guo et al.	Oct 2016	A1
20170017638	Satyavarta et al.	Jan 2017	A1
20170154241	Shambik et al.	Jun 2017	A1
20180108258	Dilger	Apr 2018	A1
20180157903	Tu et al.	Jun 2018	A1
20180189613	Wolf	Jul 2018	A1
20180373929	Ye	Dec 2018	A1
20190096135	Mutto et al.	Mar 2019	A1
20190171912	Vallespi-Gonzalez et al.	Jun 2019	A1
20190279046	Han	Sep 2019	A1
20190304102	Chen	Oct 2019	A1

Foreign Referenced Citations (1)

Number	Date	Country
1085464	Jan 2007	EP

Non-Patent Literature Citations (113)

Entry
Zhou et al, “Ensembling neural networks: Many could be better than all”, National Laboratory for Novel Software Technology, Nanjing University, Hankou Road 22, Nanjing 210093, PR China Received Nov. 16, 2001, Available online Mar. 12, 2002, pp. 239-263.
Zhou et al, “Medical Diagnosis With C4.5 Rule Preceded by Artificial Neural Network Ensemble”, IEEE Transactions on Information Technology in Biomedicine, vol. 7, Issue: 1, Mar. 2003, pp. 37-42.
Zhu et al., “Technology-Assisted Dietary Assesment”, Proc SPIE. Mar. 20, 2008, pp. 1-15.
Akira et al., “Columbia University's Baseline Detectors for 374 LSCOM Semantic Visual Concepts”, Columbia University ADVENT Technical Report #222-2006-8, Mar. 20, 2007, pp. 17.
Amparo et al., “Real Time Speaker Localization And Detection System For Camera Steering in Multiparticipant Videoconferencing Environments”, IEEE International Conference on Acoustics, Speech and Signal Processing 2011,pp. 2592-2595.
Boari et al., “Adaptive Routing For Dynamic Applications In Massively Parallel Architectures”, IEEE Parallel & Distributed Technology: Systems & Applications (vol. 3, Issue: 1, Spring 1995), pp. 61-74.
Boyer et al., “A Crossover Operator for Evolutionary Algorithms Based on Population Features”, Journal of Artificial Intelligence Research vol. 24 (2005) pp. 1-48.
Brecheisen et al., ““Hierarchical Genre Classification for Large Music Collections”” , IEEE International Conference on Multimedia and Expo (ICME) 2006, pp. 1385-1388.
Burgsteiner et al., “Movement prediction from real-world images using a liquid state machine” ,International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems IEA/AIE 2005: Innovations in Applied Artificial Intelligence, pp. 121-130.
Cernansky et al., “Feed-forward echo state networks”, IEEE International Joint Conference on Neural Networks, 2005, vol. 3, pp. 1479-1482.
Chang et al., “VideoQ: a fully automated video retrieval system using motion sketches” , Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No. 98EX201), Oct. 19-21, 1998, pp. 270-271.
Cho et al.,“Efficient Motion-Vector-Based Video Search Using Query By Clip”, IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No. 04TH8763), Year: 2004, vol. 2, pp. 1027-1030.
Clement et al.“Speaker diarization of heterogeneous web video files: A preliminary study”, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),May 22-27, 2011 pp. 4432-4435.
Cococcioni et al., “Automatic diagnosis of defects of rolling element bearings based on computational intelligence techniques”, Ninth International Conference on Intelligent Systems Design and Applications, Nov. 30-Dec. 2, 2009, pp. 970-975.
Emami et al., “Role of Spatiotemporal Oriented Energy Features for Robust Visual Tracking in Video Surveillance”, IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance Sep. 18-21, 2012, pp. 349-354.
Fathy et al., “A parallel design and implementation for backpropagation neural network using MIMD architecture”, 8th Mediterranean Electrotechnical Conference on Industrial Applications in Power Systems, Computer Science and Telecommunications (MELECON 96) ,May 16, 1996,1472-1476.
Foote et al.,“Content-based retrieval of music and audio”, Multimedia Storage and Archiving Systems II, Published in SPIE Proceedings vol. 3229, Oct. 6, 1997, p. 1.
Freisleben et al., “Recognition of fractal images using a neural network”,New Trends in Neural Computation, International Workshop on Artificial Neural Networks, IWANN '93 Sitges, Spain, Jun. 9-11, 1993: , pp. 632-637.
Ivan Garcia, “Solving The Weighted Region Least Cost Path Problem Using Transputers”, Naval Postgraduate School Monterey, California ,1989 pp. 73.
Gomes et al., “Audio Watermarking and Fingerprinting: For Which Applications?”, Journal of New Music Research 32(1) Mar. 2003 p. 1.
Gong et al., “A Knowledge-Based Mediator For Dynamic Integration Of Heterogeneous Multimedia Information Sources”, International Symposium on Intelligent Multimedia, Video and Speech Processing, Oct. 20-22, 2004, pp. 467-470.
Guo et al., “AdOn: An Intelligent Overlay Video Advertising System”, https://doi.org/10.1145/1571941.1572049, Jul. 2009, pp. 628-629.
Howlett et al., “A Multi-Computer Neural Network Architecture in a Virtual Sensor System Application”, International Journal of Knowledge-Based and Intelligent Engineering Systems, vol. 4, Published—Apr. 2000 pp. 86-93.
Hua et al., “Robust Video Signature Based on Ordinal Measure”, International Conference on Image Processing ICIP '04. 2004, Oct. 24-27, 2004, pp. 5.
Iwamoto et al, “Image Signature Robust To Caption Superimposition For Video Sequence Identification”, 2006 International Conference on Image Processing ,IEEE, Atlanta, GA, Oct. 8-11, 2006, pp. 3185-3188.
Herbert Jaeger, “The“ echo state” approach to analysing and training recurrent neural networks”, Bonn, Germany: German National Research Center for Information Technology GMD Technical Report, 148 ,2001, pp. 43.
Jianping Fan et al., “Concept-Oriented Indexing Of Video Databases: Toward Semantic Sensitive Retrieval and Browsing”, IEEE Transactions on Image Processing, vol. 13, No. 7, Jul. 2004, p. 1.
John L. Johnson., Pulse-coupled neural nets: translation, rotation, scale, distortion, and intensity signal invariance for images, vol. 33, No. 26, Applied Optics, Sep. 10, 1994, pp. 6239-6253.
Odinaev et al., “Cliques in Neural Ensembles as Perception Carriers”, 2006 International Joint Conference on Neural Networks Sheraton Vancouver Wail Centre Hotel, Vancouver, BC, Canada Jul. 16-21, 2006, pp. 285-292.
Kabary et al., “SportSense: Using Motion Queries to Find Scenes in Sports Videos”, DOI: 10.1145/2505515.2508211, Oct. 2013, pp. 2489-2491.
Keiji Yanai., “Generic Image Classification Using Visual Knowledge on the Web”, DOI: 10.1145/957013.957047, Jan. 2003, pp. 167-176.
Lau et al., “Semantic Web Service Adaptation Model for a Pervasive Learning Scenario”, Proceedings of the 2008 IEEE Conference on Innovative Technologies in Intelligent Systems and Industrial Applications Multimedia University, Cyberjaya, Malaysia. Jul. 12-13, 2008, pp. 98-103.
Li et al., “Matching Commercial Clips from TV Streams Using a Unique, Robust and Compact Signature”, DOI: 10.1109/DICTA.2005.52, Jan. 2006, pp. 7.
Lin et al., “Generating Robust Digital Signature for Image/Video Authentication”, Multimedia and Security Workshop at ACM Multimedia '98. Bristol. U.K., Sep. 1998, pp. 49-54.
Löytynoja et al., “Audio Encryption Using Fragile Watermarking”, DOI: 10.1109/ICICS.2005.1689175, Jul. 2015, pp. 881-885.
Richard F. Lyon., “Computational Models of Neural Auditory Processing”, DOI: 10.1109/ICASSP.1984.1172756, ICASSP '84. IEEE International Conference on Acoustics, Speech, and Signal Processing, Jan. 29, 2003, pp. 5.
Maass et al., “Computational Models for Generic Cortical Microcircuits”, DOI: 10.1201/9780203494462.ch18, Jun. 10, 2003, pp. 1-26.
Mandhaoui et al., “Emotional speech characterization based on multi-features fusion for face-to-face interaction”, 2009 International conference on signals, circuits and systems ,DOI: 10.1109/ICSCS.2009.5412691, Dec. 2009, pp. 1-6.
May et al., “The Transputer”, Neural Computers. Springer Study Edition, vol. 41. Springer, Berlin, Heidelberg, DOI: 10.1007/978-3-642-83740-1_48, Jan. 1989 pp. 477-486.
McNamara et al., “Diversity Decay in Opportunistic Content Sharing Systems”, DOI: 10.1109/WoWMoM.2011.5986211 2011 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks Aug. 15, 2011, pp. 1-3.
Mei et al., “Contextual in-image Advertising”,MM'OS, Oct. 26-31, 2008. Vancouver, British Columbia, Canada. Copyright 2008 ACM 978-1-60558-303—Jul. 8, 2010, DOI: 10.1145/1459359.1459418 ⋅ Source: DBLP, Jan. 2008, pp. 439-448.
Mei et al., “VideoSense—Towards Effective Online Video Advertising”, MM'07, Sep. 23-28, 2007, Augsburg, Bavaria, Germany.Copyright 2007 ACM 978-1-59593-701—Aug. 7, 0009 . . . $5.00, Jan. 2007, pp. 1075-1084.
Mladenovic et al., “Electronic Tour Guide for Android Mobile Platform with Multimedia Travel Book” 20th Telecommunications forum TELFOR 2012, DOI: 10.1109/TELFOR.2012.6419494, Nov. 20-22, 2012, pp. 1460-1463.
Morad et al., “Performance, Power Efficiency and Scalability of Asymmetric Cluster Chip Multiprocessors”, IEEE Computer Architecture Letters, vol. 5, 2006, DOI 10.1109/L-CA.2006.6, Jul. 5, 2006, pp. 4.
Nagy et al., “A Transputer Based, Flexible, Real-Time Control System for Robotic Manipulators”, UKACC International Conference on Control '96, Conference Publication No. 427 © IEE 1996, Sep. 2-5, 1996, pp. 84-89.
Nam et al., “Audio-Visual Content-Based Violent Scene Characterization”, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No. 98CB36269), DOI: 10.1109/ICIP.1998.723496, pp. 353-357.
Natschläger et al., “The “Liquid Computer”: A Novel Strategy for Real-Time Computing on Time Series”, Jan. 2002, pp. 1-7.
Nouza et al., “Large-Scale Processing, Indexing and Search System for Czech Audio-Visual Cultural Heritage Archives”, DOI: 10.1109/MMSP.2012.6343465, Sep. 2012, pp. 337-342.
Odinaev., “Cliques to Neural Ensembles as Perception Carriers”, 2006 International Joint Conference on Neural Networks Sheraton Vancouver Wail Centre Hotel, Vancouver, BC, Canada, DOI: 10.1109/IJCNN.2006.246693, Jul. 16-21, 2006, pp. 285-292.
Park et al., “Compact Video Signatures for Near-Duplicate Detection on Mobile Devices”, DOI: 10.1109/ISCE.2014.6884293, Jun. 2014, pp. 1-2.
Maria Paula Queluz., “Content-based integrity protection of digital images”, San Jose. California ⋅Jan. 1999 SPIE vol. 3657 ⋅0277-786X/99/$10.00, DOI: 10.1117/12.344706, Apr. 1999, pp. 85-93.
Raichelgauz et al., “Co-evoletiooary Learning in Liquid Architectures”, DOI: 10.1007/11494669_30, Jun. 2005, pp. 241-248.
Ribert et al., “An Incremental Hierarchical Clustering”, Vision Interface '99, Trois-Rivieres, Canada, May 19-21, pp. 586-591.
Boari et al, “Adaptive Routing for Dynamic Applications in Massively Parallel Architectures”, 1995 IEEE, Spring 1995, pp. 1-14.
Burgsteiner et al., “Movement Prediction from Real-World Images Using a Liquid State machine”, Innovations in Applied Artificial Intelligence Lecture Notes in Computer Science, Lecture Notes in Artificial Intelligence, LNCS, Springer-Verlag, BE, vol. 3533, Jun. 2005, pp. 121-130.
Chinchor, Nancy A. et al.; Multimedia Analysis + Visual Analytics = Multimedia Analytics; IEEE Computer Society; 2010; pp. 52-60. (Year: 2010).
Fathy et al, “A Parallel Design and Implementation For Backpropagation Neural Network Using MIMD Architecture”, 8th Mediterranean Electrotechnical Conference, 19'96. MELECON '96, Date of Conference: May 13-16, 1996, vol. 3 pp. 1472-1475, vol. 3.
Freisleben et al, “Recognition of Fractal Images Using a Neural Network”, Lecture Notes in Computer Science, 1993, vol. 6861, 1993, pp. 631-637.
Garcia, “Solving the Weighted Region Least Cost Path Problem Using Transputers”, Naval Postgraduate School, Monterey, California, Dec. 1989.
Guo et al, AdOn: An Intelligent Overlay Video Advertising System (Year: 2009).
Hogue, “Tree Pattern Inference and Matching for Wrapper Induction on the World Wide Web”, Master's Thesis, Massachusetts Institute of Technology, Jun. 2004, pp. 1-106.
Howlett et al, “A Multi-Computer Neural Network Architecture in a Virtual Sensor System Application”, International journal of knowledge-based intelligent engineering systems, 4 (2). pp. 86-93, 133N 1327-2314.
Hua et al., “Robust Video Signature Based on Ordinal Measure”, Image Processing, 2004, 2004 International Conference on Image Processing (ICIP), vol. 1, IEEE, pp. 685-688, 2004.
Johnson et al, “Pulse-Coupled Neural Nets: Translation, Rotation, Scale, Distortion, and Intensity Signal Invariance for Images”, Applied Optics, vol. 33, No. 26, 1994, pp. 6239-6253.
Lau et al., “Semantic Web Service Adaptation Model for a Pervasive Learning Scenario”, 2008 IEEE Conference on Innovative Technologies in Intelligent Systems and Industrial Applications, 2008, pp. 98-103.
Li et al (“Matching Commercial Clips from TV Streams Using a Unique, Robust and Compact Signature” 2005) (Year: 2005).
Lin et al., “Generating robust digital signature for image/video authentication”, Multimedia and Security Workshop at ACM Multimedia '98, Bristol, U.K., Sep. 1998, pp. 245-251.
Lyon, “Computational Models of Neural Auditory Processing”, IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '84, Date of Conference: Mar. 1984, vol. 9, pp. 41-44.
May et al, “The Transputer”, Springer-Verlag Berlin Heidelberg 1989, vol. 41.
McNamara et al., “Diversity Decay in opportunistic Content Sharing Systems”, 2011 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks, pp. 1-3.
Morad et al., “Performance, Power Efficiency and Scalability of Asymmetric Cluster Chip Multiprocessors”, Computer Architecture Letters, vol. 4, Jul. 4, 2005, pp. 1-4, XP002466254.
Nagy et al, “A Transputer, Based, Flexible, Real-Time Control System for Robotic Manipulators”, UKACC International Conference on Control '96, Sep. 2-5, 1996, Conference Publication No. 427, IEE 1996.
Natschlager et al., “The “Liquid Computer”: A novel strategy for real-time computing on time series”, Special Issue on Foundations of Information Processing of telematik, vol. 8, No. 1, 2002, pp. 39-43, XP002466253.
Odinaev et al, “Cliques in Neural Ensembles as Perception Carriers”, Technion—Institute of Technology, 2006 International Joint Conference on neural Networks, Canada, 2006, pp. 285-292.
Ortiz-Boyer et al, “CIXL2: A Crossover Operator for Evolutionary Algorithms Based on Population Features”, Journal of Artificial Intelligence Research 24 (2005) Submitted Nov. 2004; published Jul. 2005, pp. 1-48.
Pandya etal. A Survey on QR Codes: in context of Research and Application. International Journal of Emerging Technology and U Advanced Engineering. ISSN 2250-2459, ISO 9001:2008 Certified Journal, vol. 4, Issue 3, Mar. 2014 (Year: 2014).
Queluz, “Content-Based Integrity Protection of Digital Images”, SPIE Conf. on Security and Watermarking of Multimedia Contents, San Jose, Jan. 1999, pp. 85-93.
Santos et al., “SCORM-MPEG: an Ontology of Interoperable Metadata for multimediaand E-Leaming”, 23rd International Conference on Software, Telecommunications and Computer Networks (SoftCom), 2015, pp. 224-228.
Scheper et al, “Nonlinear dynamics in neural computation”, ESANN'2006 proceedings—European Symposium on Artificial Neural Networks, Bruges (Belgium), Apr. 26-28, 2006, d-side publication, ISBN 2-930307-06-4, pp. 1-12.
Schneider et al, “A Robust Content based Digital Signature for Image Authentication”, Proc. ICIP 1996, Lausane, Switzerland, Oct. 1996, pp. 227-230.
Stolberg et al (“HIBRID-SOC: A Multi-Core SOC Architecture for Multimedia Signal Processing” 2003).
Stolberg et al, “HIBRID-SOC: A Mul Ti-Core SOC Architecture for Mul Timedia Signal Processing”, 2003 IEEE, pp. 189-194.
Theodoropoulos et al, “Simulating Asynchronous Architectures on Transputer Networks”, Proceedings of the Fourth Euromicro Workshop On Parallel and Distributed Processing, 1996. PDP '96, pp. 274-281.
Vallet et al (“Personalized Content Retrieval in Context Using Ontological Knowledge” Mar. 2007) (Year: 2007).
Ware et al, “Locating and Identifying Components in a Robot's Workspace using a Hybrid Computer Architecture” Proceedings of the 1995 IEEE International Symposium on Intelligent Control, Aug. 27-29, 1995, pp. 139-144.
Whitby-Strevens, “The transputer”, 1985 IEEE, pp. 292-300.
Wilk et al., “The Potential of Social-Aware Multimedia Prefetching on Mobile Devices”, International Conference and Workshops on networked Systems (NetSys), 2015, pp. 1-5.
Lin et al., “Summarization of Large Scale Social Network Activity”, DOI: 10.1109/ICASSP.2009.4960375, Apr. 2009, pp. 3481-3484.
Santos et al., “SCORM-MPEG: an ontology of interoperable metadata for Multimedia and e-Learning”, DOI: 10.1109/SOFTCOM.2015.7314122, Nov. 2, 2015, pp. 5.
Scheper et al., “Nonlinear dynamics in neural computation”, ESANN, 14th European Symposium on Artificial Neural Networks, Jan. 2006, pp. 491-502.
Schneider et al., “A Robust Content Based Digital Signature for Image Authentication”, 3rd IEEE International Conference on Image Processing, Sep. 19, 2006, pp. 227-230.
Semizarov et al.,“Specificity of short interfering RNA determined through gene expression signatures”, PNAS vol. 100 (11), May 27, 2003, pp. 6347-6352.
Sheng Hua et al., “Robust video signature based on ordinal measure”, ICIP '04. 2004 International Conference on Image Processing, Oct. 2004, pp. 685-688.
Stolberg et al., “HiBRID-SoC: A multi-core SoC architecture for multimedia signal processing. VLSI Signal Processing”, Journal of VLSI Signal Processing vol. 41(1), Aug. 2005, pp. 9-20.
Theodoropoulos et al., “Simulating asynchronous architectures on transputer networks”, 4th Euromicro Workshop on Parallel and Distributed Processing, Braga, Portugal, 1996, pp. 274-281.
Vailaya et al., “Content-Based Hierarchical Classification of Vacation Images”, International Conference on Multimedia Computing and Systems, vol. 1, DOI-10.1109/MMCS.1999.779255, Jul. 1999, pp. 518-523.
Verstraeten et al., “Isolated word recognition with the Liquid State Machine: A case study”, Information Processing Letters, vol. 95(6), Sep. 2005, pp. 521-528.
Vallet et al.,“Personalized Content Retrieval in Context Using Ontological Knowledge”, in IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, No. 3, Mar. 2007, pp. 336-346.
Wang et al., “Classifying objectionable websites based on image content” Interactive Distributed Multimedia Systems and Telecommunication Services, vol. 1483, 1998, pp. 113-124.
Wang et al., “A Signature for Content-Based Image Retrieval Using a Geometrical Transform”, 6th ACM International Conference on Multimedia, Multimedia 1998, pp. 229-234.
Ware et al., “Locating and identifying components in a robot's workspace using a hybrid computer architecture”, 10th International Symposium on Intelligent Control, 1995, pp. 139-144.
Li et al. “Exploring Visual and Motion Saliency for Automatic Video Object Extraction”, in IEEE Transactions on Image Processing, vol. 22, No. 7, Jul. 2013, pp. 2600-2610.
Colin Whitby-Strevens, “The transputer”, 12th annual international symposium on Computer architecture (ISCA), IEEE Computer Society Press, Jun. 1985, pp. 292-300.
Wilk et al., “The potential of social-aware multimedia prefetching on mobile devices”, International Conference and Workshops on Networked Systems (NetSys 2015) Mar. 2015, p. 1.
Andrew William Hogue, “Tree pattern inference and matching for wrapper induction on the World Wide Web”, May 13, 2014, pp. 106.
Liu et al. “Instant Mobile Video Search With Layered Audio-Video Indexing and Progressive Transmission”, IEEE Transactions on Multimedia 16(Dec. 8, 2014, pp. 2242-2255.
Raichelgauz et al., “Natural Signal Classification by Neural Cliques and Phase-Locked Attractors”, International Conference of the IEEE Engineering in Medicine and Biology Society, 2006, pp. 6693-6697.
Lin et al., “Robust digital signature for multimedia authentication”, IEEE Circuits and Systems Magazine, vol. 3, No. 4, 2003, pp. 23-26.
Zang et al., “A New Multimedia Message Customizing Framework for mobile Devices”, IEEE International Conference on Multimedia and Expo, 2007, pp. 1043-1046.
Zhou et al., “Ensembling neural networks: Many could be better than all”, Artificial Intelligence, vol. 137, 2002, pp. 239-263.
Zhou et al., “Medical diagnosis with C4.5 rule preceded by artificial neural network ensemble”, IEEE Transactions on Information Technology in Biomedicine, vol. 7, No. 1, Mar. 2003, pp. 37-42.
Zhu et al., “Technology-Assisted Dietary Assessment”, SPIE. 6814. 681411, 2008, p. 1.
Zou et al., “A content-based image authentication system with lossless data hiding”, International Conference on Multimedia and Expo. ICME, 2003, pp. II(213)-II(216).

Provisional Applications (2)

	Number	Date	Country
	62827112	Mar 2019	US
	62827121	Mar 2019	US

Method for object detection using shallow neural networks

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

US

CPC

International Classifications

Disclaimer