Protocol for communications between platforms and image devices

Information

  • Patent Grant
  • 9479677
  • Patent Number
    9,479,677
  • Date Filed
    Wednesday, September 5, 2012
    12 years ago
  • Date Issued
    Tuesday, October 25, 2016
    8 years ago
Abstract
In accordance with some embodiments, a protocol permits communications between platforms and image devices. This allows, for example, the platform to specify particular types of information that the platform may want, the format of information the platform may prefer, and other information that may reduce the amount of processing in the platform. For example, conventionally, in gesture recognition software, the platform receives an ongoing stream of video to be parsed, searched and processed in order to identify gestures. This may consume communications bandwidth between platforms and imaging devices, particularly in cases where wireless communications or other bandwidth limited communications may be involved.
Description
BACKGROUND

This relates generally to computer controlled devices including imaging device peripherals such as printers, monitors or displays and cameras.


Conventional computer platforms such as laptop computers, desktop computers, and tablets, to mention some examples, may interface and receive information from imaging devices. As used herein an “image device” is anything that can produce or display an image, including a monitor, a display, a camera, a image sensor, a printer, or fax machine.


Conventionally, the platform simply receives raw data from the imaging device and then performs the necessary processing of the raw data.





BRIEF DESCRIPTION OF THE DRAWINGS

Some embodiments are described with respect to the following figures:



FIG. 1 is a block diagram of one embodiment to the present invention;



FIG. 2 is a flow chart for one embodiment to the present invention; and



FIG. 3 is a flow chart for another embodiment to the present invention.





DETAILED DESCRIPTION

In accordance with some embodiments, a protocol permits communications between platforms and image devices. This allows, for example, the platform to specify particular types of information that the platform may want, the format of information the platform may prefer, and other information that may reduce the amount of processing in the platform. Thus, a protocol may provide standardized methods for control and status information to pass between devices, such as control messages to get device capabilities and change processing features and behavior, and status messages to indicate available device features and options. For example, conventionally, in gesture recognition software, the platform receives an ongoing stream of video to be parsed, searched and processed in order to identify gestures. This may consume communications bandwidth between platforms and image devices, particularly in cases where wireless communications or other bandwidth limited communications may be involved. Thus it is advantageous, in some embodiments, to enable communication between platform and imaging device to specify that information which is desired by the platform, for example reducing the need to transmit unnecessary data that will simply be discarded anyway.


Similarly, any image sink, such as a display, a monitor, a printer, or fax machine, may specify to the image source, such as a computer system or platform, the format in which it wants to receive the data. For example, a display or printer that needs particular data types, particularly data densities, or particular protocols using particular margins can specify this information to the source. Then the source can do the processing to supply the information to the image sink. Also, a protocol embodiment of this invention may include actual cameras, printers and displays as well as processing units to act on behalf of the cameras and displays, as well as virtual cameras, printers and displays that operate within another device. For example, a laptop computer device may run a virtual camera which can communicate using this protocol to a real printer, where the virtual camera acts on behalf of the dumb laptop camera and provides a smart wrapper to implement this protocol.


Likewise image sources can specify the characteristics of image data that they can provide, offering alternatives for the selection of other formats and receiving feedback from the sink device about the format that is preferred, in some embodiments.


Image devices may include displays, printers, image processors, image sensors, and fax machines, to mention some examples. In some embodiments, these peripheral devices have sufficient intelligence to perform data analysis and manipulation and to receive commands and to communicate a response to those commands. Thus, generally these peripherals will be processor-based systems that also include a memory or a storage.


Potential applications include facial recognition, objection recognition, scene recognition, perceptual oriented information about light sources and direction vectors in a scene, colorimetric properties of the source and destination devices.


As an initial example, an image sensor in a camera for example may contain intelligence to alter the type of processing performed or to alter the corresponding metadata produced to meet the needs of the consuming endpoint platform. For example, some platforms may want processed image metadata in the form of metrics such as interest point coordinates, object coordinates, object counts, and other descriptor information either with or without video or other image information. Another imaging device such as a printer may want no metadata and may just request raw video processed in a particular manner.


As another example, a smart camera may be instructed to look for faces with certain attributes. A smart printer may tell a camera to deliver raw image data that fits into the printer's color space device model for optimal print rendering while allowing smart application software to ask the camera to prepare a three-dimensional depth map of a scene at ten frames per second. Similarly, a printer or display may ask the camera to provide the locations and regions of a range of objects, such as faces, so that the printer or display may perform smart enhancements of the objects, such as face regions, in an optimized manner to achieve best viewing results. Thus, an image capture device may be able to recognize a wide range of objects and communicate information about the objects to a printer or display using a standardized protocol of the type described herein, allowing the printer, display or other rendering device to optimize the rendering.


Thus in some embodiments a standardized bidirectional protocol may be implemented to allow communication of specifications for imaging data between platform and peripheral. In some embodiments this may result in more efficient transfer of data and the reduction of the transfer of unnecessary data.


The protocol can be embodied at a high level as Extensible Markup Language (XML), American Standard Code for Information Exchange (ASCII) text commands streams sent bi-directionally between imaging devices over existing standard hardware protocol methods used by cameras including but not limited to as universal serial bus (USB), Mobile Industry Processor Interface (MIPI) (specifications available from MIPI Alliance, Inc.), Peripheral Components International Express (PCIE), 3.05 specification (PCI Special Interest Group, Beaverton, Oreg. 97006, 2010-10-8) or the protocol may be an extension of existing hardware protocols. Alternatively, a new hardware protocol may be devised as a bi-directional channel for example in MIPI, USB, PCIE or even with video standards such as H.265 (High Efficiency Video Coding, February 2012, available from Fraunhofe Heinrich Hertz Institute) and CODEC formats. See H.265 available from ISO/IEC Moving Pictures Experts Group (MPEG).


Use cases include smart protocol enabled printers advising a smart protocol-enabled camera how to process images to be optimally printed given the color gamut of each device. Another use case is a smart protocol-enabled camera that can be configured to only produce information when a certain face is seen, allowing the face details to be sent to the smart camera device with corresponding face match coordinates and confidence levels to be sent to the platform, with or without the corresponding image. This exchange may involve sending standardized sets of interest or descriptor sets to search for, such as look for faces with these characteristics, look for the following gestures, look for the following objects, and only report when the object is found and then send the coordinates, descriptor information, confidence level and an entire image frame or frame portion containing the object.


Other use examples include smart protocol-enabled displays that can send their colormetrically accurate gamut map and device color model to a camera to enable a camera to produce optimal images for that display. Another application involves face tracking application software enabling using a communication protocol to send commands to a smart protocol-enabled camera sensor chip to request only coordinates on face rectangles, along with corresponding interest points and other image descriptor details. As one additional application, a three-dimensional printer may use a communications protocol and communications channel to send configuration commands to a smart 3D camera. These configuration commands may include specific commands or instructions for various three-dimensional (3D) sensors technologies including but not limited to stereo, time-of-flight (TOF), structure light and the like. A 3D printer then only requests a depth map and set of triangles in a 3D triangle depth mesh as well as textures on each triangle in the 3D triangle depth mesh from the image sensor camera device with corresponding colors of each polygon and a depth map to enable a three-dimensional model to be printed directly from the camera depth map and color information on the three-dimensional printer, or the 3D triangle depth mesh, and the same 3D triangle depth mesh may be provided to a 3D display as well by a standardized protocol of the type described herein, to enable full 3D rendering on the display.


Thus referring to FIG. 1, a computer system 10 may include a platform 12 with memory 16, a processor 14, and an interface 18. Interface 18 may interface with imaging devices such as a camera 20, a printer 22, and a monitor 24. Each of the devices 20, 22 and 24 may be hardware devices with hardware processors 50 and internal storage 52. Also stored in the memory 16 may be a face recognition application 26 and a gesture recognition application 28. A protocol of the type described herein will allow for devices to program each other to perform special functions, for example a smart printer may send program source code or executable code to a smart camera to perform specific processing on behalf of the printer.


The interface 18 may implement one of a variety of different interfaces including MIPI, USB, Unified Extensible Firmware Interface (UEFI) (UEFI Specification, v. 2.3.1, Apr. 18, 2011), Hypertext Markup Language (HTML), or even Transmission Control Protocol/Internet Protocol (TCP/IP) sockets or Uniform Data Protocol (UDP) datagrams. Other communication channels include both wired and wireless networks. The interface may implement a protocol that may be a request/response protocol, a polled protocol, or an event or interrupt event protocol to mention some examples. The protocol may also use Command and Status Register (CSR) shared memory or register interface or a stream protocol interface such as HTTP, datagrams in a socket over TCP/IP to mention a few examples. Any protocol method may be used in connection with some embodiments of the present invention.


Sequences shown in FIGS. 2 and 3 including the protocol source sequence 30 in FIG. 2 and the protocol sink sequence 40 shown in FIG. 3 may be implemented in software, firmware and/or hardware. In software and firmware embodiments, the sequences may be implemented by one or more non-transitory computer readable media storing computer executed instructions. The non-transitory computer readable media may be optical, magnetic and/or semiconductor memories in some embodiments.


Referring to FIG. 2, the protocol source sequence 30, for requesting data, begins with receiving a request as indicated in block 32. The request may be received at a platform or at a peripheral device such as a camera. The request may be translated into appropriate commands useful within the video receiving/requesting device as indicated in block 34. Then the raw data that may be received may be filtered as indicated in block 36 to place it into the form set forth in the request. Thus, the format may include various data formats, various data sizes, specifications of particular objects to locate in data and images, locating particular text items or any other requests. Then the filtered data is transmitted to the sink as indicated in block 38. The sink may be the receiving device such as a monitor or the platform in some embodiments.



FIG. 3 shows the protocol sink sequence 40 which is the device consuming the data. The sequence 40 begins by identifying a data format in block 42. The protocol sink sequence, for example may be implemented on the platform or the display as two examples. Then a potential source of the data (e.g. camera) may be identified in block 44. Next the data is requested from the source in a particular format that has already identified in block 42 as indicated in block 46. Then the formatted data is received and acknowledged as indicated in block 48.


The metadata that may be used may be communicated between the platform and the imaging devices may be in various formats including XML. The protocol metadata may implement the syntax of the software metadata command and status. A selected set of protocol directives may be used to allow bidirectional communications between platforms and imaging devices.


A plurality of different commands may be developed or default commands may be provided. Each command may return a status code showing success, fail or failure code, in addition to returning any useful requested information. As a first example of a command, an image preprocessing pipeline request or response may specify items like sharpening, contrast enhancement or HISTEQ, as examples. An image format request or response command may be specify the color space that should be used such as RGB, the patterns that may be used such as BAYER, YUV 422, YUV 444, HSD, Luv, or the like and dimensions such as whether x/y dimensions should be used.


Still another possible command as an optimize command that includes a request and a response to specify the devices and applications for which a camera will optimize both the images and the metadata. This may be based on a published list of profiles of device models for a list of known devices that are participating in this technology. Those profiles may be built into the camera or retrieved from the camera from a network or storage device on demand. This arrangement enables the camera to optimize the image further color gamut of a device or for a given application like face recognition. Thus, standard configurations of devices may be registered and stored online in a well-known location, allowing a standardized protocol of the type described herein, to be used to obtain device configurations or set device configurations used by this protocol.


Another potential command is interest point request or response to specify the types of interest points desired. Still another example of a command is descriptors request and response to specify the type of region descriptors around the interest points. Other commands include light sources request and response to specify the light source colors and direction vectors to be returned. Other examples include requests and response to device color model to return the color model of the device. The request or response may include some combination of desired information such as a mathematical model of the color gamut of the device and Low Level Virtual Machine (LLVM) code such as Java byte code. Still another command is a depth map request and response to specify the format to be returned in the depth map. Possibilities exist in protocol embodiments to specify the computer memory data formats including integer and floating point precision, integer 8, or integer 16, x/y dimensions of image regions, and characteristic formats of depth maps to include polygon 3D mesh points and point or pixel depth maps.


The following chart gives example commands, with descriptions, and an extensible markup language (XML) sample embodiment. Each command may return a status code showing success, fail, or failure code in addition to returning any useful requested information.














Command
Description
XML sample embodiment







Image pre-
Specify items like sharpening,
<processing>


processing
contrast enhancement,
 <pipeline>


Pipeline
HISTEQ, etc.
  <sharpen_type1 />


REQUEST

  <histeq />


RESPONSE

 </pipeline>




</processing>


Image
Specify raw RBG, BAYER,
<imageFormat>


format
YUV422, YUV444, HSV, Luv,
 <RGB16bbp />


REQUEST
etc. Dimension (x/y
 <dimension = “640×480” />


RESPONSE
dimension)
</imageFormat>


Optimize
Specify the devices and apps
<optimizeImage>


REQUEST
for which a camera will
 <HP_printer_model123 />


RESPONSE
optimize both 1) the images
 <Sharp_3d_display_model_123



and 2) meta-data, this is
/>



based on a published list of
<Face_recognition_app_from_Metao



profiles device model etc. for
/>



a list of known devices that
</optimizeImage>



are participating in the VDP



standard, which profiles can



be built into the camera or



retrieved by the camera from



a network or storage device



on demand. This enables the



camera to optimize the image



for the color gamut of a



device, or for a given



application like face



recognition.


Interest
Specify the type of interest
<interestPoints>


Points
points desired (harris Corner,
 <Canny />


REQUEST
Canny, LBP, etc.)
</interestPoints>


RESPONSE


Descriptors
Specify the type of region
<descriptor>


REQUEST
descriptors around the
 <SIFT />


RESPONSE
interest points (ORB, HOG,
 <GLOH />



SIFT, GLOH, DAISY, etc.)
</descriptors>


LightSources
Specify that light source colors
<lightSources>


REQUEST
& directional vectors are to be
 <reguest3DLightVector />


RESPONSE
returned
 <requestLightColor />




</lightSources>


Device Color
Return the color model of the
<deviceColorModel>


Model
device, this is some
 <mathematicalModel />


REQUEST
combination of desired
 <colorGamut />


RESPONSE
information such as the
</DEVICEcOLORmODEL



mathematical model in LLVM



of Java Bytes Code of the



color gamut of the device



[with white point, black point,



neutral gray axis, RGB max



values, Jch max] values.


Depth map
Specify the format for the
<depthMap>


REQUEST
returned depth map
  <precision = “int16” />


RESPONSE
(precision [int8, int16[, x/y
  <include_pixel_depth_map



dimension,
/>



include_polygon_3Dmesh_points,
  <include_polygon_3dmesh />



include_pixel_depth_map)
</depthMap>


List
List the primitive functions the
<listPrimitives>


Primitives
device is capable of running.
 <all />


REQUEST
This may also return a
 <compatabilityLeyelOnly />


RESPONSE
compatability level such as
</listPrimitives>



LEVEL 1 which means that the



device supports all primitives



and capabilities in LEVEL 1.


Accept
Send JAVA byte code or LLVM
<acceptPrimitive>


Primitives
code to a smart device, the
 <code =


REQUEST
code is an algorithm to run,
“10102399903ABBCBD123230DC”



assumes the device has a
 </code>



processor and can accept &
 <name =“nameOfThisCode” />



execute the code. The code
 <whenToRun = “at_startup”



can then be run by name later
/>



on command.
</accceptPrimitive>


Run
Run a primitive by name
<runPrimitive>


Primitive

 <name = “someCoolPrimitive”


REQUEST

/>




</runPrimitive>


Create
Create a pipeline from a
<pipeline>


Pipeline
sequence of primitives
 <name = “primitive1” />


REQUEST

  . . .




 <name =“primitive n” />




<//pipeline>


Search
Specify what the smart
<searchMethod>


Method
camera should look for, what
 <name =“faces_method1” />


REQUEST
meta-data it should return,
 <interestPoints =


(split
how much meta-data, and
“HarrisCorner” />


transaction
how often to return the meta-
 <pointCount = “1000_max” />


w/Search
data
 <descriptors = “ORB” />


Results)

 <descriptorCount =




“100_max> /




 <result =




“return_every_10_frames” />




 <frames =




“every_10th_image_frame” />




</searchMethod>


Search
Specify what the smart
<searchMethod>


Target
camera should look for, this
 <target = “faces_method1” /


REQUEST
allows the camera to be told
</searchMethod>


(split
which types of corner points


transaction
to look for, which types of


w/Search
faces to look for, which types


Results)
of cars to look for, etc.


Search
Receive the results from the
<searchResults>


Results
corresponding search
 <descriptor id=”1>


RESPONSE
command, results will come in
  <Interestpoint x=”223”


(split
periodically as specified in the
y=”533” />


transaction
Search Method command.
  . . .


w/Search

  <Descriptor


Method)

vector=”100Ba...” / />




  . . .




 </descriptor>




 <imageFrame>




  <data ” . . .” /></data>




 </imageFrame>




</searchResults>









Some embodiments also allow specific image processing and video analytics protocols to be created, such a sequential combination such as sharpen image, color correct image, look for faces, send face coordinates to printer, or send image to printer.


Other command list of primitives command to list the primitive functions the device is capable of running except primitives request to send a Java byte code or LLVM code to a smart device. The code may be an algorithm to run and assumes the device has a processor and can accept and execute the code which can then be run by name later on command. Run primitive command may be a request to run a primitive by name. Create command may create a pipeline from a sequence of primitives. Search method command may be a request to specify what the smart camera should look for, what metadata should return and how much metadata, and how often to return the metadata. Search target command is a request that specifies what the smart camera should look for. This allows the camera to be told which types of corner points to look for, which types of faces to look for, and which type of cars to look for, as examples. Finally a search results command may be a response to receive the results from the corresponding search command. The results may come in periodically as specified by the search method command.


References throughout this specification to “one embodiment” or “an embodiment” mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation encompassed within the present invention. Thus, appearances of the phrase “one embodiment” or “in an embodiment” are not necessarily referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be instituted in other suitable forms other than the particular embodiment illustrated and all such forms may be encompassed within the claims of the present application.


While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.

Claims
  • 1. A method comprising: sending a message from a platform and to an image capture device to specify a format of image data and what image data is to be transferred from the device to the platform related to an image to be displayed on said platform, wherein said message includes what depicted objects to look for in image data, what metadata to return, and how often to return the metadata;receiving image data in the platform; andprocessing the received image data on said platform.
  • 2. The method of claim 1 including transferring instructions from a platform to a camera to specify processing to be done in the camera before transferring image data to a platform.
  • 3. The method of claim 1 including sending a message from a platform to a display device to arrange for data provided to the display device to be placed in a particular format.
  • 4. The method of claim 1 including establishing a bidirectional protocol for exchanging information about image data between a platform and an image capture device.
  • 5. The method of claim 1 including using pre-established commands to specify characteristics of information exchanged for processing of image data between a platform and an image capture device.
  • 6. The method of claim 1 including offloading the processing of image data from a processor to a peripheral by providing commands to the peripheral to perform processing on the peripheral before transferring image data to the processor.
  • 7. One or more non-transitory computer readable media storing instructions that cause a computer to: send a message from a platform and to an image capture device to specify a format of image data and what image data is to be transferred from the device to the platform related to an image to be displayed on said platform, wherein said message includes what depicted objects to look for in image data, what metadata to return, and how often to return the metadata;receiving image data in the platform; andprocess the received image data on said platform.
  • 8. The media of claim 7 further storing instructions to transfer instructions from a platform to a camera to specify processing to be done in the camera before transferring image data to a platform.
  • 9. The media of claim 7 further storing instructions to exchange messages between a platform and a display device to arrange for data provided to the display device to be placed in a particular format.
  • 10. The media of claim 7 further storing instructions to establish a bidirectional protocol for exchanging information about image data between a platform and an image capture device.
  • 11. The media of claim 7 further storing instructions to use pre-established commands to specify characteristics of information exchanged for processing of image data between a platform and an image data.
  • 12. The media of claim 7 further storing instructions to offload the processing of image data from a processor to a peripheral by providing commands to the peripheral to perform processing on the peripheral before transferring image data to the processor.
  • 13. An apparatus comprising: a processor to send a message from a platform to an image capture device specifying an image data format and what image data is to be transferred from the device to the apparatus related to an image to be displayed on said apparatus, wherein said message includes what depicted objects to look for in image data, what metadata to return, and how often to return the metadata, to receive image data on said apparatus in the format specified in said message, and process the received image data on said platform; anda memory coupled to said processor.
  • 14. The apparatus of claim 13 wherein said apparatus is a cellular telephone.
  • 15. The apparatus of claim 13, said processor to send a message to a camera specifying an image data format for data from said camera.
  • 16. The apparatus of claim 13, said processor to send the message over a bidirectional protocol for exchanging image data.
  • 17. The apparatus of claim 16, said processor to use pre-established commands with said image device.
  • 18. The apparatus of claim 13 said processor to offload an image processing task to an image device using said message.
  • 19. The apparatus of claim 13 said processor to provide data to a display in a format specified in a message received from said display.
US Referenced Citations (12)
Number Name Date Kind
7543327 Kaplinsky Jun 2009 B1
7966415 Shouno Jun 2011 B2
8107337 Senda Jan 2012 B2
8290778 Gazdzinski Oct 2012 B2
8532105 Park Sep 2013 B2
20090054102 Jung Feb 2009 A1
20090191852 David et al. Jul 2009 A1
20090237513 Kuwata et al. Sep 2009 A1
20100231970 Higuchi et al. Sep 2010 A1
20120113442 Kyung May 2012 A1
20130124508 Paris et al. May 2013 A1
20130132527 Asher May 2013 A1
Non-Patent Literature Citations (1)
Entry
PCT International Search Report and Written Opinion issued in corresponding PCT/US2013/057891 dated Dec. 30, 2013, (12 pages).
Related Publications (1)
Number Date Country
20140063269 A1 Mar 2014 US