System for capturing and presenting text using video image capture for optical character recognition

Information

  • Patent Application
  • 20070230786
  • Publication Number
    20070230786
  • Date Filed
    March 28, 2007
    17 years ago
  • Date Published
    October 04, 2007
    17 years ago
Abstract
An apparatus for capturing text found on an object. The apparatus comprises an image capture subsystem which includes a video camera configured to capture a plurality of images to form a video stream. The image capture subsystem is configured to generate a master image from the video stream. The apparatus additionally comprises an Optical Character Recognition (“OCR”) subsystem configured to process the master image to form a digital text that corresponds to at least some of the text on the object.
Description

BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 provides a high-level overview of certain embodiments of the invention.



FIGS. 2A and 2B illustrates a front view and a side view of an exemplary handheld embodiment of the invention.



FIGS. 3A and 3B illustrate a rear view and a top view of the device illustrated in FIGS. 2A and 2B.



FIGS. 4A and 4B provide an isometric view of an exemplary standalone embodiment in an open configuration and a top view of the standalone embodiment in a closed configuration.



FIGS. 5A, 5B, and 5C provide a side view of the standalone embodiment illustrated in FIGS. 4A and 4B with an enlarged view of the exterior front panel and an enlarged view of the interior back panel.



FIG. 6 shows a sample page of a book containing black text against a white background that can be captured and/or processed by an exemplary embodiment of the invention.



FIG. 7 shows a sample page of a colored magazine article that can be captured and/or processed by an exemplary embodiment of the invention.



FIGS. 8A, 8B, and 8C illustrate schematics of an exemplary standalone embodiment.


Claims
  • 1. An apparatus for capturing text found on an object, the apparatus comprising: an image capture subsystem, including: a video camera configured to capture a plurality of images to form a video stream,wherein the image capture subsystem is configured to generate a master image from the video stream; andan Optical Character Recognition (“OCR”) subsystem configured to process the master image to form a digital text that corresponds to at least some of the text on the object.
  • 2. The apparatus of claim 1, further comprising a housing that contains the OCR subsystem and the image capture subsystem.
  • 3. The apparatus of claim 1, further comprising a text reader system configured to convert the digital text into a plurality of output formats.
  • 4. The apparatus of claim 3, further comprising a housing that contains the text reader system, the OCR system, and the image capture subsystem.
  • 5. The apparatus of claim 1, wherein the image capture subsystem further includes a level detector configured to determine whether the apparatus is level to a surface of the object.
  • 6. The apparatus of claim 5, wherein the level detector is configured with an indicator to signal when the apparatus is at an appropriate angle to the surface of the object.
  • 7. The apparatus of claim 1, wherein the image capture subsystem further includes an image stabilizer configured to compensate for unstable positioning of the apparatus when capturing the plurality of captured images.
  • 8. The apparatus of claim 1, wherein the image capture subsystem further includes a color differential detector configured to optimize the plurality of captured images for OCR processing.
  • 9. The apparatus of claim 1, wherein the image capture subsystem further includes a zoom configured to alter an image prior to capture.
  • 10. The apparatus of claim 1, wherein the image capture subsystem further includes a focal length adjustor.
  • 11. The apparatus of claim 1, wherein the image capture subsystem further includes an aperture adjustor.
  • 12. The apparatus of claim 11, wherein the aperture adjustor is configured to operate with a focal length adjustor to vary the depth of field in which the object appears.
  • 13. The apparatus of claim 1, wherein the image capture subsystem further includes an adjustable shutter.
  • 14. The apparatus of claim 1, wherein the video camera has one or more automatically adjustable lenses that tilt within the apparatus so the automatically adjustable lenses are level with the surface of the object.
  • 15. The apparatus of claim 1, wherein the image capture subsystem further includes a light source.
  • 16. The apparatus of claim 3, wherein the text reader system is further configured to translate the digital text.
  • 17. The apparatus of claim 16, wherein the output format is a language different than that found on the object.
  • 18. The apparatus of claim 3, wherein the output format is selected from the group speech, Braille, and displaying large print text.
  • 19. The apparatus of claim 1, wherein the object is non-planar.
  • 20. The apparatus of claim 1, further comprising a memory.
  • 21. The apparatus of claim 20, wherein the memory is configured to store an element selected from the group consisting of a dictionary, a thesaurus, a spellchecker program, and a vocabulary list.
  • 22. The apparatus of claim 20, wherein the memory is configured to store a plurality of key information from the digital text.
  • 23. The apparatus of claim 22, wherein the memory is further configured to permit searches of the plurality of key information.
  • 24. The apparatus of claim 1, further comprising a display configured to display the digital text.
  • 25. The apparatus of claim 3, wherein the text reader system is further configured to present a first output format on a display.
  • 26. The apparatus of claim 25, wherein the text reader system is further configured to present a second output format in speech.
  • 27. The apparatus of claim 26, wherein the text reader system is further configured to synchronize the first output format with the second output format.
  • 28. The apparatus of claim 26, wherein the text reader system is further configured to emphasize text of the first output format as corresponding text in the second output format is spoken.
  • 29. A system for capturing text found on an object, the system comprising: an image capture subsystem, including: a video camera configured to capture a plurality of images to form a video stream,wherein the image capture subsystem is configured to generate a master image from the video stream;a text capture module configured to create a digital text from the master image; anda material context component configured to associate a media type with the text found on the object,wherein the system is configured to organize the digital text according to the media type.
  • 30. The system of claim 29, wherein the material context component is further configured to associate a layout format with the media type.
  • 31. The system of claim 30, wherein the material context component is further configured to evaluate the media type and layout format to determine the layout of text found on the object.
  • 32. The system of claim 29, further comprising a storage component configured to store the organized digital text.
  • 33. The system of claim 29, further comprising an output component configured to convert the organized digital text to an output format.
  • 34. The system of claim 30, wherein the media type is selected from the group consisting of a book, a newspaper, a pill bottle, a prescription, a restaurant menu, and a street sign.
  • 35. The system of claim 30, wherein the layout format includes an element selected from the group consisting of columns, footnotes, pictures, headlines, text sizes, and text colors.
Provisional Applications (2)
Number Date Country
60811316 Jun 2006 US
60788365 Mar 2006 US