Embodiments described herein generally relate to vision-assist devices, and more specifically, to vision-assist devices that provide feedback to a user regarding proper orientation of the vision-assist device to provide accurate automatic detection regarding objects in an environment.
Blind or visually impaired persons have difficulty navigating within their environment because of their inability to detect the location and type of objects within the environment. Blind or visually impaired persons often use a cane to assist them in navigating a space. Other devices may use computer-based vision systems to detect objects within an environment using one or more object recognition algorithms. Although computer-based vision systems are able to detect objects present within image data, such vision systems are often incorrect in the type of object that is detected. For example, an object recognition algorithm may require a proper field of view or orientation of the camera because an improper angle of the object within the environment may prevent the object recognition algorithm from properly identifying an object. For example, in a wearable vision-assist device, due to differences in body types of users, the device may be tilted, and the images captured by the device may be analyzed improperly and incorrect information may be conveyed to the user.
Accordingly, alternative vision-assist devices for correcting the orientation of the vision-assist devices are desired.
In one embodiment, a method of calibrating a vision-assist device includes capturing a calibration image using at least one capturing device of the vision-assist device, obtaining at least one attribute of the calibration image, and comparing the at least one attribute of the calibration image with a reference attribute. The method further includes determining an adjustment of the at least one image sensor based at least in part on the comparison of the at least one attribute of the calibration image with the reference attribute, and providing an output corresponding to the determined adjustment of the vision-assist device.
In another embodiment, a vision-assist device includes an image sensor configured to capture image data, a feedback device configured to provide feedback to a user, a processor, and a non-transitory computer-readable medium. The non-transitory computer-readable medium stores computer-readable instructions that, when executed by the processor, causes the processor to capture a calibration image using one or more image sensor of the vision-assist device, obtain at least one attribute of the calibration image, and compare the at least one attribute of the calibration image with a reference attribute. The computer-readable instructions further cause the processor to determine an adjustment of the at least one image sensor based at least in part on the comparison of the at least one attribute of the calibration image with the reference attribute, and provide feedback corresponding to the determined adjustment of the vision-assist device using the feedback device.
These and additional features provided by the embodiments of the present disclosure will be more fully understood in view of the following detailed description, in conjunction with the drawings.
The embodiments set forth in the drawings are illustrative and exemplary in nature and not intended to limit the disclosure. The following detailed description of the illustrative embodiments can be understood when read in conjunction with the following drawings, where like structure is indicated with like reference numerals and in which:
Embodiments disclosed herein are directed to vision-assist device and related systems and methods that utilize one or more calibration images to provide feedback to blind or visually impaired individuals regarding how to accurately position the vision-assist device. Generally, embodiments described herein may be configured as devices that capture image data of the user's environment using one or more image sensors (e.g., one or more cameras), and perform object recognition analysis to detect objects or people within the user's environment. The information should be accurately conveyed to the blind or visually impaired individual as he or she navigates the environment according to the information.
Embodiments described herein are configured to be worn by the user. For example, as shown in
Embodiments described herein use one or more calibration processes to ensure that the vision-assist device is properly worn by the user. As described in more detail below, in one non-limiting example, an object held by a user is used in a calibration image to generate feedback to the user as to how to orientate the vision-assist device. In another example, the vision-assist device captures one or more images of an edge defined by a wall, a ceiling, and/or a floor as a reference attribute to detect the orientation (i.e., “tilt”) of the vision-assist device. Some embodiments use an inertial measurement unit to detect the orientation or tilt of the vision-assist device. In some embodiments, body type information regarding the body of the user is acquired by the vision-assist device using one or more images of the user and/or inputs with respect to the body type. Using the body type information and the calibration image, the vision-assist device instructs him or her as how to adjust the vision-assist device. Various embodiments of vision-assist devices, systems and methods for adjusting vision-assist devices will be described in more detail below.
Referring now to
The memory component 170 may be configured as volatile and/or nonvolatile non-transitory computer readable medium and, as such, may include random access memory (including SRAM, DRAM, and/or other types of random access memory), flash memory, registers, compact discs (CD), digital versatile discs (DVD), magnetic disks, and/or other types of storage components. Additionally, the memory component 170 may be configured to store, among other things, operation logic, object recognition logic, and auditory message generation logic, as described in more detail below. The memory component 170 may also store data, such as image data captured by the one or more image sensors or externally acquired image data, for performing the object recognition analysis described hereinbelow.
A local interface 120 is also included in the embodiment depicted by
The one or more processors 110 may include any processing component configured to receive information and execute instructions (such as from the memory component 170). Example processing components include, but are not limited to, one or more general purpose processors, microcontrollers, and/or application-specific integrated circuits.
The one or more image sensors 130 are configured to capture image data of the environment (i.e., the “scene”) in which the vision-assist device 100 operates. The image data digitally represents the scene in which the vision-assist device 100 operates, such as objects and people within the scene. The image sensor 130 may be configured as any sensor operable to capture image data, such as, without limitation, a charged-coupled device image sensors or complementary metal-oxide-semiconductor sensors capable of detecting optical radiation having wavelengths in the visual spectrum, for example. The one or more image sensors 130 may be configured to detect optical radiation in wavelengths outside of the visual spectrum, such as wavelengths within the infrared spectrum. In some embodiments, two image sensors 130 are provided to create stereo image data capable of capturing depth information.
The one or more inertial measurement units 140 may be configured to acquire the information with respect to the tilt or orientation of the vision-assist device 100. The one or more inertial measurement units 140 are communicatively coupled to the processor 110 such that it provides orientation information of the vision assist device and therefore the one or more image sensors 130 to the processor 110.
The one or more auditory devices 150 may be configured as speakers capable of receiving auditory signals from the processor 110 (either directly or indirectly from other hardware, such as amplifiers, drivers, digital-to-analog converts, and the like) to produce an auditory message capable of being heard by the user. In some embodiments, the one or more auditory devices 150 include a first speaker and a second speaker so that the auditory message is provided to the user in stereo. The one or more auditory devices 150 may be configured to convey information on both the environment around the user and the manner in which the user should adjust the position of the vision-assist device 100 to provide for an optimal field of view for the one or more image sensors 130.
The one or more user input devices 160 are provided for the user to communicate with the vision-assist device 100. The one or more user input devices 160 may be used by the user to complete tasks such as program preferences or settings, provide commands, and provide feedback to the vision-assist device 100. The one or more user input devices 160 may take on any appropriate form. For example, the one or more user input devices 160 may be configured as a keyboard, buttons, switches, touch-sensitive pads, microphones, and the like. Any appropriate user input device may be utilized. As described in more detail below, the one or more user input devices 160 may be used by the user to provide input regarding body type information of the user.
It should be understood that the vision-assist device 100 may include additional components not illustrated in
Referring now to
In some embodiments, the housing 180 is made from a pliable material, such as, without limitation, ethylene-vinyl acetate. In other embodiments, the housing 180 is made from a rigid material.
Referring specifically to
The first and second image sensors 130A, 130B are configured to capture image data to produce three-dimensional images of the scene as the user navigates the environment that are used by the object recognition algorithm(s) to detect objects and people, as described in detail below. As shown in
When worn around the neck of the user, the vision-assist device may be tilted depending on the user's physical characteristics.
The first and second audio devices 150A, 150B produce auditory messages that are intended to be received by the user 210
The auditory messages may provide menu navigation options to the user so that the user may program or otherwise set parameters of the vision-assist device 100. Auditory messages also include environmental information about the scene, as described in detail below. Further as described below, the first and second audio device 150A, 150B may produce auditory messages instructing a user as to how to adjust the position of the vision-assist device 100 following a calibration procedure. Although two audio devices are illustrated, more or fewer audio devices may be provided. In some embodiments, a microphone is also provided as a user-input device to enable voice-control of the vision-assist device 100. In this manner, the user may provide feedback to the vision assist device 100 using voice commands. As an example and not a limitation, first and/or second audio device 150A, 150B may be configured as a combination speaker/microphone device capable of both receiving voice commands and emitting auditory messages/sounds.
Operation of a vision-assist device 100 will now be described.
Any known or yet-to-be-developed object recognition algorithms may be utilized to detect objects within the image data representing the environment. Example object recognition algorithms include, but are not limited to, edge detection algorithms, corner detection algorithms, blob detection algorithms, and feature description algorithms (e.g., scale-invariant feature transform (“SIFT”), speeded up robust features (“SURF”), gradient location and orientation histogram (“GLOH”), and the like. It should be understood that the phrase “object recognition algorithm” as used herein also includes facial recognition algorithms used to detect people present within image data.
As noted hereinabove, if the vision-assist device is not worn properly by the user, the images captured by the one or more image sensors 130 may be improperly oriented, such as tilted. Improper orientation of the one or more image sensors 130 may adversely affect the ability of the vision-assist device to properly detect objects. If the vision-assist device 100 is not in a proper position when worn by the user, the vision-assist device 100 may not be able to provide accurate feedback to the user regarding objects in the environment 300. Embodiments of the present disclosure provide for a calibration process to ensure that the vision-assist device 100 is properly worn by the user such that the one or more image sensors 130 have an optimal field of view of objects within the environment.
As noted hereinabove, embodiments of the present disclosure are directed to vision-assist devices 100 that are capable of performing a calibration process to ensure that the vision-assist device 100 is properly worn.
In one embodiment, the calibration process employs a hand-held object, or an object that is positioned at a known location, such as on a wall, on a table, or the like. As an example and not a limitation, the hand-held object may be a sign that is held by the user at an arm's length. The sign may include an image such as, without limitation, a two-dimensional barcode as depicted in
In another embodiment, the vision-assist device 100 may instruct the user to orient his or herself toward a wall, an intersection of wall, or an intersection between wall(s), ceiling, and/or floor.
At block 620, the vision-assist device 100 captures one or more calibration images using the one or more image sensors 130 in response to the user's input requesting the start of the calibration process. The calibration image may be in the form of one or more static digital images, or a video made up of many sequential digital images. The calibration image(s) may be stored in the memory component 170, for example.
In the object embodiment (e.g., hand-held object) depicted in
At block 630, one or more attributes of the calibration image is determined. As an example and not a limitation, the one or more attributes of the calibration image may be an angle between a detected edge or feature of the calibration image and an axis, such as a vertical axis (e.g., the y-axis) or a horizontal axis (e.g., the x-axis). In the example depicted in
At block 640, the one or more attributes of the calibration image (or images) is compared with one or more reference attributes. In the examples depicted in
In some embodiments, the calibration process optionally moves to block 650, where body type information of the user is acquired. The body type information acquired at block 650 may be used by the processor of the vision-assist device to determine how to instruct the user to adjust the position of the vision-assist device. Particularly, in order to give the detailed instructions to the user, the processor 110 can acquire body type information (i.e., physical characteristics) of the user, such as height, weight, width of the neck, chest size, and the thickness of the body. The body type information can be acquired by the image of the user or by the input by the user. In some embodiments, the body type information is previously entered by the user and stored in the memory component 170. Additional information regarding acquiring and using body type information for image sensor calibration is found in U.S. patent application Ser. No. 15/205,946, which is hereby incorporated by reference in its entirety. In other embodiments, body type information is not acquired and is not utilized by the processor 110 to determine an adjustment of the vision-assist device 100.
At block 660, the processor 110 determines an adjustment of the vision-assist device 100 needed so that the vision-assist device is properly worn by the user. This determination is made by comparing the angle(s) of the attribute of the calibration image with that of the reference attribute. The processor 110 may use this angle (or angles) to determine the magnitude and the direction by which the vision-assist device 100 should be adjusted. In one example, three attributed represented by three angles with respect to three axes (e.g., the x-, y- and z-axes) are used to perform a calculation as to how to adjust the vision-assist device in three dimensions. Where body type information is acquired, the body type information may be utilized in the determination as to how to adjust the vision-assist device.
At block 670, feedback is provided to the user with respect to the determined adjustment of the vision-assist device that may be needed so that the vision-assist device has an optimum field of view. In one example, the vision-assist device 100 produces audio signals using the audio device 150 that instructs the user how to reposition or adjust the vision-assist device 100 on his or her body. As an example and not a limitation, the vision-assist device 100 may state: “Please tilt both ends of the device downward,” “Please tilt both ends of the device to the left,” “Please tilt the left end of the device upward,” and the like depending on the adjustment that is needed. It is noted that the way the instruction is communicated to the user is not limited to auditory instructions. As other non-limiting examples, the instructions may be communicated to the user through bone-conducting hardware worn by the user near his or her ears, or by haptic feedback by one or more vibration devices within the vision-assist device.
After the user adjusts the vision-assist device 100 in accordance with the instructions, the user may wish to perform another calibration process to ensure that the adjustment of the vision-assist device 100 was effective. Accordingly, the steps of flowchart 600 may be repeated until the vision-assist device 100 is properly oriented on the user.
After the position of the vision-assist device 100 is properly calibrated, the vision-assist device 100 may assist the user by properly detecting and advising the user as to objects as the user navigates the environment.
It should now be understood that embodiments described herein are directed to vision assist devices configured to be worn by a user, and to be adjusted in accordance with a calibration procedure to ensure that the vision-assist device is properly worn by the user. By calibrating the position of the vision-assist device on the user, the field of view of one or more image sensors of the vision-assist device is improved such that the automatic object recognition of objects detected by the vision-assist device in the environment is also improved. In this manner, more accurate information can be conveyed to the user by vision-assist device. Further, a blind or visually impaired person may recognize if the vision-assist device is worn properly or not, and he or she can calibrate the position of the vision-assist device without the need for human assistance.
While particular embodiments and aspects of the present disclosure have been illustrated and described herein, various other changes and modifications can be made without departing from the spirit and scope of the disclosure. Moreover, although various aspects have been described herein, such aspects need not be utilized in combination. Accordingly, it is therefore intended that the appended claims cover all such changes and modifications that are within the scope of the embodiments shown and described herein.
It should now be understood that embodiments disclosed herein includes systems, methods, and non-transitory computer-readable mediums for calibrating images. It should also be understood that these embodiments are merely exemplary and are not intended to limit the scope of this disclosure.