Pictures using voice commands

Information

  • Patent Grant
  • 10257401
  • Patent Number
    10,257,401
  • Date Filed
    Tuesday, November 24, 2015
    9 years ago
  • Date Issued
    Tuesday, April 9, 2019
    5 years ago
Abstract
A system and method is disclosed for enabling user friendly interaction with a camera system. Specifically, the inventive system and method has several aspects to improve the interaction with a camera system, including voice recognition, gaze tracking, touch sensitive inputs and others. The voice recognition unit is operable for, among other things, receiving multiple different voice commands, recognizing the vocal commands, associating the different voice commands to one camera command and controlling at least some aspect of the digital camera operation in response to these voice commands. The gaze tracking unit is operable for, among other things, determining the location on the viewfinder image that the user is gazing upon. One aspect of the touch sensitive inputs provides that the touch sensitive pad is mouse-like and is operable for, among other things, receiving user touch inputs to control at least some aspect of the camera operation. Another aspect of the disclosed invention provides for gesture recognition to be used to interface with and control the camera system.
Description
BACKGROUND OF THE INVENTION

Digitally-based and film-based cameras abound and are extremely flexible and convenient. One use for a camera is in the taking of self portraits. Typically, the user frames the shot and places the camera in a mode whereby when the shutter button is depressed; the camera waits a predetermined time so that the user may incorporate himself back into the shot before the camera actually takes the picture. This is cumbersome and leads to nontrivial problems. Sometimes the predetermined delay time is not long enough. Other times, it may be too long. For participates who are in place and ready to have their picture taken, especially children, waiting with a smile on their face for the picture to be snapped by the camera can seem endless even if it is just a few seconds long. Additionally, many who might like to be included into a shot find themselves not able to be because they have to take the picture and it is simply too much trouble to set up for a shutter-delayed photograph.


Voice recognition techniques are well known in the art and have been applied to cameras, see for example, U.S. Pat. Nos. 4,951,079, 6,021,278 and 6,101,338 which are herein incorporated by reference. It is currently possible to have fairly large vocabularies of uttered words recognized by electronic device. Speech recognition devices can be of a type whereby they are trained to recognize a specific person's vocalizations, so called speaker dependent recognition, or can be of a type which recognizes spoken words without regard to who speaks them, so called speaker independent recognition. Prior art voice operated cameras have several defects remedied or improved upon by various aspects of the present invention more fully disclosed below. One such problem is that in self portrait mode, the camera may snap the picture while the user is uttering the command. Another defect is that the microphone coupled to the voice recognition unit is usually mounted on the back of the camera. This placement is non-optimal when the user is in front of the camera as when taking a self portrait. Still another problem with prior art voice activated cameras is that they associate one vocalization or utterance to one camera operation. Thus, the user must remember which command word is to be spoken for which camera operation. This is overly constraining, unnatural, and significantly reduces the utility of adding voice recognition to the camera.


One prior art implementation of voice recognition allows for menu driven prompts to help guide the user through the task of remembering which command to speak for which camera function. This method however requires that the user be looking at the camera's dedicated LCD display for the menu. One aspect of the present invention provides for the menus to be displayed in the electronic view finder of the camera and be manipulated with both voice and gaze. Another aspect of the present invention incorporates touchpad technology which is typically used in laptop computers, such technology being well know in the art, as the camera input device for at least some functions.


SUMMARY OF THE INVENTION

A self-contained camera system, according to various aspects of the present invention, includes voice recognition wherein multiple different vocalizations can be recognized and wherein some such recognized vocalizations can be associated with the same camera command. Another aspect of the invention provides for multiple microphones disposed on or in the camera system body and be operable so that the user can be anywhere around the camera system and be heard by the camera system equally well. According to other aspects of the present invention, the camera system viewfinder includes gaze tracking ability and in exemplary preferred embodiments, gaze tracking is used alone or in combination with other aspects of the invention to, for example, manipulate menus, improve picture taking speed, or improve the auto focus capability of the camera. Other aspects of the present invention, such as the addition of touchpad technology and gesture recognition provide for a improved and more natural user interface to the camera system.


Thus, it is an object of the invention to provide an improved self-portrait mode for a camera system. It is further an object of the invention to provide an improved user interface for a camera system. It is yet a further object of the invention to make a camera system more user friendly with a more natural and intuitive user interface. It is still a further object of the invention to broaden the capabilities of the camera system. It is further an object of the invention to more easily allow a user to compose a shot to be taken by the camera system. It is still further an object of the invention to improve image quality of pictures taken by the camera system. It is yet another object of the invention to improve the speed of picture taking by the camera system.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1A is an exemplary perspective view of the rear (back) of the camera system according to various aspects of the present invention.



FIG. 1B is an exemplary perspective view of the front of the camera system according to various aspects of the present invention.



FIG. 2 is a functional representation of automatic microphone selection circuitry that may be uses in various aspects of the present invention.



FIG. 3 shows an exemplary functional block diagram of an inventive camera system implementing various aspects of the present invention.



FIG. 4 shows an exemplary embodiment of a wink detector according to various aspects of the present invention.



FIG. 5A shows an exemplary touchpad overlay with cutouts according to various aspects of the present invention.



FIG. 5B shows an exemplary touchpad overlay with cutouts according to various aspects of the present invention.



FIG. 5C shows an exemplary touchpad overlay with cutouts according to various aspects of the present invention.





DESCRIPTION OF PREFERRED EXEMPLARY EMBODIMENTS

One aspect of the present invention solves several of the problems of the prior art voice recognition cameras in that this aspect provides for more than one microphone to be the source to the recognition unit. With reference to FIG. 1, this aspect of the present invention provides for at least two microphones to be used, one microphone, 10b, placed on the back of the camera and one microphone, 10a, placed on the front, either of which can receive voice commands. In a first preferred embodiment of this aspect of the invention, a detection device determines which microphone is to be used as the input to the recognition unit based upon the strength of the voice signal or sound level received by each of the microphones. In another preferred embodiment, the outputs of the microphones are combined as the input to the voice recognition unit. In still another embodiment, the user can select which microphone is used as the input to the voice recognition unit, for example, by a switch or by selection through a camera menu.


Automatic microphone selection is preferred and with reference to FIG. 2, microphones 10a and 10b are each amplified by amplifiers 20 and 22 respectively. Diode 24, capacitor 28 and resister 32 form a simple energy detector and filter for microphone 10a. The output of this detector/filter is applied to one side of a comparator, 36. Similarly, diode 26, capacitor 30, and resister 34 form the other energy detector associated with microphone 10b. The output of this filter/detector combination is also applied to comparator 36. Thus, the output of this comparator selects which amplified microphone output is passed to the voice recognition unit through multiplexer 38 based on which amplified microphone output contains the greatest energy.


In yet another novel embodiment of this aspect of the invention, the multiple microphones are preferably associated with multiple voice recognition units or, alternatively, with different voice recognition algorithms well know in the art. The outputs of these multiple voice recognition units or different voice recognition algorithms are then coupled to the camera controller (FIG. 3 element 40). The camera controller preferably selects one of these outputs as being the camera controller's voice recognition input. Alternatively, the camera controller accepts the outputs of all the voice recognition units or algorithms and preferably uses a voting scheme to determine the most likely recognized command. This would obviously improve recognition rates and this aspect of the invention is contemplated to have utility beyond camera systems including, by way of example and not limitation, consumer computer devices such as PCs and laptops; portable electronic devices such as cell phones, PDAs, IPODs, etc.; entertainment devices such as TVs, video recorders, etc; and other areas.


To illustrate this embodiment using the example of the camera system having microphones on its frontside and backside given above, each of these microphones is coupled to a voice recognition unit. When an utterance is received, each voice recognition unit recognizes the utterance. The camera controller then selects which voice recognition unit's recognition to accept. This is preferably based on the energy received by each microphone using circuitry similar to FIG. 2 Alternatively, the selection of which voice recognition unit to use would be a static selection. Additionally, both recognizers' recognition would be considered by the camera controller with conflicting results resolved by voting or using ancillary information (such as microphone energy content).


An embodiment using multiple algorithms preferably has one voice recognition algorithm associated with the frontside microphone and, a different voice recognition algorithm associated with the backside microphone. Preferably, the voice recognition algorithm associated with the frontside microphone is adapted to recognize vocalizations uttered from afar (owing to this microphone probably being used in self-portraits), while the voice recognition algorithm associated with the backside microphone is optimal for closely uttered vocalizations. Selection of which algorithm is to be used as the camera controller input is preferably as above. Alternatively, as above, the selection would be by static selection or both applied to the camera controller and a voting scheme used to resolve discrepancies. While the above example contemplates using different voice recognition algorithms, there is no reason this must be so. The same algorithms could also be used in which case this example functions the same as multiple voice recognition units.


It is further contemplated in another aspect of the invention that the voice recognition subsystem be used in conjunction with the photograph storing hardware and software. In a preferred use of this aspect of the invention, the user utters names to be assigned to the photographs during storage and, later, utter then again for recall of the stored image. Thus, according to this aspect of the present invention, a stored photograph can be recalled for display simply by uttering the associated name of the photograph. The name association is preferably by direct association, that is, the name stored with the picture. In a second preferred embodiment, the photograph storage media contains a secondary file managed by the camera system and which associates the given (i.e., uttered) name with the default file name assigned by the camera system's storage hardware and/or software to the photograph when the photograph is stored on the storage media. According to the second embodiment, when a photograph is to be vocally recalled for viewing, the camera system first recognizes the utterance (in this case, the name) which will be used to identify the picture to be recalled. The camera system then scans the association file for the name which was uttered and recognized. Next, the camera system determines the default name which was given to the photograph during storage and associated with the user-given name (which was uttered and recognized) in the association file. The camera system then recalls and displays the photograph by this associated default name.


In another preferred embodiment, the voice recognition subsystem of the improved camera system recognizes at least some vocalized letters of the alphabet and/or numbers so that the user may assign names to pictures simply by spelling the name by vocalizing letters and/or numbers. Another aspect of the invention provides that stored photographs be categorized on the storage media through use of voice-recognized utterances being used to reference and/or create categories labels and that, additionally, the recognizer subsystem preferably recognize key words for manipulating the stored pictures. For instance, according to this aspect of the invention, the inventive camera system would recognize the word “move” to mean that a picture is to be moved to or from a specific category. More specifically, “move, Christmas” would indicate that the currently referenced photograph is to be moved to the Christmas folder. An alternative example is “John move new year's” indicating that the picture named john (either directly named or by association, depending on embodiment) be moved to the folder named “New Year's”. It is further contemplated that the folder names may be used for picture delineation as well. For instance, the picture “John” in the Christmas folder is not the same as the picture “John” in the Birthday folder and the former may be referenced by “Christmas, John” while the latter is referenced by “Birthday, John”.


Another aspect of the present invention provides that the voice recognition camera system be capable of associating more than one vocal utterance or sound with a single command. The different utterances are contemplated to be different words, sounds or the same word under demonstrably different conditions. As an example, the voice recognition camera system of this aspect of the present invention allows the inventive camera system to understand, for example, any of “shoot”, “snap”, “cheese”, and a whistle to indicate to the camera system that a picture is to be taken. In another example, perhaps the phrase and word “watch the birdie” and “click” instruct the camera to take the picture. It is further envisioned that the user select command words from a predetermined list of the camera command words and that he then select which words correspond to which command. It is alternatively envisioned that the association of multiple recognizable words to camera commands may also be predetermined or preassigned. In another alternate embodiment, the inventive camera system allows the user to teach the camera system which words to recognize and also inform the camera system as to which recognized words to associate with which camera commands. There are obviously other embodiments for associating recognized vocalizations to camera commands and the foregoing embodiments are simply preferred examples.


In another embodiment of this aspect of the present invention, the user has his uttered commands recognized under demonstrably different conditions and recognized as being different utterances. For instance, according to this aspect of the invention, the voice operated camera system operates so that it understand commands vocalized close to the camera (as if the user is taking the picture in traditional fashion with the camera back to his face) and significantly farther away (as if the user is taking a self portrait picture and is part of the shot and thus has to vocalize loudly to the front of the camera.) For this illustration, in a preferred embodiment the user teaches the words to the camera under the different conditions anticipated. For example, the user would teach the camera system by speaking the word “snap” close to the camera and inform the camera that this is a picture taking command and would then stand far from the camera and say “snap”, thus teaching another utterance, and instruct the camera that this is also a picture taking command. These two different utterances of the same word under different conditions would be stored and recognized as different utterances. This aspect of the invention contemplates that the words vocalized and/or taught need not be the same word and, as illustrated above, different words would also be considered different utterances as well.


Since voice recognition is not always 100 percent accurate, another aspect of the present invention contemplates that the camera system or a remote device, or both, preferably provide an indication that a voice command was or was not understood. Thus, using the self portrait example above, if the user vocalizes the command to take a picture but the camera system does not properly recognize the vocalization as being something it understands, the camera system would beep, or light an LED, etc. to indicate it's misrecognition. Because of the relatively small number of anticipated camera commands and allowing for multiple vocalizations to command the same action, it is expected that the recognition rates will be quite high and fairly tolerant of extraneous noise without necessarily resorting to the use of a highly directional or closely coupled (to the user's mouth) microphone though the use of such devices is within the scope of the invention.


It is anticipated that the user of the inventive camera system may be too far away from the camera system for the camera system to recognize and understand the user's vocalizations. Thus, another aspect of the invention provides that the camera is equipped with a small laser sensor (FIG. 1 element 18) or other optically sensitive device such that when a light of a given frequency or intensity or having a given pulse sequence encoded within it is sensed by the camera system equipped with the optically sensitive device, the camera system immediately, or shortly thereafter (to give the user time to put the light emitting device down or otherwise hide it, for example) takes a picture. The light emitting device is preferably a laser pointer or similar, stored within the camera housing when not needed so as to not be lost when not in use. Additionally, the light emitting device's power source would preferably be recharged by the camera system's power source when so stored. In another embodiment, it is also contemplated that the light emitting device may be housed in a remotely coupled display which is disclosed below. The light emitting device preferably includes further electronics to regulate the emitted light intensity or to encode a predetermined pulse sequence (on-off pulses for example) or otherwise onto the emitted light, all of which techniques are well known in the art, which the camera system of this aspect of the present invention would receive and recognize by methods well known in the art.


Another aspect of the present invention provides for there being a predetermined delay introduced between recognizing a voice command and the camera actually implementing the command. This aspect of the invention allows time, for example, for the user to close his mouth or for others in a self-portrait shot to settle down quickly before the picture is actually taken. In a first preferred embodiment of this aspect of the invention, the delay is implemented unconditionally for at least the picture taking command. In a second preferred embodiment of this aspect of the invention, the delay introduced is dependent upon from where the command came relative to the camera system. For instance, if the camera system recognized the command as coming from the frontside microphone, delay is used, but if the command comes from the backside microphone, then no delay is implemented. The simple energy detection circuitry of FIG. 2, described above is easily adapted for this function. In an alternative embodiment, implementation of the delay is dependent upon the location of the microphone due to the orientation of the flip-up or swivel LCD display when the microphone is attached to the LCD display (FIG. 1, element 12c). For example, if the microphone in the display sub-housing is oriented forward relative to the camera body then delay is implemented, if the microphone is not oriented forward then no delay is introduced. Determining the orientation of this microphone relative to the camera body is known in the art and would typically be done with switches or other sensor devices. Another preferred embodiment of this aspect of the invention implements the delay for only certain commands, such as the command to take a picture. In yet another preferred embodiment, whether the delay is implemented at all is selectable by the user.


Another aspect of the present invention provides that the camera LCD display (FIG. 1, element 14) employs touch sensitive technology. This technology is well known in the computer art and can be any of resistive, capacitive, RF, etc touch technology. This aspect of the present invention allows the user to interact with menus, features and functions displayed on the LCD display directly rather than through ancillary buttons or cursor control. For those embodiments of touch technology requiring use of a stylus, it is further contemplated that the camera body house the stylus (FIG. 1, element 14a) for easy access by the user.


According to another aspect of the present invention, it is envisioned that the current dedicated LCD display (FIG. 1, element 14) incorporated on a digital camera be made to be removable and be extendable from the camera by cable, wireless, optical, etc. interconnection with the camera. In one embodiment, this remote LCD would be wire-coupled to receive display information from the digital camera through a pluggable port. In another embodiment, the remote LCD would be wirelessly coupled to the digital camera through any of several technologies well understood in the art including, by way of example only, Bluetooth, WIFI (802.11a/b/g/n), wireless USB, FM, optical, etc. In a another embodiment of this aspect of the invention, the remotely coupled display would serve the dual purpose of being a remote input terminal to the camera system in addition to being a dedicated display for the camera system. Preferably, as mentioned earlier, the display is touch sensitive using any of the touch sensitive technology well understood in the art such as resistive, capacitive, RF, etc., methods mentioned above. Touch commands input by the user would be coupled back to the camera system as needed. It is also contemplated that the remote display house the stylus if one is required.


In another preferred embodiment, the remotely coupled display has buttons on it to control the camera system. In another embodiment, the remotely coupled display contains the microphone for receiving the voice commands of the user, digitizing the received voice, analyzing and recognizing the vocalization locally and sending a command to the camera system. In another preferred embodiment, the remotely coupled display containing the microphone simply digitizes the vocalization received by the microphone and transmits the digitized vocalization to the camera system for recognition of the vocalization by the camera system itself. In all embodiments of the wireless remote display, it is preferred that the display contain its own power source, separate from the power source of the camera. It is also contemplated that the display's separate power source may be coupled to the camera's power source when the display is ‘docked’ to the camera so that both may share power sources or so that the camera's power source may recharge the display's power source.


According to another aspect of the present invention, the electronic view finder (EVF) typically used on modern digital cameras includes a gaze tracking capability which is well known in the art, see for example U.S. Pat. No. 6,758,563 to Levola which is herein incorporated by reference. In this aspect of the present invention, menus typically used for user interface to the camera are electronically superimposed in the image in the EVF. The gaze tracker subsystem is operable for determining the area or approximate location of the viewfinder image at which the user is gazing. Thus, by the user looking at different areas of the EVF image, the gaze tracker subsystem informs the camera system so that a mouse-like pointer or cursor is moved by the camera system to the area of the EVF image indicated by the gaze tracking device to be the area the user is viewing. Preferably, the user then speaks a command to indicate his selection of the item pointed to by the pointer image. Alternatively, the user may indicate through other methods that this is his selection, such as staring at a position in the image for a minimum predetermined time or pressing a button, etc. As an example, the EVF displays icons for flash, shutter speed, camera mode, etc (alone or superimposed on the normal viewfinder image.) By gazing at an icon, a small compositely rendered arrow, cursor, etc., in the EVF image is caused by the gaze tracker subsystem to move to point to the icon at which the user is determined to be gazing by the gaze tracking subsystem, for instance, the camera mode icon as an example here. Preferably, the user then utters a command which is recognized by the camera system as indicating his desire to select that icon, for example, “yes” or “open”.


Alternatively, the icon is selected by the user gazing at the icon for some predetermined amount of time. When the icon is selected by whatever method, the EVF image shows a drop down menu of available camera modes, for example, portrait, landscape, fireworks, etc. The user, preferably, then utters the proper command word from the list or he may optionally gaze down the list at the mode he desires whereupon the gaze tracker subsystem directs that the pointer or cursor in the EVF image moves to the word and, preferably highlighting it, indicates that this is what the camera system thinks the user want to do. The user, preferably, then utters a command indicating his acceptance or rejection of that mode in this example, such as ‘yes’ or ‘no’. If the command uttered indicates acceptance, the camera system implements the command, if the command indicates rejection of the selected command, the camera system preferably moves the pointer to a neighboring command. To leave a menu, the user may utter ‘end’ to return to the menu above or ‘home’ to indicate the home menu. Preferably, the user can also manipulate the pointer position by uttering commands such as “up”, “down”, “left” and “right” to indicate relative cursor movement. In this way, the user interacts with the camera in the most natural of ways, through sight and sound cooperatively. While the above example used the preferred combination of gaze and voice recognition, it is contemplated that gaze tracking be combined with other input methods such as pushing buttons (like a mouse click) or touch input disclosed below, or gesture recognition disclosed below, etc. as examples.


Another application of this aspect of the invention uses gaze tracking to assist the auto focus (AF) capability of the prior art camera. AF generally has too modes, one mode uses the entire image, center weighted, to determine focus, another mode allows different areas of the image to have greater weight in determining focus. In the second mode, the user typically pre-selects the area of the framed image that he wishes to be over-weighted by the AF capability. This is cumbersome in that the user must predict where he wants the weighting to be ahead of time, thus, this embodiment of this aspect of the invention provides that the gaze tracker subsystem inform the AF capability of the camera system as to the location of the image that the user is gazing and that the AF capability use this information to weight this area of the image when determining focus. It is contemplated that the AF system may only provide for discrete areas of the image to be so weighted and in this case, preferably, the AF capability selects the discrete area of the image closest to that being gazed upon.


Another embodiment of this aspect of the invention uses the gaze tracker to enable the flash of the camera system. Flash is common used to “fill” dimly lit photographic scenes but sometimes this is not warranted. Other times, it is desired to have “fill” flash because the area of the scene desired is dark but the rest of the scene is quite bright (taking a picture in shade for example) and the camera does not automatically provide “fill” flash because the overall image is bright enough. Typically, the amount of “fill” flash the camera will give is determined by the camera measuring the brightness of the scene. The inventive camera system with gaze tracking is used to enhance the prior art method of determining the desire and amount of “fill” flash in that the inventive camera system gives more weight, in determining the scene brightness, to the area of the scene indicated by the gaze tracker as being gazed upon.


Another aspect of the present invention adds touchpad technology to the prior art camera system. Use of the word ‘touchpad’ throughout this disclosure should be construed to mean either the touchpad itself or the touchpad with any or all of a controller, software, associated touchpad electronics, etc. This touchpad technology is similar to the touchpad mouse pad used on laptop computers which is also well understood in the computer art. In a first preferred embodiment, the EVF (or LCD display) displays the menus as above and the user moves the cursor or mouse pointer around this image by use of his finger on the touchpad. This operation is virtually identical to that of the mouse in laptop computers and is well understood in the art. Preferably, the touch pad is mounted on the top of the camera at the location typically used for the shutter button (FIG. 1 element 12a). It is also preferred that the touchpad software implement ‘tapping’ recognition, also well known in the art, so that the user may operate the shutter button, make a selection, etc. simply by tapping the touchpad with his index finger, much the same way modern laptop driver software recognizes tapping of the touchpad as a click of the mouse button. It is also currently preferred that tapping recognition is used to make selections on the menus shown in the EVF, LCD display, or otherwise.


Another application of this aspect of the invention uses the touchpad to inform the camera system to zoom the lens simply by the user stroking his finger from front to back (for example, to zoom) or back to front over the touchpad (for example, to wide angle). For this aspect of the present invention, a preferred embodiment has the touchpad on the barrel of the lens. This is a most natural way to control zoom since the movement of the finger is a gesture with the user ‘pulling’ the object to be photographed closer (front to back stroke means zooming) or ‘pushing’ the object to be photographed away (back to front stroke means wide angle). According to another aspect of the invention, the touchpad replaces the shutter button functionality and the preferable location for this embodiment is top mounted. Preferably, the touchpad is tapped once to focus the camera and/or lock the AF and tapped a second time to trip the shutter. Alternatively, the inventive camera system simply senses the person's touch of the touchpad, auto focuses the camera and/or locks the focus or provides continually focusing while the person's touch is sensed and wherein a tap of the touchpad then trips the shutter. Preferably, the camera system enforces a maximum amount of time that the AF may be locked so that action photographs will not be badly focused. Automatically locking the AF settings for a maximum predetermined time after AF activation or continuously focus upon AF activation is also applicable to the prior art AF button activation method described below. While a computer-like touchpad was used to illustrate the above preferred embodiments of this aspect of the invention, the touch sensitive input device could be comprised of other structure, for instance, the aforementioned touch-sensitive LCD display. Also, throughout this disclosure, the word ‘continuous’ (and its variants, e.g., continually, etc.) should be construed to mean discretely continuous in addition to its analogue-world definition.


In a second preferred embodiment of this aspect of the invention, the touchpad is placed on the back of the camera (FIG. 1 element 12b) and is operable for manipulated the cursor and menus shown on the LCD or EVF display. This provides a much more natural and computer-like interface to the camera system. It is also contemplated that either embodiment of this aspect of the invention may be coupled with voice recognition so that the user may interact with the camera by touchpad manipulation in combination with voice commands. Additionally, combined with gaze tracking, the user can interact with the camera through touch, voice, and gaze (i.e., sight) to manipulate menus, control the camera system, compose the shot, focus, zoom, enable/disable flash, select macro or panoramic camera modes, etc.


One of the most annoying properties of the modern digital camera is the shutter delay that occurs when a picture is taken. That is, the delay between the user depressing the shutter button and the camera actually taking the picture. This delay can be as much as one second on some modern digital cameras and is typically due to the camera focusing and then taking the picture after the shutter button is depressed. One solution to this implemented by prior art cameras is for the camera to sense when the shutter button is depressed half way, then focus and lock the AF settings of the camera while the shutter button remains half way depressed, so that when the user depresses the shutter button the rest of the way, the picture is taken almost instantaneously. This solution is more often than not misused or misunderstood by novice users or those who do not use their camera regularly and can also result in blurred action photographs. Thus, one aspect of the present invention provides that the viewfinder be coupled to a unit for detecting when the user's eye is viewing through the viewfinder. When viewfinder use is detected, the inventive camera system preferably enables the auto focus system to continually focus thus ensuring that the shot is focused when the camera system is commanded to take a picture. Preferably, the gaze tracker is used for this determination though this aspect of the invention may be implemented without gaze tracking.


In a preferred embodiment of this aspect of the invention without gaze tracking, the viewfinder is equipped with a small light emitting device and a light detection device both well known in the art. With reference to FIG. 4, the light emitting device, 70, emits a frequency or frequencies of light some of which is reflected from the eyeball when a user is viewing through the viewfinder, 74. The light detection device, 72, is operable for sensing this reflected light and an amplifier (not shown) coupled to device 72, amplifies the signal from the light detection device, 72. Obviously, if there is no one viewing through the viewfinder, then there will be no reflected light from the eyeball and the amplifier output will be near ground, however, when a person peers into the viewfinder, light will be reflected from his eyeball and the output of the amplifier will be significantly larger. Thus, this system and method provides a way for detecting the use of the viewfinder by the user without providing gaze tracking ability. It is contemplated that this system and method be used with both EVF and optical (i.e., traditional) viewfinders and that viewport, 76, may be an LCD, optical lens, etc. Shroud 78 typically included on modern viewfinders helps to improve viewfinder use detection by cutting down on extraneous light reaching device 72 when the user is viewing through the viewfinder. It should be noted that the location of elements 70 and 72 in FIG. 4 is exemplary only and other placements of these elements are within the scope of this aspect of the invention. While the above embodiment of this aspect of the invention relied on eyeball reflectivity, in an alternate embodiment it is contemplated that the viewfinder use detect can be made with a light source and light detector juxtaposed wherein the eye interrupts the light between the two thus indicating viewfinder use, or that the shroud be fitted with a touch sensor around its outer ring that would sense the person's contact with the shroud when the viewfinder is in use. Additionally, it is contemplated that embodiments of this aspect of the invention may employ filters or other structures to help minimize false viewfinder use detection due to sunlight or other light sources shining on detector 72 when a user is not viewing through the viewfinder.


Another aspect of the present invention is to employ a wink-detector as part of the viewfinder of the camera. Preferably, the gaze tracker is modified for this purpose. Alternatively, the previously disclosed viewfinder use detector may also be employed. All that is required is to additionally detect the abrupt change in reflected light from the eye that would be caused by the eyelid wink. The wink-detector is contemplated to be used for shutter trip and/or AF activation or lock among other things. It is contemplated that it be used in the aforementioned application wherein the menus of the camera are displayed on the EVF. In this case, the wink detector preferably acts as a user selection detector device in that the user may select an item pointed to by the gaze tracker pointer or that is otherwise highlighted by the gaze tracker simply by winking. It is contemplated that the detected wink would preferably function in the camera system similarly to a left mouse click on a computer system when dealing with menus and icons. In this way, the camera system with wink detector of this aspect of the present invention becomes a optical gesture-recognizing camera wherein the gesture is optically received and electronically recognized (gesture recognition is also contemplated to be used in the touchpad software as described above.)


In an enhancement of this aspect of the invention, the wink detector subsystem discriminates between a wink and a blink by preferably determining the amount of time taken by the wink or blink. If the amount of time taken for the gesture (blinking or winking) is below a certain threshold, the gesture is considered a wink and disregarded.


Once a user of a camera has taken pictures, typically he will wish to print or otherwise develop the pictures for viewing, framing, etc. Another aspect of the present invention provides for simpler photo offloading from the modern digital camera when a set of predetermined conditions, such as day, time, number of pictures to offload, etc., are met. The camera system preferably includes the ability for the user to indicate to the camera which pictures to offload so that the camera offloads only those pictures that are so indicated by the user. In a first preferred embodiment of this aspect of the invention, the camera system is internally equipped with wireless interface technology by a wireless interface to the camera controller for interfacing directly to a photo printer or other photo rendering device. Currently preferred is WIFI (i.e., IEEE 802.11a/b/g/n) with alternatives being Bluetooth, or wireless USB all of which are known in the art. By connecting via WIFI, the inventive camera system can preferably access other devices on the LAN associated with the WIFI for the storing of pictures onto a computer, network drive, etc. In additional, preferably, devices on the network can access the camera system and the pictures within it directly and also access camera settings, upload new software or updates to the camera system, etc. Since one of the big complaints with wireless technology for small devices is the often-obtrusive antenna, it is greatly preferred for this aspect of the invention that the wireless hardware including antenna be completely contained within the body of the camera system.


In a second preferred embodiment of this aspect of the invention, the inventive camera system is equipped with software and hardware coupled to the camera controller allowing independent communication with a computer network for the primary purpose of communicating its pictures over the internet. Currently preferred is WIFI which is typically connected by LAN, routers, etc. to the internet and which usually allows WIFI-equipped devices to independently connect to the internet. Alternatively, the invention contemplates the use of wired LAN, cellular data networks, etc. as the interconnection technology used by the inventive camera system. The inventive camera system is further preferably equipped with a microbrowser that runs on the inventive camera system's camera controller which is preferably a microprocessor. It is contemplated that some embodiments may not be required a microbrowser (see enhancement below). Design and operation of microbrowser-equipped electronic devices for use with the internet is well known in the art and need not be discussed further. The camera system LCD display serves the purpose of displaying internet webpages when the user is navigating the internet in addition to its function as the camera display. So equipped, the inventive camera system can now independently upload its pictures to any of the internet-based photo printing services, such as those provided by Walmart.com, Walgreens.com, Kodak.com, etc., without the need for first storing the photos to a computer system and then connecting the computer system to the internet to upload the pictures. Use of these internet services for printing photos is preferred by many over use of a home photo printer because of the convenience, ease, availability, quality and lower per-picture printing costs. Providing the novel combination of a high photo-quality camera system with direct access to the internet according to this aspect of the present invention will further improve the utility of the camera system and these services.


In an enhancement to the above-disclosed embodiments of this aspect of the invention, the inventive camera system is operable for being instructed to automatically initiate a connection to the internet, LAN, printer, etc. whenever the predetermined conditions are met and it is in range of the network connection, (e.g., WIFI, Bluetooth, wireless USB, wired LAN, etc). Once the transmittal of the pictures is complete, the inventive camera system preferably terminates the connection. Additionally, the inventive camera system is preferably operable so that the automatic connection is made only at certain times of the day or weekends, etc., so as to confine picture transmission to periods of low network usage or periods of cheaper network access, etc. Also, it is currently preferred that the user be queried to allow the automatic connection though this is obviously not required and the connection can be made completely autonomously. Thus, in the first embodiment above, the inventive camera system automatically sends its pictures to a printer or other device on the LAN for printing or for remotely storing the pictures in the inventive camera system, whenever the inventive camera system is in range of the LAN network connection and connection can be made. In the second embodiment above, the inventive camera system automatically connects to the internet preferably via WIFI, although cellular network, etc. connection is also contemplated, when it has a predetermined number of pictures and can so connect, and will send the pictures to virtually any internet destination without user intervention. For example, the inventive camera system can be instructed to automatically send the pictures to an email account, internet picture hosting site, web-based photo printing site, the user's internet-connected home computer (when he is on vacation, for instance), etc. In this way, valuable pictures are immediately backed-up and the need for reliance on expensive camera storage media like flash cards, SD, etc. is greatly reduced.


Many prior art digital cameras can now record images continuously at 30 frames per second (i.e., take movies) along with sound. Thus, a prior art camera having an internet connection capability as herein taught combined with well known and straightforward editing methods enables inventive on-camera movie composition. According to this aspect of the invention, the inventive camera records a series of images, (e.g., a movie) and then the user downloads an MP3 file (i.e., a sound file) from a network (e.g., internet) source to be associated with the movie taken so that when the movie is played, the MP3 file also plays. Alternatively, the MP3 content is embedded in the movie, either as is, or re-encoded. Additionally, the user may download other movie material or still images via the network connection for insertion in the camera-recorded movie or for the replacement of certain individual camera-taken “frames” in the movie.



FIG. 3 shows an exemplary functional block diagram of the improved, camera system according to various aspects of the present invention. The figure shows one possible exemplary embodiment contemplated and the figure should not be used to limit the teaching of this disclosure to a certain implementation, embodiment, combination of aspects of the present invention, or otherwise.


Another aspect of the present invention provides that prior art features of the cell phone (FIG. 3, element 46a) are combined so that voice control of the camera in the cell phone can be accomplished. Many modern cell phones incorporating cams also provide voice recognition-driven dialing. Therefore, the functionality necessary for recognizing vocalizations within a cellular communication device exists in the art but has not been applied to the cell phone camera. This aspect of the present invention couples the voice recognition unit of a cell phone to the camera control unit of the cell phone either directly or via the cell phone controller, thus enabling voice control of the cell phone camera. Preferably, when recognizing a vocalization, the cell phone controller programming would also include the step of determining if the recognized vocalization was for camera control, or for dialing. Such determination would preferably be by reserving certain recognized keywords to be associated with camera functions (e.g., snap, shoot, etc). Alternatively, the cell phone may be explicitly placed into camera mode so that it is known ahead of time that recognized utterances are for camera control.


Cell phones being so light and without much inertia are hard to steady and the fact that the user must push a button on something so light makes it even harder to keep steady particularly given the small size of the shutter button on some cell phones. This aspect of the present invention would make picture taking on cell phones simpler and more fool proof.


Another aspect of the invention provides that the prior art voice recognition unit of the cell phone be adapted to recognize at least some email addresses when spoken. Another aspect of this inventive adaptation is to adapt the cell phone voice recognizer to identify the letters of the alphabet along with certain key words, for example, “space”, “underscore”, “question mark”, etc and numbers so that pictures may be named when stored by spelling, for example. This aspect of the invention is contemplated to serve the dual purpose of being usable for text messaging or chat text input on the cell phone in addition to picture labeling.


Additionally, other aspects of the present invention taught for the improved camera system are applicable to the improved cell phone herein disclosed particularly the aspect of the present invention associating multiple different utterances to a single command. The aspect of the invention allowing for automatic connection to a LAN or the internet is also contemplated for use with cell phone cameras. This aspect of the invention ameliorates the prior art storage space limitation which severely hampers the utility of the cell phone camera. Cellular service providers typically charge a fee for internet access or emailing and so an automatic feature to connect to the net or send email for the purposes of transmitting pictures can improve revenue generation for these companies.


The embodiments herein disclosed for the various aspects of the present invention are exemplary and are meant to illustrate the currently preferred embodiments of the various aspects of the invention. The disclosed embodiments are not meant to be exhaustive or to limit application of the various aspects of the invention to those embodiments so disclosed. There are other embodiments of the various aspects of the present invention that are within the scope of the invention. Additionally, not all aspects of the invention need to be practiced together, it is contemplated that subsets of the disclosed aspects of the present invention may be practiced in an embodiment and still be within the scope of the present invention. For instance, an embodiment combining a touch sensitive shutter button with a viewfinder use detector so that focusing is only accomplished when both the shutter button is touched and viewfinder use is detected. Another embodiment contemplated is to use the viewfinder use detector to automatically turn the EVF on and the LCD display off when viewfinder use is detected instead of the prior art method of pressing a button which typically toggles which of the two is on and which is off. Still another contemplated embodiment applies the touch gesture recognition typically used with the computer-like touchpad technology to a touch sensitive display, such as the touch sensitive LCD of the camera and other devices herein disclosed that utilize an LCD display. Combining various aspects of the invention herein disclosed, such as voice recognition, touch input, gaze tracking, etc for camera control provides much more natural and human interfacing to the camera system for the control of camera menus, camera features, camera options, camera settings, commanding picture taking, enabling flash, etc.


Another alternative embodiment for the disclosed aspects of the present invention is to use the disclosed touchpad with or without supporting input gesture recognition with cellular phones, other cellular devices, Apple Computer Inc.'s Ipod MP3 player, etc., with the computer-like touchpad replacing some or all of the buttons on devices. Touch input with or without touch-based gesture recognition would be an ideal replacement for Apple's Ipod click wheel interface. The touch pad would preferably be made round (alternatively, it would be rectangular with the housing of the device providing a round aperture to the touchpad device) and simply by skimming a finger over or touching the touchpad at the appropriate places on the touch pad, the Ipod would be commanded to perform the proper function such as raising or lowering the volume, fast forwarding, slowing down replay, changing the selection, etc. This type of round touchpad is also contemplated for use on cell phones to simulate the old-fashioned rotary dial action or placement of digits. The user touches the pad at the appropriate place around the circumference of the touch pad to select digits and enter them and then makes a dialing motion (stroking a thumb or finger around the circumference of the touchpad) to begin the call or touches the center of the pad to begin the call. Round pattern dialing is easily done with the thumb when the phone is being single-handedly held. With reference to FIG. 5, in another embodiment, the touchpad, 94, is further contemplated to be fitted with a solid overlay having 2 or more cutouts over its surface (the solid overlay with cutouts is preferably part of the cell phone or other device's housing and alternatively, the solid overlay, 90, with cutouts, 92, is applied to the touchpad surface separately) that only allows for certain areas of the touchpad to actually be touched to assist the user in assuring that only certain well-defined areas of the touchpad are touched. This greatly reduces the software detection requirements for the touchpad interface software since now the software need only detect when a certain defined area is touched and assigns a specific function to that touched area and reports that to the device controller. That is, the cutout areas would essentially be soft keys but without there being a plurality of different keys, instead, simply different soft key locations on the same touchpad but delineated physically so that certain other areas of the touchpad simply cannot be touched. It is further contemplated that, in many instances, the cutouts can be made large enough so that finger-stroke gestures can still be made and discerned. Because of the nature of modern mouse-like touchpad technology and how it works, the firmness of a persons touch that actually registers as a touch can also be provided for by software and this feature is also contemplated for use herein. Additionally, the touchpad, covered by a solid overlay with cutouts, would be recessed below the upper surface of the overlay (by as much as desired) helping to minimize false touches. This would be a much cheaper input gathering structure and would replace some or all of the many buttons and joystick-like controller of the cell phone, Ipod, camera, etc. It is contemplated that a few generic touchpad shapes and sizes could be manufactured and serve a host of input functions, replacing literally tons of buttons and switches, since now the solid overlay with cutouts on top of the touchpad defines the areas that can be touched or gestured (see exemplary drawings of FIG. 5(b) and FIG. 5(c)), and touchpad software, well understood in the art, defines what meaning is ascribed to these touched locations and gestures and what degree of firmness of touch is required to actually register the touch. Tapping and gesture (i.e., a finger stroke) recognition would further extend this new input-gathering device capability but is not required. This new input-gather device can be used to replace all or some of the buttons or joystick-like controllers on cell phones, portable electronic devices, cordless phones, mp3 players, PDAs, cameras, calculators, point of sales terminals, computers, computer monitors, game controllers, radio, stereos, TV, DVD players, set-top boxes, remote controls, automobile interfaces, appliances, household switches light and appliance switches, etc. Additionally, use of an overlay with cutouts is not absolutely necessary to practicing the above teachings. Similar functionality can be accomplished by simply embedding, embossing, or surface applying area-delineating markings, preferably with labels, to the touchpad itself and allowing software to accept only those touches that occur in these defined areas and to give the labeled meaning to these areas when so touched. However, use of an overlay with cutouts is currently greatly preferred because of the tactile delineation of areas it provides.


Returning to the Ipod example, because of the large memory currently available with the Ipod, it is also contemplated that a digital camera, similar to cell phone's camera be embedded in the Ipod and coupled to the Ipod controller and this inventive Ipod be operable for taking pictures and storing the pictures in the Ipod's memory. Another alternate embodiment for the disclosed aspects of the present invention is to use the viewfinder use detector, gaze tracker, and/or the disclosed internet connectability, herein described, in a video camera. As with the camera system disclosure, the viewfinder use detector can be used to enable or disable various aspects of the video camera system, such as turning the LCD display off when viewfinder use is detected. Gaze tracking is contemplated to be used to assist the video camera focusing or used to guide and select menu items. Internet connectability is contemplated be used to download sound or image files for editing or for uploading video recorded for editing or remote storage of the video images.


It is further contemplated that certain aspects of the presently disclosed invention have application beyond those disclosed herein. For instance, various voice recognition aspects of the present invention, such as use of a plurality of microphones or multiple different vocal utterances associated with the same command or delayed implementation of a command which corresponds to a recognized vocalization, are contemplated to have utility for many of the devices herein referenced and are anticipated to be incorporated therein. As an example, automatically connecting to the internet when a set of predetermined rules or conditions (such as time, date, status of equipment, etc) is met would be useful for the download/upload of information from/to the internet, like music, video, etc. for processing, storage, transmission to another party, etc. Those skilled in the art will undoubtedly see various combinations and alternative embodiments of the various aspects of the present invention herein taught but which will still be within the spirit and scope of the invention.

Claims
  • 1. A camera system comprising: (a) a lens;(b) an image sensor configured to take pictures;(c) a controller including a control program having instructions to control and respond to a voice recognizer, the voice recognizer (i) configured to use speaker-independent voice-recognition and (ii) coupled to one or more microphones;(d) a non-volatile memory that stores speaker-independent information representative of a camera system-provided list of two or more English words that is designed to enable the voice recognizer to recognize the words when any user of the camera system speaks them, wherein each English word: (i) has its own dictionary entry and (ii) is simultaneously assigned by the control program to trigger execution of an identical sequence of operations, wherein the sequence of operations takes a picture using approximately the same delay.
  • 2. The camera system of claim 1, wherein the delay is approximately no delay.
  • 3. The camera system of claim 1, wherein the picture is a self-portrait.
  • 4. The camera system of claim 1, wherein the delay is specific to a self-portrait camera mode.
  • 5. The camera system of claim 1, wherein one of the English words is “snap.”
  • 6. The camera system of claim 1, wherein one of the English words is “cheese.”
  • 7. The camera system of claim 1, wherein one of the English words is “shoot.”
  • 8. The camera system of claim 1, wherein one of the English words is “click.”
  • 9. The camera system of claim 1 further comprising an integrated cell phone, and wherein the voice recognizer is configured to recognize at least one spoken word to control dialing of the cell phone.
  • 10. The camera system of claim 1 further comprising an integrated cell phone, and wherein the voice recognizer is configured to recognize spoken words in association with text messaging over the cell phone.
  • 11. The camera system of claim 1 further comprising a touch-sensitive display.
  • 12. The camera system of claim 11 wherein the touch-sensitive display is configured to control zoom operations by detecting a finger sliding on the touch-sensitive display.
  • 13. The camera system of claim 11 wherein the touch-sensitive display is configured to display menu options.
  • 14. The camera system of claim 13 wherein the menu options are configured to be super-imposed over a photographic scene.
  • 15. The camera system of claim 11 further comprising a camera body and a stylus housed within the camera body.
  • 16. The camera system of claim 1, wherein the controller is further configured with a picture editor for creating and storing a picture sequence file in the non-volatile memory, the picture sequence file comprising: (1) a first picture taken with the image sensor and stored in the non-volatile memory;(2) a second picture taken with the image sensor and stored in the non-volatile memory;(3) data from a sound file downloaded via a network interface of the camera system and stored in the non-volatile memory.
  • 17. The camera system of claim 16, wherein the controller is further configured to upload the picture sequence file to a remote picture hosting service via the network interface.
  • 18. The camera system of claim 1, wherein the voice recognizer's recognition rate is less than one hundred percent.
CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of pending application Ser. No. 14/614,515, filed Feb. 5, 2015, which claims the benefit of pending application Ser. No. 14/539,687, filed Nov. 12, 2014, which claims the benefit of application Ser. No. 14/495,976 (now issued U.S. Pat. No. 8,917,982), filed Sep. 25, 2014, which claims the benefit of application Ser. No. 14/453,511 (now issued U.S. Pat. No. 8,923,692), filed Aug. 6, 2014, which claims the benefit of application Ser. No. 14/315,544 (now issued U.S. Pat. No. 8,897,634), filed Jun. 26, 2014, which claims the benefit of application Ser. No. 14/203,129 (now issued U.S. Pat. No. 8,818,182), filed Mar. 10, 2014, which claims the benefit of application Ser. No. 13/717,681 (now issued U.S. Pat. No. 8,831,418), filed Dec. 17, 2012, which claims the benefit of application Ser. No. 13/087,650 (now issued U.S. Pat. No. 8,467,672), filed Apr. 15, 2011, which claims the benefit of application Ser. No. 12/710,066 (now issued U.S. Pat. No. 7,933,508), filed Feb. 22, 2010, which claims the benefit of application Ser. No. 11/163,391 (now issued U.S. Pat. No. 7,697,827), filed Oct. 17, 2005, all of which are herein incorporated by reference. Reference is also made to related application Ser. No. 14/199,855 (now issued U.S. Pat. No. 8,824,879), filed Mar. 6, 2014.

US Referenced Citations (144)
Number Name Date Kind
4081623 Vogeley Mar 1978 A
4087630 Browning et al. May 1978 A
4389109 Taniguchi et al. Jun 1983 A
4399327 Yamamoto et al. Aug 1983 A
4595990 Garwin Jun 1986 A
4658425 Julstrom Apr 1987 A
4951079 Hoshino et al. Aug 1990 A
5027149 Hoshino et al. Jun 1991 A
5204709 Sato Apr 1993 A
5239337 Takagi et al. Aug 1993 A
5245381 Takagi et al. Sep 1993 A
5253008 Konishi et al. Oct 1993 A
5331149 Spitzer et al. Jul 1994 A
5459511 Uehara et al. Oct 1995 A
5461453 Watanabe et al. Oct 1995 A
5486892 Suzuki et al. Jan 1996 A
5517021 Kaugman May 1996 A
5541400 Hagiwara et al. Jul 1996 A
5546145 Bernardi et al. Aug 1996 A
5548335 Mitsuhashi et al. Aug 1996 A
5570151 Terunuma et al. Oct 1996 A
5579046 Mitsuhashi et al. Nov 1996 A
5579080 Irie et al. Nov 1996 A
5581323 Suzuki et al. Dec 1996 A
5600399 Yamada et al. Feb 1997 A
5634141 Akashi et al. May 1997 A
5689619 Smyth Nov 1997 A
5729289 Etoh Mar 1998 A
5745810 Matsushima Apr 1998 A
5748992 Tsukahara et al. May 1998 A
5749000 Narisawa May 1998 A
5765045 Takagi et al. Jun 1998 A
5797046 Nagano et al. Aug 1998 A
5907723 Inoue May 1999 A
5917921 Sasaki Jun 1999 A
5923908 Schrock Jul 1999 A
5926655 Irie et al. Jul 1999 A
5970258 Suda et al. Oct 1999 A
6021278 Bernardi et al. Feb 2000 A
6091334 Galiana Jul 2000 A
6101338 Bernardi et al. Aug 2000 A
6115556 Reddington Sep 2000 A
6181377 Kobayashi Jan 2001 B1
6243076 Hatfield Jun 2001 B1
6256060 Waikui Jul 2001 B1
6289140 Oliver Sep 2001 B1
6351273 Lemelson et al. Feb 2002 B1
6393216 Ootsuka et al. May 2002 B1
6434403 Ausems et al. Aug 2002 B1
6499016 Anderson Dec 2002 B1
6538697 Honda et al. Mar 2003 B1
6549586 Gustafsson et al. Apr 2003 B2
6591239 McCall Jul 2003 B1
6593956 Potts et al. Jul 2003 B1
6636259 Anderson Oct 2003 B1
6637883 Tengshe et al. Oct 2003 B1
6674964 Irie Jan 2004 B2
6721001 Berstis Apr 2004 B1
6758563 Levola Jul 2004 B2
6795558 Matsuo Sep 2004 B2
6804396 Higaki et al. Oct 2004 B2
6847336 Lemelson et al. Jan 2005 B1
6927694 Smith Aug 2005 B1
7046924 Miller et al. May 2006 B2
7091928 Rajasingham Aug 2006 B2
7117519 Anderson et al. Oct 2006 B1
7149688 Schalkwyk Dec 2006 B2
7164117 Breed Jan 2007 B2
7259785 Stavely et al. Aug 2007 B2
7286256 Herbert Oct 2007 B2
7340766 Nagao Mar 2008 B2
7443419 Anderson et al. Oct 2008 B2
7702821 Feinberg et al. Apr 2010 B2
8046504 Feinberg et al. Oct 2011 B2
8140813 Ozceri et al. Mar 2012 B2
8571851 Tickner et al. Oct 2013 B1
8645325 Anderson et al. Feb 2014 B2
8699869 Kamimura Apr 2014 B2
20020005907 Alten Jan 2002 A1
20020008765 Ejima et al. Jan 2002 A1
20020051638 Arakawa May 2002 A1
20020080257 Blank Jun 2002 A1
20020082844 Van Gestal Jun 2002 A1
20020105575 Hinde Aug 2002 A1
20020140803 Gutta et al. Oct 2002 A1
20030030731 Colby Feb 2003 A1
20030090572 Belz et al. May 2003 A1
20030101052 Chen May 2003 A1
20030175010 Nomura et al. Sep 2003 A1
20030179888 Burnett et al. Sep 2003 A1
20030215128 Thomson Nov 2003 A1
20040003151 Bateman et al. Jan 2004 A1
20040080624 Yuen Apr 2004 A1
20040103111 Miller et al. May 2004 A1
20040140971 Yamazaki et al. Jul 2004 A1
20040172419 Morris et al. Sep 2004 A1
20040189856 Tanaka Sep 2004 A1
20040192421 Kawahara Sep 2004 A1
20040196399 Stavely Oct 2004 A1
20040201709 Mcintyre et al. Oct 2004 A1
20050001024 Kusaka et al. Jan 2005 A1
20050036034 Rea et al. Feb 2005 A1
20050052548 Delaney Mar 2005 A1
20050055636 Graves Mar 2005 A1
20050094019 Grosvenor et al. May 2005 A1
20050097173 Johns et al. May 2005 A1
20050102141 Chikuri May 2005 A1
20050128311 Rees Jun 2005 A1
20050159955 Oerder Jul 2005 A1
20050168579 Imamura Aug 2005 A1
20050195309 Kim et al. Sep 2005 A1
20050212817 Cannon et al. Sep 2005 A1
20050216862 Shinohara et al. Sep 2005 A1
20050259173 Nakajima et al. Nov 2005 A1
20050273489 Pecht et al. Dec 2005 A1
20060008256 Khedouri et al. Jan 2006 A1
20060041632 Shah Feb 2006 A1
20060061663 Park Mar 2006 A1
20060097993 Hietala et al. May 2006 A1
20060099995 Kim et al. May 2006 A1
20060114337 Rothschild Jun 2006 A1
20060114338 Rothschild Jun 2006 A1
20060114514 Rothschild Jun 2006 A1
20060114516 Rothschild Jun 2006 A1
20060139459 Zhong Jun 2006 A1
20060176305 Arcas et al. Aug 2006 A1
20060182045 Anderson Aug 2006 A1
20060189349 Montulli et al. Aug 2006 A1
20060289348 Montulli et al. Aug 2006 A1
20060208169 Breed Sep 2006 A1
20060223503 Muhonen et al. Oct 2006 A1
20060239672 Yost et al. Oct 2006 A1
20060282572 Steinberg et al. Dec 2006 A1
20070081090 Singh Apr 2007 A1
20070100632 Aubauer May 2007 A1
20080096587 Rubinstein Apr 2008 A1
20090018419 Torch Jan 2009 A1
20100205667 Anderson et al. Aug 2010 A1
20130155309 Hill et al. Jun 2013 A1
20130257709 Raffle Oct 2013 A1
20140104197 Khosravy et al. Apr 2014 A1
20140282196 Zhao et al. Sep 2014 A1
20140347363 Kaburlasos Nov 2014 A1
20150029322 Ragland et al. Jan 2015 A1
Foreign Referenced Citations (39)
Number Date Country
0376618 Jul 1990 EP
1938573 Aug 2011 EP
2368347 Nov 2011 ES
2380556 Apr 2003 GB
56012632 Feb 1981 JP
58080631 May 1983 JP
58137828 Aug 1983 JP
60205433 Oct 1985 JP
1056428 Mar 1989 JP
1191838 Aug 1989 JP
1191840 Aug 1989 JP
64-56428 Sep 1990 JP
2230225 Sep 1990 JP
H09-186954 Jul 1997 JP
H11-212726 Aug 1999 JP
H11-355617 Dec 1999 JP
2000-083186 Mar 2000 JP
2000-163193 Jun 2000 JP
2000-221582 Aug 2000 JP
2000-231151 Aug 2000 JP
2000214525 Aug 2000 JP
2000227633 Aug 2000 JP
2000231142 Aug 2000 JP
2000235216 Aug 2000 JP
2000-285413 Oct 2000 JP
2000284794 Oct 2000 JP
2001305642 Feb 2001 JP
2002-040545 Feb 2002 JP
2002-218092 Aug 2002 JP
2004120526 Apr 2004 JP
2005024792 Jan 2005 JP
2005027002 Jan 2005 JP
2005-134819 May 2005 JP
2005-181365 Jul 2005 JP
2005333582 Dec 2005 JP
20050090265 Sep 2005 KR
WO2005050308 Jun 2005 WO
WO2005062591 Jul 2005 WO
WO2006062966 Jun 2006 WO
Non-Patent Literature Citations (97)
Entry
Adams, Russ, “Sourcebook of Automatic Identification and Data Collection,” Van Norstrand Reinhold, New York, Dec. 31, 1990.
Bernardi, Bryan D., “Speech Recognition Camera with a Prompting Display,” The Journal of the Acoustical Society of America, vol. 108, Issue 4, Oct. 2000, p. 1383.
Bernardi, Bryan D., “Speech Recognition Camera with a Prompting Display,” The Journal of the Acoustical Society of America, vol. 109, Issue 4, Apr. 2001, p. 1287.
Chapman, William D. “Prospectives in Voice Response from Computers,” R.L.A. Trost, “Film Slave,” Nov. 1976, Elektor, vol. 2, No. 11, pp. 1135-1137.
Goode, Georgianna, et al., Voice Controlled Stereographic Video Camera System, Proc. SPIE vol. 1083, p. 35, Three-Dimensional Visualization and Display Technologies; Scott S. Fisher: Woodrow E. Robbins, Eds.
Harif, Shlomi, Recognizing non-verbal sound commands in an interactive computer controlled speech word recognition display system, Acoustical Society of America Journal, vol. 118, Issue 2, pp. 599-599 (2005).
Hermes operating system now also listens to “his British master's voice” (Nov. 1999).
Morgan, Scott Anthony, Speech command input recognition system for interactive computer display with term weighting means used in interpreting potential commands from relevant speech terms, The Journal of the Acoustical Society of America, vol. 110, Issue 4, Oct. 2001, p. 1723.
Panasonic VLG2010E-S Video Intercom System with Silver door station.
Philips, M.L. Adv. Resource Dev. Corp., Columbia, MD, Voice control of remote stereoscopic systems Voice control of remote stereoscopic systems, by, Southeastcon '90. Proceedings., IEEE, Apr. 1-4, 1990, 594-598 vol. 2.
Reichenspurner, et al., Use of the voice-controlled and computer-assisted surgical system ZEUS for endoscopic coronary artery bypass grafting. The Journal of thoracic and cardiovascular surgery, Jul. 1999.
Robotics: the Future of Minimally Invasive Heart Surgery (May 2000).
ST Microelectronics TSH512 Hi-fi Stereo/mono Infrared Transmitter and Stereo Sub-carrier Generator (Oct. 2005).
Non-Final Office Action in U.S. Appl. No. 11/163,391, (dated Sep. 25, 2008).
Response to Non-Final Office Action in U.S. Appl. No. 11/163,391 (dated Jan. 9, 2009).
Non-Final Office Action in U.S. Appl. No. 11/163,391, (dated Apr. 22, 2009).
Response to Non-Final Office Action in U.S. Appl. No. 11/163,391 (dated Sep. 22, 2009).
Final Office Action in U.S. Appl. No. 11/163,391, (dated Dec. 18, 2009).
Response to Final Office Action in U.S. Appl. No. 11/163,391 (dated Jan. 11, 2010).
Non-Final Office Action in U.S. Appl. No. 12/710,066, (dated May 3, 2010).
Response to Non-Final Office Action in U.S. Appl. No. 12/710,066 (dated Aug. 3, 2010).
Final Office Action in U.S. Appl. No. 12/710,066, (dated Oct. 18, 2010).
Response to Final Office Action in U.S. Appl. No. 12/710,066 (dated Dec. 20, 2010).
Non-Final Office Action in U.S. Appl. No. 13/087,650, (dated Apr. 19, 2012).
Response to Non-Final Office Action in U.S. Appl. No. 13/087,650 (dated Jul. 19, 2012).
Non-Final Office Action in U.S. Appl. No. 13/717,681, (dated May 21, 2013).
Response to Non-Final Office Action in U.S. Appl. No. 13/717,681 (dated Nov. 15, 2013).
File History, U.S. Appl. No. 11/163,391 (now issued U.S. Pat. No. 7,697,827) to Konicek (filed Oct. 2005).
File History, U.S. Appl. No. 12/710,066 (now issued U.S. Pat. No. 7,933,508) to Konicek (filed Feb. 2010).
File History, U.S. Appl. No. 13/087,650 (now issued U.S. Pat. No. 8,467,672) to Konicek (filed Apr. 2011).
File History, U.S. Appl. No. 13/717,681 to Konicek (filed Dec. 2012).
Notice of Allowance in U.S. Appl. No. 13/717,681, (dated Jan. 24, 2014).
Request for Continued Examination in U.S. Appl. No. 13/717,681 (dated Mar. 14, 2014).
Non-Final Office Action in U.S. Appl. No. 14/203,129, (dated Apr. 25, 2014).
Response to Non-Final Office Action in U.S. Appl. No. 14/203,129, (dated Jun. 3, 2014).
File History, U.S. Appl. No. 14/199,855 to Konicek (filed Mar. 2014).
File History, U.S. Appl. No. 14/203,129 to Konicek (filed Mar. 2014).
File History, U.S. Appl. No. 14/315,544 to Konicek (filed Jun. 2014).
Notice of Allowance in U.S. Appl. No. 13/717,681, (dated Aug. 4, 2014).
Notice of Allowance in U.S. Appl. No. 14/199,855, (dated Jul. 14, 2014).
Notice of Allowance in U.S. Appl. No. 14/203,129, (dated Jul. 14, 2014).
Notice of Allowance in U.S. Appl. No. 14/315,544, (dated Sep. 29, 2014).
Notice of Allowance in U.S. Appl. No. 14/453,511, (dated Oct. 20, 2014).
Notice of Allowance in U.S. Appl. No. 14/495,976, (dated Oct. 22, 2014).
RSC-164i Datasheet, “General Purpose Microcontroller Featuring Speech Recognition, Speaker Verification, and Speech Synthesis,” Sensory, Inc. (1996).
Non-Final Office Action in U.S. Appl. No. 14/539,687, (dated Apr. 17, 2015).
Machine Translation ofJP2000214525 to Yoji (date unknown).
U.S. Appl. No. 60/718,155 to Feinberg et al. (filed Sep. 15, 2005).
Smart Commander Guide to Voice Recognition (date unknown).
Network Smart Capture Ver.1.2 (dated 1997).
Partial English Translation of Network Smart Capture Ver.1.2 (date unknown).
Smart Capture Smart Commander (date unknown).
Partial English Translation of Smart Capture Smart Commander (date unknown).
Final Office Action in U.S. Appl. No. 14/539,687, (dated Nov. 16, 2015).
Response to Final Office Action in U.S. Appl. No. 14/539,687 (dated Jan. 15, 2016).
Non-Final Office Action in U.S. Appl. No. 14/539,687, (dated Feb. 4, 2016).
Response to Non-Final Office Action in U.S. Appl. No. 14/539,687 (dated May 4, 2016).
Notice of Allowance in U.S. Appl. No. 14/539,687, (dated Jul. 15, 2016).
BMW Group—Voice Commands for BMW 5 Series & 6 Series MY2004 Equipped with CCC (date unknown).
Non-Final Office Action in U.S. Appl. No. 15/188,736 (dated Oct. 12, 2016).
Non-Final Office Action in U.S. Appl. No. 14/614,515 (dated Mar. 6, 2017).
Response to Non-Final Office Action in U.S. Appl. No. 15/188,736 (dated Apr. 12, 2017).
Declaration of Jeffrey C. Konicek Under Rule 1.132 in U.S. Appl. No. 15/188,736 (Apr. 12, 2017).
Non-Final Office Action in U.S. Appl. No. 13/717,681, (dated Apr. 3, 2014).
Non-Final Office Action in U.S. Appl. No. 14/199,855, (dated Apr. 24, 2014).
Response to Non-Final Office Action in U.S. Appl. No. 14/199,855, (dated May 21, 2014).
Response to Non-Final Office Action in U.S. Appl. No. 13/717,681 (dated Jun. 30, 2014).
Nokia 9500 Communicator User Guide (p. 38) (Copyright 2004-2005).
HP iPAQ rX3715 Quick Specs (Jul. 27, 2004).
HP iPAQ rX3715 Data Sheet (Copyright 2004).
Ricoh RDC-i700 Operation Manual (Copyright 2000).
Machine English Translation of JP 2005-181365 to Imamura et. al.
Machine English Translation of JP H09-186954 to Yasuyuki, et al.
Machine English Translation of JP 2000-221582 to Yoji.
Machine English Translation of JP 2000-231151 to Yoji.
Machine English Translation of JP2000-083186 to Hiroshi.
Machine English Translation of JP 2002-218092 to Nobuaki.
Machine English Translation of JP 2000-285413 to Kenji et al.
Machine English Translation of JP H11-212726 to Hideyuki et al.
Machine English Translation of JP H11-355617 to Manbu.
Machine English Translation of JP 2005-134819 to Mineko et al.
Response to Non-Final Office Action in U.S. Appl. No. 14/614,515 (dated Sep. 6, 2017).
Final Office Action in U.S. Appl. No. 14/614,515, (dated Nov. 15, 2017).
RCE and Response to Final Office Action in U.S. Appl. No. 14/614,515 (dated Mar. 15, 2018).
Non-Final Office Action in U.S. Appl. No. 14/614,515, (dated May 10, 2018).
Response to Non-Final Office Action in U.S. Appl. No. 14/614,515 (dated Nov. 2, 2018).
Non-Final Office Action in U.S. Appl. No. 14/950,370, (dated Jun. 20, 2017).
Response to Non-Final Office Action in U.S. Appl. No. 14/950,370 (dated Dec. 20, 2017).
Supplemental Response and Amendment in U.S. Appl. No. 14/950,370 (dated Feb. 8, 2018).
Notice of Allowance in U.S. Appl. No. 14/950,370, (dated May 29, 2018).
Corrected Notice of Allowance in U.S. Appl. No. 14/950,370, (dated Jun. 12, 2018).
Interview Summary in U.S. Appl. No. 15/188,736, (dated May 9, 2017).
Interview Summary in U.S. Appl. No. 15/188,736, (dated Jun. 15, 2017).
Final Office Action in U.S. Appl. No. 15/188,736, (dated Jun. 19, 2017).
Response to Final Office Action in U.S. Appl. No. 15/188,736 (dated Dec. 11, 2017).
Interview Summary in U.S. Appl. No. 15/188,736, (dated Dec. 12, 2017).
Notice of Allowance in U.S. Appl. No. 15/188,736, (dated Jan. 19, 2018).
Related Publications (1)
Number Date Country
20160080628 A1 Mar 2016 US
Divisions (1)
Number Date Country
Parent 11163391 Oct 2005 US
Child 12710066 US
Continuations (9)
Number Date Country
Parent 14614515 Feb 2015 US
Child 14950338 US
Parent 14539687 Nov 2014 US
Child 14614515 US
Parent 14495976 Sep 2014 US
Child 14539687 US
Parent 14453511 Aug 2014 US
Child 14495976 US
Parent 14315544 Jun 2014 US
Child 14453511 US
Parent 14203129 Mar 2014 US
Child 14315544 US
Parent 13717681 Dec 2012 US
Child 14203129 US
Parent 13087650 Apr 2011 US
Child 13717681 US
Parent 12710066 Feb 2010 US
Child 13087650 US