This invention relates to robots and, more specifically, to robotic devices capable of interfacing to mobile devices like smartphones and to internet services.
A variety of known robotic devices respond to sound, light, and other environmental actions. These robotic devices, such as service robots, perform a specific function for a user. For example, a carpet cleaning robot can vacuum a floor surface automatically for a user without any direct interaction from the user. Known robotic devices have means to sense aspects of an environment, means to process the sensor information, and means to manipulate aspects of the environment to perform some useful function. Typically, the means to sense aspects of an environment, the means to process the sensor information, and the means to manipulate the environment are each part of the same robot body.
Systems and methods described herein pertain to robotic devices and robotic control systems that may be capable of sensing and interpreting a range of environmental actions, including audible and visual signals from a human. An example device may include a body having a variety of sensors for sensing environmental actions, a separate or joined body having means to process sensor information, and a separate or joined body containing actuators that produce gestures and signals proportional to the environmental actions. The variety of sensors and the means to process sensor information may be part of an external device such as a smartphone. The variety of sensors and the means to process sensor information may also be part of an external device such as a server connected to the internet.
Systems and methods described herein pertain to methods of sensing and processing environmental actions, and producing gestures and signals in proportional to the environmental actions. The methods may include sensing actions, producing electrical signals proportional to the environmental actions, processing the electrical signals, creating a set of actuator commands, and producing gestures and signals proportional to environmental actions.
These and other features of the preferred embodiments of the invention will become more apparent in the detailed description in which reference is made to the appended drawings wherein:
The present invention can be understood more readily by reference to the following detailed description, examples, drawings, and claims, and their previous and following description. However, before the present devices, systems, and/or methods are disclosed and described, it is to be understood that this invention is not limited to the specific devices, systems, and/or methods disclosed unless otherwise specified, and, as such, can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular aspects only and is not intended to be limiting. The following description of the invention is provided as an enabling teaching of the invention in its best, currently known embodiment. To this end, those skilled in the relevant art will recognize and appreciate that many changes can be made to the various aspects of the invention described herein, while still obtaining the beneficial results of the present invention. It will also be apparent that some of the desired benefits of the present invention can be obtained by selecting some of the features of the present invention without utilizing other features. Accordingly, those who work in the art will recognize that many modifications and adaptations to the present invention are possible and can even be desirable in certain circumstances and are a part of the present invention. Thus, the following description is provided as illustrative of the principles of the present invention and not in limitation thereof. Although the term “at least one” may often be used in the specification, claims and drawings, the terms “a”, “an”, “the”, “said”, etc. also signify “at least one” or “the at least one” in the specification, claims and drawings. Thus, for example, reference to “a pressure sensor” can include two or more such pressure sensors unless the context indicates otherwise.
Ranges can be expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, another aspect includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by use of the antecedent “about,” it will be understood that the particular value forms another aspect. It will be further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint.
As used herein, the terms “optional” or “optionally” mean that the subsequently described event or circumstance may or may not occur, and that the description includes instances where said event or circumstance occurs and instances where it does not.
Finally, it is the applicant's intent that only claims that include the express language “means for” or “step for” be interpreted under 35 U.S.C. 112, paragraph 6. Claims that do not expressly include the phrase “means for” or “step for” are not to be interpreted under 35 U.S.C. 112, paragraph 6.
Systems and methods described herein may provide a robotic device that may be capable of sensing and interpreting a range of environmental actions and performing a function in response. For example, utilizing a real-time analysis of a user's auditory input and making use of online services that can translate audio into speech can provide a robot with the human-like ability to respond to human verbal speech commands. In other embodiments, different sensed data can be observed and analyzed and set to a remote service. The remote service can use this data to generate command data that may be sent back to the robotic device. The robotic device may use the command data to perform a task. Elements used to sense the environment, process the sensor information, and manipulate aspects of the environment may be separate from one another. In fact, each of these systems may be embodied on a separate device, such as a smartphone or a server connected to the internet.
The robotic device and robotic control system disclosed herein can be used in a variety of interactive applications. For example, the robotic device and control system can be used as an entertainment device that dances along with the rhythm and tempo of any musical composition.
Example systems and methods described herein may sense inputs such as dance gestures, drum beats, human created music, and/or recorded music, and perform a function such as producing gestures and signals in an entertaining fashion in response.
Additionally, systems and methods described herein may provide a robotic device capable of receiving and interpreting audio information. Human-robotic interaction may be enabled within the audio domain. Using sound as a method of communication rather than keyboard strokes or mouse clicks may create a more natural human-robot interaction experience, especially in the realm of music and media consumption. For example, by utilizing a real-time analysis of a user's auditory input and taking advantage of on-line databases containing relevant information about musical audio files available via the internet, it may be possible to match a human's audio input into a robotic device to a specific audio file or musical genre. These matches can be used to retrieve and playback songs that a user selects. A handful of applications that correlate audio input with existing songs exist which may be used with the specific processes and systems for human input to a robotic device's response within the context of human-robot interaction.
In yet another example, utilizing a real-time analysis of user visible input, such as facial expressions or physical gestures, and making use of off-line and on-line services that interpret facial expressions and gestures can provide a robot with the human-like ability to respond to human facial expressions or gestures.
In another example, the robotic device and robotic control system can be used as a notification system to notify a user of specific events or actions, such as when the user receives a status update on a social networking website, or when a timer has elapsed.
In another example, the robotic device and robotic control system can be used as remote monitoring system. In such a remote monitoring system, robotic device can be configured to remotely move the attached smartphone into an orientation where the video camera of the smartphone can be used to remotely capture and send video of the environment. In such a remote monitoring system, the robotic device can also be configured to remotely listen to audible signals from the environment and can be configured to alert a user when audible signals exceed some threshold, such as when an infant cries or a dog barks.
In another example, the robotic device and robotic control system can be used as an educational system. In such a system, the robotic device can be configured to present a set of possible answers, for example through a flash card or audio sequence, to a user and listen or watch for a user's correct verbal or visible response. In such a system, the robotic device can also be configured to listen as a user plays a musical composition on a musical instrument and provide positive or negative responses based on the user's performance.
In another example, the robotic device and robotic control system can be used as a gaming system. In such a system, the robotic device can be configured to teach a user sequences of physical gestures, such as rhythmic head bobbing or rhythmic hand shaking, facial expressions, such as frowning or smiling, audible actions, such as clapping, and other actions and provide positive or negative responses based on the user's performance. In such a system, the robotic device could also be configured to present the user a sequence of gestures and audio tones which the user must mimic in the correct order. In such a system, the robotic device could also be configured to present a set of possible answers to a question to the user, and the robotic device would provide positive or negative responses to the user based on the user's response.
The following detailed example discusses an embodiment wherein the robotic device and control system are used as an entertainment device that observes a user's audible input and plays a matching song and performs in response. Those of ordinary skill in the art will appreciate that the systems and methods of this embodiment may be applicable for other applications, such as those described above.
Several methods of human audio input can be used to elicit a musical or informative response from robotic devices. For example, human actions such as hand clapping can be used. In some robot learning algorithms, the examination of the real time audio stream of a human's hand clapping may be split into at least two parts: feature extraction and classification. An algorithm may pull from several signal processing and learning techniques to make assumptions about the human's tempo and style of the hand clapping. This algorithm may rely on the onset detection method described by Puckette, et al., “Real-time audio analysis tools for Pd and MSP”. Proceedings, International Music Conference. San Francisco: International Computer Music Association, pp. 109-112, 1998, for example, which measures the intervals between hand claps, autocorrelates the results, and processes the results through a comb filter bank as described by Davies, et al “Casual Tempo Tracking of Audio”, Proceedings of the 5th International Conference or Music Information Retrieval, pp. 164-169, 2004, for example. The contents of both of these articles are incorporated herein by reference in their entirety. Additionally, a quality threshold clustering to group the intervals can be used. From an analysis of these processed results a tempo may be estimated and/or a predicted output of future beats may be generated. Aside from onset intervals, information about specific clap volumes and intensities, periodicities, and ratios of clustered groups may reveal information about the clapping musical style such as rock, hip hop, or jazz. For example, an examination of a clapped sequence representative of a jazz rhythm may reveal that peak rhythmic energies fall on beats 2 and 4 whereas in a hip hop rhythm the rhythmic energy may be more evenly distributed. Clustering of the sequences also may show that the ratio of the number of relative triplets to relative quarter notes is greater in a jazzier sequence as opposed to the hip hop sequence which may have a higher relative sixteenth note to quarter note ratio. From the user's real-time clapped input, it may be possible to retrieve the tempo, predicted future beats, and a measure describing the likelihood of the input fitting a particular genre. This may enable “query by clapping” in which the user is able to request specific genres and songs by merely introducing a rhythmically meaningful representation of the desired output.
The robot systems and methods described herein may comprise one or more computers. A computer may be any programmable machine capable of performing arithmetic and/or logical operations. In some embodiments, computers may comprise processors, memories, data storage devices, and/or other commonly known or novel components. These components may be connected physically or through network or wireless links. Computers may also comprise software which may direct the operations of the aforementioned components. Computers may be referred to with terms is that are commonly used by those of ordinary skill in the relevant arts, such as servers, PCs, mobile devices, and other terms. Computers may facilitate communications between users, may provide databases, may perform analysis and/or transformation of data, and/or perform other functions. It will be understood by those of ordinary skill that those terms used herein are interchangeable, and any computer capable of performing the described functions may be used. For example, though the term “servers” may appear in the following specification, the disclosed embodiments are not limited to servers.
Computers may be linked to one another via a network or networks. A network may be any plurality of completely or partially interconnected computers wherein some or all of the computers are able to communicate with one another. It will be understood by those of ordinary skill that connections between computers may be wired in some cases (i.e. via Ethernet, coaxial, optical, or other wired connection) or may be wireless (i.e. via WiFi, WiMax, or other wireless connection). Connections between computers may use any protocols, including connection oriented protocols such as TCP or connectionless protocols such as UDP. Any connection through which at least two computers may exchange data can be the basis of a network.
As depicted in
The variety of sensors for sensing environmental actions 20, the module configured to process sensor information 30, and the module configured to produce gestures and signals proportional to environmental actions 40 may be contained within separate bodies, such as a smartphone 16 or other portable computer device, a server connected to the internet 50, and/or a robot body 11, in any combination or arrangement.
The robot body 11 may include various expressive elements which may be configured to move and/or activate automatically to interact with a user, as will be described in greater detail below. For example, the robot body 11 may include a movable head 12, a movable neck 13, one or more movable feet 14, one or more movable hands 15, one or more speaker systems 17, one or more lights 21, and/or any other features which may be automatically controlled to interact with a user.
A smartphone 16 or other computer device may be in communication with the robot body 11 via the robot body's communication link 34. The smartphone 16 may include a computer configured to execute one or more smartphone applications 35 or other programs which may enable the smartphone 16 to exchange sensor and/or control data with the robot body 11. In some embodiments, the module configured to process sensor information 30 and the module configured to produce gestures and signals proportional to environmental actions 40 may include the smartphone 16 computer and smartphone application 35, in addition to or instead of the computer of the robot body 11. The smartphone 16 may include sensors 32, which may be controlled by the computer and may detect user input and/or other environmental conditions as will be described in greater detail below. The smartphone 16 may include a communication link 34, which may be configured to place the computer of the smartphone 16 in communication with other devices such as the robot body 11 and/or an internet service 51. The communication link 34 may be any type of communication link, including a wired or wireless connection.
An internet service 51 may be in communication with the smartphone 16 and/or robot body 11 via the communication link 34 of the smartphone 16 and/or robot body 11. The internet service 51 may communicate via a network such as the internet using a communication link 34 and may comprise one or more servers. The servers may be configured to execute an internet service application 36 which may receive information from and/or provide information to the other elements of the robotic device 10, as will be described in greater detail below. The internet service 51 may include one or more databases, such as a song information database 37 and/or a user preference database 38. Examples of information contained in these databases 37, 38 are provided in greater detail below.
When a user 60 creates additional environmental actions, for example, but not limited to, tapping a rhythm onto a surface, hand clapping, or humming, the robotic device may detect the environmental actions 120 and may begin capturing the user input 125 for interpretation. At this time, the robot body 11 may produce additional gestures and signals, for example, but not limited to, dancing gestures and audio playback through the speaker system 17.
The operating algorithm used by the robotic device 10 control software 31 and/or smartphone application 35 may interpret environmental actions such as, but not limited to, tapping a rhythm onto a surface, hand clapping, or humming, and may distinguish between tempos, cadences, styles, and genres of music using techniques such as those described by Puckette and Davies et. al 130. For example, the operating algorithm may distinguish between a hand clapped rhythm relating to a jazz rhythm, and a hand clapped rhythm relating to a hip hop rhythm. In cases wherein tapping, or some other input with no tonal variation, is detected, the system 10 may capture the rhythm of the signal 135. In cases wherein humming, or some other input with tonal variation, is detected, the system 10 may capture the tones and the rhythm of the signal 140.
Once the robot system 10 has detected the user input, it may select a song based on the user input 145. For example, this may be performed as described above with respect to
This application is based on and derives the benefit of the filing date of U.S. Provisional Patent Application No. 61/552,610, files Oct. 28, 2011. The entire content of U.S. Provisional Patent Application No. 61/552,610 is herein incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61552610 | Oct 2011 | US |