The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:
In this embodiment, the nodes 110 support operating systems and applications enabled to provide instant messaging within the network. These and other enabling software and hardware supported by the environment 100 will be collectively referenced as the environment facilities. One such facility as supported by the environment 100 is instant messaging function and/or applications.
Instant messaging can be defined as a form of real-time or instantaneous form of written communication between two or more entities. The text message is conveyed via nodes 110 using the network 130. Some instant messaging applications may require the use of one or more client programs that interface to an instant messaging service. Communications are then achievable instantaneously and in real-time between peers via these nodes. Users can establish a list of contacts consisting of their peers. Any peer seeking to initiate or respond to a communication, can use nodes or other environment's facility, to check availability of other peers at any given time, and subsequently if available start communicating with them. This concept is shown in the illustration of
As illustrated in
Instant messaging facilities used by the present invention can be varied as known to those skilled in the art. Any instant messaging facility that boosts communication and allows easy collaboration can be easily implemented in the present invention. It may be preferred to use instant messaging facilities that easily allow the parties to know whether the peer is available, busy, or away from the computer.
In one embodiment, for example as the one discussed in conjunction with
Incorporating a sound file provides many advantages when used in conjunction with speech technology. The sound file, combines the benefits afforded by e-mail with the advantages provided by a telephone without being as intrusive as a telephone. For example, using an instant messaging facility allows the users the opportunity to wait before responding. In such cases, the users are not to be forced to reply immediately to incoming messages. In addition, when recording is desired, instant message facilities that allow transcript logging can be used. In such cases instant messages can be logged in a local message history, and recorded to allow advantages provided by traditional e-mail. In the embodiment provided a sound file will also be used and provided by the instant messaging facilities.
Incorporating a sound file also provides additional advantages. Such advantages can range from providing assistance to the disabled to providing convenience of checking e-mail when a display can not be used such as while driving. Spoken words can also alert an otherwise inattentive recipient to the incoming instant message or even e-mail.
These sound files provided in on embodiment of the present invention, include pre-recorded snippets that readily identify peers, in their native voice (i.e. own voice when human voice) as per one embodiment, among other things and can be referenced as peer/“buddy” pre-recorded snippets and/or messages. It should be understood that any information can be included in these snippets, and more than one sound file (and thus more than one snippet) can be correlated and associated with each peer. This pre-recorded information, for example, can contain the actual voice recording of the owner or generator of the message. This is particularly advantageous when the digital pronunciation or recognition of words may lead to inaccuracies. In a preferred embodiment, personally recorded sound file snippets are provided to voice-enabled applications by taking advantage of the bandwidth and storage capabilities of the network. The snippets can also be used to pronounce names and words properly, with added benefit of being in the actual voice of the sender. Such pre-recorded snippets could also include, other standard features such as “announcements” for voice-enabled applications. For example, these “announcements” may include (but should not be limited to) instant message utilities using such pre-recorded common phrases as:
Alternatively, the sender can also prerecord “their own selective” or “their own distinctive” phrases for each such common announcement. The sound files including these phrases can also be arranged/rearranged at any given time to form (new) announcements by sequence of phrases included in the snippets. They will then be subsequently used in peer to peer communications. In other words, if more than one sound file is associated to each peer, these sound files can be arranged in a certain pre-selected sequence when desired.
For ease of understanding, a particular example will now be discussed as how such files can be used by the environment 100 and nodes 110. It should be understood, however, that although the example as will be discussed below provides for one embodiment of the present invention, it is only used to enhance understanding and therefore this embodiment should not be considered to limit the workings of the present invention as other alternative embodiments are available and can be implemented.
In one embodiment, the embodiment 100 supports speech technology and voice-enabled applications, such as Notesbuddy, via such networking services such as internet (i.e Web Services). Notesbuddy is often used in conjunction with International Business Machines (IBM) Corporation's Lotus Notes™ internet mail and Lotus Sametime™ which provides instant messaging. Notesbuddy has the capacity to provide voice, pager and display facilities which is why it is being used here as an example to ease understanding. Other examples and facilities other than Notesbuddy can easily be used/substituted in conjunction with the workings of the present invention. For ease of understanding, however, the discussion below incorporates the use of Notesbuddy.
One advantage of Notesbuddy is that it can monitor e-mail at the server or the replica. The filter-setting page allows definition of important email according to simple criteria such as names or keywords. NotesBuddy works well with Lotus Notes and Domino, so it shares the inbox, the folders, and address books. Audio notes allow for composition of and listening to email using recorded voice attachments. NotesBuddy is “multimodal,” so it can play back either text or audio mail using the same user interface. This can be used in conjunction with the workings of the present invention to allow user(s) a great variety of options and application selection.
In one example, when Notesbuddy and Sametime are both used (Lotus Sametime Connect is installed), NotesBuddy can access the Sametime server and provide instant messaging and buddy status. NotesBuddy shows buddy status in email headers and in the inbox view, as well as in a buddy list. Chats or email can be initiated any place a buddy name is displayed and then saved to folders or forwarded to others. Rich text and graphics (such as smiley faces and Web graphics) can also be exchanged when chatting with other NotesBuddy users.
In this way, Notesbuddy can be used to access and use a sound file. When Notesbuddy is not used, a sound file can still be made available. In either case, it is preferred to use the actual users voice. A variety of standards, can be used for the format of the sound file. In one embodiment, the sound file can be a WAV (or Wave) or MPEG-1 Audio Layer 3 (MP3) file as supported by the environment. (MP3, is a popular digital audio specific compression encoding format, designed to greatly reduce the amount of data required to represent audio, while maintaining the sound quality. WAV is short for Waveform audio format which is another standard for storing audio on PCs. It is a variant of the RIFF bit-stream format method for storing data in chunks.)
For ease of understanding the following scenario will now be provided and discussed as per one embodiment of the present invention. Assume a situation where a first user X (hereinafter also referenced as User 1), sends an instant message to a second user (User 2) Y. X and Y are using respective nodes. When the message is received by Y, Notesbuddy uses Web Services to locate and download the associated pre-recorded user X greeting WAV file from a repository that is preferably used by the network, or alternatively by X. This WAV file is then cached locally, such as on user Y's system, desk etc., with X's other buddy information. Anytime messages from X arrive, Notesbuddy will use the WAV file to announce X's name, using X's voice.
It should be noted that this method is different and advantageous to the real-time connections provided by TCP/IP and other network connections for the purposes of establishing real time chats or connection between two clients for voice communication purposes that are instantaneous. One advantage provided by the present invention is that no dedicated real time connection is needed as the user is not actually in instant voice communication with other users/clients. Therefore, no channel is used for piping data to and from the microphones/speakers of one or both clients. This is because no person is actually speaking live into a microphone in order to voice the words or messages.
Another advantage of the present invention is also that it can actually be adaptable and used to extend the text-to-speech (TTS) features that may be supported by many software applications for instant messaging clients. These features are usually pre-packaged voices for TTS engines that can be built-in to the messaging clients. In this way one can select attributes, such as accents, languages and the make gender (male or female) selection for greeters voice. The present invention does not replace these features, but is used in conjunction with them to extend their capability even further when such features are supported.
While the conventional messaging methods are abstractions of voice over IP (VOIP) technology, which largely addresses the steady-state dialogue and not the pre-connection announcement phase, the present invention can be used to provide and improve the ability to recognize the incoming communications from a particular source or user for the purpose of screening or recognizing urgency.
In a preferred embodiment of the present invention, a template of words, comprising in some instances of one that encompasses more than 500 words, can be selected to create other WAV, MPS or similar sound files. In this embodiment, these common words can be spoken by someone selectively (such as the user of a particular node), so that it can be later used to say words or whole messages in the sender's natural voice.
Such enhanced messages, are then instantly recognizable by the receiver who is familiar and recognizes the sender's voice. Such recognition can help prioritize the urgency of the message (one's supervisor, spouse or other family member) without ever having to interrupt the work flow of the user of the receiving node. In alternate embodiments, important users, such as one's boss or spouse can be pre-selected to speak the incoming message while the user continues to work on other tasks on the node or sits by the node while busy with other work.
In other embodiments, the concept of the present invention as discussed above can be also extended to provide a variety of other benefits. In one example, the receiving node, can be comprised of software that helps the handicapped. In one example, a blind user, can use the invention to recognize and even remember speakers that they are unable to see using Icons.
In an alternate embodiment, the invention can be applied to foreign language recognitions. In this embodiment, the Web services are enabled such that they download sound files such as WAV files, of the speaker's voice in the listeners language for these foreign language speakers, therein yielding a dynamic translation.
Many other advantages and application can also be possible using the present invention. The invention can be made available as a web service and be accessible via a variety of common networking and internet protocols such as HTTP. The ability to download the sound files locally is selective and is used for faster-reuse (such as cached locally on a personal computer). However, in alternate embodiments, these can be stored on a network or other resources that may be available on the system to enable more than one receiving user to be able to use or access them from a repository (such as a directory used by a large corporation or a small firm). In either case, the ability to recognize speakers and to selectively screen the users and/or messages without having to read the screen or listen to the entire message remains in the control of the recipient.
While the preferred embodiment to the invention has been described, it will be understood that those skilled in the art, both now and in the future, may make various improvements and enhancements which fall within the scope of the claims which follow. These claims should be construed to maintain the proper protection for the invention first described.