The invention pertains to the recording of audio streams in telephone calls, and the processing of those recordings in a manner where the recordings may be replayed on an accessible site, such as a web server on the internet, where a user can search for recordings according to various criteria, and play the desired recording. The process converts a telephone call, if necessary, from a conventional call to a voice over IP format (VoIP). Various instructions can be issued to the caller. Records are created in a database and contain information about the identity of the user, the recordings, the locations were the recordings are posted for listening, and other information such as billing information. The process records at least part of the call, and transfers it to a storage location. The servers used in the processes can be either physical or virtual machines. Optionally, additional processes can be included for marking recordings to various groups created on the site, for marking recordings as private, for adding users to or deleting them from groups for authentication and for prevention of unwanted intrusions, such as spoofing.
The invention relates to apparatuses and methods, with various embodiments and optional features, for recording the content of a telephone call placed by a caller to a PBX (private branch exchange) server. This machine, in communication with another server, a PBX integrator, receives a call from a user, provides various instructions to the user, and records at least part of the telephone call. If the call is placed from a conventional telephone system, the call is translated to an appropriate format. The PBX server and integrator create records for users, and store those records on a database server. Recordings are uploaded to a storage location, with an assigned URL that is also stored in a record. The apparatus and method includes a publicly addressable server that can be accessed by users. A user accessing the server can identify and play a recording using the telephone number of the calling user.
The PBX server and integrator can also accomplish additional functions, such as authentication of a user, and prevention of spoofing, for example, a user pretending to be another user. The publicly addressable server may be a web server. It can accomplish additional functions, such as allowing a user to mark recordings as private, and to create groups of persons to be allowed access to recordings. The various servers can be physical machines, or can be instances of virtual machines, or a combination of physical and virtual machines.
b are depictions of two typical screens as seen by a user.
The service allows recording of audio from a telephone, and posting of the recording immediately to a website without any special intervention on the part of a user of the telephone. Refer to
To initiate a recording, a user 10 dials a specific number, for example, 1-877-mic-hand in this example. A telephone call 12 is translated, if necessary, from a regular POTS (plain old telephone service) call to a VoIP (Voice over IP) call by an intermediary. Telephone calls using services such as Skype do not require translation. Thus, the intermediary is not shown in
If the user 10 has not called before, he is given a short audio description of the service before the recording starts (including where to find his recording on the web). Internally, a determination whether the user is a new caller is done by a database lookup accomplished by a PBX integrator 16 that receives a message 18, “receive (Number),” from PBX server 14. First, the caller ID is checked by communication 20 between PBX integrator 16 and a database 22 to see if that specific telephone has originated a call before. If no record exists in database 22, a new record is generated in the database 22 for that telephone number represented by communication 24, “create (Number),” between the processes on PBX integrator 16 and database 22. The variable “Number” is the telephone number used by user 10. If user 10 is known, database 22 communicates accordingly with PBX Integrator 16 via message 26 to look up user 10. Asterisk®, an open source software implementation of a PBX by Digium, and Adhearsion, are one way to implement PBX server 14 and PBX integrator 16.
An alternative embodiment that adds utility deals with handling a user 10 who has used the service before, but not from the number represented by call 12. It is possible that user 10 has used the service before, but with another number. If so, user 10 can be prompted with a question asking if he has registered on the website. If so, he can be prompted for the primary number associated with his account, and the new number he is calling from is conditionally added to his account in database 10. This number is still subject to the verification steps for an “assigned” number described below. This optional feature is not shown in
Another option that adds additional utility deals with authentication of a user. See block 27 in
After the description has played or PIN verification has completed, PBX integrator 16 instructs PBX server 14 to create a record for the forthcoming recording via communication 36. Initially, this record may contain details such as the current time for billing purposes. It then instructs the PBX to start the audio recording via communication 37, “startRecording(recording)” with a message, for example, “Your recording will start after the beep.” A tone 38 is heard following the message, and recording 40, “Produce Audio,” starts. From this point on, any audio 40 is captured by PBX server 14 and, in this embodiment, written to a local file on local disk storage of PBX server 14. When the recording terminates because the user sends “FinishCall” 42 by hanging up, or by pressing a key such as the octothorpe key, or remaining silent for some length of time, PBX server 14 initiates an action 44 “sendFile (recording, file)”. The “file” variable is a handle denoting the location of the file. In one implementation, where PBX server 14 and PBX integrator 16 exist on the same machine, as described below, the handle is simply a file name on that machine. Once PBX integrator 16 receives message 44, it updates the database record via “updateRecording(recording)” 46 with the URL representing the location of the recorded file. PBX integrator 16 then sends “storageUpload(recording)” 48 to Async File Uploader 50. Uploader 50 subsequently moves the file to a storage location, such as Amazon's S3 storage service, not shown in
Once uploaded, this file would be available at a URL (uniform resource locator) such as:
http ://s3.amazonaws.com/handmic/1234567890/1.wav. “1234567890” represents the telephone number for user 10. In this example, the file format is .wav, but many suitable formats are known. Some of the principal formats besides .wav are .mp3, aiff, .ogg,. raw, .au, .gsm, .aac, .wma, or .ra.
The record in the database 22 for the recording contains this URL, so that the recording can always be reached. The upload process may be started in the background, since it may take some time to do, and any incoming calls should be processed while performing an upload. PBX integrator 16 communicates a message 48, “storageUpload(recording)” asynchronously to Async Uploader 50. PBX integrator 16 can then handle other tasks. However, the upload is typically fast since audio files are small.
Now refer to
Normally, the content associated with this number is marked public for all to hear until that number is associated with a user 10 who has registered, that is, created an account on the system. Once an account has been created, there is another option that adds additional utility. The user, once authenticated, can change permissions on the content via the web page. For example, a user can create a number of authorization groups for his content, such as “friends,” “family,” and “work.” A user belongs to a group he owns, and can belong to groups that other users own and add him to. Recordings made by a user's telephone can be added to any or all of the groups owned by that user. Then, any users on the system that are a member of the groups that the recordings are added to can access the content in that group. That is, those users are allowed to see the database records for the recordings and the associated URLs on S3. A user can create new groups, add existing recordings to them, delete relationships between old groups and existing recordings, etc.
For example, there is a “superuser” or “admin” user on the system, which is normally protected by, for example, a password. This superuser or admin user has a special group called “Public.” When an unregistered user is added to the system, he is automatically added to this group owned by the admin, and may then access any recordings that are marked public. By default, recordings that come in for unknown users are added to this group and no others.
In another embodiment, a user may have multiple telephone numbers, and each one may have some default settings, and these settings need not be the same. For example, audio that is recorded on the number 1234567890 is added to the “home” group, and audio that is recorded on the number 2234567890 is added to the “work” group. This would be a configuration option set by the user on the web page. Very little input to the phone is necessary to initiate a recording.
Another embodiment adds protection against spoofing of caller ID. In this embodiment, an authentication mechanism is added for verifying that a user owns a telephone number he claims to own. For example, suppose a hacker calls the service with a spoofed caller ID 1234567890. After calling, the hacker goes to the web page, creates an account, and then goes to the “Add Phone” link 58 and adds this number. Then the real owner of 1234567890 attempts a recording from the real 1234567890. Unfortunately, this telephone number has already been “claimed” by the hacker. Because the hacker has an account, the real owner of 1234567890 could not find the recorded call on the web page if the hacker set the option on this number to make all recordings on it private by default. The real owner of 1234567890 would not be able to access recordings for this number.
The authentication mechanism in this embodiment can operate in several ways. An SMS (short message service) message can be sent to the added number, with a request that the user reply to it via SMS with a PIN number given out on the homepage; another embodiment would have the user receive the code in the SMS, but respond by calling the service's phone number and entering the assigned PIN upon request. SMS-based mechanisms work, but require a telephone capable of receiving SMS messages, which presently is not available on most POTS lines. Another alternative in this embodiment is where the PBX calls the number, plays a recording, and requests the user to enter the assigned PIN on the keypad. Outgoing calls cannot be spoofed. That is, a hacker cannot intercept an outgoing call to 1234567890 and pretend to be the owner of that number. This provides verification that the telephone belongs to the person claiming it, and allows verification of telephone calls from landlines as well as from cellular telephones.
In another embodiment, a user can create groups. Suppose a user has signed up, added a number, and wants recordings made with that number to belong to an access-restricted group by default, for example, “work”. Additionally, the user wants to make sure recordings are never made with spoofed caller IDs and added to that group by a hacker, which would potentially increase charges to the real user, require effort by the user to delete such recordings, or cause others in the private group to hear spoofed messages. So, when a user calls in and the number is recognized, and a second database lookup is performed that recognizes that that number belongs to a known user on the system, and furthermore that user's options are set to make recordings on that phone private, the user can be requested to enter a PIN before starting the recording, as shown in block 27 of
Refer to the block diagram,
Webserver 62, application server 64, database server 22, PBX server 14, PBX integrator 16 and Async File Uploader 50 may all exist as physical machines in one embodiment managed at an office, or in a traditional web hosting data center. In another, alternative embodiment, Amazon's EC2 service may be used. EC2 stands for “Elastic Cloud,” and is Amazon's compute service. Amazon exposes a public API that allows one to start, stop and query the status of Xen virtual machines running on Amazon's physical infrastructure. Xen is quite similar to Parallels or VMWare. If Amazon's EC2 service is used, the web, application and database servers 62, 64 and 22 reside on one Amazon EC2 instance. PBX server 14 and PBX integrator 16 reside on another Amazon EC2 instance, and have access to the database 22 via Amazon's internal network. Different partitions of the servers on VM instances within EC2 are possible. The allocation to one EC2 instance or the other can be made based upon the demands of a process for CPU time and memory. The allocation might also be affected by user demand; multiple webservers could be used. Webserver 62, application server 64 and database 22 on one EC2 instance provide the front end that runs handmic.com. On that site, a user can manage his telephones, recordings and groups. A user can also set his service options, including those regarding security.
In still other alternative embodiments, the service may be expanded to include a load balancer, multiple databases (such as read-only replicas for performance), and multiple PBX servers to handle higher call volume. The service is very adaptable, simply by using more VM instances in EC2, or more physical machines.
Those skilled in the art will appreciate that various changes, additions, omissions, and modifications can be made to the illustrated embodiments without departing from the spirit of the present invention. All such modifications and changes are intended to be covered by the claims.