This application is related to the following commonly owned U.S. patent applications, the disclosures of which are incorporated herein by reference:
The present invention relates to multimedia editing applications in general, and to user interface paradigms for controlling such applications.
Multimedia editing applications are used to create various types of multimedia presentations containing multimedia objects. A multimedia object can be a video object, an audio object, an image object, or a combination thereof, which has a temporal aspect in that its content is experienced over time. A typical multimedia presentation may contain multiple different video and/or audio elements. The editing applications typically provide support to create and modify the video and audio elements, and combine such elements into sequences that together form the multimedia presentation.
The user interfaces of conventional editing applications use a direct manipulation paradigm, in which the user operates a mouse or other pointing device to select graphical representations of the multimedia objects, and then select icons, menu items or keystrokes to perform a desired function on the selected objects. For example, a user may select a video object, and select an icon to change the entry point of the video object (the time at which the video object starts) by dragging an entry marker on a graphical timing panel. As another example, the user may start the playback of the multimedia object by clicking on an icon representing a play function, or may “rewind” the object by clicking on an icon associated with a reverse play function. In this type of icon-based control paradigm, three separate steps are required. First, the user must know exactly where on the screen the icon for the desired function appears, second, the user then must move the cursor to this location, and third the user must select, click or drag the icon to achieve the desired functionality. These steps take time and attention away from the user's focus on the particular aspect of the multimedia object being edited.
Another problem with this approach is that each icon takes up screen space and thus the user interface of the application tends to get cluttered up with multiple tool bars, menus, icon palettes and the like in order to present the icons for all of the available functions. For functions that are used frequently, requiring the user to repeatedly access an icon on the screen only serves to slow down the user's work instead of streamlining it. Complex keystroke combinations (e.g., “CONTROL-ALT-P” for “play”) are thus used for common functions, but this requires the user to memorize a typically arbitrary mapping of key combinations to functions, and further may require the user to release control of their pointing device in order to access the key board to generate the keystroke. After the keystroke is pressed, the user then must access the pointing device again. Of course, some keystroke combinations may be done with one hand on the keyboard while the other hand retains the pointing device. However, this approach severely limits the number of available keystroke combinations to those that can be pressed with one hand, and further requires a high degrees of dexterity in the user's non-dominant hand, since most users use their dominant hand (e.g., right hand) to control the pointing device, and leave their non-dominant hand on the keyboard. The net result in these situations that the user's attention is directed away from the object to the particulars of the device interface, and thus reduces the user's efficiency.
Gestures have been previously used for entry of limited command in a tablet based computer. In most cases, the gestures are essentially alphabetical characters, such as “G” for the function “goto,” or “f” for the function “Find,” or “s” for the function “save,” and thereby merely represent the word for the function itself, essentially providing an abbreviation of the word. While arbitrary graphical symbols are sometimes used, they do not provide a comprehensive set of useful visual mnemonics.
Accordingly, it is desirable to provide a multimedia editing application that can be controlled by various gestures that have visually mnemonic forms.
A multimedia editing application uses a plurality of gestures to control certain editing operations of the application. Each gesture is associated with a particular function (or set of functions) of the editing application. A user inputs the gestures with a pointing device, such as a mouse, a light pen, or the like. The editing application includes a gesture recognition engine that receives the input gesture and determines the function associated with the gesture. The application then executes the identified function. The multimedia editing application also supports the ability for the immediate entry (“on demand”) of a gesture mode based on user activation of a configured switch (e.g., button) on the gesture input device, such as a graphics pen. This allows the user to seamlessly enter both gestures, select menu items and icons, and draw graphics all without putting the gesture input device aside or using multiple hands. Each gesture is designed such that its geometric form provides a visual mnemonic of the associated function or the object on which the function operates. The visual mnemonic is preferably non-linguistic in that it does not represent a character, letter, or other linguistic symbol which is an abbreviation of a function name.
The use of gestures to control a multimedia editing application provides a more efficient and easier to use interface paradigm over conventional keyboard and iconic interfaces. First, since the gestures can replace individual icons, less screen space is required for displaying icons, and thereby more screen space is available to display the multimedia object itself. Indeed, the entire screen can be devoted to displaying the multimedia object (e.g., a full screen video), and yet the user can still control the application through the gestures. Second, because the user effects the gestures with the existing pointing device, there is no need for the user to move one of his hands back and forth between the pointing device and keyboard as may be required with keystroke combinations. Rather, the user can fluidly input gestures with the pointing device in coordination with directly manipulating elements of the multimedia object by clicking and dragging. Nor is the user required to move the cursor to a particular portion of the screen in order to input the gestures, as is required with iconic input. Third, the gestures provide a more intuitive connection between the form of the gesture and the associated function, as the shape of the gesture may be related to the meaning of the function. Fourth, whereas there is a relatively limited number of available keystroke combinations—since many keystroke combination may already be assigned to the operating system, for example—there is a much larger set of available gestures that can be defined, and thus the user can control more of the application through gestures, then through keystrokes.
In one embodiment, the gestures fall generally into two classes, each of which provides a visual mnemonic of the associated function. The gestures are non-linguistic in that they are not letters, characters, or other symbols which are derived directly from the names of the associated functions. Thus, while a particular gesture may have an geometric form similar to a particular letter, or may be described by reference to the shape of a letter, the gesture's mnemonic qualities do not depend on the geometric form being similar to or derived from (e.g., an abbreviation) of the name of the function with which the gesture is associated.
A first class of gestures are those whose geometric form (or “shape”) implies or suggests a temporal or spatial directionality of the associated function. These geometric characteristics also include the directional or motion aspects of the gestures as input (e.g., directionality or curvature of the gesture stroke). Gestures in this class include, for example, gestures to control the playback of a multimedia object either in reverse or forward time, or to move an object in a particular spatial direction. The second class of gestures are those whose shape imply or suggest the form of the object on which the associated function is performed. Gestures in this class include, for example, gestures to manipulate the size of windows, to open or close particular window panels, or the shape of an icon associated with the desired function. The visual mnemonic aspect of a gesture may also connote the movement or shape of particular parts of a human body, such as the motion or shape of a hand, or motion of the head, eyes, or the like.
In one embodiment, a gesture is preferably a single continuous stroke. In another aspect of the invention, some of the gestures may be grouped into gesture pairs of a first and second gesture. The first and second gestures are logical mirror images of each other, (e.g., reflected in either the horizontal or vertical axis). The first gesture is associated with a temporally forward function on a multimedia object (or element), such as forward play, forward fast advance, forward step and the like. The second gesture is associated with a temporally backward function, such as reverse play, reverse fast advance, reverse step and the like. Other gesture pairs include first and second gestures which operate on a beginning portion and an ending portion respectively of a multimedia object or element, such as jump to beginning or jump to end. Other gesture pairs include a first gesture that performs a rescaling operation to increase the scale of displayed objects, and a second gesture that performs a rescaling operation to decrease the scale of displayed objects.
The present invention has embodiments as a multimedia editing application including stored gesture definitions of gestures, and a gesture recognition engine, as various methods of operating a multimedia editing application, as a user interface for a multimedia editing application, as computer program products, and as a library of gestures for controlling a multimedia editing application.
a-d illustrate the play forward and play reverse gestures and associated functionality.
a-d illustrate the frame forward gesture and associated functionality.
a-e illustrate the go to start of play range and go to end play range gestures and associated functionality.
a-c illustrate the go to head and go to tail gestures and associated functionality.
a-g illustrate the group and ungroup gestures and associated functionality.
a-c illustrate the group gesture functionality in the context of the file browser.
a-d illustrate the set local in and set local out gestures and associated functionality.
a-c illustrate the set local in gesture functionality in the context of the file browser.
a-f illustrate the up one level and down one level gestures and
a-g illustrate the set global marker and set local marker gestures and associated functionality.
a-e illustrate the set play range start and set play range end gestures and associated functionality.
a-f illustrate the zoom in and zoom out gestures and associated functionality.
a-e illustrate the zoom in and zoom out functionality in the context of the file browser.
a-c illustrate the zoom in and zoom out functionality in the context of the timing panel.
a-b illustrate the zoom in and zoom out functionality in conjunction with various modifiers.
a-c illustrate the home view gesture and associated functionality.
a-c illustrate the fit to window gesture and associated functionality.
a-e illustrate the open/close bottom and top panel gestures and associated functionality.
a-c illustrate the open/close left and right panel gestures and associated functionality.
The figures depict a preferred embodiment of the present invention for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein.
Referring now to
The user interface 100 includes three primary regions, the canvas 102, the timing panel 106, and the file browser 110. The canvas 102 is used to display the objects as they are being manipulated and created by the user, and may generally be characterized as a graphics window. In the example of
Below the canvas 102 to the right is the timing panel 106. The timing panel 106 shows a linear representation of the overall length of the presentation. At the top of the timing panel 106 is the timescale 124, which shows time markers (e.g., minutes, seconds, 1/10's of seconds, etc.) (not show here for clarity). The timescale 124 can be of the entire presentation from beginning to end, or any portion thereof. The objects 104 in the canvas 102 correspond to the current time on the timescale 124, which is indicted by current time marker 114. For each multimedia object 104 there is a timebar 116 in the timing panel 106. The object 104 is indicated by an object bar 108, which may have the object's name displayed for recognition. The width of the object bar 108 corresponds to the duration of the multimedia object 104, which duration is controlled by the user. The beginning of the object's duration is called the In point; the end of the object's duration is the Out point. These points are established by the user, for example, by selecting and dragging the respective end of the object bar in the desired direction. From the layering of the object bars 108 and their positioning relative to the timescale 124, the user can see the relative start and end times of each object. Notice that the current time marker 114 shows a point in the multimedia presentation at which Square A, Star B, and Circle C are all visible at the same time, as indicated by the overlap of their respective object bars 108. Two other object bars 108 for objects E and F are shown in the timing panel, but the objects are not shown in the canvas 102 because are after the current time.
On the left side of the user interface is the file browser 110. The browser 110 is used to access multimedia object files for use in a multimedia presentation. The browser 110 shows file icons 112 for files in a currently selected directory (whether local or networked). The user can access any of the files in a variety of different ways. For example, the user can drag and drop the icon for a desired multimedia object into the canvas 102 thereby making part of the presentation, with its start time corresponding to the current time.
An optional area of the user interface is the project view panel 118, shown the immediate left of the timing panel 106. The project view 118 provides additional information on each object 104, showing the object's name 112 and a preview 120 of the object by itself. From the preview 120 the user can access a video preview of the object's actions independently of the other objects 104 in the multimedia presentation.
One aspect of the present invention is the ability to directly input the gestures via a direct and immediate mode change effected from the input device itself, rather than having to enter the input mode via a tool bar, menu item, or other intermediate mechanism. In one embodiment, a button or other switch mechanism on the input device, preferably at a fingertip location, is assigned by the multimedia editing application to be the trigger to enter the gesture input mode. For example, a pen type input device typically has two buttons on the pen barrel. One of the buttons may be designated as the gesture mode trigger, and the other button may be designated as the selection control for selecting items in the user interface. To input a gesture, the user simply holds down the gesture mode trigger button while drawing the gesture. The user then releases the button upon completion of the gesture. The multimedia editing application recognizes the input stroke as a gesture (rather than as a graphical object, such as a line). This approach allows the user to enter the gesture anywhere on the user interface, instead of via a dedicated region of the user interface, as is commonly done. In addition, it provides a very fluid and natural way of entering the gestures.
Another aspect of the gestures that will be illustrated below is that the interpretation of the gestures by the multimedia editing application is context sensitive, and in particular is sensitive to which region of the user interface 100 the gesture is input, the object(s) over which the gesture is drawn, or the currently selected object(s). In other words, a given gesture can be associated with a plurality of different gestures, and the multimedia editing application determines which function to execute in response to the gesture according to which region of the user interface receives the gestures from the user. In the preferred embodiment, the different functions associated with a given gestures are semantically related so that they provide a consistent mental model to the user of the underlying functionality of the gesture and the application. These features will be further explained by example below.
The present invention provides a number of different gestures for controlling the functionality of the multimedia editing application. The gestures can be grouped as follows, according to their functionality, as follows:
1. Transport and playback
2. Timing panel navigation
3. Editing
4. View management
5. General
Transport and Playback
Referring now to
b-3d illustrate the effect of the play forward gesture 300, when input into the timing panel 106. In
The play reverse gestures 302 operates in the opposite manner naturally, and thus would play the multimedia presentation in reverse from the current time, as indicated by the current time marker 114.
In a preferred embodiment, the user may modify the functionality associated with the gesture (and hence change its semantics) by using various modifier operations. One class of modifier operations are key combinations, in which the one or more keys on the keyboard are depressed while the gesture is input. Another class of modifier operations are button presses on the input device.
For key combinations, the following modifier keys may be used in conjunction with many of the gestures that are described herein: the SHIFT key, the COMMAND key, and the ALT key.
In the context of the play forward or play reverse commands, these modifier keys have the following effects:
SHIFT: Play forwards/reverse from start or end of multimedia presentation. This modifier causes the multimedia editing application to begin the playback at either the start of the entire multimedia presentation or the end, depending on the gesture.
COMMAND: Play the currently selected object 114 from its In and Out points, according to the direction of play of the gesture. Thus, if COMMAND is used with the play forward gesture 300 the currently selected object is played from its In point to its Out point. If COMMAND is used with the play reverse gesture 302, then playback is from the Out point to the In point. The playback is for all objects in the multimedia presentation that have a duration during this extent of time.
ALT: Play the multimedia presentation in a continuous loop.
The play forward gesture 300 and play reverse gestures 302 are an example of pairs of gestures are whose prototypes are substantially mirror images of each other, (e.g., reflected in either the horizontal or vertical axis). Other examples of these mirror image gesture pairs will be apparent throughout the rest of this disclosure.
The play forward gesture 300 and the play reverse gestures 302 are also context dependent gestures, and the multimedia editing application determines the appropriate functionality based on the window in which the gesture is input, the object(s) 114 over which it is drawn, or the currently selected object(s). Thus, in the file browser, a play forward gesture 300 input over one of the file icons 112 cause the multimedia editing application to playback the associated file, whether it is a video, audio, or other type of multimedia object. The playback of these files occurs in a separate preview window (not shown) if the object has a visual component, or just by auditory output. Similarly, if one of these gesture is input into the canvas 102, then the multimedia editing application plays back the currently selected multimedia object 114, showing its associated animation. If the gesture is input over the preview field 120 in the project view 118, then again the playback is in a preview window, or by auditory output (or both).
Referring now to
b-5d illustrate the functionality of the frame forward gesture 500. In
The frame reverse gesture 502 and frame forward gesture 500 can each be modified with a modification operation. When used in conjunction with SHIFT, the frame forward gesture 500 operates to jump the current time marker 114 ten (10) frames forward in time, and likewise when used with the frame reverse gesture 502, operates to jump the current time marker 114 backwards ten (10) frames. In one embodiment, the user can define the number of forward or reverse frames to jump when the SHIFT modifier is used.
The frame forward gesture 500 and the frame reverse gesture 502 are also context dependent. When used in during the playback of a preview (e.g. of a video object) in a preview window, they act as above to forward or advance the playback a single frame at time (or plural frame when input with SHIFT). When used during the playback of an audio object, they act to advance the audio playback one (1) second forward or backward in time (or ten (10) seconds forward or backward, when input with SHIFT).
Timing panel Navigation
The next group of gestures are associated with functions for navigation of the timing panel 106. These gestures generally effect the location of the current time marker 114 in the multimedia presentation, and thus effect both the timing panel 106, and the appearance of objects in the canvas.
a shows the go to start of play range gesture 600 and the go to end of play range gesture 602. The geometric form of the go to start of play range gesture 600 is a right facing, right angle with a downward end stroke. The geometric form of the go to end of play range gesture 602 is a left facing, right angle with a downward end stroke. The visual mnemonic of these gesture connotes pointing at the beginning (to the left) or ending (to the right) of an object. These gestures adjust the current time marker 114 in the timing panel 106 as follows. The user can specify a play range in the timing panel 106, which defines a duration or period of interest to the user, say for example a period of 10 seconds starting at 01:25 (ss:mm) and ending at 01:35. The playback functions are aligned with this play range, and so the user can repeatedly playback this limited play range in order to focus on editing during this portion. The user sets the beginning and end of the play range in the timing panel, for example, by placing play range markers 119, as shown in
The objects in the canvas as updated in response to the adjustment of the current time marker 114. Notice that in
d shows the user interface at the same point in time as
The go to start of play range gesture 600 and go to end of play range gesture 602 are also context sensitive gestures. When input while the user is in the browser, for example, previewing an audio or video file, these gestures will cause the multimedia editing application to go to the beginning (go to start of play range gesture 600) or end (go to end of play range gesture 602) of the currently playing file.
Referring to
Referring to
Editing
The next group of gestures are used to perform various editing functions in the multimedia editing application.
Referring to
b-9d illustrate the functionality of the group gesture 900 in the context of the canvas 102.
Another embodiment of the group gesture 900 and ungroup gesture 902 does not need the user to select the objects 104 prior to inputting the gesture. Instead, in this embodiment, the user draws the group gesture 900 over each of the objects of interest, in essence “touching” each object to be grouped.
The group gesture 900 and ungroup gesture 902 are context dependent. In the file browser 110, the group gesture 900 results in the creation of a directory (or folder) containing the selected files. The ungroup gesture 902, when applied to a user selected directory object, results in all of the contents of the directory being removed, and placed as the same directory level as the directory itself, leaving the directory object empty. In one embodiment, the directory object may be left in place, or deleted, according to a user selectable option.
The group gesture 900 may also be used in the project view 118 as illustrated in
The functionality of the ungroup gesture 902 in the file browser should now be apparent to those of skill in the art without further illustration.
a illustrates the next pair of editing gestures, the set local in gesture 1100 and the set local out gesture 1102. The set local in gesture 1100 established an In point for a currently selected multimedia object 104 at the current time marker 114 (and hence the current frame). The set local out gesture 1102 sets the Out point for the currently related multimedia object 104 at the current time marker 114. The geometric form of the set local in gesture 1100 is a first downward vertical stroke portion, with a second, upwardly angled, left oriented end stroke portion. The left oriented end portion “points” to the beginning of the selected object, hence its In point. The geometric form of the set local out gesture 1102 is a first downward vertical stroke point, with a second, upwardly angled, right oriented end stroke portion. The right oriented end portion “points” to the end of the selected object, hence its Out point.
b-11d illustrate this functionality. In
The set local in gesture 1100 and set local out gesture 1102 are also context sensitive. The user can input these gestures while viewing preview of a file from the file browser 112, or when a particular file 112 is selected in the browser 112. If the user inputs the set local in gesture 1100 when viewing or selecting a file, then the object represented by the file is inserted into the multimedia presentation and its In point is set at the current time.
The next pair of editing gestures are for changing the order (layer) of a select object or set of objects. These are the up one level gesture 1300 and the down one level gesture 1302, as illustrated in
b-13f illustrates these associated functions. In
These gestures may be used with various modifiers. The SHIFT key modifies the up one level gesture 1300 to bring the selected object to the topmost position of the object stack ('bring to front'), and modifies the down one level gesture 1302 to push the selected object to the bottom of the object stack (‘send to rear’). In addition, these gestures work with a set of multiple selected objects 104 as well.
The up one level gesture 1300 and down one level gesture 1302 are also context sensitive. In the timing panel 106, the use can select an object bar 108 for an object 104, and apply the desired gesture, which in turn promotes ore demotes the selected object 104 in the object stack, and updates it positioning in the tracks 116 of the timing panel. In the file browser 110, the up one level gesture 1300 moves a select object(s) from its current directory (folder) up to the parent directory of the current directory. The down one level gesture 1302 moves a selected object(s) from the current directory into the a selected child directory of the current directory, where child directory is selected either by the user (e.g., via concurrent selection of the object and the directory), or automatically (e.g., first child directory based on sort order, whether alphabetical, by modification date, creation date, or other attribute).
a-14d illustrates the next pair of editing gestures, the set global marker gesture 1400 and the set local marker gesture 1402. The geometric form of the set global marker 1400 is a first stroke portion comprising an angle open towards the bottom of the window with an upward pointing apex, and a second, ending stroke portion connected to the angle and comprising a short horizontal line segment. The geometric form of the set local marker 1402 is a first stroke portion comprising an angle open towards the top of the window with an downward pointing apex, and a second, ending stroke portion connected to the angle and comprising a short horizontal line segment.
A global marker is a time marker set in the timing panel 106 at the current time marker 114. The user may set any number of time markers in a multimedia presentation. The time markers allow the user jump forward or backward throughout the presentation to these global markers, which can represent any point of interest to the user.
A local time marker is a time marker that is established with the respect to the start time of a particular multimedia object 114, and which stays constant relative to the start time of the object, regardless of where the object 114 is positioned in the multimedia presentation. For example, a local marker may be set at 1 second after the beginning of a user selected object 114. This marker would remain offset 1 second from beginning of the selected object, regardless of where the object begins within the multimedia presentation.
a-15e illustrates the set play range start gesture 1500 and the set play range end gesture 1502. The geometric form of the set play range start gesture 1500 is a generally rectangular, right facing “C” shape; the geometric form of the set play range end gesture 1502 is a generally rectangular, left facing “C” shape. These forms are visual mnemonics connoting bracketing a portion or region of space or time between respective start and end points. It is instructive to note that while the geometric forms of these gestures have been described with respect to the shape of the letter “C”, their mnemonic aspects are entirely unrelated to any word or name of the associated play range functions, such as the symbols “P” or “R”. Rather, the mnemonic characteristics, as explained here, clearly derive from the directional and spatial features of the gestures, which merely happen to correspond to the general shapes of the letter “C”.
As discussed above, the play range is a section of the current presentation that is used to control the playback behavior, allowing the user to loop over the play range to better focus his editing efforts. The set play range start gesture 1500 sets a play range marker 119 for the start of a play range at the current time marker 114. The set play range end gesture 1502 sets a play range marker 119 for an end of the play range to the current time marker 114. If a play range has already been defined with a start and end point, then these commands update the respective play range markers 119; otherwise they instantiate the markers as needed.
b illustrates the timing panel 106 for which no play range has been established, and at which point the user has input the set play range start gesture 1500. As shown in
View Management
The next set of gestures comprises gestures associated with functions to control and management the view of the various window and components of the multimedia editing application user interface.
The first pair of these gestures is the zoom in gesture 1600 and the zoom out gesture 1602, illustrated in
b-16f illustrate the associated functions of these gestures. In
The zoom in gesture 1600 and zoom out gesture 1602 are both context sensitive, and may be used in the file browser 110 and the timing panel 106.
These gestures may also be used in a preview window (e.g., like preview window 1106) when viewing the content of a selected file 112. The semantics of these gestures are as described for the canvas 102.
d illustrates the file browser 110 again at the same resolution as in
The zoom in gesture 1600 and zoom out gesture 1602 gestures likewise operate in the timing panel 106, to change the resolution of the timescale 124 for the timing panel.
The zoom step (percent increase or decrease) for the zoom in gesture 1600 or zoom out gesture 1602 can be either fixed or set by the user, as desired. In one embodiment, the step size is +/−20%.
Various modifiers may also be used with the zoom in gesture 1600 and zoom out gesture 1602. The SHIFT key used with the zoom in gesture 1600 fits contents (e.g., objects 114, files 112, object bars 108) that are within the perimeter of the gesture stroke to the size of the current window pane.
The ALT key also modifies these gestures, by centering the rescaled window at the center of the gesture. As part of the gesture recognition, the multimedia editing application will compute the center (or centroid) of a gesture, and use that center location to be the center of the rescaled window pane.
A related view management gesture is the interactive zoom gesture 2000, illustrated in
The interactive zoom gesture 2000 is also context sensitive, and operates in the timing panel 106 and file browser 110 in with the same semantics as described for the zoom in gesture 1600 and zoom out gesture 1602.
The next view navigation gesture is the pan gesture 2100, as illustrated in
The pan gesture 2100 is also context sensitive, and may be used in the tinting panel 106, the file browser 110, and a preview window. In the timing panel 106 and file browser 110, the pan mode gesture acts as a horizontal and vertical scroll, automatically scrolling these windows in response to the direction of panning.
A further view navigation gesture is the home view gesture 2200, as illustrated in
In the file browser 110, the home view gesture 2200 likewise returns the file browser 110 window pane to a native resolution, here too 100% scale, and then resetting to a native offset for the window, at the top of the item list. In the timing panel 106, the home view gesture 2200 resets the timescale 124 to a native resolution which is here the entire project duration, running from a start time of 00:00:00 to the project end time, whatever value that may be. The current time marker 114 and play ranger markers 119 are not changed, though their relative location on the screen will be adjusted, as will the placement and size of all objects bars in the timing panel 106.
The next view management gesture is the fit to window gesture 2300, as illustrated in
The next view management gestures are for opening and closing various ones of the window panes of the user interface.
The semantics of these gestures as is follows. First, each gesture is associated with a state changing function, in that a single gesture is used to open a closed panel, or closed an open panel, thus changing the state of a particular panel. The panel which is the object of the gesture's function is indicated by the direction from which the gesture begins. Thus, the open/close bottom panel gesture 2400 begins at point towards the bottom of the user interface window with an upward stroke (e.g., the left leg of the “N”), and thus indicates that a panel situated along the bottom of the window is the object of the gesture. The open/close top panel gesture 2402 begins at a point towards the top of the user interface window with a downward stroke (e.g., the right leg of the “N”), and thus indicates that a panel oriented along the top of the user interface window is the object of the gesture. The open/close left panel gesture 2500 begins at point towards the left side of the user interface window with an rightward stroke (e.g., the top leg of the “Z”), and thus indicates that a panel situated along the left of the user interface window is the object of the gesture. The open/close right panel gesture 2502 begins at a point towards the right side of the user interface with a leftward stroke (e.g., the bottom leg of the “Z”), and thus indicates that a panel oriented along the top of the user interface 100 is the object of the gesture. It is instructive to note that while the geometric forms of the open/close panel gestures have been described with respect to the shapes of the letters “Z” and “N”, the mnemonic aspects are entirely unrelated to any word or name of the associated open and close functions, such as the symbols “O” or “C”. Rather, the mnemonic characteristics, as explained here, clearly derive from the directional and spatial features of the gestures, which merely happen to correspond to the general shapes of these letters.
b illustrates the user interface with just the canvas 102 and the file browser 110 panels open. Notice that the timing panel 106, which would be the bottom panel, is not currently open. The user has input the open/close bottom panel gesture 2400 into the canvas 102. In response, the multimedia editing application opens and displays the designated bottom panel, here the timing panel 106, as illustrated in
d illustrates again the user interface 100 with just canvas 102 and file browser 110 panels open. The user has input the open/close top panel gesture 2402 into the canvas 102. In response, the multimedia editing application opens and displays a designated top panel 2408, as illustrated in
b-c illustrate the functionality of the open/close left panel gesture 2500. In
The association of which panel is operated on by one of these gestures 2400, 2402, 2500, 2502 may be predetermined by the multimedia editing application, may be designated by the user, or may be dynamically determined based on the present configuration of window panels as placed by the user.
General
The following gestures are associated with general functionality of the multimedia editing application.
Preferably, the confirmation message 2900 is displayed with no fill around the text, and the text for the gesture name in a contrasting color (or pattern) to the background color of the canvas 102 or any objects 104 behind, so as to be more easily noticed by the user. This provides a “heads up” type presentation to the user, as the user can “see through” the message 2900 to the canvas 102 and objects behind it. Accordingly, the geometric form of this gesture is a visual mnemonic for the action of looking up, since the gesture has the form of a single, upward vertical stroke, in essence suggesting to the user to look up or direct their eyes upward. For example, if the canvas behind the confirmation message 2900 is black, then the confirmation message 2900 may be displayed with white text for the gesture name; if the canvas 102 behind the confirmation message 2900 is white, then the gesture name text may be displayed in black. Other variations in the formatting of the message will be apparent to those of skill in the art. This feature can be enabled or disabled in the multimedia editing application by the user, as desired by use of this gesture.
As noted above, the various gestures of the present invention have shapes and characteristics that form visual mnemonics of their associated functions, without themselves being letters, numbers or other linguistic symbols. That said, it should be noted that the gestures of the present invention can be used in a multimedia editing application (or similar application) in which linguistic based gestures as also present, and hence the present invention does not require the exclusive use of the gestures disclosed and described herein.
The underlying gesture recognition mechanism may be any known or subsequently developed gesture recognition algorithm and accompany system elements. The art of gesture recognition is well developed, and thus it is not necessary to describe in detail here how a gesture recognition system operates. In general, the gesture recognition engine maintains a gesture definition (or prototype) that describes the characteristics of each gesture. The gesture definition may be procedural, vector-oriented, statistical, curve-based, pixel mapped, or any other implementation. These prototypes form a gesture library of defined gesture attributes or characteristics, which will be stored in computer accessible memory (e.g., disk or RAM). The gesture recognition engine further maintains a mapping between each gesture and a predetermined function provided by the multimedia editing application. These functions are of course internally exposed in the multimedia editing application via function interfaces. When the multimedia editing application receives a gesture mode trigger event, indicating that the user is about to enter a gesture, it receives the input stroke data from the user interface handler of the operating system, and passes the stroke information to the gesture recognition engine. The gesture recognition engine analyses the stroke data with respect to the known gesture prototypes, and determines which gesture was input. Based on the determined gesture, the gesture recognition engine then invokes the associated function of the multimedia editing application. The gesture recognition engine may be implemented, for example, using any of the systems and methods described in the following patents and applications, all of which incorporated by reference herein: U.S. Pat. Nos. 5,859,925; 5,768,422; 5,805,730; 5,805,731 5,555,363; 5,563,996; 5,581,681; 5,583,542; 5,583,946; 5,590,219; 5,594,640; 5,594,810; 5,612,719; 5,677,710; 5,398,310; 5,523,775; 5,528,743, and U.S. application Ser. No. 09/520,206 (filed Mar. 7, 2000). In addition, an embodiment may also be implemented using Microsoft Corp.'s Tablet PC Software Development Kit (SDK), Version 1.5, which includes a Gesture Recognizer that can be configured to recognize custom gestures for an application, and which is incorporated by reference herein.
The present invention has been described in particular detail with respect to one possible embodiment. Those of skill in the art will appreciate that the invention may be practiced in other embodiments. First, the particular naming of the gestures, capitalization of terms, the attributes, or any other programming or structural aspect is not mandatory or significant, and the mechanisms that implement the invention or its features may have different names, formats, or protocols. Further, the system may be implemented via a combination of hardware and software, as described, or entirely in hardware elements. Also, the particular division of functionality between the various user interface components and gestures described herein is merely exemplary, and not mandatory. Those of skill in the art recognize that the present invention is implemented using a computer program(s) executing on a computer system including a processor, memory, storage devices, input devices (e.g., keyboard, mouse, pen, tablet, etc.) and output devices (e.g., display, audio, etc.), peripherals, and network connectivity interfaces. The memory, or alternatively one or more storage devices in the memory, includes a non-transitory computer readable storage medium. The details of these aspects of the system well known to those of skill in the art, and are thus not illustrated or further described here.
The displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may also be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the described functions.
Finally, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, the disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims.
Number | Name | Date | Kind |
---|---|---|---|
5252951 | Tannenbaum et al. | Oct 1993 | A |
5339393 | Duffy et al. | Aug 1994 | A |
5398310 | Tchao et al. | Mar 1995 | A |
5463696 | Beernink et al. | Oct 1995 | A |
5502803 | Yoshida et al. | Mar 1996 | A |
5523775 | Capps | Jun 1996 | A |
5528743 | Tou et al. | Jun 1996 | A |
5555363 | Tou et al. | Sep 1996 | A |
5563996 | Tchao | Oct 1996 | A |
5581681 | Tchao et al. | Dec 1996 | A |
5583542 | Capps et al. | Dec 1996 | A |
5583946 | Gourdol | Dec 1996 | A |
5588098 | Chen et al. | Dec 1996 | A |
5590219 | Gourdol | Dec 1996 | A |
5594640 | Capps et al. | Jan 1997 | A |
5594810 | Gourdol | Jan 1997 | A |
5596346 | Leone et al. | Jan 1997 | A |
5612719 | Beernink et al. | Mar 1997 | A |
5615384 | Allard et al. | Mar 1997 | A |
5657047 | Tarolli | Aug 1997 | A |
5677710 | Thompson-Rohrlich | Oct 1997 | A |
5717848 | Watanabe et al. | Feb 1998 | A |
5731819 | Gagne et al. | Mar 1998 | A |
5821930 | Hansen | Oct 1998 | A |
5835692 | Cragun et al. | Nov 1998 | A |
5835693 | Lynch et al. | Nov 1998 | A |
5883619 | Ho et al. | Mar 1999 | A |
5883639 | Walton et al. | Mar 1999 | A |
5892507 | Moorby et al. | Apr 1999 | A |
6011562 | Gagne et al. | Jan 2000 | A |
6045446 | Ohshima | Apr 2000 | A |
6154601 | Yaegashi et al. | Nov 2000 | A |
6266053 | French et al. | Jul 2001 | B1 |
6310621 | Gagne et al. | Oct 2001 | B1 |
6353437 | Gagne | Mar 2002 | B1 |
6414686 | Protheroe et al. | Jul 2002 | B1 |
6476834 | Doval et al. | Nov 2002 | B1 |
6525736 | Erikawa et al. | Feb 2003 | B1 |
6664986 | Kopelman et al. | Dec 2003 | B1 |
6714201 | Grinstein et al. | Mar 2004 | B1 |
6756984 | Miyagawa | Jun 2004 | B1 |
7000200 | Martins | Feb 2006 | B1 |
7004394 | Kim | Feb 2006 | B2 |
7158123 | Myers et al. | Jan 2007 | B2 |
7240289 | Naughton et al. | Jul 2007 | B2 |
7614008 | Ording | Nov 2009 | B2 |
20010030647 | Sowizral et al. | Oct 2001 | A1 |
20020112180 | Land et al. | Aug 2002 | A1 |
20020140633 | Rafii et al. | Oct 2002 | A1 |
20030156145 | Hullender et al. | Aug 2003 | A1 |
20040036711 | Anderson | Feb 2004 | A1 |
20040039934 | Land et al. | Feb 2004 | A1 |
20040119699 | Jones et al. | Jun 2004 | A1 |
20040141010 | Fitzmaurice et al. | Jul 2004 | A1 |
20040196267 | Kawai et al. | Oct 2004 | A1 |
20040212617 | Fitzmaurice et al. | Oct 2004 | A1 |
20050046615 | Han | Mar 2005 | A1 |
20050057524 | Hill et al. | Mar 2005 | A1 |
Number | Date | Country |
---|---|---|
101 40 874 | Mar 2008 | DE |
WO 0167222 | Sep 2001 | WO |
Entry |
---|
“Windows XP Tablet PC Edition, Developer's Documentation”, Microsoft Corp., 2004, 47 pages, retrieved from msdn.microsoft.com on May 26, 2004. |