The present disclosure relates to an information processing apparatus, an information processing method, and a program.
The present disclosure contains subject matter related to that disclosed in Japanese Priority Patent Application JP 2012-285025 filed in the Japan Patent Office on Dec. 27, 2012, the entire content of which is hereby incorporated by reference.
In recent years, a user interface (UI) that receives an user input via an image from a video camera has been proposed.
As one example, JP 2002-496855A discloses a method of superimposing object images for UI purposes on an image in which a mirror image of a user appears and carries out an application process associated with an object image selected by movement of the users hand. JP 2005-216061A meanwhile discloses a method which eliminates the trouble of making initial settings, such as setting a camera angle, in a UI which uses camera images by determining the position of the user's head and hands in an input image and automatically displaying an object image in the vicinity of the determined position.
PTL 1: JP 2002-196855A
PTL 2: JP 2005-216061A
However, the screen region in the vicinity of the head or hands of the user is limited. This means that with the method disclosed in JP 2005-216061A, when a large number of selectable objects are provided in a UI, the screen becomes crowded with such objects, which can conversely cause a drop in usability.
Accordingly, it is desirable to realize an improved. UI capable of avoiding a drop in usability due to crowding of the screen, even when a large number of selectable objects are provided.
According to one aspect, an information processing system is described that includes processing circuitry configured to
control a movement of a UI object on a display screen from a pre-recognition position toward a post-recognition position in response to recognition of an operation object initiated by a user, wherein the post-recognition position is spatially related to a displayed position of a predetermined displayed feature, wherein the predetermined displayed feature is an image derived from a camera-captured image.
According to another aspect, an information processing method is described that includes
controlling with processing circuitry a movement of a UI object on a display screen from a pre-recognition position toward a post-recognition position in response to recognition of an operation object initiated by a user, wherein the post-recognition position is spatially related to a displayed position of a predetermined displayed feature, and the predetermined displayed feature is an image derived from a camera-captured image.
According to another aspect, a non-transitory computer readable medium is described that includes computer readable instructions that when executed by a processing circuitry perform a method, the method including
controlling with the processing circuitry a movement of a UI object on a display screen from a pre-recognition position toward a post-recognition position in response to recognition of an operation object initiated by a user, wherein the post-recognition position is spatially related to a displayed position of a predetermined displayed feature, and the predetermined displayed feature is an image derived from a camera-captured image.
According to the above embodiments of the present disclosure, a UI capable of avoiding a drop in usability due to crowding of the screen, even when a large number of selectable objects are provided, is realized.
Hereinafter, preferred embodiments of the present disclosure will be described in detail with reference to the appended drawings. Note that, in this specification and the appended drawings, structural elements that have substantially the same function and structure are denoted with the same reference numerals, and repeated explanation of these structural elements is omitted.
The following description is given in the order indicated below.
1. Overview
2. First Embodiment
2-1. Example Hardware Configuration
2-2. Example Functional Configuration
2-3. Mode Of Approach of UI Objects
2-4. Various Examples of Operation Events
2-5. Incorporation of a Plurality of Operation Objects
2-6. Example Window Compositions
2-7. Example Processing Flow
3. Second Embodiment
4. Conclusion
First, an overview of an information processing apparatus to which the technology according to an embodiment of the present disclosure can be applied will be described with reference to
According to existing methods, the UI objects operated by the user may be automatically laid out in the vicinity of the head or hand of the user in the image. However, since the screen region in the vicinity of the head or hand of the user is limited, when a plurality of UI objects are provided, there is the risk of such UI objects being congested in the vicinity of the user. If the UI objects are congested in a limited screen region, it becomes difficult to select the individual UI objects, which can conversely cause a drop in usability. For this reason, the information processing apparatuses 100 and 200 avoid such drop in usability in accordance with the framework described in detail in the following sections.
<2-1. Example Hardware Configuration>
(1) Camera
The camera 101 includes an image pickup element such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor) and picks up images. The images picked up by the camera 101 (frames that construct video) are treated as input images for processing by the information processing apparatus 100.
(2) Microphone
The microphone 102 picks up a voice sample produced by a user and generates a voice signal. The voice signal generated by the microphone 102 is able to be treated as an input voice intended for voice recognition by the information processing apparatus 100. The microphone 102 may be an omnidirectional microphone or a microphone with fixed or variable directionality.
(3) Input Device
The input device 103 is a device used by the user to directly operate the information processing apparatus 100. As examples, the input device 103 may include buttons, switches, dials, and the like disposed on the housing of the information processing apparatus 100. On detecting a user input, the input device 103 generates an input signal corresponding to the detected user input.
(4) Communication Interface
The communication IF 104 acts as an intermediary for communication between the information processing apparatus 100 and another apparatus. The communication I/F 104 supports an arbitrary wireless communication protocol or wired communication protocol and establishes a communication connection with the other apparatus.
(5) Memory
The memory 105 is constructed of a storage medium such as a semiconductor memory or a hard disk drive and stores programs and data for processing by the information processing apparatus 100, as well as content data. As one example, the data stored by the memory 105 may include characteristic data used for image recognition and voice recognition, described later. Note that some or all of the programs and data described in the present specification may not be stored by the memory 105 and instead may be acquired from an external data source (as examples, a data server, network storage, or an externally-attached memory).
(6) Tuner
The tuner 106 extracts and demodulates a content signal on a desired channel from a broadcast signal received via an antenna (not shown). The tuner 106 then outputs the demodulated content signal to the decoder 107.
(7) Decoder
The decoder 107 decodes content data from the content signal inputted from the tuner 106. The decoder 107 may decode content data from a content signal received via the communication I/F 104. Content images may be generated based on the content data decoded by the decoder 107.
(8) Display
The display 108 has a screen constructed of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), a CRT (Cathode Ray Tube), or the like and displays images generated by the information processing apparatus 100. As examples, content images and images that were described with reference to
(9) Speaker
The speaker 109 has a diaphragm and circuit elements such as an amplifier and outputs audio based on an output voice signal generated by the information processing apparatus 100. The volume of the speaker 109 is variable.
(10) Remote Control Interface
The remote control I/F 110 is an interface that receives a remote control signal (an infrared signal or other wireless signal) transmitted from a remote controller used by the user. On detecting a remote control signal, the remote control I/F 110 generates an input signal corresponding to the detected remote control signal.
(11) Bus
The bus 111 connects the camera 101, the microphone 102, the input device 103, the communication IT 104, the memory 105, the tuner 106, the decoder 107, the display 108, the speaker 109, the remote control I/F 110, and the processor 112 to each other.
(12) Processor
As examples, the processor 112 may be a CPU (Central Processing Unit) or a DSP (Digital Signal Processor). By executing a program stored in the memory 105 or on another storage medium, the processor 112 causes the information processing apparatus 100 to function in various ways as described later.
2-2. Example Functional Configuration>
(1) Image Acquisition Unit
The image acquisition unit 120 acquires an image picked up by the camera 101 as an input image. The input image is typically an individual frame in a series of frames that construct video in which users appear. The image acquisition unit 120 then outputs the acquired input image to the recognition unit 150 and the control unit 170.
(2) Voice Acquisition Unit
The voice acquisition unit 130 acquires the voice signal generated by the microphone 102 as an input voice. The voice acquisition unit 130 then outputs the acquired input voice to the recognition unit 150. Note that processing of an input voice may be omitted from the present embodiment.
(3) Application Unit
The application unit 140 carries out various application functions of the information processing apparatus 100. As examples, a television program reproduction function, an electronic program guide display function, a recording setting function, a content reproduction function, a content searching function, and an Internet browsing function may be carried out by the application unit 140. The application unit 140 outputs application images (which may include content images) and audio which have been generated via the application function to the control unit 170.
In the present embodiment, at least some of the processes carried out by the application unit 140 are associated with objects laid out on a UI image. Such processes may be carried out in response to operation events that involve the associated UI objects. The processes that may be carried out via UI objects may include arbitrary processes, such as setting a channel and volume for a television program reproduction function, setting a channel and time period for an electronic program guide display function, selecting content for a content reproduction function, and designating a search keyword and carrying out a search for a content search function.
(4) Image Recognition Unit
The image recognition unit 152 recognizes an operation object used by the user in an input image inputted from the image acquisition unit 120. In the present embodiment, the operation object is the user's hand. A user's hand that makes a specified shape (such as a shape where the hand is open, a gripping shape, or a shape of pointing with a finger) may be used as the operation object. In other embodiments, instead of the user's hand, the user's foot or a known actual object held by the user may be used as the operation object. As one example, the image recognition unit 152 may recognize the hand region in the input image by matching image characteristic values extracted from the input image and image characteristic values an operation object stored in advance by the characteristics DB 160, in the same way, the image recognition unit 152 may recognize a face region in the input image.
As one example, the image recognition unit 152 may identify the user by matching an image part (facial image) of the face region recognized in an input image against facial image data of known users stored in advance by the characteristics DB 160. As examples, the user identification result produced by the image recognition unit 152 can be used to personalize menus displayed in a UI image, or by the application unit 140 to recommend content, and to make adjustments to the voice recognition.
In the present embodiment, the image recognition unit 152 also recognizes gestures of the users appearing in an input image. In the example in
Example of gestures that can be recognized by the image recognition unit 152 now be described with reference to
In
In
In
In
In
In
In
Note that the gestures described here are mere examples. It is not necessary for the image recognition unit 152 to recognize a number of such gestures and/or the image recognition unit 152 may additionally recognize other types of gestures.
(5) Voice Recognition Unit
The voice recognition unit 154 carries out voice recognition on the voice of the user based on an input voice inputted from the voice acquisition unit 130. If, for example, an application being carried out or a UI receives the inputting of a voice command, the voice recognition unit 154 recognizes a voice command from the user's voice and outputs an identifier of the recognized voice command to the application unit 140 or the control unit 170.
(6) Characteristics DB
The characteristics DB 160 stores in advance image characteristics data which is to be used in image recognition by the image recognition unit 152. As one example, the image characteristics data may include known image characteristic values for an operation object (such as the hand) used by the user and the face of the user. The image characteristics data may also include facial image data for each user. The image characteristics data may also include gesture definition data defining gestures to be recognized by the image recognition unit 152. The characteristics DB 160 may also store in advance voice characteristics data to be used for voice recognition by the voice recognition unit 154.
(7) Operation Control Unit
The operation control unit 172 generates a UI image by superimposing at least one UI object on the input image, and displays a generated UI image (an output image corresponding to the input image) on the screen of the display 108. The input image for generating the UI image may differ to the input image to be used by the image recognition unit 152 for recognizing the operation object (as one example, an image with reduced resolution may be used to recognize the operation object). The operation control unit 172 then controls the displaying and operation of at least one UI object based on the recognition result of the operation object inputted from the image recognition unit 152.
As shown in
In a certain scenario, the mode of approach of UI objects toward the user is uniform. That is, the operation control unit 172 sets the approach speeds of the UI objects that are to make the approach at the same value so that such UI objects all approach toward the user at the same approach speed.
In another scenario, the mode of approach of UI objects toward the user is non-uniform. That is, the operation control unit 172 sets the approach speeds of the UI objects that are to make the approach at different values so that the UI objects approach toward the user at different approach speeds. In addition to (or instead of) the approach speed, other attributes of the UI objects may be set non-uniformly. As examples, such other attributes may include at least one of approach start timing, post-approach display positions (hereinafter referred to as the “target positions”), display size, transparency, and depth.
As one example, when causing the display positions of the UI objects to approach toward the user, the operation control unit 172 may vary the mode of approach of the respective objects in accordance with priorities set for the respective objects. Such priorities are set in advance by the priority setting unit 174 in accordance with a specific priority setting standard and stored by an operation DB 180. A first example of the priority setting standard is a standard relating to an operation history of the UI objects. As one example, the priorities may be set higher for UI objects with a higher operation frequency (the number of past operations per specific period) and the priorities may be set lower for UI objects with a lower operation frequency. It is also possible to set the priority higher for UI objects that were operated at more recent timing in the past. A second example of a priority setting standard is a standard relating to user attributes. As one example, out of a plurality of content items, the priorities of UI objects corresponding to content items with a high recommendation score calculated according to a known recommendation technology based on the user attributes may be set at higher values. The operation control unit 172 may provide the user with a UI for switching the priority setting standard at desired timing between a plurality of candidates. Such UI may be realized by any method such as user gestures or voice commands.
Typically, the operation control unit 172 sets the approach speed and other attributes of the UI objects so as to make objects that have higher priorities easier to operate for the user. More specifically, as one example, the operation control unit 172 may set the approach speed toward the user higher for objects with higher priorities. The operation control unit 172 may also set the approach start timing of objects earlier for objects with higher priorities. Also, for an object with higher priority, the operation control unit 172 may set the target position closer to the user, the display size larger, the transparency lower, or the depth shallower.
Regardless of whether the mode of approach toward the user of the UI objects is uniform or non-uniform, the operation control unit 172 controls operations of UI objects by the user in response to a number of operation events defined in advance. The operation events typically include recognition of a user gesture, and recognition of voice commands may be used to complement such recognition. At least one operation event is recognition of a new operation object. Recognition of a new operation object may trigger UI objects approaching toward the user. Another operation event may trigger execution (launching) of a process associated with a UI object. A number of specific examples of operation events that can be used in the present embodiment are described later in this specification.
The operation control unit 172 also controls the displaying of a UI image via the display 108. As one example, the operation control unit 172 may display only an UI image on which UI objects are superimposed on the screen of the display 108. Alternatively, the operation control unit 172 may display a single output image generated by combining a image and an application image generated by the application unit 140 on the screen. A number of examples of window compositions of output images that can be used in the present embodiment are described later.
(8) Priority Setting Unit
The priority setting unit 174 sets the priority of each UI object in accordance with the priority setting standard described earlier. As one example, in accordance with a priority setting standard relating to the operation histories of objects, the priority setting unit 174 may set the priorities higher for UI objects with a higher operation frequency. Also, in accordance with a priority setting standard relating to user attributes, the priority setting unit 174 may set the priorities higher for UI objects corresponding to content items with a higher recommendation score. The priority setting unit 174 may also set the priorities of UI objects randomly to add an element of surprise to the UI. The priority setting unit 174 may update the priority data for example when a UI object has been operated or when the user attributes have changed.
(9) Operation DB
The operation DB 180 stores data used by the operation control unit 172 to control displaying and operations of UI objects. The data stored by the operation DB 180 includes object data showing a default display position and default values of other attributes for each UI object. The data stored by the operation DB 180 may also include priority data showing priorities set by the priority setting unit 174, operation history data showing an operation history for each user, and user data showing attributes (such as age, sex, occupation, and tastes) for each user.
<2-3. Mode Of Approach of UI Objects>
This section will describe a number of examples of modes of approach toward the user of UI objects with reference to
In
A user Ud appears in the UI image ST11 and a mirror image display is realized. UI objects B11 to B16 are laid out at default display positions. It is assumed here that as a result of the user Ud raising his hand, the image recognition unit 152 recognizes the gesture G0.
In the next UI image ST12, the user Ud is raising his hand. In response to the recognition of the gesture G0, the operation control unit 172 causes the UI objects B11 to B16 to start approaching toward the user Ud (as one example, the band that is the operation object of the user Ud). In the UI image ST12, the UI object B13 is positioned closest to the user Ud.
In the next UI image ST13, the UI object B13 returns to the default display position and in place of the UI object B13, the UI objects B12 and B14 are positioned in the vicinity of the user Ud's hand. In the next UI image ST14, the UI objects B12 and B14 return to the default display positions and in place of the UI objects B12 and B14, the UI objects B11, B15 and B16 are positioned in the vicinity of the user Ud's hand.
According to this mode of approach, it is possible for the user to touch the UI object B13 at the time of the UI image ST12, to touch any of the UI objects B12 and B14 at the time of the UI image ST13, and to touch any of the UI objects B11, B15, and B16 at the time of the UI image ST14 with a simple operation of merely moving his hand or arm. At such time, since the screen region in the vicinity of the user is not crowded by a large number of UI objects, the risk of an erroneous operation being carried out, such as touching the wrong UI object, is reduced.
In
The user Ud appears in the UI image ST21 and a mirror image display is realized. UI objects B11 to B16 are laid out at default display positions. It is assumed here that as a result of the user Ud raising his hand, the image recognition unit 152 recognizes the gesture G0.
In the next UI image ST22, the user Ud is raising his hand. In response to the recognition of die gesture G0, the operation control unit 172 causes the UI objects B11 to B16 to start approaching toward the user Ud. In the UI image ST22, the UI object B13 is positioned closest to the user Ud.
In the next UI image ST23, the UI object B13 remains at its target position and the UI objects B12 and B14 also reach the vicinity of the user Ud's hand. In the next UI image ST24, the UI objects B12, B13, and B14 remain at their target positions and the UI objects B11, B15 and B16 also reach the vicinity of the user Ud's hand.
According to this mode of approach, it is possible for the user to touch the UI object B13 at the time of the UI image ST22, to touch any of the UI objects B12, B13, and B14 at the time of the UI image ST23, and to touch any of the UI objects B11 to B16 at the time of the UI image ST24 with a simple operation of merely moving his hand or arm. In particular, at the time of the UI images ST22 and ST23, crowding of the screen region in the vicinity of the user by a large number of UI objects is avoided. Also, as described later, if an operation event of moving the display position of a designated UI object away from the user is used, it is possible to ease the crowding of the screen region in the vicinity of the user from a state where the UI objects B11 to B16 have reached the vicinity of the user.
In
The user Ud appears in the UI image ST31 and a mirror image display is realized. The UI objects B11 to B16 are laid out at default display positions. It is assumed here that as a result of the user Ud raising his hand, the image recognition unit 152 recognizes the gesture G0.
In the next UI image ST32, the user Ud is raising his hand. In response to the recognition of the gesture G0, the operation control unit 172 sets the respective approach speeds V11 to V16 of the UI objects B11 to B16 in accordance with the priorities set by the priority setting unit 174. In the example in
The operation control unit 172 refers to such priority data 182a and sets the approach speeds V11 and V15 of the UI objects B11 and B15 at the fastest speed, the approach speed V12, of the UI object B12 at next fastest speed, and the approach speeds V13, V14, and V16 of the UI objects B13, B14, and B16 at the slowest speed. As shown in
According to this mode of approach, it is possible for the user to rapidly operate the UI objects that are operated most frequently. Also, since the screen region in the vicinity of the user does not become crowded with a large number of UI objects, the risk of an erroneous operation being carried out, such as touching the wrong UI object, is reduced.
The user Ud appears in the UI image ST41 and a mirror image display is realized. The UI objects B21 to B26 are laid out at default display positions. It is assumed here that as a result of the user Ud raising his hand, the image recognition unit 152 recognizes the gesture G0.
In the next UI image ST42, the user Ud is raising his hand. In response to the recognition of the gesture G0, the operation control unit 172 sets the respective approach speeds V21 to V26 of the UI objects B21 to B26 in accordance with the priorities set by the priority setting unit 174. In the example in
The operation control unit 172 refers to such priority data 182b and sets the approach speed V21 of the UI object B21 at the fastest speed, the approach speeds V22 and V25 of the UI objects B22 and B25 at the next fastest speed, and the approach speeds V23, V24, and V26 of the UI objects B23, B24 and B26 at the slowest speed. Returning to
According to this mode of approach, it is possible for the user to rapidly operate a UI object determined by a recommendation algorithm to be more suited to such user. Also, since the screen region in the vicinity of the user is not crowded by a large number of UI objects, the risk of an erroneous operation being carried out, such as touching the wrong UI object, is reduced.
In
The user Ud appears in the UI image ST51 and a mirror image display is realized. The UI objects B11 to B16 are laid out at default display positions. It is assumed here that as a result of the user Ud raising his hand, the image recognition unit 152 recognizes the gesture G0.
In the next UI image ST52, the user Ud is raising his hand. In response to the recognition of the gesture G0, the operation control unit 172 sets the respective approach speeds and display sizes of the UI objects B21 to B26 in accordance with the priorities set by the priority setting unit 174. In the example in
The operation control unit 172 refers to such priority data 182a and sets the approach speeds V11 and V15 of the UI objects B11 and B15 at the fastest speed, the approach speed V12 of the UI object B12 at the next fastest speed, and the approach speeds V13, V14, and V16 of the UI objects B13, B14 and B16 at the slowest speed. The operation control unit 172 also sets the display size of the UI objects B11 and B15 the largest, the display size of the UI object B12 the next largest, and the display sizes of the UI objects B13, B14 and B16 the smallest.
In the next UI image ST53, most of the UI objects have already reached the vicinity of the user Ud's hand and out of such UI objects, the UI objects B11 and B15 have larger display sizes that the other UI objects.
According to this mode of approach, it is possible for the user to operate a UI object that has a higher priority more rapidly and more accurately.
<2-4. Various Examples of Operation Events>
In this section, a number of examples of operation events relating to control by the operation control unit 172 will be described with reference to
As a result of using such an operation event, the user is capable of remotely controlling the information processing apparatus 100 even when a remote controller is not at hand. At this time, since the only movement necessary by the user is a simple vesture, the user is capable of having the information processing apparatus 100 carry out a desired process (for example, a menu process or an application process) without feeling stress.
Note that the process associated with a UI object may be a process for UI control. For example, opening a submenu item from the designated menu item, calling a setting screen corresponding to the designated menu item, and the like may be carried out in response to recognition of a gesture of touching a UI object.
In the UI image ST15, the operation control unit 172 sets display attributes (for example, at least one of texture, color, transparency, display size, and depth) of the designated UI object B16 at different attribute values to other UI objects. By doing so, it is possible for the user to grasp that the UI object B16 was appropriately designated.
Even when such an operation event is used, it is possible for the user to remotely control the information processing apparatus 100 even when a remote controller is not at hand. At this time, since the only movement necessary for the user is a simple gesture, it is possible to reduce the user's stress. Note that for the other examples of operation events described in this section, recognition of the user's gesture may be substituted with recognition of a voice command.
In the next UI image ST26, the image recognition unit 152 may recognize the gesture G3a of rotating the hand from a movement where the user Ud's hand, which is operation object, rotates. In response to such operation event, the operation control unit 172 rotates (in the direction D1 in the image) the display positions of the UI objects B11 to B16 around a reference point in the image. As examples, the reference point referred to here may be a center of gravity of the hand region, a center of the UI objects B11 to B16, or any other arbitrary point.
As a result of using an operation event of stopping the movement of the UI objects described earlier, it is possible for the user to stop further movement of the objects when a desired UI object has reached a display position that is suited to operation and to then accurately operate a desired UI object.
Also, as a result of using an operation event of rotating the display positions of the UI objects described earlier, it is possible for the user to move the display positions of UI objects that have approached the vicinity of the user to positions that is easier to handle. Instead of rotating the display positions of the UI objects in response to recognition of the gesture of rotating the hand, the display positions of the UI objects may move in parallel in response to movement of the user's hand. Note that instead of all of the displayed UI objects rotating or moving as shown in the example in
As a result of using such an operation event, even if a UI object that is not necessary has approached the user earlier than a desired UI object, it is still possible for the user to remove the UI object that is not necessary from the screen region in the vicinity of the user and thereby prevent crowding of such screen region.
The user Ud appears in the UI image ST51 and a mirror image display is realized. The UI objects B11 to 316 are also laid out at default display positions. The UI objects B11 to B16 are objects belonging to a first category out of a plurality of categories defined in advance. As one example, the first category is a category relating to a television program reproduction function. Note that in an initial state, UI objects are not necessarily visible to the user. For example, in an initial state, UI objects may be positioned outside the screen, or may be transparent or translucent. The UI objects may change from a non-active state (undisplayed or translucent) to an active state (displayed or non-transparent) at timing where the user raises his/her hand.
In a UI image ST52, as a result of the user Ud raising his hand, the UI objects B11 to B16 start to approach the user Ud. In the UI image ST52, the image recognition unit 152 may recognize the gesture G1 of waving the hand from a movement where the user's hand is waved to the left and right. In response to an operation event that corresponds to recognition of the gesture G1, the operation control unit 172 replaces the objects B11 to B16 laid out in the UI image with the UI objects belonging to a second category. The second category may be an arbitrary category (such as a category relating to a content reproduction function) that differs to the first category.
In the UI image ST54, the objects B11 to B16 are removed from the screen and new UI objects B31 to B37 are laid out on the screen.
As a result of using such operation event, the information processing apparatus 100 is capable of displaying only some of the UI objects on the screen without displaying all of the UI object candidates that can be displayed on the screen. Accordingly, crowding of the screen region is significantly eased. It is also possible for the user to have a desired UI object, which is not presently displayed at such time, displayed on the screen via a simple gesture and to appropriately operate such UI object.
Note that selection of the category of UI objects to be displayed in an UI image may depend on the shape of the user's hand. For example, the UI objects that have been displayed so far may be replaced with UI objects that belong to any of the first to fifth categories in response to recognition of five types of hand shape that respectively express the numbers one to five.
Also, the gesture G1 may be defined as not as a vesture for switching the category of UI objects to be displayed but as a gesture for switching the priority setting standard used to set the approach speeds. In this case, in response to the operation event corresponding to the recognition of the gesture G1, the operation control unit 172 resets the priorities of at least one of the UI objects being displayed in accordance with the new priority setting standard.
The user Ud appears in the UI image ST42 and is raising his hand. The UI objects B21 to B26 are approaching toward the user Ud at approach speeds respectively set by the operation control unit 172.
In the next UI image ST44, the image recognition unit 152 may recognize the gesture G5 of grasping an object from a movement of the user's hand that changes from a shape where the palm of the hand is open to a shape where the hand is closed. By comparing the position in the UI image of the hand region at such time with the display positions of the respective UI objects, the operation control unit 172 may determine that the UI object B25 is designated (that is, grasped). In response to an operation event corresponding to the recognition of the gesture G5, the operation control unit 172 thereafter has the display position of the designated UI object B25 track the position of the hand region (that is, has the UI object B25 move together with the operation object).
In the next UI image ST45, UI objects aside from the designated UI object B25 are removed. Also, two screen regions R11 and R12 are set in the image. As one example, in a UI image, the operation control unit 172 may set an equal number of screen regions to the number of processes associated with the designated UI object B25. As one example, if the UI object B25 is a content item for photographic content, the screen region R11 may be associated with launching an image viewer and the screen region R12 may be associated with transmitting a message to which photographic content is appended.
In the next UI image ST46, as a result of the user's hand region moving to a position that coincides with the screen region R12, the display position of the UI object B25 also moves to a position that coincides with the screen region R12. In response to the operation event corresponding to the movement of the UI object B25 to such specific screen region, the operation control unit 172 causes the application unit 140 to carry out a process associated with the UI object B25. Here, as one example, a message transmission function is launched by the application unit 140 and photographic content may be appended to a new message.
As a result of using such operation event, it is possible for the user to launch a desired process for a UI object with an easy and intuitive operation, even when a variety of processes are related to a single UI object.
The user Ud appears in the UI image ST61 and a mirror image display is realized. The UI objects B11 to B16 are also laid out at default display positions. The UI objects B11 to B16 are objects belonging to a first category out of a plurality of categories defined in advance. As one example, the first category is a category relating to a television program reproduction function. In addition, four screen regions R21 to R24 are set in the image. The screen regions R21 to R24 may be associated with respectively different categories.
In the UI image ST62, as a result of the user Ud raising her hand, the UI objects B11 to B16 start to approach toward the user Ud. The position of the user Ud's hand coincides with the screen region R23. It is assumed that the screen region R23 is associated with the first category.
In the UI image ST63, as a result of the position of the hand region being lowered in a downward direction, the position of the user Ud's hand region coincides with the screen region R24. It is assumed that the screen region R24 is associated with a different category to the first category. In response to an operation event corresponding to movement of the hand that is the operation object between screen regions, the objects B11 to B16 laid out in the UI image are replaced with objects belonging to another category.
In the UI image ST63, the objects B11 to B16 are removed from the screen and new UI objects B41 to B45 are laid out in the image. In the next UI image ST64, the UI objects B41 to B45 start to approach toward the user Ud.
In the same way as the sixth example described earlier, as a result of such operation event being used, the information processing apparatus 100 is capable of displaying only some of the UI objects on the screen instead of all of the UI object candidates that can be displayed. Accordingly, crowding of the screen region is significantly eased. It is also possible for the user to have desired UI objects that are not on display at the present time displayed on the screen via a simple operation of simply moving the hand and to appropriately operate such UI objects.
<2-5 Incorporation of a Plurality of Operation Objects>
Examples where a single operation object is recognized in the input image have mainly been described so far. However, with the technology according to the present disclosure, a plurality of operation objects may be recognized in the input image. In this section, a number of example operation scenarios involving a plurality of operation objects will be described with reference to
In
A user Ud appears in the UI image ST71 and a mirror image display is realized. UI objects B51 to B58 are laid out at default display positions. It is assumed here that the UI objects B51 to B58 are grouped into a plurality of groups in accordance with a grouping standard. As examples, the UI objects may be grouped according to a standard relating to the priorities described earlier, the types of corresponding menu items or content items, or display positions, or may be randomly grouped.
In the next UI image ST72, the user Ud raises his left hand and the hand region A21 is recognized. In response to recognition of the gesture G0 for the hand region A21, the operation control unit 172 causes the UI objects B53 to B56 included in the first group to start approaching toward the user Ud.
In the next UI image ST73, the user Ud further raises his right hand and the hand region A22 is recognized. In response to recognition of the gesture G0 for the hand region A22, the operation control unit 172 causes the UI objects B51, B52, B57, and B58 included in the second group to start approaching toward the user Ud.
In the next UI image ST74, the UI objects B53 to B56 are laid out in a ring in the vicinity of the user Ud's left hand and the UI objects B51, B52, B57, and B58 are laid out in a ring in the vicinity of the user Ud's right hand. As one example, in response to recognition of a gesture of the user MI bringing his hands together, the operation control unit 172 may form a single ring by merging the two rings of such UI objects. If the hand regions are positioned at edge portions of the screen, the operation control unit 172 may distort the shapes of the rings.
According to this operation scenario, it is possible for the user to have a large number of UI objects approach the vicinity of the user using both hands. A range produced by combining the regions that can be reached by two hands is wider than a region that can be reached by one hand. This means that the user can rapidly designate a desired UI object out of a larger number of UI objects in the vicinity and operate the desired UI object.
In
The user Ud and a user Ue appear in the UI image ST81 and a mirror image display is realized. UI objects B61 to B68 are also displayed. The user Ud raises his left hand and a hand region A21 is recognized. In response to recognition of the gesture G0 for the hand region A21, the UI objects B61 to B68 are approaching toward the user Ud.
In the next UI image ST82, the user Ue is raising her right hand and a hand region A31 is recognized. In response to recognition of the gesture G0 for the hand region A31, the operation control unit 172 has the UI objects B61, B64, B65 and B68 start to approach toward the user. Ue. As one example, the UI objects may be grouped into a plurality of groups in accordance with a grouping standard relating to an operation history for each user or user attributes. In the example in
In the UI image ST82, by setting display attributes (for example, color) of the UI objects B61, B64, B65 and B68 at different attribute values to the other UI objects, the operation control unit 172 expresses that the UI objects B61, B64, B65 and B68 are included in the second group intended for the user Ue. If the target positions of the two groups interfere with one another, the operation control unit 172 may shift the target positions to eliminate such interference.
According to this operation scenario, it is possible to share UI objects among a plurality of users. When doing so, it is possible for each user to have suitable UI objects for such user approach the user and to rapidly operate a desired UI object. The fifth example of an operation event described with reference to
In
The user Ud and the user Ue appear in the UI image ST81 and a mirror image display is realized. The UI objects B61 to B68 are also displayed. The user Ud raises his left hand and a hand region A21 is recognized. In response to recognition of the gesture G0 for the hand region A21, the UI objects B61 to B68 continue to approach toward the user Ud.
In the next UI image ST83, the user Ue uses her right hand to touch the UI object B65. In response to recognition of the gesture G4a for the hand region A32 of the user Ue's right hand, the operation control unit 172 may determine that the UI object B65 is designated. In response to such operation event, the operation control unit 172 has the application unit 140 carry out a process associated with the designated UI object B65.
According to this operation scenario, it is possible for a different user to easily operate UI object that has approached toward a given user.
Note that examples where UI objects are two-dimensionally laid out in a UI image have mainly been described so far. However, the respective UI objects are not limited to having two-dimensional display positions and may have an attribute corresponding to depth. If the information processing apparatus 100 is capable of recognizing the distance between a camera and an operation object using a known method such as parallax, the operation control unit 172 may determine which UI object has been designated by the user also based on such recognized distance.
<2-6. Example Window Compositions>
<2-7. Example Processing Flow>
The flowcharts in
As shown in
Next, the image recognition unit 152 recognizes the operation object appearing in the input image inputted from the image acquisition unit 120 (step S105). It is assumed here that the operation object is the user's hand. Far example, the image recognition unit 152 recognizes a hand region in the input image and outputs position data showing the position of such recognized hand region to the control unit 170. The image recognition unit 152 also recognizes a user gesture based on movement of the hand region. In addition, a voice command may also be recognized by the voice recognition unit 154 based on an input voice.
Next, the operation control unit 172 determines an operation event based on an image recognition result inputted from the image recognition unit 152 and a voice recognition result that may be inputted as necessary from the voice recognition unit 154 (step S110). The subsequent processing branches in accordance with the operation event determined here.
In step S115, the operation control unit 172 determines whether a new set of UI objects is to be displayed (step S115). As examples, if a tit image is to be newly displayed or if the operation event described with reference to
In step S120, the operation control unit 172 determines whether any of the UI objects has been selected (step S120). As one example, if the operation event described with reference to
In step S125, in response to an operation event that selects a UI object, the operation control unit 172 causes the application unit 140 to carry out a process associated with the selected UI object (step S125). By increasing the operation frequency of the selected UI object for example, the priority setting unit 174 then updates the priority data (step S130). After this, the processing returns to step S100.
In step S135, the operation control unit 172 sets up the new set of UI objects (step S135). As one example, the operation control unit 172 specifies a set of UI objects belonging to a different category to the set of UI objects that were displayed in the previous frame. The operation control unit 172 then lays out the UI objects included in the new set at the default display positions (step S140). After this, the processing proceeds to step S145.
In step S145, the operation control unit 172 determines whether an operation object has been newly recognized (step S145). As examples, if the gesture G0 described with reference to
In step S150, the operation control unit 172 sets the approach speeds and other attributes of the UI objects (step S150). As one example, the operation control unit 172 may set the approach speed toward the user of an object with a higher priority at a higher speed. The operation control unit 172 may also set the display size of an object with a higher priority at a larger size.
As shown over in
In step S160, the operation control unit 12 updates the display positions of UI objects related to a special event (step S160). As one example, if the operation control unit 172 has detected the operation event described with reference to
The operation control unit 172 then updates the display positions of other UI objects based on their approach speeds (step S165). As one example, the display positions of objects that have faster approach speeds may be moved much closer toward the user.
After this, the operation control unit 172 generates a UI image by superimposing at least one UI object on the input image in accordance with the display positions and attributes decided via the processing so far (step S170). The operation control unit 172 displays an output image including a generated UI image on the screen of the display 108 (step S175). After this, the processing returns to step S100.
As described earlier, the technology according to an embodiment of the present disclosure is not limited to a television apparatus and can be applied to various types of apparatus. For this reason, an example where the technology according to an embodiment of the present disclosure has been applied to the information processing apparatus 200 that includes the internet will now be described as a second embodiment. As was described with reference to
(1) Example Hardware Configuration
The camera 201 includes an image pickup element such as a CCD or a CMOS and picks up images. The images picked up by the camera 201 (frames that construct video) are treated as input images for processing by the information processing apparatus 200.
The sensor 202 may include various sensors such as a measurement sensor, an acceleration sensor, and a gyro sensor. The sensor data generated by the sensor 202 may be used by an application function of an information processing apparatus 200.
The input device 203 is a device used by the user to directly operate the information processing apparatus 200 or to input information into the information processing apparatus 200. As one example, the input device 103 may include a touch panel, buttons, switches, and the like. On detecting a user input, the input device 203 generates an input signal corresponding to the detected user input.
The communication I/F 204 acts as an intermediary for communication between the information processing apparatus 200 and another apparatus. The communication I/F 204 supports an arbitrary wireless communication protocol or wired communication protocol and establishes a communication connection with the other apparatus.
The memory 205 is constructed of a storage medium such as a semiconductor memory or a hard disk drive and stores programs and data for processing by the information processing apparatus 200, as well as content data. Note that some or all of the programs and data may not be stored by the memory 205 and instead may be acquired from an external data source (as examples, a data server, network storage, or an externally attached memory).
The display 208 has a screen constructed of an LCD, an OLED, or the like and displays images generated by the information processing apparatus 200. As one example, the same UI images as those described in the first embodiment may be displayed on the screen of the display 208.
The speaker 209 has a diaphragm and circuit elements such as an amplifier and outputs audio based on an output audio signal generated by the information processing apparatus 200. The volume of the speaker 209 is variable.
The bus 211 connects the camera 201, the microphone 202, the input device 203, the communication I/F 204, the memory 205, the display 208, the speaker 209, and the processor. 212 to each other.
As examples, the processor 112 may be a CPU or a DSP. By executing a program stored in the memory 205 or on another storage medium, in the same way as the processor 112 of the information processing apparatus 100 according to the first embodiment, the processor 212 causes the information processing apparatus 200 to function in various ways. Aside from differences in the application function, the configuration of the logical functions realized by the memory 205 and the processor 212 of the information processing apparatus 200 may be the same as the configuration of the information processing apparatus 100 illustrated in
(2) Example Operation Scenario
In the output image ST91, the application image WAPP includes text written in a Web page. In
The next output image ST92 may be displayed after the hand of the user Ud that is the operation object is recognized, for example. In the output image ST92, UI objects B71 to B73 are superimposed on the UI image. The UI object B71 is associated with the keyword “XXX Computer Entertainment Inc,”. The UI object B72 is associated with the keyword “GameStation”. The UI object B73 is associated with the keyword “Christmas”.
In the next output image ST93, the user Ud's hand coincides with the UI object B72. Three screen regions R41, R42, and R43 are set in the UI image. The screen region R41 is associated with a Web search (text search) process. The screen region R42 is associated with an image search process. The screen region R43 is associated with a movie search process.
In the next output image ST94, the UI object B72 has moved so as to track movement of the user Ud's hand and has moved to a position that coincides with the screen region R41. In response to such operation event, the operation control unit 172 of the information processing apparatus 200 causes the application unit 140 to carry out a Web search function that uses the keyword “Gamestation” shown by the UI object B72.
Embodiments of the present disclosure have been described in detail so far with reference to
Also, according to the embodiments described above, the mode of approach toward the user of the UI objects may vary according to the priorities set for the respective UI objects. Accordingly, the user is capable of rapidly operating a UI, object that has a higher priority (as examples, a UI object operated with higher frequency or a UI object determined to be suited to the user).
According to the embodiments described above, various operation events triggered by user gestures may be realized. Accordingly, the user is capable of flexibly operating an information appliance using UI objects that have approached the vicinity of the user, even when the user does not have a remote controller or other physical operation device.
Note that the series of processes carried out by the various apparatuses described as embodiments of the present disclosure are typically realized using software. As one example, programs composed of software that realizes such series of processes are stored in advance on a storage medium (non-transitory medium) provided internally in or externally to such apparatuses. As one example, during execution, such programs are then written into RAM (Random Access Memory) and executed by a processor such as a CPU.
Although preferred embodiments of the present disclosure are described in detail above with reference to the appended drawings, the technical scope of the disclosure is not limited thereto. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and alterations may occur depending on design requirements and other factors insofar as they are within the scope of the appended claims or the equivalents thereof.
Additionally, the present technology may also be configured as below.
(1)
An information processing system comprising:
processing circuitry configured to
control a movement of a UI object on a display screen from a pre-recognition position toward a post-recognition position in response to recognition of an operation object initiated by a user, wherein the post-recognition position is spatially related to a displayed position of a predetermined displayed feature, and wherein the predetermined displayed feature is an image derived from a camera-captured image.
(2)
The information processing system of (1), wherein
the processing circuitry is configured to vary a mode of approach of the UI object in accordance with a parameter related to the UI object.
(3)
The information processing system of (2), wherein
the mode of approach is non-uniform for the displayed object and other displayed objects such that respective speeds of approach are different for different displayed objects.
(4)
The information processing system of (2), wherein
the parameter is a priority.
(5) The information processing system of (4), wherein the priority is based on an operation frequency or a recommendation
(6)
The information processing system of (2), wherein
the mode of approach is non-uniform for the displayed object and other displayed objects such that respective post-recognition displayed positions are different for different displayed objects.
(7)
The information processing system of (2), wherein
a trigger for the movement for the displayed object and another displayed object are different.
(8)
The information processing system of (7), wherein
a first detected gesture triggers a movement of the displayed object and a second detected gesture triggers a movement of the other displayed object.
(9)
The information processing system of (7), wherein
the displayed object and the other displayed object are displayed in a ring around the operation object.
(10)
The information processing system of (1), wherein the processing circuitry is configured to control a movement of a plurality of UI objects.
(11)
The information processing system of (1), wherein
the post-recognition position for the UI object is different for the UI object than for a different UI object.
(12)
The information processing system of (11), wherein
the post-recognition position is closer to the operation object when the UI object is identified as a higher priority than the different UI object, and further from the operation object when the UI object is identified as a lower priority than the different UI object.
(13)
The information processing system of (1), wherein
the predetermined displayed feature is a body part of the user.
(14)
The information processing system of (1), wherein
the predetermined displayed feature is the operation object.
(15)
The information processing system of (1), wherein
the predetermined displayed feature is a feature of a user image.
(16)
The information processing system of (1), wherein
the predetermined displayed feature is a feature of an action of a user.
(17)
The information processing system of (16), wherein
the processing circuitry is also configured to implement an image recognition unit that recognizes a feature of the user as the operation object.
(18)
The information processing system of (1), wherein
the post-recognition position of the displayed object to the operation object is a closer distance than the pre-recognition position such that the displayed object moves toward the operation object.
(19)
An information processing method comprising: controlling with processing circuitry a movement of a UI object on a display screen from a pre-recognition position toward a post-recognition position in response to recognition of an operation object initiated by a user, wherein the post-recognition position is spatially related to a displayed position of a predetermined displayed feature, and the predetermined displayed feature is an image derived from a camera-captured image.
(20)
A non-transitory computer readable medium having computer readable instructions that when executed by a processing circuitry perform a method, the method comprising:
controlling with the processing circuitry a movement of a UI object on a display screen from a pre-recognition position toward a post-recognition position in response to recognition of an operation object initiated by a user, wherein the post-recognition position is spatially related to a displayed position of a predetermined displayed feature, and the predetermined displayed feature is an image derived from a camera-captured image.
Additionally, the present technology may also be configured as below.
(1)
An information processing apparatus including:
an image acquisition unit acquiring an input image;
a recognition unit recognizing, in the input image, an operation object used by a user; and
a control unit displaying, on a screen, an output image which corresponds to the input image and on which a plurality of objects to be operated by the user are superimposed and controlling displaying of at least one of the objects based on a recognition result of the operation object,
wherein the control unit causes display positions of the plurality of objects being displayed on the screen before recognition of the operation object to respectively approach toward the user after the recognition of the operation object.
(2)
The information processing apparatus according to (1),
wherein the control unit varies a mode of approach of the respective objects toward the user in accordance with priorities set for the respective objects.
(3)
The information processing apparatus according to (2),
wherein the control unit sets at least one out of an approach speed, an approach start timing, a display position after approach, a display size, a transparency, and a depth of the plurality of objects so as to make operation of art object that has a higher priority easier far the user.
(4)
The information processing apparatus according to (3),
wherein the control unit sets the approach speed of the object that has the higher priority toward the user to a higher speed.
(5)
The information processing apparatus according to any one of (2) to (4), further including:
a priority setting unit setting the priority for each of the plurality of objects in accordance with a setting standard relating to any of an operation history for each of the objects and an attribute of the user.
(6)
The information processing apparatus according to one of (1) to (5), further including:
an application unit carrying out processes associated with the respective objects,
wherein the control unit is operable in response to a first event, to cause the application unit to carry out a process associated with an object designated by the operation object.
(7)
The information processing apparatus according to any one of (1) to (6),
wherein the control unit is operable in response to a second event, to stop movement of the plurality of objects on the screen.
(8)
The information processing apparatus according to (7),
wherein the control unit is operable in response to a third event after the movement of the plurality of objects is stopped, to rotate the display positions of the plurality of objects around a reference point in the image.
(9)
The information processing apparatus according to any one of (1) to (8),
wherein the control unit is operable in response to a fourth event, to move the display position of at least one object near the operation object away from the user.
(10)
The information processing apparatus according to any one of (1) to (9),
wherein the plurality of objects belong to a first category out of a plurality of categories defined in advance, and
wherein the control unit is operable in response to a fifth event, to replace the plurality of objects superimposed on the input image with objects belonging to a second category that is different from the first category.
(11)
The information processing apparatus according to any one of (6) to (10),
wherein the operation object is a hand of the user, and
wherein the first event is recognition by the recognition unit of a specific gesture of the user.
(12)
The information processing apparatus according to any one of (6) to (10),
wherein the first event is recognition of a specific voice command issued by the user.
(13)
The information processing apparatus according to (6),
wherein the control unit moves the display position of the object designated by the operation object together with the operation object, and
wherein the first event is movement of the object to a specific screen region.
(14)
The information processing apparatus according to (13),
wherein the control unit sets a number of screen regions on the screen equal to a number of processes associated with the object designated by the operation object.
(15)
The information processing apparatus according to (10),
wherein the first category and the second category are associated with different screen regions, and
wherein the fifth event is movement of the operation object from a first screen region associated with the first category to a second screen region associated with the second category.
(16)
The information processing apparatus according to any one of (1) to (15),
wherein the operation object is a right hand and a left hand of the user, and
wherein the control unit is operable, after recognition of one of the right hand and the left hand, to cause a first group out of the plurality of objects to approach toward the recognized one of the right hand and the left hand and is operable, after recognition of another of the right hand and the left hand, to cause a second group out of the plurality of objects to approach toward the recognized other of the right hand and the left hand.
(17)
The information processing apparatus according to any one of (1) to (15),
wherein the operation object is a hand of a first user and a hand of a second user, and
wherein the control unit is operable, after recognition of the hand of the first user, to cause a first group out of the plurality of objects to approach toward the first user and is operable, after recognition of the hand of the second user, to cause a second group out of the plurality of objects to approach toward the second user.
(18)
The information processing apparatus according to (6),
wherein the control unit is operable in response to a sixth event designating at least one object, to cause the application unit to carry out a process associated with the designated object, and
wherein the sixth event is recognition of a specific gesture of another user for the designated object.
(19)
An information processing method carried out by an information processing apparatus, the information processing method including:
acquiring an input image;
recognizing, in the input image, an operation object used by a user;
displaying, on a screen, an output image which corresponds to the input image and on which a plurality of objects to be operated by the user are superimposed; and
causing display positions of the plurality of objects being displayed on the screen before recognition of the operation object to respectively approach toward the user after the recognition of the operation object.
(20)
A program for causing a computer controlling an information processing apparatus to function as:
an image acquisition unit acquiring an input image;
a recognition unit recognizing, in the input image, an operation object used by a user; and
a control unit displaying, on a screen, an output image which corresponds to the input image and on which a plurality of objects to be operated by the user are superimposed and controlling displaying of at least one of the objects based on a recognition result of the operation object,
wherein the control unit causes display positions of the plurality of objects being displayed on the screen before recognition of the operation object to respectively approach toward the user after the recognition of the operation object.
Number | Date | Country | Kind |
---|---|---|---|
2012-285025 | Dec 2012 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2013/006979 | 11/27/2013 | WO | 00 |