In the past, users frequently had access to computers with keyboards and input devices commonly referred to as “mice.” Typically, standard keyboards are best suited for larger devices, and mice are best suited for desktop computers. More recently computing devices such as small, mobile devices have made use of touch sensitive interfaces. However, such interfaces may be impractical for some electronic devices that exist today or are contemplated.
Recently, voice interfaces have been contemplated. Voice interfaces have the benefit of not requiring the user to have the device in their hands. However, voice interfaces have limitations such as their accuracy in human voice recognition.
Without access to conventional input devices such as keyboards, mice, and touch sensitive interfaces, it can be difficult to interface with electronic devices. One example is that a user could find it very difficult to navigate through large amounts of content stored on or accessible to an electronic device. For example, a user could find it difficult to navigate through a large number of sorted data records.
Technology is disclosed herein to help a user navigate through large amounts of content while wearing a see-through, near-eye, mixed reality display device such as a head mounted display (HMD). The HMD allows the user to view virtual objects overlaid in the user's field of view. The user can use a physical object such as a book to navigate through content being presented in the HMD. As one example, the book has markers that can be identified by the HMD so that the content can be presented in the HMD as the user flips through pages in the book.
One embodiment includes a method of navigating through content. Input is received that specifies what content is to be navigated by a user wearing a see-through, near-eye, mixed reality display. Markers are identified in a physical object using a camera as the user manipulates the physical object. The portions of the content that are associated with the identified markers are determined. Images representing the portions of the content are presented in the see-through, near-eye, mixed reality display device.
One embodiment includes a see-through, near-eye, display device system for navigating digital content. The system includes a see-through, near-eye display device; an image sensor; and logic in communication with the display device and the image sensor. The logic is configured to receive input that specifies what digital content is to be navigated by a user that is manipulating a physical object that has markers. The logic accesses the digital content to be navigated. The logic identifies the markers in the physical object using the image sensor as the user manipulates the physical object. The logic identifies portions of the digital content that are associated with the identified markers. The logic presents images representing the identified portions of the digital content in the see-through, near-eye, display device.
One embodiment includes a computer storage device having instructions stored thereon which, when executed on a processor, cause the processor to help a user navigate digital content using a book that has markers. The instructions cause the processor to receive input from the user that specifies what digital content is to be navigated, and to access the digital content. The instructions cause the processor to identify markers in the book using image data as the user turns pages in the book. The book has an ordered sequence of pages with the markers on the pages. The instructions cause the processor to identify portions of the digital content that are associated with the identified markers. The instructions cause the processor to present virtual images representing the identified portions of the digital content in a see-through, near-eye, display device being worn by the user. The instructions cause the processor to present a navigation aid to the user in the see-through, near-eye display as the user turns the pages in the book.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Technology is disclosed herein to help a user navigate through large amounts of content while wearing a see-through, near-eye, mixed reality display device such as a head mounted display (HMD). The user can use a physical object such as a book to navigate through content being presented in the HMD. In one embodiment, a book has markers on the pages that allow the system to organize the content. The markers could be ordinary text or images. The book could have real content, but it could be blank other than the markers. As the user flips through the book, the system recognizes the markers and presents content associated with the respective marker in the HMD.
The user can rapidly scan through the content by flipping through the pages of the book. One non-limiting example of the content is a large set of records. The user could search for content by flipping back and forth in the book. The user could sequentially advance through the content, perform random access in large jumps, perform a binary search for desired data based on page numbers, etc. In one embodiment, a table of contents is presented in the HMD to help the user find the content faster. In one embodiment, chapter headings are presented in the HMD as the user flips through the book to help the user find content faster.
A remote, network accessible computer system 12 may be leveraged for processing power and remote data access. An application may be executing on computing system 12 which interacts with or performs processing for display system 8, or may be executing on one or more processors in the see-through, mixed reality display system 8. An example of hardware components of a computing system 12 is shown in
The system 8, possibility with aid of computer system 12, is able to help a user navigate content using a physical object 11. In one embodiment, the physical object 11 includes an ordered sequence of pages, such as a book. In one embodiment, the physical object 11 is a book. The book could be bound such that the order of the pages is fixed. For example, books commonly have glue or some other adhesive to bind the pages. Another technique could be used to fix the order of the pages. In one embodiment, the physical object 11 is a binder having pages. The binder helps to keep the pages ordered, but the order of the pages could be altered at some point. Another technique for binding the pages is to use a staple, paper clip, or other fastener. In one embodiment, the physical object 11 includes a number of cards or simply loose papers. Herein numerous examples will be provided in which the physical object 11 is a book having pages. However, it will be understood that the physical object 11 does not need to be a bound book.
As one example of navigating content, as the user turns pages in a book (an example of a physical object), their contacts in a contact list are presented to them in the HMD 2. Thus, the HMD 2 may be used to present some representation of the content. The content could be presented in the HMD 2 such that it appears to the user as if it is displayed in the book, but that is not required. Also, presenting the content could include playing audio, video, or rendering 2D/3D imagery.
In one embodiment, the physical object 11 has markers that can be identified using the HMD 2. A marker may be any text, symbol, image, etc. that is able to be uniquely identified. The marker could be visible to the human eye, as in text or an image. However, the marker might not be visible to the human eye. For example, the markers could be infrared (IR) retro-reflective markers. A retro-reflective marker is a passive element that reflects IR light when illuminated with IR light.
In one embodiment, the markers are associated with portions of the content to allow the user to navigate the content. For example, as the user turns pages in a book, the markers are identified on the pages and the associated contacts in a contact list are presented in the HMD 2. Many other types of data could be displayed.
The physical object 11 might not contain any visible elements. As noted, the markers may be (IR) retro-reflective markers. On the other hand the physical object 11 might have visible text in it that serves as the markers. This text need not be related to the content to be navigated at all. For example, the text of any book could serve as the markers. Note that the markers may be used for whatever content the user wants to navigate. For example, any ordered data set could be navigated, in accordance with one embodiment.
Thus, note that the physical object 11 may be used to navigate different data sets, such as ordered data sets. For example, the same physical object 11 may be used to navigate a user's contact list, their list of audio albums, media files, emails, list of purchase orders, 3D models of items in a catalog, etc.
In
The use of the term “actual direct view” refers to the ability to see real world objects directly with the human eye, rather than seeing created image representations of the objects. For example, looking through glass at a room allows a user to have an actual direct view of the room, while viewing a video of a room on a television is not an actual direct view of the room. Each display optical system 14 is also referred to as a see-through display, and the two display optical systems 14 together may also be referred to as a see-through display.
Frame 115 provides a support structure for holding elements of the system in place as well as a conduit for electrical connections. In this embodiment, frame 115 provides a convenient eyeglass frame as support for the elements of the system discussed further below. The frame 115 includes a nose bridge portion 104 with a microphone 110 for recording sounds and transmitting audio data in this embodiment. A temple or side arm 102 of the frame rests on each of a user's ears. In this example, the right temple 102 includes control circuitry 136 for the display device 2. The HMD 2 may also include an audio transducer for presenting audio signals.
As illustrated in
The processing unit 4 may take various embodiments. In some embodiments, processing unit 4 is a separate unit which may be worn on the user's body, e.g. a wrist, or be a separate device like the illustrated mobile device 4 as illustrated in
In many embodiments, the two cameras 113 provide overlapping image data from which depth information for objects in the scene may be determined based on stereopsis. In some examples, the cameras may also be depth sensitive cameras which transmit and detect infrared light from which depth data may be determined. The processing identifies and maps the user's real world field of view. Some examples of depth sensing technologies that may be included on the head mounted display device 2 without limitation are SONAR, LIDAR, Structured Light, and/or Time of Flight.
Control circuits 136 provide various electronics that support the other components of head mounted display device 2. In this example, the right temple 102r includes control circuitry 136 for the display device 2 which includes a processing unit 210, a memory 244 accessible to the processing unit 210 for storing processor readable instructions and data, a wireless interface 137 communicatively coupled to the processing unit 210, and a power supply 239 providing power for the components of the control circuitry 136 and the other components of the display 2 like the cameras 113, the microphone 110 and the sensor units discussed below. The processing unit 210 may comprise one or more processors including a central processing unit (CPU) and a graphics processing unit (GPU).
Inside, or mounted to temple 102, are ear phones 130, inertial sensors 132, one or more location or proximity sensors 144, some examples of which are a GPS transceiver, an infrared (IR) transceiver, or a radio frequency transceiver for processing RFID data. Optional electrical impulse sensor 128 detects commands via eye movements. In one embodiment, inertial sensors 132 include a three axis magnetometer 132A, three axis gyro 132B and three axis accelerometer 132C. The inertial sensors are for sensing position, orientation, and sudden accelerations of head mounted display device 2. From these movements, head position may also be determined. In this embodiment, each of the devices using an analog signal in its operation like the sensor devices 144, 128, 130, and 132 as well as the microphone 110 and an IR illuminator 134A discussed below, include control circuitry which interfaces with the digital processing unit 210 and memory 244 and which produces and converts analog signals for its respective device.
Mounted to or inside temple 102 is an image source or image generation unit 120 which produces visible light representing images. In one embodiment, the image source includes micro display 120 for projecting images of one or more virtual objects and coupling optics lens system 122 for directing images from micro display 120 to reflecting surface or element 124. The microdisplay 120 may be implemented in various technologies including transmissive projection technology, micro organic light emitting diode (OLED) technology, or a reflective technology like digital light processing (DLP), liquid crystal on silicon (LCOS) and Mirasol® display technology from Qualcomm, Inc. The reflecting surface 124 directs the light from the micro display 120 into a lightguide optical element 112, which directs the light representing the image into the user's eye. Image data of a virtual object may be registered to a real object meaning the virtual object tracks its position to a position of the real object seen through the see-through display device 2 when the real object is in the field of view of the see-through displays 14.
In some embodiments, the physical object 11 has markers. For example, a photograph in a magazine may be printed with IR retro-reflective markers. An IR unit 144 may detect the marker and send the data it contains to the control circuitry 136.
In the illustrated embodiment, the display optical system 14 is an integrated eye tracking and display system. The system includes a light guide optical element 112, opacity filter 114, and optional see-through lens 116 and see-through lens 118. The opacity filter 114 for enhancing contrast of virtual imagery is behind and aligned with optional see-through lens 116, lightguide optical element 112 for projecting image data from the microdisplay 120 is behind and aligned with opacity filter 114, and optional see-through lens 118 is behind and aligned with lightguide optical element 112. More details of the light guide optical element 112 and opacity filter 114 are provided below.
Light guide optical element 112 transmits light from micro display 120 to the eye 140 of the user wearing head mounted, display device 2. Light guide optical element 112 also allows light from in front of the head mounted, display device 2 to be transmitted through light guide optical element 112 to eye 140, as depicted by arrow 142 representing an optical axis of the display optical system 14r, thereby allowing the user to have an actual direct view of the space in front of head mounted, display device 2 in addition to receiving a virtual image from micro display 120. Thus, the walls of light guide optical element 112 are see-through. Light guide optical element 112 includes a first reflecting surface 124 (e.g., a mirror or other surface). Light from micro display 120 passes through lens 122 and becomes incident on reflecting surface 124. The reflecting surface 124 reflects the incident light from the micro display 120 such that light is trapped inside a waveguide, a planar waveguide in this embodiment. A representative reflecting element 126 represents the one or more optical elements like mirrors, gratings, and other optical elements which direct visible light representing an image from the planar waveguide towards the user eye 140.
Infrared illumination and reflections, also traverse the planar waveguide 112 for an eye tracking system 134 for tracking the position of the user's eyes. The position of the user's eyes and image data of the eye in general may be used for applications such as gaze detection, blink command detection and gathering biometric information indicating a personal state of being for the user. The eye tracking system 134 comprises an eye tracking illumination source 134A and an eye tracking IR sensor 134B positioned between lens 118 and temple 102 in this example. In one embodiment, the eye tracking illumination source 134A may include one or more infrared (IR) emitters such as an infrared light emitting diode (LED) or a laser (e.g. VCSEL) emitting about a predetermined IR wavelength or a range of wavelengths. In some embodiments, the eye tracking sensor 134B may be an IR camera or an IR position sensitive detector (PSD) for tracking glint positions.
The use of a planar waveguide as a light guide optical element 112 in this embodiment allows flexibility in the placement of entry and exit optical couplings to and from the waveguide's optical path for the image generation unit 120, the illumination source 134A and the IR sensor 134B. In this embodiment, a wavelength selective filter 123 passes through visible spectrum light from the reflecting surface 124 and directs the infrared wavelength illumination from the eye tracking illumination source 134A into the planar waveguide 112 through wavelength selective filter 125 passes through the visible illumination from the micro display 120 and the IR illumination from source 134A in the optical path heading in the direction of the nose bridge 104. Reflective element 126 in this example is also representative of one or more optical elements which implement bidirectional infrared filtering which directs IR illumination towards the eye 140, preferably centered about the optical axis 142 and receives IR reflections from the user eye 140. Besides gratings and such mentioned above, one or more hot mirrors may be used to implement the infrared filtering. In this example, the IR sensor 134B is also optically coupled to the wavelength selective filter 125 which directs only infrared radiation from the waveguide including infrared reflections of the user eye 140, preferably including reflections captured about the optical axis 142, out of the waveguide 112 to the IR sensor 134B.
In other embodiments, the eye tracking unit optics are not integrated with the display optics. For more examples of eye tracking systems for HMD devices, see U.S. Pat. No. 7,401,920, entitled “Head Mounted Eye Tracking and Display System,” issued Jul. 22, 2008 to Kranz et al., which is incorporated herein by reference.
Another embodiment for tracking the direction of the eyes is based on charge tracking. This concept is based on the observation that a retina carries a measurable positive charge and the cornea has a negative charge. Sensors 128, in some embodiments, are mounted by the user's ears (near earphones 130) to detect the electrical potential while the eyes move around and effectively read out what the eyes are doing in real time. Eye blinks may be tracked as commands. Other embodiments for tracking eyes movements such as blinks which are based on pattern and motion recognition in image data from the small eye tracking camera 134B mounted on the inside of the glasses, can also be used. The eye tracking camera 134B sends buffers of image data to the memory 244 under control of the control circuitry 136.
Opacity filter 114, which is aligned with light guide optical element 112, selectively blocks natural light from passing through light guide optical element 112 for enhancing contrast of virtual imagery. When the system renders a scene for the mixed reality display, it takes note of which real-world objects are in front of which virtual objects and vice versa. If a virtual object is in front of a real-world object, then the opacity is turned on for the coverage area of the virtual object. If the virtual object is (virtually) behind a real-world object, then the opacity is turned off, as well as any color for that display area, so the user will only see the real-world object for that corresponding area of real light. The opacity filter assists the image of a virtual object to appear more realistic and represent a full range of colors and intensities. In this embodiment, electrical control circuitry for the opacity filter, not shown, receives instructions from the control circuitry 136 via electrical connections routed through the frame.
Again,
Content navigation 197 includes marker identification 202, content presentation 204, and content to marker linkage 166. Content navigation 197 is able to help the user navigate through various content using a physical object 11, such as a book. In one embodiment, content navigation 197 is able to present an interface to the user for selecting what content to navigate.
Marker identification 202 is able to identify markers in the physical object 11. Marker identification 202 may use any type of data to identify the markers. In some embodiments, the marker identification 202 identifies reflected light. In one embodiment, marker identification 202 uses light intensity values in image data. The image data could be RGB data, which can allow identification of text, symbols, images, etc. In one embodiment, marker identification 202 uses IR data to detect, for example, retro-reflective markers. In one embodiment, the marker includes light source, such as an LED. Thus, marker identification 202 is able to detect a light pattern from an LED or other light source, in one embodiment. Marker identification 202 may communicate with image processing and audio engine 191 to detect the markers.
Content presentation 204 is able to present content being navigated by the user in response to markers being detected. The content can be presented in the HMD 2. As one example, a hologram is presented in the HMD 2. The content could also be audio.
Content to marker linkage 166 is able to determine how to link content to markers in the physical object 11. For example, the physical object 11 may be a book containing a fixed number of pages. As one possibility, there may be a marker on each page. The content to marker linkage 166 is able to analyze the content and determine how to link it to each marker. As one example, the content to marker linkage 166 determines how to link a list of contacts to each marker. As another example, the content to marker linkage 166 determines how to link a video file to each marker.
Examples of the content include, but are not limited to, media files 198, content with versions 207, content with elements 211, and other content 209. This content could be stored anywhere. The processing unit 4 of the HMD 2 has some amount of storage that could be used. However, the content may well be external to the system 8. As one example, the content is on another electronic device, such as a cellular telephone, laptop computer, notepad computer, etc. Also, as previously noted, processing unit 4 of the system 8 could itself be a device such as a cellular telephone, laptop computer, notepad computer, etc. The content could be on (or accessible to) a server that is accessible over, for example, the Internet. A media file 198 could include a digital or analog file that may contain audio and/or visual data. Visual data could include video or images. As one example, the user can “scrub” through a video file by paging through a book (an example of a physical object). Note that in this example, the user could advance through the content by a certain amount of time for each page turn, as one example. In this case, the frames (or batches of frames) of video or samples of audio data may be considered to be a set of ordered records. As noted herein, embodiments allow the user to search through large sets of ordered records.
Many types of content can be broken down into various elements, as represented by content with elements 211. For example, each contact in a user's contact list can be considered to be an element. In this example, each turn of the page could advance the content by one contact. However, a different level of granularity could be used. Each page turn could show “n” contacts, where “n” is any positive integer. As another example, each page could contain all contacts with a letter of the alphabet. If there are too many contacts for a particular letter, then the contacts for that letter could be spread over multiple pages.
Some content has different versions, as represented by content with versions 207. For example, a document under revision may have any number of revisions. In one embodiment, each page of each version of the document is linked to one marker. The first marker could be page 1 of revision 1; the second marker could be page 1 of revision 2, etc. Thus, the user is able to advance through by revision by, for example, paging through a book.
Many other types of content may be navigated, which is represented by other content 209.
Image and audio processing engine 191 includes object recognition engine 192, gesture recognition engine 193, sound recognition engine 194, virtual data engine 195, and, optionally eye tracking software 196 if eye tracking is in use, all in communication with each other. Image and audio processing engine 191 processes video, image, and audio data received from a capture device such as the outward facing cameras 113. To assist in the detection and/or tracking of objects, an object recognition engine 192 of the image and audio processing engine 191 may access one or more databases of structure data 200 over one or more communication networks 50.
Virtual data engine 195 processes virtual objects and registers the position and orientation of virtual objects in relation to one or more coordinate systems. Additionally, the virtual data engine 195 performs the translation, rotation, scaling and perspective operations using standard image processing methods to make the virtual object appear realistic. A virtual object position may be registered or dependent on a position of a corresponding real object. The virtual data engine 195 determines the position of image data of a virtual object in display coordinates for each display optical system 14. The virtual data engine 195 may also determine the position of virtual objects in various maps of a real-world environment stored in a memory unit of the display device system 8 or of the computing system 12. One map may be the field of view of the display device with respect to one or more reference points for approximating the locations of the user's eyes. For example, the optical axes of the see-through display optical systems 14 may be used as such reference points. In other examples, the real-world environment map may be independent of the display device, e.g. a 3D map or model of a location (e.g. store, coffee shop, museum).
One or more processors of the computing system 12, or the display device system 8 or both also execute the object recognition engine 192 to identify real objects in image data captured by the environment facing cameras 113. For example, the object recognition engine 192 may implement pattern recognition based on structure data 200 to detect particular objects including a human. The object recognition engine 192 may also include facial recognition software which is used to detect the face of a particular person.
Structure data 200 may include structural information about targets and/or objects to be tracked. For example, a skeletal model of a human may be stored to help recognize body parts. In another example, structure data 200 may include structural information regarding one or more inanimate objects, such as a book, in order to help recognize the one or more inanimate objects. The structure data 200 may store structural information as image data or use image data as references for pattern recognition. The image data may also be used for facial recognition.
As printed material typically includes text, the structure data 200 may include one or more image datastores including images of numbers, symbols (e.g. mathematical symbols), letters and characters from alphabets used by different languages. Additionally, structure data 200 may include handwriting samples of the user for identification. Based on the image data, the marker identification 202 can identify various markers in a physical object.
The sound recognition engine 194 processes audio received via microphone 110.
The outward facing cameras 113 in conjunction with the gesture recognition engine 193 implements a natural user interface (NUI) in embodiments of the display device system 8. Blink commands or gaze duration data identified by the eye tracking software 196 are also examples of physical action user input. Voice commands may also supplement other recognized physical actions such as gestures and eye gaze.
The gesture recognition engine 193 can identify actions performed by a user indicating a control or command to an executing application. The action may be performed by a body part of a user, e.g., a hand or finger in some applications, but also an eye blink sequence of an eye can be gestures. In one embodiment, the gesture recognition engine 193 includes a collection of gesture filters, each comprising information concerning a gesture that may be performed by at least a part of a skeletal model. The gesture recognition engine 193 compares a skeletal model and movements associated with it derived from the captured image data to the gesture filters in a gesture library to identify when a user (as represented by the skeletal model) has performed one or more gestures. In some examples, a camera, in particular a depth camera in the real environment separate from the display device 2 in communication with the display device system 8 or a computing system 12 may detect the gesture and forward a notification to the system 8, 12. In other examples, the gesture may be performed in view of the cameras 113 by a body part such as the user's hand or one or more fingers.
In some examples, matching of image data to image models of a user's hand or finger during gesture training sessions may be used rather than skeletal tracking for recognizing gestures.
More information about the detection and tracking of objects can be found in U.S. patent application Ser. No. 12/641,788, “Motion Detection Using Depth Images,” filed on Dec. 18, 2009; and U.S. patent application Ser. No. 12/475,308, “Device for Identifying and Tracking Multiple Humans over Time,” both of which are incorporated herein by reference in their entirety. More information about recognizer engine 454 can be found in U.S. Patent Publication 2010/0199230, “Gesture Recognizer System Architecture,” filed on Apr. 13, 2009, incorporated herein by reference in its entirety. More information about recognizing gestures can be found in U.S. Patent Publication 2010/0194762, “Standard Gestures,” published Aug. 5, 2010, and U.S. Patent Publication 2010/0306713, “Gesture Tool” filed on May 29, 2009, both of which are incorporated herein by reference in their entirety.
The computing environment 54 also stores data in image and audio data buffer(s) 199. The buffers provide memory for receiving image data captured from the outward facing cameras 113, image data from an eye tracking camera of an eye tracking assembly 134 if used, buffers for holding image data of virtual objects to be displayed by the image generation units 120, and buffers for audio data such as voice commands from the user via microphone 110 and instructions to be sent to the user via earphones 130.
In step 402, input is received from a user that indicates what content is to be navigated. For example, the user provides input that the physical object 11 is now to be used to navigate their contacts list. This input may be received in any number of ways including, but not limited to, the user selecting the data set from an interface in a navigation application. For example, a content navigation application 197 presents an interface in the HMD 2 that allows the user to select the content. Details of establishing content for navigation are discussed with respect to
In step 404, content that is to be navigated is accessed. The content may be accessed from any location. As noted above, the content may be on an electronic device other than the HMD 2, such as a cellular telephone, laptop computer, notepad computer, etc. The content could be on (or accessible to) a server that is accessible over, for example, the Internet.
In step 406, markers in (or on) the physical object 11 are identified using a camera as the user manipulates the physical object 11. In one embodiment, the physical object 11 is a book. The book could be bound such that the order of the pages is fixed. For example, books commonly have glue or some other adhesive to bind the pages. Another technique could be used to fix the order of the pages. In one embodiment, the physical object is a binder having pages. The binder helps to keep the pages ordered, but the order of the pages could be altered at some point. Another technique for binding the pages is to use a staple, paper clip, or other fastener. In one embodiment, the physical object includes a number of cards or simply loose papers. If the order of the cards/papers are changed, the content bound to them (by the markers) does not change in one embodiment. Herein numerous examples will be provided in which the physical object is a book having pages. However, it will be understood that the physical object does not need to be a book in the conventional sense, nor does the physical object need to have physical pages in the conventional sense.
The following describes a few examples of markers.
Note that in some cases there is more than one marker that is potentially visible to the camera at one time. For example, a book can have two pages visible at one time. One option is to process both markers at the same time. Thus, later in the process when images are presented in the HMD 2, one image might be presented on each page. As one example, each page would show one contact in a user's contact list.
In one embodiment, having two markers visible allows disambiguation of similar tags. For example, the system might be nearly certain that one marker is marker A. If the system detects that a second marker is either marker B or marker F, it may determine that it must be marker B based on its location relative to marker A.
Another possibility is to have only one marker potentially visible to the camera at one time. For example, the left page could have a marker and the right page not have one.
In step 408, portions of the content that are associated with the identified markers are themselves identified. A brief example is for each contact in a contact list to be associated with a page of the physical object 11.
In step 410, content that represents the identified portions of the content are presented in the HMD 2. In one embodiment, images are presented in the HMD 2 such that they appear as virtual images on the physical object 11. For example, an image appears as a hologram that rests on and possibly extending above the surface of a page of a book. The hologram might appear as being inside (e.g., below) the page. Note that the virtual images do not have to appear to be connected to the physical object 11. For example, the virtual images could appear to be on a table or wall.
Referring to
Also note that the virtual images 119 could be presented such that they are independent of the physical object 11. For example, the images could be presented wherever the user is looking. For example, the user could be looking at a wall instead of the physical object 11. Note that step 410 may include presenting an audio signal. Step 410 may include tracking the eye gaze of the user to determine where the content should appear to be located in the real world.
In one embodiment, content is presented in a display other than an HMD 2 in step 410. For example, the content could be presented on a display screen of a laptop computer, a notepad computer, a cellular telephone, a display screen connected to a personal desktop computer, etc.
After the user is finished navigating through the content for this data set, the user can choose to navigate some other content. This is reflected in process 400, by returning to step 402 to receive further input from the user so that other content can be navigated using the same physical object 11. Note that when other content is navigated, the way in which the markers are associated with the content could be completely different. For example, when navigating a media file, turning a page in the book may advance the media file by a certain time interval. However, when navigating a contacts list, turning a page in the book may advance by one, two, or a few contacts.
In one embodiment, steps of this process are performed by logic that may include any combination of hardware and/or software. Note that this logic could be spread out over more than one physical device. For example, some of the steps could be performed by logic residing within see-through, near-eye display device 2, and other steps performed by one or more computing devices 4, 12 in communication with the see-through, near-eye display device 2. A computing device may be in communication with the HMD 2 over a network 50, as one possibility.
In step 502, input is received identifying what content is to be set up for navigating. In one embodiment, a content navigation application 197 is able to interface with another program such as an email, calendar, or contact program. Thus, the user could, for example, request that their contact list be set up for navigating. However, note that the user does not need to make a specific request. In one embodiment, when the user opens their email program this triggers the process to form a binding of emails to pages in the book, as one example. As another example, when a calendar program is opened, this triggers the process of binding days of the month to pages in the book. There could be a relative time binding. For example, “today” is bound to one or more pages, “tomorrow” is bound to one or more pages, etc. As still another example, when a file system browser is open, this triggers the binding of directories to pages in the book. As one further example, the content navigation application 197 could allow the user to specify a media file, such as an audio or audio-visual file that is stored either locally or remotely.
Step 502 may also include accessing that content. As an alternative to accessing the content, some metadata about the content could be accessed. For example, it may not be necessary to access an entire media file since the media file does not need to be played at this time. For an audio file it may be sufficient to know titles and lengths of each song. For an audio-video file it may be sufficient to know how the file is segmented into scenes or the like.
In step 504, a determination is made as to how to associate the content with the markers. In one embodiment, each marker of the physical object 11 is assigned a number. This number may be used for whatever content is to be navigated. In step 504, the content can first be divided in some logical manner. A number may then be assigned to each of the divisions. Thus, each division may be assigned to one of the markers.
In step 506, the markers are associated with the content. The following examples will be used to illustrate. The book may have 300 pages, and thus 300 markers. Note that there may be more than one marker per page, as another alternative. Also, it is not required that each page have a marker.
As one example, the user might have 275 contacts on their contact list. In this case, the contacts could be assigned to markers 1-275. If the user has more contacts than there are markers, then more than one contact could be assigned to a given marker. However, note that some of the markers could be reserved for special navigation aids, such as a table of contents.
As another example, a media file might be 120 minutes long. Dividing 120 minutes into 300 sections equates to 24 second time intervals. In this case, each marker could correspond to a 24 second jump in the media. For example, marker 1 is 0 seconds into the file; marker 2 is 24 seconds into the file, etc.
Many other ways of associating markers to the content are possible in step 506.
In step 508, special navigation aids are added. One example of a navigation aid is a table of contents. This might be assigned to the first marker, but could be anywhere.
Another example of a special navigation aid is to have some embellishment as the user turns the pages in a book. For example, conventional printed books may have chapter headings that delineate where each chapter in a novel or other book starts. Using the example of the contact list, the contacts could be presented in alphabetical order. For example, the letter of the alphabet can be made to appear in the book by presenting a suitable image in the HMD. This could be presented elsewhere than in the book. To be able to know where in the book to present each letter, navigational aids are assigned to markers in one embodiment.
In step 604, an association between the elements in the digital file and markers in the physical object 11 is accessed. This association may have been built in process 500 of
In step 606, a marker is identified in the physical object 11, as the user manipulates the physical object 11. Then, a determination is made whether the marker is a special navigation marker, in step 608. If it is, then a special navigation aid is presented in the HMD 2 in step 610. As one example, a table of contents is presented in the HMD 2 to help the user locate content faster. For a contact list, the contacts could be organized alphabetically in the physical object (e.g., a book). The beginning of the book could contain a table of contents with page numbers associated with letters. Thus, the user is able to quickly find the page. As another example, a letter of the alphabet is presented in the HMD 2 to help the user quickly navigate a contact list or other alphabetized list.
Note that special navigation aids can also be presented in the HMD 2 without reference to a certain marker. As one example, the HMD 2 makes it appear that there are tabs on the edges of pages of the book. These tabs can help the user quickly locate a certain letter of the alphabet, as one example. In one embodiment, there is a marker on the cover of the book to help determine the orientation of the closed book. Also, the system knows the thickness of the book in one embodiment to know where to render the tabs. The thickness could be determined by the system using camera data or, alternatively, the thickness might be provided to the system as an input parameter.
If the marker is not a special navigation marker, then it is determined what element in the digital file corresponds to the marker, in step 612. In step 614, an element is presented in the HMD 2 representing the element. Note that a special navigation aid, such as a letter of the alphabet, could be presented on the page with the contact.
Note that in one embodiment, more than one marker is identified at a time. For example, a marker on each of two pages that are open is identified. One element of the digital file could be presented for each marker, in this case. The elements could be presented on the respective pages of the book.
In step 624, an association between points in the media file and markers in the physical object 11 is accessed. This association may have been built in process 500 of
In step 626, a marker is identified in the physical object 11, as the user manipulates the physical object 11. A determination is made whether the marker is a special navigation marker, in step 628. If it is, then a special navigation aid is presented in the HMD 2 in step 630. One example is to present a table of contents. The table of content may specify what page of the physical object 11 a user should turn to access certain sections of the media file For example, if the media file is a movie, the movie could be broken down into different scenes. The table of contents may specify which page each scene can be found at. Thus, the user will know where to quickly turn to in order to access a particular scene.
If the marker is not a special navigation marker, then the time that corresponds to the marker is determined, in step 632. In step 634, the media file is presented in the HMD 2 starting at the time determined in step 612. Note that process 620 is a way of “scrubbing” through the media file. For example, by flipping through pages of the book, the user is able to quickly scan through the media file for a point of interest.
Another way to associate markers with a media file is by a segment of the media file. Examples of segments are songs on a compact disk, scenes in a movie, and episodes in a disk having multiple episodes of show. Also note that more than one media file, such as a number of compact discs, MP3 files, etc. can be navigated using the physical object 11. For example, the user could scan through their entire collection of music, by flipping through pages of the book.
In step 646, a marker is identified in the physical object 11, as the user manipulates the physical object 11. A determination is made whether the marker is a special navigation marker, in step 648. If it is, then a special navigation aid is presented in the HMD 2 in step 650. One example is to present a table of contents. The table of content may specify what page of the physical object 11 a user should turn to access a certain segment of the media file(s). For example, if the media file is a movie, the movie could be broken down into different scenes. The table of contents may specify which page each scene can be found at. Thus, the user will know where to quickly turn to in order to access a particular scene. If the user is navigating their music collection, the table of contents could let them know what page a song or album is at.
If the marker is not a special navigation marker, then the segment that corresponds to the marker is determined, in step 652. In step 654, the media file(s) is presented in the HMD 2 starting at the segment determined in step 654. As one example, the user turns to page 85 and a certain song associated with the marker on page 85 starts to play. The HMD 2 might also present some virtual image, such as cover art, concert footage, a music video, etc. As another example, the HMD 2 starts to play a certain scene in a movie that is associated with the marker on the open page in the book. To provide a greater viewing field, the user might look at a wall instead of the book.
The process 680 provides further details of one embodiment of process 400 of
In step 684, an association of the links between each marker and a particular page of a particular version of the digital content 11 is accessed. Note that a unit other than a page may be used. As one example, the physical object 11 is a book having an ordered sequence of pages. Each page has one marker, in one embodiment. One or more of the pages could be used for a special navigation page, such as a table of content. Other pages could be used for a page of one of the versions of the digital content.
In step 686, a marker is identified in the physical object 11, as the user manipulates the physical object 11. A determination is made whether the marker is a special navigation marker, in step 688. If it is, then a special navigation aid is presented in the HMD 2 in step 610. One example is to present a table of contents. The table of content may specify what page of the physical object 11 a user should turn to access a particular page of a particular version of the digital content.
If the marker is not a special navigation marker, then the page and version of the digital content that corresponds to the marker is determined, in step 692. In step 693, the page for that version of the digital content is presented in the HMD 2. Note that the organization might be to present one page of each version after another. Thus, the user could move from one version to the next to compare how the digital content was changed. For example, the user could turn to page 1 to see the first edit of a document, and then turn to page 2 to see the second edit of that document. Presenting one page of the digital content is just one example of a unit for display. A unit other than a page could be presented. Also, the presentation could be such that one version is on one page of the physical object 11 and the next version is on the opposite page. Therefore, the user can do a side-by-side comparison. Note that the presentation does not need to be on the page of the physical object 11.
Note that the process of identifying markers is made more accurate in one embodiment by identifying more than one marker at a time. For example, when a book is open typically two pages can be viewed by the camera. The system can attempt to identify the marker on each page. Since the system knows what markers are expected to be paired together, the system can attempt to resolve any uncertainty based on possible combinations of markers that are allowed.
In one embodiment, the process of accessing and presenting data is made more efficient by pre-fetching, pre-rendering, etc. of content. For example, if the user is flipping through the pages of the book sequentially, then the next marker(s) can be predicted. Therefore, the next content can be predicted. Therefore, pre-fetching, pre-rendering, and other anticipatory steps can be taken
Device 800 may also contain communications connection(s) 812 such as one or more network interfaces and transceivers that allow the device to communicate with other devices. Device 800 may also have input device(s) 814 such as keyboard, mouse, pen, voice input device, touch input device, etc. Output device(s) 816 such as a display, speakers, printer, etc. may also be included. All these devices are well known in the art and need not be discussed at length here.
As discussed above, the processing unit 4 may be embodied in a mobile device 5.
Mobile device 900 may include, for example, processors 912, memory 1010 including applications and non-volatile storage. The processor 912 can implement communications, as well as any number of applications, including the interaction applications discussed herein. Memory 1010 can be any variety of memory storage media types, including non-volatile and volatile memory. A device operating system handles the different operations of the mobile device 900 and may contain user interfaces for operations, such as placing and receiving phone calls, text messaging, checking voicemail, and the like. The applications 930 can be any assortment of programs, such as a camera application for photos and/or videos, an address book, a calendar application, a media player, an internet browser, games, other multimedia applications, an alarm application, other third party applications like a skin application and image processing software for processing image data to and from the display device 2 discussed herein, and the like. The non-volatile storage component 940 in memory 910 contains data such as web caches, music, photos, contact data, scheduling data, and other files.
The user is able to navigate the various data stored on the mobile device 900 using a physical object 11, such as a book, in accordance with embodiments described herein. As noted, the mobile device 900 could be used as processor 4. As another alternative, system 8 has access to mobile device 900 and data stored thereon.
The processor 912 also communicates with RF transmit/receive circuitry 906 which in turn is coupled to an antenna 902, with an infrared transmitted/receiver 908, with any additional communication channels 960 like Wi-Fi, WUSB, RFID, infrared or Bluetooth, and with a movement/orientation sensor 914 such as an accelerometer. Accelerometers have been incorporated into mobile devices to enable such applications as intelligent user interfaces that let users input commands through gestures, indoor GPS functionality which calculates the movement and direction of the device after contact is broken with a GPS satellite, and to detect the orientation of the device and automatically change the display from portrait to landscape when the phone is rotated. An accelerometer can be provided, e.g., by a micro-electromechanical system (MEMS) which is a tiny mechanical device (of micrometer dimensions) built onto a semiconductor chip. Acceleration direction, as well as orientation, vibration and shock can be sensed. The processor 912 further communicates with a ringer/vibrator 916, a user interface keypad/screen, biometric sensor system 918, a speaker 920, a microphone 922, a camera 924, a light sensor 921 and a temperature sensor 927.
The processor 912 controls transmission and reception of wireless signals. During a transmission mode, the processor 912 provides a voice signal from microphone 922, or other data signal, to the RF transmit/receive circuitry 906. The transmit/receive circuitry 906 transmits the signal to a remote station (e.g., a fixed station, operator, other cellular phones, etc.) for communication through the antenna 902. The ringer/vibrator 916 is used to signal an incoming call, text message, calendar reminder, alarm clock reminder, or other notification to the user. During a receiving mode, the transmit/receive circuitry 906 receives a voice or other data signal from a remote station through the antenna 902. A received voice signal is provided to the speaker 920 while other received data signals are also processed appropriately.
Additionally, a physical connector 988 can be used to connect the mobile device 900 to an external power source, such as an AC adapter or powered docking station. The physical connector 988 can also be used as a data connection to a computing device. The data connection allows for operations such as synchronizing mobile device data with the computing data on another device.
A GPS receiver 965 utilizing satellite-based radio navigation to relay the position of the user applications is enabled for such service.
The example computer systems illustrated in the figures include examples of computer readable storage devices. Computer readable storage devices are also processor readable storage device. Such devices may include volatile and nonvolatile, removable and non-removable memory devices implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Some examples of processor or computer readable storage devices are RAM, ROM, EEPROM, cache, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, memory sticks or cards, magnetic cassettes, magnetic tape, a media drive, a hard disk, magnetic disk storage or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by a computer.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
Number | Name | Date | Kind |
---|---|---|---|
6037915 | Matsueda et al. | Mar 2000 | A |
7126558 | Dempski | Oct 2006 | B1 |
7676372 | Oba | Mar 2010 | B1 |
20040090445 | Iizuka et al. | May 2004 | A1 |
20040104935 | Williamson et al. | Jun 2004 | A1 |
20060028400 | Lapstun et al. | Feb 2006 | A1 |
20100002909 | Lefevre | Jan 2010 | A1 |
20100199232 | Mistry et al. | Aug 2010 | A1 |
20110018903 | Lapstun et al. | Jan 2011 | A1 |
20120008003 | Lim et al. | Jan 2012 | A1 |
20120032977 | Kim et al. | Feb 2012 | A1 |
20120050326 | Tanaka | Mar 2012 | A1 |
20120081394 | Campbell et al. | Apr 2012 | A1 |
20130073509 | Burkard et al. | Mar 2013 | A1 |
Entry |
---|
Ajanki, et al., “Ubiquitous Contextual Information Access with Proactive Retrieval and Augmentation”, In Proceedings of 4th International Workshop on Ubiquitous Virtual Reality, Sep. 2009, 28 pages. |
Billinghurst, et al., “The MagicBook: A Transitional AR Interface”, In Proceedings of Computers and Graphics, Oct. 2001, vol. 25, No. 5, pp. 745-753, 14 pages. |
Kato, et al., “Marker Tracking and HMD Calibration for a Video-based Augmented Reality Conferencing System”, In Proceedings of the 2nd IEEE and ACM International Workshop on Augmented Reality, Oct. 20, 1999, pp. 85-94, 10 pages. |
Number | Date | Country | |
---|---|---|---|
20130321255 A1 | Dec 2013 | US |