Touchscreen technology can be used to facilitate display interaction on mobile devices such as smart phones and tablets, as well as with personal computers (“PC”) with larger screens, e.g., desktop computers. However, as touchscreen sizes increase, the cost for touchscreen technology may increase exponentially. Moreover, larger touchscreens may result in “gorilla arm”—the human arm held in an unsupported horizontal position rapidly becomes fatigued and painful—when using a large-size touchscreen. A separate interactive touch surface such as a trackpad may be used as an indirect touch device that connects to the host computer to act as a mouse pointer when a single finger is used. The trackpad can be used with gestures, including scrolling, swipe, pinch, zoom, and rotate.
Features of the present disclosure are illustrated by way of example and not limited in the following figure(s), in which like numerals indicate like elements.
For simplicity and illustrative purposes, the present disclosure is described by referring mainly to an example thereof. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. It will be readily apparent however, that the present disclosure may be practiced without limitation to these specific details. In other instances, some methods and structures have not been described in detail so as not to unnecessarily obscure the present disclosure. As used herein, the terms “a” and “an” are intended to denote at least one of a particular element, the term “includes” means includes but not limited to, the term “including” means including but not limited to, and the term “based on” means based at least in part on.
Additionally, it should be understood that the elements depicted in the accompanying figures may include additional components and that some of the components described in those figures may be removed and/or modified without departing from scopes of the elements disclosed herein. It should also be understood that the elements depicted in the figures may not be drawn to scale and thus, the elements may have different sizes and/or configurations other than as shown in the figures.
Referring now to
Controller 112 may take various forms. In some examples, controller 112 takes the form of a processor, or central processing unit (“CPU”), or even multiple processors, such as multi-core processor. Such a processor may execute instructions stored in memory (not depicted in
Controller 112 is operably coupled with 3D vision sensor 106, e.g., using various types of wired and/or wireless data connections, such as universal serial bus (“USB”), wireless local area networks (“LAN”) that employ technologies such as the Institute of Electrical and Electronics Engineers (“IEEE”) 802.11 standards, personal area networks, mesh networks, and so forth. Accordingly, vision data 116 captured by 3D vision sensor 106 is provided to controller 112. Controller 112 is likewise operably coupled with touch interaction surface 102—which in this example takes the form of a touch sensor or “interactive touch surface”—using the same type of connection as was used for 3D vision sensor 106 or a different type of data connection. Accordingly, touch data 118 captured by touch interaction surface 102 is provided to controller 112. However, in other examples, touch interaction surface 102 may be passive, and physical contact with touch interaction surface 102, e.g., by a hand 120 of a user 122, may be detected using vision data 116 alone. For example, touch interaction surface 102 may simply be a portion of a desktop or other work surface that is within FOV 104 of 3D vision sensor 106.
In some examples in which touch interaction surface 102 is interactive and generates touch data 118, touch interaction surface 102 may include a screen. For example, touch interaction surface 102 may take the form of a touchscreen tablet. In some such examples, a user may operate the tablet, e.g., using a hard or soft input element, or a gesture, to transition stylus/touch interactivity from the tablet to a separate display, such as display 110. This may include examples in which touch interaction surface 102 itself is a computer, with controller 112 integrated therein, as may be the case when touch interaction surface 102 takes the form of a laptop computer that is convertible to a tablet form factor.
3D vision sensor 106 may take various forms. In some examples, 3D vision sensor 106 may operate in various ranges of the electromagnetic spectrum, such a visible, infrared, etc. In some examples, 3D vision sensor may detect 3D/depth information. For example, 3D vision sensor 106 may include array of sensors to triangulate and/or interpret depth information. In some examples, 3D vision sensor may take the form of a multi-camera apparatus such as a stereoscopic and/or stereographic camera. In some examples, 3D vision sensor 106 may take the form of a structured illumination apparatus that projects known patterns of light onto a scene, e.g., combined in combination with a single or multiple cameras. In some examples, 3D vision sensor may include a time-of-flight apparatus with or without single or multiple cameras. In some examples, vision data 116 may take the form of two-and-a-half-dimensional (“2.5D”) (2D with depth) image(s), where each of the pixels of the 2.5D image defines an X, Y, and Z coordinate of a surface of a corresponding object, and optionally color values (e.g., R, G, B values) and/or other parameters for that coordinate of the surface. In some examples, 3D vision sensor 106 may take the form of a 3D laser scanner.
In some examples, 3D vision sensor 106 may capture vision data 116 at a framerate and/or accuracy that is sufficient to generate, in “real time,” 3D representation of a hand 120 of a user 122. In some examples, this 3D representation of hand 120 may take the form of a skeletal representation that includes, for instance, wrist and finger joints. In other examples, it may take the form of a 3D point cloud, a wireframe structure, and so forth.
Additionally or alternatively, in some examples, multiple sensors may be employed in tandem to determine a position, size, and/or pose of hand 120, from which a 3D representation of hand 120 may be generated. For example, one 2D vision sensor may be positioned over touch interaction surface 102 to capture a silhouette of hand 120. At the same time, touch data 118 may indicate locations of touch events on touch interaction surface 102. These signals may be combined to estimate a size, position, and/or pose of hand 120. Additionally or alternatively, ultrasound sensors may be deployed to detect, for instance, a height of hand 120.
Based on vision data 116 received from 3D vision sensor 106 and/or touch data 118 received from touch interaction surface, controller 112 may cause a virtual hand 124 to be rendered on display 110. Virtual hand 124 may be transparently or translucently overlaid on other displayed elements (not depicted in
In some examples, including that of
Also depicted in
In some examples, a placement and/or configuration of 3D vision sensor 106 may be selected so that FOV 104 captures at least the extent of touch interaction surface 102, e.g., so that 3D vision sensor 106 is able to detect when hand 120 extends over touch interaction surface 102. In some examples, FOV 104 of 3D vision sensor 106 may cover a volume extending some distance vertically above touch interaction surface 102, e.g., a few inches. This may allow for detection of things like, for instance, a user's fingers hovering an inch above the lower edge of touch interaction surface 102. Additionally or alternatively, in some examples, FOV 104 of 3D vision sensor 106 may extend farther towards user 122 such that the entirety of hand 120 is captured even when user 122 only extends hand 120 over the lower portion of touch interaction surface 102. In some examples, FOV 104 may extend even farther towards user 122 such that 3D vision sensor 106 is able to see the whole of the user's hand 120 when the user's fingertips are at a lower edge of touch interaction surface 102.
In
In some examples, a calibration routine may be implemented to establish a location of 3D vision sensor 106 with respect to touch interaction surface 102. If 3D vision sensor 106 is physically coupled to touch interaction surface 102, as is depicted in
As described previously, 3D vision sensor 106 generates vision data 116 and touch interaction surface 102 generates touch data 118. Vision data 116 is provided to a hand recognition and tracking module 212. Hand recognition and tracking module 212 processes vision data 116—and in some examples, other data from other sensors, such as touch data 118—to generate a 3D representation of the user's hand 120. As noted previously, in some examples the 3D representation of the user's hand 120 takes the form of a skeletal model.
One example of a skeletal hand model 324 is depicted in
The size of the user's hand 120 relative to touch interaction surface 102 may or may not be desirable for recreation on display 110. For example,
Accordingly, and referring back to
Rendering module 244 causes virtual hand 124 to be rendered on display 110. In many examples, rendering module 244 renders virtual hand 124, and a virtual stylus if stylus 140 is detected, from a viewpoint above touch interaction surface 102. In some examples the rendering may be orthographic, e.g., so that vertical movement of hand 120 towards/away from touch interaction surface 102 does not result in any change in virtual hand 124. Alternatively, the user raising their hand vertically may result in changing the scaling of virtual hand 124, e.g. increasing its displayed size by +10%, but does not affect its position. Changes in vertical height of hand 120 from touch interaction surface 102 may also be visually indicated in other ways, such as fading, blurring, to changing a color of virtual hand 124, or adding some indication mechanism to virtual hand 124, such as shapes at each fingertip that expand and fade with vertical height of hand 120 from touch interaction surface 102.
Rather than dominating nearly all of display 110, because of the scaling performed by scaling system 230, rendering module 244 renders virtual hand 124 to occupy a smaller portion of display 110 than it would unscaled. Consequently, in some examples, virtual hand 124 may appear more life-sized, providing user 122 with a better and/or more intuitive experience.
In various examples, virtual hand 124 may be rendered in various ways based on the 3D representation of the user's hand 120. A user may be able to select how virtual hand 124 is rendered from these options. For example, a user may be able to select whether virtual hand 124 is rendered to appear realistic or abstract. In one example, the 3D representation itself is rendered on display 110 as virtual hand 124. Additionally or alternatively, in some examples, virtual hand 124 may be rendered by projecting the 3D representation of the user's hand onto the display as a 2D projection, which may be rendered variously as a silhouette, a shadow hand, cartoon outlined hand, a wireframe hand, etc. In yet other examples, virtual hand 124 may be rendered as a skeletal hand. In some examples, virtual hand 124—and the virtual stylus if actual stylus 140 is detected—may be alpha-blended with underlying content already rendered on display 110. Consequently, virtual hand 124 may appear at least partially transparent so that the underlying display content is still visible.
In
Scaling center engine 232 may identify a scaling center at various locations. In some examples, scaling center engine 232 may identify, as a scaling center, a primary point of physical interaction between user 122 and touch interaction surface 102. This might correspond, for example, with the finger or finger(s) most commonly used for touch operations, which might vary between one user who uses a particular type of touch gesture more frequently than another user. In
Referring back to
In some examples, the following equation may be employed to determine the scaling factor SF:
The first term
relates the whole display area DD to all or part of the touch interaction surface 102 area DT. This relationship may include accommodating aspect ratio mismatches between display 110 and touch interaction surface 102, as well as allowing user 122 to map all or a portion of touch interaction surface 102 onto display 110.
The second term
ensures that virtual hand 124/324 rendered on the display subtends a similar visual angle for user 122 as the user's hand 120 on touch interaction surface 102. As noted previously, the distance 134 between user 122 and display 110 may be determined using, for instance, vision data captured by camera 130. In some examples, user 122 may have the ability to adjust and save a preferred scaling factor and/or scaling center. In some such examples, user 122 may associated these preferences with preset options such as “desktop,” “presentation,” and so forth.
In other examples, scaling center engine 232 may determine the scaling factor based on non-physical, or “virtual” rendering constraints. One type of virtual rendering constraint may be an application window having a current focus; such an application window may occupy less than the entirety of display 110. Alternatively, suppose that instead of viewing a display that is more or less perpendicular to touch interaction surface 102, as is depicted in
Note that the scale factor applied to the 3D representation of the user's hand, described by the equation above, may be different from the scale factor used to transform the position of that representation on touch interaction surface 102 to a position on the display 110. The latter scale factor may only include the
term in the above.
Blending engine 236 receives the scaled 3D representation of the user's hand and, if applicable, blends it with other 3D data. For example, and as will be described below, if user 122 grasps stylus 140 over touch interaction surface 102, a 3D representation of stylus 140 may be generated, e.g., based on a detected pose of stylus. This 3D representation of stylus 140 may then be blended with the 3D representation of the user's and 120 by blending engine 236.
As noted previously, in some examples, touch interaction surface 102 generates touch data 118. In
A stylus detection and tracking module 256 may receive stylus data 258 from stylus 140, and/or from touch interaction surface 102 in examples in which stylus and touch interaction surface 102 operate in cooperation. As described herein, in some examples, when stylus 140 is detected as being grasped by user 122, e.g., by stylus detection and tracking module 256 or by scaling system 230, the scaling center may be identified as nib 142 of stylus. Data indicative of stylus data 258, such as stylus position and/or pose, may be provided to scaling system 230.
In
It can be seen in
For multi-touch gestures such as that represented by 460 and 462, the scaling that is applied to the 3D representation of the user's hand might result in the finger touch locations appearing closer together on the display than they physically occur on touch interaction surface 102. Accordingly, the touch events generated by touch interaction surface 102 may be scaled, e.g., by scaling system 230, in the same or similar manner as the 3D representation of the user's hand before being passed on to controller 112, so that scaled touch events 460′, 462′ correspond to the locations of the fingers on virtual hand 124. In
When stylus 140 is detected in the user's grasp, e.g., from vision data 116, from touch data 118, or from other sensor(s) such as stylus 140 itself, virtual hand 124 may be rendered differently to represent the user's hand holding an avatar of stylus 140. As noted previously, in various examples, the pose of stylus 140, which may include its position, tilt, etc., may be determined from any of the aforementioned data sources and used to render virtual hand 124 holding an avatar of stylus 140. Referring now to
Because virtual stylus 546 is scaled about the scaling center 550 at its tip, the location at which nib 142 contacts touch interaction surface 102 is unaffected by scaling applied to virtual stylus 546, and thus, the location can be passed directly to, for instance, an operating system of the computing device. In some examples, if a change in scaling center 550 is significant when starting or ending stylus use, that is, when transitioning between a hand-based scaling center and a stylus-based scaling center, the change in the scaling center's position may be animated over some small interval of time to make the change less visually abrupt.
In some examples, virtual stylus 546 may be rendered disguised as a user-selected tool. For example, a user operating a graphic design or photo editing application may have access to a number of drawing tools, such as airbrush, paintbrush, erasers, pencils, pens, etc. Rather than rendering virtual stylus 546 to appear similar to actual stylus 140, in some examples, virtual stylus 546 may be rendered to appear as the user-selected tool. Thus, a user who selects an airbrush will see virtual hand 124 holding an airbrush. In some examples, other aspects of the user-selected tool may be incorporated into virtual stylus 546. For example, a user may vary an amount of pressure applied to touch interaction surface 102 by stylus 140, and this may be represented visually by virtual stylus 546, e.g., with a color change, etc. or, in the case of a virtual paintbrush tool, by changing the shape of the brush tip.
In some examples, system 100 may detect the special case of a user using a computer mouse on touch interaction surface 102. The mouse's position and the location of the cursor on display 110 may not be directly related. Accordingly, in this special case system 100 may render the scaled representation of the mouse and the user's hand (scaled, for example, about the front edge of the mouse) at the cursor location, irrespective of the location of the physical mouse on touch interaction surface 102. Alternatively, the system may not render a representation of the mouse, or the hand holding it, at all.
Examples described herein are not limited to rendering a single virtual hand of a user. Techniques described herein may be employed to detect, scale, and render virtual representations of multiple hands of a single user, or even multiple hands of multiple users. Moreover, if any of the multiple detected hands is holding stylus 140, that may be detected and included in the virtual representation. In some examples in which multiple hands are detected, resulting in rendition of multiple virtual hands 124, the 3D representations of the multiple hands may be scaled together about a single scaling center. This may ensure that when fingers from different hands touch each other, which the user will feel, the fingers of the virtual hands will also appear to touch. Additionally or alternatively, in some examples, each virtual hand may be scaled separately about their own scaling center when the virtual hands are farther apart than some threshold, such as a fixed distance, a percentage of width of touch interaction surface, etc. When the user's hands are brought closer together, the multiple scaling centers may be transitioned to a single scaling center.
Referring now to
In
At block 702, the system may receive, from 3D vision sensor 106, vision data 116 capturing at least a portion of a user 122 in an environment. In various examples, the vision data may include data representing the user's hand 120 relative to touch interaction surface 102. At block 704, the system may process the vision data 116 to generate a 3D representation of the user's hand. This 3D representation may take the form of a 3D point cloud, a 3D skeletal model, etc.
At block 706, the system may identify a scaling center on touch interaction surface 102 to scale the 3D representation of the user's hand. Various examples of scaling centers are described herein, including those locations referenced by 350, 550, and 650. As noted herein, scaling centers may be identified based on fingertip locations, offset from a user's wrist, location of nib 142 of stylus 140, etc.
At block 708, the system may scale, using a scaling factor, the 3D representation of the user's hand with respect to (e.g., about) the scaling center identified at block 706. In various examples, the scaling factor may be based on various rendering constraints. Rendering constraints include but are not limited physical dimensions of a display, physical dimensions of touch interaction surface 102, distance of the user from display/touch interaction surface, orientation of virtual surfaces on which a virtual hand is to be rendered, an application window size, an orientation of the display, and so forth.
At block 710, the system may render a virtual hand. Rendering as used herein may refer to causing a virtual hand to be rendered on an electronic display, such as display 110, a display of an HMD, a projection screen, and so forth. However, rendering is not limited to causing output on a physical display. In some examples, rendering may include rendering data in a two-dimensional buffer and/or or in a two dimensional memory array, e.g., forming part of a graphical processing unit (“GPU”). In various examples, the virtual hand may be rendered based on the scaled 3D representation of the user's hand, and may be rendered realistically and/or abstractly, e.g., as a skeletal model, an outline/silhouette, cartoon, etc. The virtual hand may be rendered transparently to avoid occluding content already rendered on the display, e.g., by blending alpha channels.
At block 802, the system may receive, from touch interaction surface 102, data representing a touch input event from the user's hand, such as touch data 118. For example, the touch input event may include coordinates on touch interaction surface 102 at which physical contact is detected from user 122. Touch inputs may come in various forms, such as a tap or swipe, or multi-touch input events such as pinches, etc. Touch events may also be caused by various physical objects, such one or more fingers of the user, a stylus, or other implements such as brushes (which may not include paint but instead may be intended to mimic the act of painting), forks, rulers, projectors, compasses, or any other implement brought into physical contact with touch interaction surface 102.
At block 804, the system may process the data representing the touch input event to generate a representation of the touch input event. Non-limiting examples of representations of touch input events were indicated at 460 and 462 of
At block 806, the system may scale the representation(s) of the touch input event(s) with respect to the identified scaling center using the same scaling factor as was used at block 708 of
At block 902, the system may detect a stylus proximate touch interaction surface 102, e.g., based on wireless communication between the stylus and touch interaction surface 102, based on a detected position of the stylus relative to a known position of touch interaction surface 102, and/or based on the vision data 116 generated by 3D vision sensor 106. At block 904, which may occur alongside or in place of block 706 of
At block 906, the system may detect a pose of the stylus, e.g., based on information provided by the stylus about its orientation, or based on an orientation of stylus detected in vision data 116. At block 908, the system may generate a 3D representation of the stylus based on the pose of the stylus detected at block 906. At block 910, the system may scale, e.g., using the same scaling factor as described previously, the 3D representations of the stylus with respect to the nib of the stylus.
At block 912, the system may render virtual stylus 546 on the display in conjunction with the virtual hand. In various examples, virtual stylus 546 may be based on the scaled 3D representation of actual stylus 140. In some examples, blending engine 236 may blend the 3D representation of the user's hand with the 3D representation of stylus 140 to generate a single 3D representation, which is then used to render a virtual hand holding a virtual stylus or other tool.
User interface input devices 1022 may include input devices such as a keyboard, pointing devices such as a mouse, trackball, touch interaction surface 102 (which may take the form of a graphics tablet), a scanner, a touchscreen incorporated into the display, audio input devices such as voice recognition systems, microphones, 3D vision sensor 106, 2D camera 130, stylus 140, and/or other types of input devices. In general, use of the term “input device” is intended to include all possible types of devices and ways to input information into computer system 1010 or onto a communication network.
User interface output devices 1020 may include a display subsystem that includes display 110, a printer, a fax machine, or non-visual displays such as audio output devices. The display subsystem may include a cathode ray tube (CRT), a flat-panel device such as a liquid crystal display (LCD), a projection device, or some other mechanism for creating a visible image. The display subsystem may also provide non-visual display such as via audio output devices. In general, use of the term “output device” is intended to include all possible types of devices and ways to output information from computer system 1010 to the user or to another machine or computer system.
Storage subsystem 1026 stores programming and data constructs that provide the functionality of some or all of the modules described herein. For example, the storage subsystem 1026 may include the logic to perform selected aspects of methods 700-900.
These machine-readable instruction modules are generally executed by processor 1014 alone or in combination with other processors. Memory 1025 used in the storage subsystem 1026 can include a number of memories including a main random access memory (RAM) 1030 for storage of instructions and data during program execution and a read only memory (ROM) 1032 in which fixed instructions are stored. A file storage subsystem 1026 can provide persistent storage for program and data files, and may include a hard disk drive, a floppy disk drive along with associated removable media, a CD-ROM drive, an optical drive, or removable media cartridges. The modules implementing the functionality of certain examples may be stored by file storage subsystem 1026 in the storage subsystem 1026, or in other machines accessible by the processor(s) 1014.
Bus subsystem 1012 provides a mechanism for letting the various components and subsystems of computer system 1010 communicate with each other as intended. Although bus subsystem 1012 is shown schematically as a single bus, alternative examples of the bus subsystem may use multiple busses.
Computer system 1010 can be of varying types including a workstation, server, computing cluster, blade server, server farm, or any other data processing system or computing device. Due to the ever-changing nature of computers and networks, the description of computer system 1010 depicted in
Although described specifically throughout the entirety of the instant disclosure, representative examples of the present disclosure have utility over a wide range of applications, and the above discussion is not intended and should not be construed to be limiting, but is offered as an illustrative discussion of aspects of the disclosure.
What has been described and illustrated herein is an example of the disclosure along with some of its variations. The terms, descriptions and figures used herein are set forth by way of illustration only and are not meant as limitations. Many variations are possible within the scope of the disclosure, which is intended to be defined by the following claims—and their equivalents—in which all terms are meant in their broadest reasonable sense unless otherwise indicated.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2019/023444 | 3/21/2019 | WO | 00 |