The disclosure below relates generally to registering and using a hand-held non-electronic object as a controller to control the position, orientation, and game state of a displayed graphical element.
As understood herein, electronic video game controllers can be very complex and prevent children below a certain age from effectively using them to play a video game.
Present principles also understand that video games like virtual reality (VR) video games can be played by a wider array of users by enabling detection of movement of non-electronic objects about the real world as input to the video game to alter game state. Thus, a player can pick out one of their own toys like a plush doll and use that toy as a game controller instead of an electronic gamepad. The player can thus move the toy itself for controlling the game's characters and other features. A depth sensing camera may be used to detect the pre-registered object, get the position and angles of the object, and export that data to the game. The game can thus receive object pose information constantly during gameplay and connect the data to various control keys.
Further, note that any object can be registered using the camera and machine learning so that kids and other people may use their own toys or other real-world objects in an intuitive way to play the VR or other type of video game. Thus, an object can be scanned and used for gameplay without that object communicating via wireless or wired analog or digital signals, providing a video game controller for kids and others that wish to use it.
Accordingly, in one aspect an apparatus includes at least one processor configured to receive input from a camera and, based on the input, identify position data related to a non-electronic object. The processor is also configured to control a graphical element of a video game based on the position data related to the non-electronic object.
In certain example embodiments, the at least one processor may also be configured to register three-dimensional (3D) features of the non-electronic object through a setup process prior to controlling the graphical element of the video game based on the position data. So, for example, the processor may be configured to execute the setup process, where the setup process includes prompting a user to position the non-electronic object in view of the camera, using images from the camera that show the non-electronic object to identify the 3D features, and storing the 3D features in storage accessible to the processor.
Also in various example embodiments, the at least one processor may be configured to, based on the position data, control a location and/or orientation of the graphical element within a scene of the video game.
Still further, if desired the apparatus may include the camera, and in certain examples the camera may be a depth-sensing camera. The apparatus may also include a display accessible to the at least one processor, and the at least one processor may be configured to present the graphical element of the video game on the display according to the position data. Also if desired, the graphical element may include a 3D representation of the non-electronic object itself. For example, the 3D representation may be generated using data from the setup process where the non-electronic object is positioned in front of the camera to register 3D features of the non-electronic object.
Still further, in certain example implementations the processor may be configured to, based on the position data related to the non-electronic object, control the graphical element of the video game to hover over/overlay on and then select a selector that is presented as part of the video game.
In another aspect, a method includes receiving input from a camera and, based on the input, identifying position data related to a non-electronic object. The method also includes controlling a graphical element of a computer simulation based on the position data related to the non-electronic object.
In one example, the computer simulation may include a video game. Additionally or alternatively, the computer simulation may represent the non-electronic object as the graphical element on a spatial reality display.
Still further, if desired the method may include, prior to controlling the graphical element of the computer simulation based on the position data, registering three-dimensional (3D) features of the non-electronic object through a setup process. Then during the computer simulation, the method may include controlling a location and/or orientation of the graphical element within a scene of the computer simulation based on the position data. Also if desired, the method may include controlling the graphical element of the computer simulation to hover over and select a button that is presented as part of the computer simulation based on the position data related to the non-electronic object.
In still another aspect, a device includes at least one computer storage that is not a transitory signal. The computer storage includes instructions executable by at least one processor to receive, at a device, input from a camera. Based on the input, the instructions are executable to identify position data related to an object that is not communicating with the device via signals sent wirelessly or through a wired connection. Based on the position data related to the object, the instructions are executable to control a graphical element of a computer simulation.
In some example implementations, the instructions may also be executable to, prior to controlling the graphical element of the computer simulation based on the position data, register three-dimensional (3D) features of the object through a setup process so that the object can be represented in the computer simulation as the graphical element according to the 3D features.
Also, if desired the object may be a first object and the instructions may be executable to use input from the camera to determine that the first object contacts, in the real world, a second object. Here the instructions may then be executable to present audio as part of the computer simulation based on the determination, with the audio mimicking a real world sound of the first and second objects contacting each other according to an object type associated with the first object and/or the second object.
The details of the present application, both as to its structure and operation, can be best understood in reference to the accompanying drawings, in which like reference numerals refer to like parts, and in which:
This disclosure relates generally to computer ecosystems including aspects of consumer electronics (CE) device networks such as but not limited to computer game networks. A system herein may include server and client components which may be connected over a network such that data may be exchanged between the client and server components. The client components may include one or more computing devices including game consoles such as Sony PlayStation® or a game console made by Microsoft or Nintendo or other manufacturer, extended reality (XR) headsets such as virtual reality (VR) headsets, augmented reality (AR) headsets, portable televisions (e.g., smart TVs, Internet-enabled TVs), portable computers such as laptops and tablet computers, and other mobile devices including smart phones and additional examples discussed below. These client devices may operate with a variety of operating environments. For example, some of the client computers may employ, as examples, Linux operating systems, operating systems from Microsoft, or a Unix operating system, or operating systems produced by Apple, Inc., or Google, or a Berkeley Software Distribution or Berkeley Standard Distribution (BSD) OS including descendants of BSD. These operating environments may be used to execute one or more browsing programs, such as a browser made by Microsoft or Google or Mozilla or other browser program that can access websites hosted by the Internet servers discussed below. Also, an operating environment according to present principles may be used to execute one or more computer game programs.
Servers and/or gateways may be used that may include one or more processors executing instructions that configure the servers to receive and transmit data over a network such as the Internet. Or a client and server can be connected over a local intranet or a virtual private network. A server or controller may be instantiated by a game console such as a Sony PlayStation®, a personal computer, etc.
Information may be exchanged over a network between the clients and servers. To this end and for security, servers and/or clients can include firewalls, load balancers, temporary storages, and proxies, and other network infrastructure for reliability and security. One or more servers may form an apparatus that implement methods of providing a secure community such as an online social website or gamer network to network members.
A processor may be a single- or multi-chip processor that can execute logic by means of various lines such as address lines, data lines, and control lines and registers and shift registers. A processor including a digital signal processor (DSP) may be an embodiment of circuitry.
Components included in one embodiment can be used in other embodiments in any appropriate combination. For example, any of the various components described herein and/or depicted in the Figures may be combined, interchanged, or excluded from other embodiments.
“A system having at least one of A, B, and C” (likewise “a system having at least one of A, B, or C” and “a system having at least one of A, B, C”) includes systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together.
Referring now to
Accordingly, to undertake such principles the AVD 12 can be established by some, or all of the components shown. For example, the AVD 12 can include one or more touch-enabled displays 14 that may be implemented by a high definition or ultra-high definition “4K” or higher flat screen. The touch-enabled display(s) 14 may include, for example, a capacitive or resistive touch sensing layer with a grid of electrodes for touch sensing consistent with present principles.
The AVD 12 may also include one or more speakers 16 for outputting audio in accordance with present principles, and at least one additional input device 18 such as an audio receiver/microphone for entering audible commands to the AVD 12 to control the AVD 12. Other example input devices include gamepads or mice or keyboards.
The example AVD 12 may also include one or more network interfaces 20 for communication over at least one network 22 such as the Internet, an WAN, an LAN, etc. under control of one or more processors 24. Thus, the interface 20 may be, without limitation, a Wi-Fi transceiver, which is an example of a wireless computer network interface, such as but not limited to a mesh network transceiver. It is to be understood that the processor 24 controls the AVD 12 to undertake present principles, including the other elements of the AVD 12 described herein such as controlling the display 14 to present images thereon and receiving input therefrom. Furthermore, note the network interface 20 may be a wired or wireless modem or router, or other appropriate interface such as a wireless telephony transceiver, or Wi-Fi transceiver as mentioned above, etc.
In addition to the foregoing, the AVD 12 may also include one or more input and/or output ports 26 such as a high-definition multimedia interface (HDMI) port or a universal serial bus (USB) port to physically connect to another CE device and/or a headphone port to connect headphones to the AVD 12 for presentation of audio from the AVD 12 to a user through the headphones. For example, the input port 26 may be connected via wire or wirelessly to a cable or satellite source 26a of audio video content. Thus, the source 26a may be a separate or integrated set top box, or a satellite receiver. Or the source 26a may be a game console or disk player containing content. The source 26a when implemented as a game console may include some or all of the components described below in relation to the CE device 48.
The AVD 12 may further include one or more computer memories/computer-readable storage media 28 such as disk-based or solid-state storage that are not transitory signals, in some cases embodied in the chassis of the AVD as standalone devices or as a personal video recording device (PVR) or video disk player either internal or external to the chassis of the AVD for playing back AV programs or as removable memory media or the below-described server. Also, in some embodiments, the AVD 12 can include a position or location receiver such as but not limited to a cellphone receiver, GPS receiver and/or altimeter 30 that is configured to receive geographic position information from a satellite or cellphone base station and provide the information to the processor 24 and/or determine an altitude at which the AVD 12 is disposed in conjunction with the processor 24.
Continuing the description of the AVD 12, in some embodiments the AVD 12 may include one or more cameras 32 that may be a thermal imaging camera, a digital camera such as a webcam, an IR sensor, an event-based sensor, and/or a camera integrated into the AVD 12 and controllable by the processor 24 to gather pictures/images and/or video in accordance with present principles. Also included on the AVD 12 may be a Bluetooth® transceiver 34 and other Near Field Communication (NFC) element 36 for communication with other devices using Bluetooth and/or NFC technology, respectively. An example NFC element can be a radio frequency identification (RFID) element.
Further still, the AVD 12 may include one or more auxiliary sensors 38 that provide input to the processor 24. For example, one or more of the auxiliary sensors 38 may include one or more pressure sensors forming a layer of the touch-enabled display 14 itself and may be, without limitation, piezoelectric pressure sensors, capacitive pressure sensors, piezoresistive strain gauges, optical pressure sensors, electromagnetic pressure sensors, etc. Other sensor examples include a pressure sensor, a motion sensor such as an accelerometer, gyroscope, cyclometer, or a magnetic sensor, an infrared (IR) sensor, an optical sensor, a speed and/or cadence sensor, an event-based sensor, a gesture sensor (e.g., for sensing gesture command). The sensor 38 thus may be implemented by one or more motion sensors, such as individual accelerometers, gyroscopes, and magnetometers and/or an inertial measurement unit (IMU) that typically includes a combination of accelerometers, gyroscopes, and magnetometers to determine the location and orientation of the AVD 12 in three dimension or by an event-based sensors such as event detection sensors (EDS). An EDS consistent with the present disclosure provides an output that indicates a change in light intensity sensed by at least one pixel of a light sensing array. For example, if the light sensed by a pixel is decreasing, the output of the EDS may be −1; if it is increasing, the output of the EDS may be a+1. No change in light intensity below a certain threshold may be indicated by an output binary signal of 0.
The AVD 12 may also include an over-the-air TV broadcast port 40 for receiving OTA TV broadcasts providing input to the processor 24. In addition to the foregoing, it is noted that the AVD 12 may also include an infrared (IR) transmitter and/or IR receiver and/or IR transceiver 42 such as an IR data association (IRDA) device. A battery (not shown) may be provided for powering the AVD 12, as may be a kinetic energy harvester that may turn kinetic energy into power to charge the battery and/or power the AVD 12. A graphics processing unit (GPU) 44 and field programmable gated array 46 also may be included. One or more haptics/vibration generators 47 may be provided for generating tactile signals that can be sensed by a person holding or in contact with the device. The haptics generators 47 may thus vibrate all or part of the AVD 12 using an electric motor connected to an off-center and/or off-balanced weight via the motor's rotatable shaft so that the shaft may rotate under control of the motor (which in turn may be controlled by a processor such as the processor 24) to create vibration of various frequencies and/or amplitudes as well as force simulations in various directions.
A light source such as a projector such as an infrared (IR) projector also may be included.
In addition to the AVD 12, the system 10 may include one or more other CE device types. In one example, a first CE device 48 may be a computer game console that can be used to send computer game audio and video to the AVD 12 via commands sent directly to the AVD 12 and/or through the below-described server while a second CE device 50 may include similar components as the first CE device 48. In the example shown, the second CE device 50 may be configured as a computer game controller manipulated by a player or a head-mounted display (HMD) worn by a player. The HMD may include a heads-up transparent or non-transparent display for respectively presenting AR/MR content or VR content (more generally, extended reality (XR) content). The HMD may be configured as a glasses-type display or as a bulkier VR-type display vended by computer game equipment manufacturers.
In the example shown, only two CE devices are shown, it being understood that fewer or greater devices may be used. A device herein may implement some or all of the components shown for the AVD 12. Any of the components shown in the following figures may incorporate some or all of the components shown in the case of the AVD 12.
Now in reference to the aforementioned at least one server 52, it includes at least one server processor 54, at least one tangible computer readable storage medium 56 such as disk-based or solid-state storage, and at least one network interface 58 that, under control of the server processor 54, allows for communication with the other illustrated devices over the network 22, and indeed may facilitate communication between servers and client devices in accordance with present principles. Note that the network interface 58 may be, e.g., a wired or wireless modem or router, Wi-Fi transceiver, or other appropriate interface such as, e.g., a wireless telephony transceiver.
Accordingly, in some embodiments the server 52 may be an Internet server or an entire server “farm” and may include and perform “cloud” functions such that the devices of the system 10 may access a “cloud” environment via the server 52 in example embodiments for, e.g., network gaming applications. Or the server 52 may be implemented by one or more game consoles or other computers in the same room as the other devices shown or nearby.
The components shown in the following figures may include some or all components shown in herein. Any user interfaces (UI) described herein may be consolidated and/or expanded, and UI elements may be mixed and matched between UIs.
Present principles may employ various machine learning models, including deep learning models. Machine learning models consistent with present principles may use various algorithms trained in ways that include supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, feature learning, self-learning, and other forms of learning. Examples of such algorithms, which can be implemented by computer circuitry, include one or more neural networks, such as a convolutional neural network (CNN), a recurrent neural network (RNN), and a type of RNN known as a long short-term memory (LSTM) network. Support vector machines (SVM) and Bayesian networks also may be considered to be examples of machine learning models. In addition to the types of networks set forth above, models herein may be implemented by classifiers.
As understood herein, performing machine learning may therefore involve accessing and then training a model on training data to enable the model to process further data to make inferences. An artificial neural network/artificial intelligence model trained through machine learning may thus include an input layer, an output layer, and multiple hidden layers in between that that are configured and weighted to make inferences about an appropriate output.
Referring now to
A non-electronic object 206 in the form of a stuffed animal is also shown. An end-user may thus hold the object 206 within view of the camera 204 during a setup process for the computer 202 to register the object 206, including its colors, shapes, 3D feature points, etc. This data about the object 206 may then be used to generate a 3D model representing the object 206 for incorporation of the 3D model into a scene of the video game consistent with present principles and also to control the video game itself consistent with present principles.
Now in reference to
Turning to
Before moving on to other figures, note that what is shown in
Now in reference to
Then, while the selected game instance is loading, the computer may present visual aids to demonstrate, using the element 1014, different actions the user may take with the ducky 1010 to provide different types of game inputs to the game. For example, aids similar to those described above in reference to
Now in reference to
Assuming the first letter “m” has already been gathered,
Accordingly, in response to determining that the two real-world objects have contacted each other in the real world, the computer may show the representations 1014 and 1402 similarly making contact in the same way as the corresponding physical objects themselves according to real-world location, orientation, speed of approach, etc. Also in response to determining that the two physical objects have contacted each other, the computer may present audio as part of the video game so that the audio is timed for real time playout at the same moment the corresponding elements 1014 and 1402 are shown on screen as contacting each other. The audio may mimic a real world sound of objects 1010, 206 contacting each other according to object types respectively associated with each object.
For example, upon recognizing each object using object recognition, the computer may access a relational database indicating respective object types for respective objects to identify an object type for the recognized object through the relational database. Additionally or alternatively, the object recognition result itself may sometimes indicate object type, such as “rubber” for the ducky 1010 or “fabric” for the animal 206. The computer may then access a database of audio files to locate a particular audio file tagged with metadata indicating that it pertains to a sound of objects of the rubber and fabric types contacting each other. The computer may then either present the corresponding audio from the file as-is, or may even alter the audio using audio processing software to even better match the actual type of contact that was identified (e.g., increase the volume based on a smash of the objects 1010, 206 together, or draw the audio out over a longer period of presentation time based on the objects 1010, 206 being rubbed together for the same amount of real world time as the presentation time). Further note that in examples where an artificial intelligence-based audio generation model might be used, the sounds of the two objects 1010, 206 contacting each other may be dynamically generated by the model as already trained to render conforming sound outputs based on two different materials contacting each other (with the two different material/object types being used as the input to the model along with the type of contact that was detected).
Continuing the detailed description in reference to
Note that the computer may superimpose a virtual pose box 1604 over the ducky 1600 per the video feed 1006. Also note that the representation 1602 may be a 3D representation of the ducky 1600 as taken from a 3D model of the ducky 1600, where the 3D model may be generated during a registration/training process as described above.
Thus, the user 1008 may move the physical objects 1010, 206, and 1600 to command the representations 1014, 1402, and 1602 to move correspondingly within the game scene 1610. In the present example, this entails controlling the representations to collide with the letter “P” 1620 as the representations approach it within the scene 1610. Also note before moving on that the present example might represent a multi-player game instance if, e.g., some of the objects 1010, 206, 1600 are controlled by different end-users within the same area or even by remotely-located users (each with their own depth-sensing camera).
In any case, per
Additionally, in some examples the representation 1905 may be animated to change viewing perspective based on real-world angle of view of the user themselves, giving a spatial reality effect to the representation 1905. For example, the user 1900 may leave the animal 206 stationary on a table and then walk up to the display 1906 to inspect the representation 1905 from different angles of view. Accordingly, note that to control the spatial reality display 1906, the camera 1904 may also be used to image the user's eyes so that the computer 1902 can perform eye tracking and head position tracking to change the virtual perspective of the representation 1905 according to the user's angle of view with respect to the display 1906 itself. Thus, the representation 1905 may change its presented orientation to mimic the user's actual viewing perspective toward the representation 1905 as if the representation 1905 existed in the real world and was stationary within the box mimicked via the display 1906 so that the user could simply move around the box in real-world 3D space to inspect different angles and aspects of the representation 1905 just as if inspecting the animal 206 itself from different angles. In the present example, the display 1906 may therefore be thought of as a digital toy box when representing the animal 206.
Now in reference to
From block 2104 the logic may then proceed to block 2106. At block 2106 the computer may synchronize/control a graphical element of the simulation based on the position data related to the non-electronic object. For example, at block 2106 the computer may control the location/orientation of the graphical element to move or select a button within a scene of the video game. Thereafter the logic may proceed to block 2108 where the computer may in some examples also present audio responsive to non-electronic objects being identified as contacting each other as described above. Thus, the audio may mimic the real-world sound of the corresponding real-world objects colliding according to object type as described above.
As shown in
The GUI 2200 may also include a prompt 2206 for the user to select from already-registered objects to use one of those objects in an ensuing computer simulation as described herein. Thus, option 2208 may be selected to select the rubber ducky 1010 from above, with a thumbnail image 2210 of the ducky 1010 also being presented. Option 2212 may be selected to select the stuffed animal 206 from above, with a thumbnail image 2214 of the animal 206 also being presented.
Before concluding, note that object recognition may be executed in some instances to identify the type of real-world non-electronic object being held to then identify corresponding game movements to implement. For example, if a rubber ducky were being held according to the examples above, the computer may recognize as much and then enable the corresponding graphical element to have flying capability with animated wings that change motion based on real-world object pose (even if the real-world ducky itself does not have moveable wings). As another example, a real-world soldier figurine might be recognized to enable the corresponding graphical element to have walking and running capability with animated legs that change pace and crouch based on real-world object pose. So animations in the computer simulation can be triggered by not only the pose of the real-world object itself but also the type of real-world object.
Also note that video games and other computer simulations that may be used consistent with present principles are not limited to the examples above. For example, virtual reality and augmented reality video games and other types of simulations may also employ present principles. Also note that the graphical element controlled via the real-world non-electronic object need not necessarily be a representation of the non-electronic object itself. Itself, it might be a preexisting/pre-canned video game character that is nonetheless moveable via the non-electronic object, for example.
While the particular embodiments are herein shown and described in detail, it is to be understood that the subject matter which is encompassed by the present invention is limited only by the claims.