Artificial Intelligence (AI) is a broad term used to describe computer systems that improve with the processing of more data, giving them the appearance of having human-like intelligence. More specific industry terms are Machine Learning, or a subset of machine learning called Deep Learning (DNN or Deep Neural Networks). Currently the data to train these systems and the deployment and testing of the systems use physical data and real environments. For example, developing a retail store AI system that understands product stock availability, proper product merchandising, and shopper behavior requires physical retail store mockups or test stores, actors or others performing the shopping tasks, and a very large number of product types, product shelf positions, stock in/out configurations, plus physical cameras, and shelf and other sensors that comprise the AI system. Providing the data variability needed to train the AI system requires that months or years of camera or sensor data be collected, while the product types, stock levels and shelf positions are randomly varied. The test data must represent many years of store operation in order to create an AI system that understands situations and actions it has not been exposed to before.
The present disclosure will be more readily understood from a detailed description of some example embodiments taken in conjunction with the following figures:
Various non-limiting embodiments of the present disclosure will now be described to provide an overall understanding of the principles of the structure, function, and use of AI development environments as disclosed herein. One or more examples of these non-limiting embodiments are illustrated in the accompanying drawings. Those of ordinary skill in the art will understand that systems and methods specifically described herein and illustrated in the accompanying drawings are non-limiting embodiments. The features illustrated or described in connection with one non-limiting embodiment may be combined with the features of other non-limiting embodiments. Such modifications and variations are intended to be included within the scope of the present disclosure.
Reference throughout the specification to “various embodiments,” “some embodiments,” “one embodiment,” “some example embodiments,” “one example embodiment,” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with any embodiment is included in at least one embodiment. Thus, appearances of the phrases “in various embodiments,” “in some embodiments,” “in one embodiment,” “some example embodiments,” “one example embodiment,” or “in an embodiment” in places throughout the specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
Throughout this disclosure, references to components or modules generally refer to items that logically can be grouped together to perform a function or group of related functions. Like reference numerals are generally intended to refer to the same or similar components. Components and modules can be implemented in software, hardware, or a combination of software and hardware. The term software is used expansively to include not only executable code, but also data structures, data stores, and computing instructions in any electronic format, firmware, and embedded software. The terms information and data are used expansively and can include a wide variety of electronic information, including but not limited to machine-executable or machine-interpretable instructions; content such as text, video data, and audio data, among others; and various codes or flags. The terms information, data, and content are sometimes used interchangeably when permitted by context.
The examples discussed herein are examples only and are provided to assist in the explanation of the systems and methods described herein. None of the features or components shown in the drawings or discussed below should be taken as mandatory for any specific implementation of any of these systems and methods unless specifically designated as mandatory. For ease of reading and clarity, certain components, modules, or methods may be described solely in connection with a specific figure. Any failure to specifically describe a combination or sub-combination of components should not be understood as an indication that any combination or sub-combination is not possible. Also, for any methods described, regardless of whether the method is described in conjunction with a flow diagram, it should be understood that unless otherwise specified or required by context, any explicit or implicit ordering of steps performed in the execution of a method does not imply that those steps must be performed in the order presented but instead may be performed in a different order or in parallel.
AI systems need to be trained and tested before deployment. As provided above, physical data and real environments are conventionally used in the development of AI systems. The use of purely physical data and physical test environments in the development of AI systems presents many limitations. For example, the physical data needed to train the AI system must already exist or be created. Although in some cases public datasets of image-based data may exist, this data is not typically tailored to the specific use-case. For example, autonomous automobiles (also known as “self-driving cars”) require training data from billions of miles of driver experiences, and this training data is currently being created by competing companies at great time and cost expense.
Once sufficient data is obtained, it has to be manually annotated. This process of labeling the data informs the AI model what each image or group of images contains, such as, cars, pedestrians, bicyclists, roads, buildings, landscaping, traffic signage, as the case may be. Human labor is typically used to manually draw bounding boxes around each pertinent object in the scene and associate the appropriate label with the rectangular region. This process is inherently slow, costly, and prone to errors and inaccuracies as training datasets often contain hundreds of thousands or even millions of images.
With labeled training data the AI system can be trained and validated. The validation process consists of testing the model with data it has not seen before. Often this data is a subset of the training dataset but not used in training. If the validation process does not meet the required system accuracy specification, the AI model can be “tuned” and/or more training data can be utilized (with the associated time and cost to gather and label the additional data). Once the AI model passes the validation stage it is deployed into the test environment. The test environment could be a mock retail store, public roadways, or the homes of test volunteers, among others.
There are many inefficiencies in the conventional AI system development method. Any changes to the project goals or specifications can require repetition of the entire process, and physical environments, products and objects need to be constructed. By way of example, testing a retail AI system in a grocery store juice section instead of the cereal aisle requires that the physical mockup store be reconfigured, new products brought in, and the entire test process repeated. Testing with a wide range of shopper types often requires hiring human actors of different sizes, shapes, ethnicities, ages, shopping behaviors, etc. Moreover, camera-based AI systems are sensitive to lighting, camera positioning, lens parameters, and other factors that are difficult to create and vary physically. Data variability is essential for training AI models, but creating that variability with physical systems is extremely time-consuming, costly, and results in necessary compromises that could lead to system failure when the AI system is deployed outside of the specific physical development environment.
As described in more below, virtual AI development processes are presented where the AI model training data, system validation, system deployment, and system testing can be performed within a real-time three-dimensional (3D) virtual environment incorporating objects, camera systems, sensors and human-driven avatars. Generally, a virtual 3D spatial environment in accordance with the present disclosure can be networked with external computer resources to simulate the end-use environment of the AI system. This environment can include various sub-systems that feed data into the AI system, such as, but not limited to, force, weight, capacitance, temperature, position and motion sensors, LiDAR, infrared and depth-sensing 3D mapping systems, and video and still camera output. This data can be captured and utilized to train and validate the AI system, which can then itself be deployed into the same real-time virtual environment. Finally, real-time motion capture techniques and human actors can be used to drive humanoid avatars within the virtual environment, thus simulating all aspects of the physical space, such as spatial accuracy and content, human behavior, sensor and camera output, and AI system response.
Referring now to
Thus, the retail environment depicted in the virtual environment 102 depicted in
The virtual environment 102 can incorporate a camera system 112, such as an RGB video camera system and/or other suitable camera system, and a humanoid avatar 114. Additionally or alternatively, the virtual environment 102 can include a virtual sensor system, which can model the operation of various sensors from the corresponding real world physical environment. Example sensors in a sensor system can include, without limitation, weight sensors, optical sensors, capacitance sensors, proximity sensors, temperature sensors, and so forth. As is to be appreciated, the number and type of virtual sensors incorporated into any virtual environment 102 can depend on the particular real-world physical environment that is being modeled. By way of example, a sensor system associated with a retail environment may be different from a sensor system associated with a medical environment or a manufacturing environment. As such, the virtual environments associated with each of the different real-world environments can model the operation of different types of sensor networks.
In the illustrated embodiment, the humanoid avatar 114 is a shopper within the retail environment. The virtual environment 102 of the illustrated embodiment also includes virtual product display 116 and a human actor 118. A real-time motion capture system 120 can be used to drive the motion of the humanoid avatar 114. The human actor 118 can be physically positioned within a studio 160. The studio 160 can be any suitable venue or location with equipment to present the human actor 118 with a virtual reality experience. Actions of the humanoid avatar 114 and positions of other objects in the virtual environment 102 can be recorded by the camera system 112, and the data stream can be fed into and processed by an AI processing computer system 124. Additionally, as the humanoid avatar 114 moves within the virtual environment 102 and interacts with various virtual objects, such as the virtual product display 116, various virtual sensors within the virtual environment 102 can stream information for the AI processing computer system 124 to process.
The humanoid avatar 114 can be controlled in real-time by the human actor 118. A virtual reality (VR) device 150, such as a VR headset or other suitable VR system, can enable the human actor 118 to visualize and experience the virtual environment 102 through a virtual reality interface of the VR device 150. In some embodiments, the VR device 150 can also include one or more hand controls, as shown in
The physical motions of the human actor 118 can be captured by the real-time motion capture system 120 in the studio 160, converted into data to drive the humanoid avatar 114, and transmitted to the virtual environment computer system 122. In some embodiments, the human actor 118 can wear active trackers 172 to aid in the tracking of the human actor's movements. While the active trackers 172 are schematically shown as elbow and ankle cuffs in
During a testing session, the human actor 118 in the studio 160 can interact with virtual objects in the virtual environment 102. In the case of a retail virtual environment 102, the human actor 118 can interact with, for example, retail products. In this fashion, through movements of the human actor 118 in the studio 160, the humanoid avatar 114 in the virtual environment 102 can, for example, select products from the product display 116 and put them in a shopping cart (not shown). It can be determined whether the AI processing computing system 124 correctly tracked the selected product through the shopping event. Such feedback regarding the successful or unsuccessful tracking of the selected product, as well as other aspects of the shopping event, can be learned to further train the AI processing computer system 124. Thus, important performance metrics can be identified through the virtual AI development environment 100 and calibrations to the AI system can be implemented before deployment of the AI system to the real-world physical environment. Moreover, the presently disclosed embodiments can provide data variability in the virtual environment 102 required to train the AI system. By way of example, for a retail environment, the product display 116 can be varied, the product types can be varied, the stock levels can be varied, and the lighting levels can be varied, among a wide variety of other variables.
An alternative embodiment of a virtual AI development environment 200 is illustrated in
In the example embodiment shown in
An alternative embodiment of a virtual AI development environment 300 is illustrated in
The physical training object 330 can either be a mock-up of the real-world physical object or the real-world physical object itself. With regard to using a mock-up as a physical training object 330, a relatively quickly produced physical training object 330 can beneficially be used that is made out of wood, Styrofoam, 3D printed, or other method of production. The surgical device (or other type of device) for presentment to the human actor 318 through the VR device 350 can be modeled to the specifications of the actual surgical device. When the human actor 318 physically actuates the physical actuator 334 on the mock-up device, the human actor 318 will view an actuation 338 of the virtual object 332. Thus, as the human actor 318 physically handles the physical training object 330 in the studio 360, such manipulation can be tracked by the motion capture system 320 and translated into the humanoid avatar 314 virtually handling the virtual object 332 within the virtual environment 302.
In some embodiments, to aid in motion capture by the motion capture system 320, a plurality of markers 340 can be worn by the human actor 318 in the studio 360. The markers 340 can be passive markers or active trackers. Additionally or alternatively, the human actor 318 can wear motion capture gloves 344. Such motion capture gloves 344 can assist with, for example, the tracking of individual digits of the human actor 318. Moreover, the motion capture system 320 can be optical (i.e. camera-based) and/or a non-optical motion capture system. In any event, the motion capture system 320 can be used track various movements and gestures of the human actor 318, including the appendages of the human actor 318. In some embodiments, individual digits of the human actor 318 can also be tracked.
While
Similar to
In this example embodiment, the virtual environment 402 is a surgical environment and the virtual object 432 is a surgical tool. Furthermore, each virtual object 432 can be a different surgical tool, as shown, although this disclosure is not so limited. The virtual environment 402 can include a virtual patient 442 and other objects or devices found in a surgical environment, for example. Furthermore, similar to previous embodiments, the physical training objects 430 can either be a mock-up of the real-world physical object or the real-world physical object itself.
Though the VR interface provided to the first human actor 418A, at least a portion of a first humanoid avatar 414A can be presented that can replicate the real-time physical motion of the human actor 418A. For example, the human actor 418A can see their extended humanoid arm, legs, and movement thereof through the VR interface of their VR device 450. In addition, the first human actor 418A can be presented with the second humanoid avatar 414B that is replicating the real-time physical motion of the human actor 418B. Thus, while in the studio 460, both human actors 418A-B can simultaneously participate in the same virtual environment 402, while interacting with objects therein, and observing each other's actions.
While
In general, it will be apparent to one of ordinary skill in the art that at least some of the embodiments described herein can be implemented in many different embodiments of software, firmware, and/or hardware. The software and firmware code can be executed by a processor or any other similar computing device. The software code or specialized control hardware that can be used to implement embodiments is not limiting. For example, embodiments described herein can be implemented in computer software using any suitable computer software language type, using, for example, conventional or object-oriented techniques. Such software can be stored on any type of suitable computer-readable medium or media, such as, for example, a magnetic or optical storage medium. The operation and behavior of the embodiments can be described without specific reference to specific software code or specialized hardware components. The absence of such specific references is feasible, because it is clearly understood that artisans of ordinary skill would be able to design software and control hardware to implement the embodiments based on the present description with no more than reasonable effort and without undue experimentation.
Moreover, the processes described herein can be executed by programmable equipment, such as computers or computer systems and/or processors. Software that can cause programmable equipment to execute processes can be stored in any storage device, such as, for example, a computer system (nonvolatile) memory, an optical disk, magnetic tape, or magnetic disk. Furthermore, at least some of the processes can be programmed when the computer system is manufactured or stored on various types of computer-readable media.
It can also be appreciated that certain portions of the processes described herein can be performed using instructions stored on a computer-readable medium or media that direct a computer system to perform the process steps. A computer-readable medium can include, for example, memory devices such as diskettes, compact discs (CDs), digital versatile discs (DVDs), optical disk drives, or hard disk drives. A computer-readable medium can also include memory storage that is physical, virtual, permanent, temporary, semi-permanent, and/or semi-temporary.
A “computer,” “computer system,” “host,” “server,” or “processor” can be, for example and without limitation, a processor, microcomputer, minicomputer, server, mainframe, laptop, personal data assistant (PDA), wireless e-mail device, cellular phone, pager, processor, fax machine, scanner, or any other programmable device configured to transmit and/or receive data over a network. Computer systems and computer-based devices disclosed herein can include memory for storing certain software modules used in obtaining, processing, and communicating information. It can be appreciated that such memory can be internal or external with respect to operation of the disclosed embodiments.
In various embodiments disclosed herein, a single component can be replaced by multiple components and multiple components can be replaced by a single component to perform a given function or functions. Except where such substitution would not be operative, such substitution is within the intended scope of the embodiments. The computer systems can comprise one or more processors in communication with memory (e.g., RAM or ROM) via one or more data buses. The data buses can carry electrical signals between the processor(s) and the memory. The processor and the memory can comprise electrical circuits that conduct electrical current. Charge states of various components of the circuits, such as solid state transistors of the processor(s) and/or memory circuit(s), can change during operation of the circuits.
Some of the figures can include a flow diagram. Although such figures can include a particular logic flow, it can be appreciated that the logic flow merely provides an exemplary implementation of the general functionality. Further, the logic flow does not necessarily have to be executed in the order presented unless otherwise indicated. In addition, the logic flow can be implemented by a hardware element, a software element executed by a computer, a firmware element embedded in hardware, or any combination thereof.
The foregoing description of embodiments and examples has been presented for purposes of illustration and description. It is not intended to be exhaustive or limiting to the forms described. Numerous modifications are possible in light of the above teachings. Some of those modifications have been discussed, and others will be understood by those skilled in the art. The embodiments were chosen and described in order to best illustrate principles of various embodiments as are suited to particular uses contemplated. The scope is, of course, not limited to the examples set forth herein, but can be employed in any number of applications and equivalent devices by those of ordinary skill in the art. Rather it is hereby intended the scope of the invention to be defined by the claims appended hereto.
This application claims the benefit of U.S. provisional patent application Ser. No. 62/870,326, filed on Jul. 3, 2019, entitled A VIRTUAL AI DEVELOPMENT ENVIRONMENT TO TRAIN, DEPLOY AND TEST ARTIFICIAL INTELLIGENCE, MACHINE LEARNING, AND DEEP LEARNING SYSTEMS, the disclosure of which is incorporated herein by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
6054991 | Crane | Apr 2000 | A |
6166744 | Jaszlics | Dec 2000 | A |
7319992 | Gaos | Jan 2008 | B2 |
8597142 | Mayles | Dec 2013 | B2 |
8600550 | Kurzweil | Dec 2013 | B2 |
8704855 | Berme | Apr 2014 | B1 |
8963916 | Reitan | Feb 2015 | B2 |
8982151 | Tardif | Mar 2015 | B2 |
9047698 | Maciocci | Jun 2015 | B2 |
9215293 | Miller | Dec 2015 | B2 |
9628477 | Smith | Apr 2017 | B2 |
9865089 | Burns | Jan 2018 | B2 |
9921641 | Worley, III | Mar 2018 | B1 |
9971491 | Schwesinger | May 2018 | B2 |
10044982 | Wilson | Aug 2018 | B2 |
10365784 | Inomata | Jul 2019 | B2 |
10540812 | Yildiz | Jan 2020 | B1 |
10642564 | Takano | May 2020 | B2 |
10679412 | Griffin | Jun 2020 | B2 |
20010056574 | Richards | Dec 2001 | A1 |
20020060648 | Matsui | May 2002 | A1 |
20080043013 | Gruttadauria et al. | Feb 2008 | A1 |
20120249741 | Maciocci | Oct 2012 | A1 |
20130117377 | Miller | May 2013 | A1 |
20140306866 | Miller | Oct 2014 | A1 |
20140368537 | Salter | Dec 2014 | A1 |
20150248845 | Postlethwaite | Sep 2015 | A1 |
20160100034 | Miller | Apr 2016 | A1 |
20160182502 | Smith | Jun 2016 | A1 |
20160188277 | Miyasaka | Jun 2016 | A1 |
20160210783 | Tomlin | Jul 2016 | A1 |
20160214011 | Weising | Jul 2016 | A1 |
20160260251 | Stafford | Sep 2016 | A1 |
20160350973 | Shapira | Dec 2016 | A1 |
20170053456 | Cho | Feb 2017 | A1 |
20170061688 | Miller | Mar 2017 | A1 |
20170092086 | Keller | Mar 2017 | A1 |
20170168585 | Faaborg | Jun 2017 | A1 |
20170178272 | Lashkari | Jun 2017 | A1 |
20170201722 | Wilson | Jul 2017 | A1 |
20170237789 | Harner | Aug 2017 | A1 |
20170282062 | Black | Oct 2017 | A1 |
20170324841 | Clement | Nov 2017 | A1 |
20180005429 | Osman | Jan 2018 | A1 |
20180039330 | Delaney | Feb 2018 | A1 |
20180101990 | Yang | Apr 2018 | A1 |
20180225131 | Tommy et al. | Aug 2018 | A1 |
20180225873 | Murdock et al. | Aug 2018 | A1 |
20180293785 | Lee | Oct 2018 | A1 |
20180295130 | Lee | Oct 2018 | A1 |
20180349527 | Li et al. | Dec 2018 | A1 |
20190041976 | Veeramani | Feb 2019 | A1 |
20190065028 | Chashchin-Semenov | Feb 2019 | A1 |
20190111336 | Gutierrez | Apr 2019 | A1 |
20190156222 | Emma | May 2019 | A1 |
20190258254 | Kadin | Aug 2019 | A1 |
20190313059 | Agarawala | Oct 2019 | A1 |
20190318542 | Sai Krishna et al. | Oct 2019 | A1 |
20190325771 | Ghatage et al. | Oct 2019 | A1 |
20190340306 | Harrison | Nov 2019 | A1 |
20190347547 | Ebstyne et al. | Nov 2019 | A1 |
20190362312 | Platt | Nov 2019 | A1 |
20190362529 | Wedig et al. | Nov 2019 | A1 |
20190378340 | Chia et al. | Dec 2019 | A1 |
20190378476 | Jeon | Dec 2019 | A1 |
20200050904 | Powers et al. | Feb 2020 | A1 |
20200117788 | Mohammad | Apr 2020 | A1 |
20200118340 | Rammos et al. | Apr 2020 | A1 |
20200265633 | Okutani | Aug 2020 | A1 |
20200402314 | Yerli | Dec 2020 | A1 |
20210132380 | Wieczorek | May 2021 | A1 |
20210166484 | Kim | Jun 2021 | A1 |
20210357959 | Cella | Nov 2021 | A1 |
Entry |
---|
Becominghuman, “Immersive Virtual Reality AI and Its Near-Coming Effects”; retrieved from https://becominghuman.ai/immersive-virtual-reality-ai-and-its-near-coming-effects-40f530efe7e0, Jul. 19, 2019, 12 pages. |
Vanhorn et al., “Deep Learning Development Environment in Virtual Reality”, retrieved from https://arxiv.org/ftp/arxiv/papers/1906/1906.05925.pdf, Jun. 3, 2019, 10 pages. |
Number | Date | Country | |
---|---|---|---|
20210004076 A1 | Jan 2021 | US |
Number | Date | Country | |
---|---|---|---|
62870326 | Jul 2019 | US |