This application claims priority under 35 U.S.C. § 119 or 365 to European Application No. 20306099.1, filed Sep. 25, 2020. The entire contents of the above application(s) are incorporated herein by reference.
The disclosure relates to the field of computer programs and systems, and more specifically to methods, devices and programs pertaining to a deep-learning generative model that outputs 3D modeled objects each representing a mechanical part or an assembly of mechanical parts.
A number of systems and programs are offered on the market for the design, the engineering and the manufacturing of objects. CAD is an acronym for Computer-Aided Design, e.g., it relates to software solutions for designing an object. CAE is an acronym for Computer-Aided Engineering, e.g., it relates to software solutions for simulating the physical behavior of a future product. CAM is an acronym for Computer-Aided Manufacturing, e.g., it relates to software solutions for defining manufacturing processes and operations. In such computer-aided design systems, the graphical user interface plays an important role as regards the efficiency of the technique. These techniques may be embedded within Product Lifecycle Management (PLM) systems. PLM refers to a business strategy that helps companies to share product data, apply common processes, and leverage corporate knowledge for the development of products from conception to the end of their life, across the concept of extended enterprise. The PLM solutions provided by Dassault Systémes (under the trademarks CATIA, ENOVIA and DELMIA) provide an Engineering Hub, which organizes product engineering knowledge, a Manufacturing Hub, which manages manufacturing engineering knowledge, and an Enterprise Hub which enables enterprise integrations and connections into both the Engineering and Manufacturing Hubs. All together the system delivers an open object model linking products, processes, resources to enable dynamic, knowledge-based product creation and decision support that drives optimized product definition, manufacturing preparation, production and service.
In this context and other contexts, deep-learning and in particular deep-learning generative models of 3D modeled objects are gaining wide importance.
The following papers relate to this field and are referred to hereunder:
Papers [1,2,3] relate to methods to retrieve and synthesize new shapes by borrowing and combining parts from an existing patrimony (e.g., dataset) of 3D objects. Generative methods have demonstrated their potential to synthesize 3D modeled objects (i.e., to come up with a new content that goes beyond combining and retrieving existing patrimony), thereby proposing conceptually novel objects. Prior art-generative methods learn and apply generative models of 3D content [4,5], focusing only on either geometric or structural features of 3D modeled objects. Paper [4] introduces a model which is a generative decoder that maps noise samples (e.g., uniform or Gaussian) to a voxel grid as output whose distribution is close to that of the data generating distribution. Paper [5] surveys learning 3D generative models that are aware of the structure of the data, e.g., the arrangement and relations between parts of the shape of the 3D modeled object.
However, there is still a need for improved solutions with respect to outputting 3D modeled objects each representing a mechanical part or an assembly of mechanical parts.
It is therefore provided a computer-implemented method for training a deep-learning generative model that outputs 3D modeled objects each representing a mechanical part or an assembly of mechanical parts. The training method comprises providing a dataset of 3D modeled objects. Each 3D modeled object represents a mechanical part or an assembly of mechanical parts. The training method further comprises training the deep-learning generative model based on the dataset. The training of the deep-learning generative model includes minimization of a loss. The loss includes a term that penalizes, for each output respective 3D modeled object, one or more functional scores of the respective 3D modeled object. Each functional score measures an extent of non-respect of a respective functional descriptor among one or more functional descriptors, by the mechanical part or the assembly of mechanical parts.
The training method may comprise one or more of the following:
It is further provided a method of use of a deep-learning generative model that outputs 3D modeled objects each representing a mechanical part or an assembly of mechanical parts, and trained (i.e., having been trained) according to the training method. The method of use comprises providing the deep-learning generative model and applying the deep-learning generative model to output one or more 3D modeled objects each representing a respective mechanical part or an assembly of mechanical parts.
It is further provided a machine-learning process comprising the training method, and then the method of use.
It is further provided a computer program comprising instructions for performing the training method, the method of use, and/or the machine-learning process.
It is further provided a device comprising a data storage medium having recorded thereon the program and/or the trained deep-learning generative model.
The device may form a non-transitory computer-readable medium. The device may alternatively comprise a processor coupled to the data storage medium. The device may thus form a system. The system may further comprise a graphical user interface coupled to the processor.
Embodiments of the disclosure will now be described, by way of non-limiting example, and in reference to the accompanying drawings, where:
Referring to
The training method constitutes an improved solution with respect to outputting 3D modeled objects each representing a mechanical part or an assembly of mechanical parts.
Notably, the training method allows obtaining a generative model configured to output 3D modeled objects. The generative model allows generating (i.e., outputting or synthesizing) one or more 3D modeled objects automatically, thus improving ergonomics in the field of 3D modeling. Moreover, the training method obtains the generative model via a training S20 based on the dataset provided at S10, and the generative model is a deep-learning generative model. Thus, the training method leverages on improvements provided by the field of machine-learning. In particular, the deep-learning generative model may learn to replicate the data distribution of a diversity of 3D modeled objects present in the dataset and output 3D modeled objects based on the learning.
In addition, the deep-learning generative model outputs particularly accurate 3D modeled objects, in terms of use or purpose characteristics of the represented mechanical part or assembly of mechanical parts in the real world, within an intended context. This is thanks to the training S20 including minimization of a particular loss. The loss includes in specific a term that penalizes, for each output respective 3D modeled object, one or more functional scores of the respective 3D modeled object. With this specific term included in the loss, the deep-learning generative model is taught to output a 3D object that fulfills a functional validity that is measured by the one or more functional scores. Indeed, each functional score measures an extent of non-respect of a respective functional descriptor among one or more functional descriptors, by the mechanical part or the assembly of mechanical parts. In examples, such functional descriptors may involve any cue related to shape, structure, or any type of interaction of the represented mechanical part or assembly of mechanical parts with other objects or mechanical forces within the intended context. Therefore, the learning S20 may evaluate the functional validity of the 3D modeled objects of the dataset in the intended context, for example, in terms of shape or structural soundness, physical realizability and interaction quality in the intended context. Thus, the deep learning generative model outputs particularly accurate 3D modeled objects that are functionally valid.
In addition, the training S20 may explore the functional features of 3D modeled objects of the dataset. In examples, in addition to geometric and structural features of said 3D objects, the training S20 may explore the mutual dependencies between geometric, structural, topological and physical features of said 3D objects. Moreover, the training S20 may be used for outputting 3D modeled objects that refine or correct the functionality of respective 3D modeled objects of the dataset, thereby improving the accuracy of the output 3D modeled objects with respect to the dataset.
Any method herein is computer-implemented. This means that all steps of the training method (including S10 and S20) and all steps of the method of use are executed by at least one computer, or any system alike. Thus, steps of the method are performed by the computer, possibly fully automatically, or, semi-automatically. In examples, the triggering of at least some of the steps of the method may be performed through user-computer interaction. The level of user-computer interaction required may depend on the level of automatism foreseen and put in balance with the need to implement user's wishes. In examples, this level may be user-defined and/or pre-defined.
A typical example of computer-implementation of a method herein is to perform the method with a system adapted for this purpose. The system may comprise a processor coupled to a memory and a graphical user interface (GUI), the memory having recorded thereon a computer program comprising instructions for performing the method. The memory may also store a database. The memory is any hardware adapted for such storage, possibly comprising several physical distinct parts (e.g., one for the program, and possibly one for the database).
A modeled object is any object defined by data stored e.g. in the database. By extension, the expression “modeled object” designates the data itself. According to the type of the system, the modeled objects may be defined by different kinds of data. The system may indeed be any combination of a CAD system, a CAM system, a PDM system and/or a PLM system. In those different systems, modeled objects are defined by corresponding data. One may accordingly speak of CAD object, PLM object, PDM object, CAE object, CAM object, CAD data, PLM data, PDM data, CAM data, CAE data. However, these systems are not exclusive one of the other, as a modeled object may be defined by data corresponding to any combination of these systems. A system may thus well be both a CAD and PLM system.
By CAD system, it is additionally meant any system adapted at least for designing a modeled object on the basis of a graphical representation of the modeled object, such as CATIA. In this case, the data defining a modeled object comprise data allowing the representation of the modeled object. A CAD system may for example provide a representation of CAD modeled objects using edges or lines, in certain cases with faces or surfaces. Specifications of a modeled object may be stored in a single CAD file or multiple ones. The typical size of a file representing a modeled object in a CAD system is in the range of one Megabyte per part. A modeled object may typically be an assembly of thousands of parts.
In the context of CAD, a modeled object may typically be a 3D modeled object. By “3D modeled object”, it is meant any object which is modeled by data allowing its 3D representation. A 3D representation allows the viewing of the part from all angles. For example, a 3D modeled object, when 3D represented, may be handled and turned around any of its axes, or around any axis in the screen on which the representation is displayed.
Any 3D modeled object herein, including any 3D modeled object output by the deep learning generative model, may represent the geometry of a product to be manufactured in the real world such as a (e.g. mechanical) part or assembly of parts (or equivalently an assembly of parts, as the assembly of parts may be seen as a part itself from the point of view of the training method, or the training method may be applied independently to each part of the assembly), or more generally any rigid body assembly (e.g. a mobile mechanism). Any of said 3D modeled objects may thus represent an industrial product which may be any mechanical part, such as a part of a motorized or non-motorized terrestrial vehicle (including e.g. car and light truck equipment, racing cars, motorcycles, truck and motor equipment, trucks and buses, trains), a part of an aerial vehicle (including e.g. airframe equipment, aerospace equipment, propulsion equipment, defense products, airline equipment, space equipment), a part of a naval vehicle (including e.g. navy equipment, commercial ships, offshore equipment, yachts and workboats, marine equipment), a general mechanical part (including e.g. industrial manufacturing machinery, heavy mobile machinery or equipment, installed equipment, industrial equipment product, fabricated metal product, tire manufacturing product), an electro-mechanical or electronic part (including e.g. consumer electronics, security and/or control and/or instrumentation products, computing and communication equipment, semiconductors, medical devices and equipment), a consumer good (including e.g. furniture, home and garden products, leisure goods, fashion products, hard goods retailers' products, soft goods retailers' products), a packaging (including e.g. food and beverage and tobacco, beauty and personal care, household product packaging). Any of said 3D modeled objects, including any 3D modeled object output by the deep learning generative model, may be subsequently integrated as part of a virtual design with for instance a CAD software solution or CAD system. A CAD software solution allows the subsequent design of products in various and unlimited industrial fields, including: aerospace, architecture, construction, consumer goods, high-tech devices, industrial equipment, transportation, marine, and/or offshore oil/gas production or transportation.
By PLM system, it is additionally meant any system adapted for the management of a modeled object representing a physical manufactured product (or product to be manufactured). In a PLM system, a modeled object is thus defined by data suitable for the manufacturing of a physical object. These may typically be dimension values and/or tolerance values. For a correct manufacturing of an object, it is indeed better to have such values.
By CAM solution, it is additionally meant any solution, software of hardware, adapted for managing the manufacturing data of a product. The manufacturing data generally includes data related to the product to manufacture, the manufacturing process and the required resources. A CAM solution is used to plan and optimize the whole manufacturing process of a product. For instance, it can provide the CAM users with information on the feasibility, the duration of a manufacturing process or the number of resources, such as specific robots, that may be used at a specific step of the manufacturing process; and thus allowing decision on management or required investment. CAM is a subsequent process after a CAD process and potential CAE process. Such CAM solutions are provided by Dassault Systémes under the trademark DELMIA®.
The GUI 2100 may be a typical CAD-like interface, having standard menu bars 2110, 2120, as well as bottom and side toolbars 2140, 2150. Such menu- and toolbars contain a set of user-selectable icons, each icon being associated with one or more operations or functions, as known in the art. Some of these icons are associated with software tools, adapted for editing and/or working on the 3D modeled object 2000 displayed in the GUI 2100. The software tools may be grouped into workbenches. Each workbench comprises a subset of software tools. In particular, one of the workbenches is an edition workbench, suitable for editing geometrical features of the modeled product 2000. In operation, a designer may for example pre-select a part of the object 2000 and then initiate an operation (e.g. change the dimension, color, etc.) or edit geometrical constraints by selecting an appropriate icon. For example, typical CAD operations are the modeling of the punching or the folding of the 3D modeled object displayed on the screen. The GUI may for example display data 2500 related to the displayed product 2000. In the example of the figure, the data 2500, displayed as a “feature tree”, and their 3D representation 2000 pertain to a brake assembly including brake caliper and disc. The GUI may further show various types of graphic tools 2130, 2070, 2080 for example for facilitating 3D orientation of the object, for triggering a simulation of an operation of an edited product or render various attributes of the displayed product 2000. A cursor 2060 may be controlled by a haptic device to allow the user to interact with the graphic tools.
The client computer of the example comprises a central processing unit (CPU) 1010 connected to an internal communication BUS 1000, a random access memory (RAM) 1070 also connected to the BUS. The client computer is further provided with a graphical processing unit (GPU) 1110 which is associated with a video random access memory 1100 connected to the BUS. Video RAM 1100 is also known in the art as frame buffer. A mass storage device controller 1020 manages accesses to a mass memory device, such as hard drive 1030. Mass memory devices suitable for tangibly embodying computer program instructions and data include all forms of nonvolatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM disks 1040. Any of the foregoing may be supplemented by, or incorporated in, specially designed ASICs (application-specific integrated circuits). A network adapter 1050 manages accesses to a network 1060. The client computer may also include a haptic device 1090 such as cursor control device, a keyboard or the like. A cursor control device is used in the client computer to permit the user to selectively position a cursor at any desired location on display 1080. In addition, the cursor control device allows the user to select various commands, and input control signals. The cursor control device includes a number of signal generation devices for input control signals to system. Typically, a cursor control device may be a mouse, the button of the mouse being used to generate the signals. Alternatively or additionally, the client computer system may comprise a sensitive pad, and/or a sensitive screen.
The computer program may comprise instructions executable by a computer, the instructions comprising means for causing the above system to perform any of the methods. The program may be recordable on any data storage medium, including the memory of the system. The program may for example be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. The program may be implemented as an apparatus, for example a product tangibly embodied in a machine-readable storage device for execution by a programmable processor. Method steps may be performed by a programmable processor executing a program of instructions to perform functions of the method by operating on input data and generating output. The processor may thus be programmable and coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. The application program may be implemented in a high-level procedural or object-oriented programming language, or in assembly or machine language if desired. In any case, the language may be a compiled or interpreted language. The program may be a full installation program or an update program. Application of the program on the system results in any case in instructions for performing any method herein.
The deep learning generative model trained according to the training method may be part of a process for designing a 3D modeled object. “Designing a 3D modeled object” designates any action or series of actions which is at least part of a process of elaborating a 3D modeled object. For example, the process may comprise applying the deep-learning generative model to output one or more 3D modeled objects, and optionally performing one or more design modifications to each respective output 3D modeled object. The process may comprise displaying a graphical representation of each respective output 3D modeled object, and the optional one or more design modifications may comprise applying CAD operations with a CAD software, for example by user-graphical interaction.
The designing may comprise using the deep learning generative model trained according to the training method for creating the 3D modeled object from scratch, thus improving the ergonomics of the design process. Indeed, the user does not need to use cumbersome tools to create from scratch the 3D modeled object and may concentrate on other tasks in the design process. Alternatively, the designing may comprise providing a 3D modeled object (having been previously created) to the deep learning generative model trained according to the training method. The deep-learning generative model may map the input modeled object to an output 3D modeled object and the subsequent design is performed on said output 3D modeled object of the deep learning generative model. The output 3D modeled object is more accurate relative than the input 3D modeled object, in terms of functionality. In other words, the deep-learning generative model may be used to correct or improve/optimize an input 3D modeled objects, functionally speaking, yielding an improved design process. The input 3D modeled object may be geometrically realistic, but yet functionally invalid or unoptimized. The method of use in that case may remedy this.
The design process may be included in a manufacturing process, which may comprise, after performing the design process, producing a physical product corresponding to the modeled object. In any case, the modeled object designed by the method may represent a manufacturing object. The modeled object may thus be a modeled solid (i.e., a modeled object that represents a solid). The manufacturing object may be a product, such as a part, or an assembly of parts. Because the deep learning generative model improves the design process of elaborating the modeled object, the deep learning generative model also improves the manufacturing of a product and thus increases productivity of the manufacturing process.
By “deep-learning model”, it is meant any data structure that represents a series of computations, wherein at least a part of said data structure can be trained based on a dataset of 3D modeled objects. The training can be performed by means of any set of techniques known from the field of machine learning, notably within the field of deep-learning. The trained deep learning model, i.e., having been trained based on the dataset of 3D modeled objects, is “generative”, i.e., configured to generate and output one or more 3D objects. In examples, the deep-learning generative model may include a Variational Auto Encoder or a Generative Adversarial Network, and/or any other Neural Network (e.g., Convolutional Neural Network, Recurrent Neural Network, Discriminative Models). In such examples, the deep-learning generative model may generate a family of synthesized 3D objects, with at least some degree of accuracy with respect to geometrical shape.
The minimization of the loss may be performed in any way, e.g., gradient descent, or any other minimization known within the field. The loss includes a term that penalizes, for each output respective 3D modeled object, one or more functional scores of the respective 3D modeled object.
By “functional score” it is meant any indicator (e.g., a number, a vector) that is attributed to each object, and that measures an extent of non-respect of a respective functional descriptor among one or more functional descriptors, by the mechanical part or the assembly of mechanical parts. By convention, a lower functional score may signify a better agreement to (or respect of) the respective functional descriptor.
By “functional descriptor” it is meant any functional attribute of the 3D modeled object, i.e., a feature which characterizes (i.e., represents) a physical functionality of the represented real-world object (the mechanical part or the assembly of mechanical parts). By physical functionality it is meant the quality (by the real-world object) of being functional, that is, the capability of the represented real-world object to perform or be able to perform in the way is supposed to within a physical context of use, i.e., its use or purpose. A physical functionality may be a physical property that the represented real-world object possesses and that may be influenced by a physical interaction. Such physical properties may include, for example, electrical conductivity, thermal conductivity, drag in a fluid environment, density, rigidity, elastic deformation, tenacity, resilience, of the represented real-world object. A physical interaction may be an interaction with another mechanical object or plurality of objects, or the presence of environmental stimuli that influences a physical property of the represented real-world object. An environmental stimulus may be any physical input (to the represented object) coming from at least part of the intended environment, for example, heat sources from the environment or an environment comprising fluid flows. The physical functionality may be a mechanical functionality. A mechanical functionality may be a property of a spatial structure and mechanical soundness, that the represented real-world object possesses and that may be influenced under a mechanical interaction. Properties of spatial structure and mechanical soundness of the represented object may include the connectedness of the components of the represented real-world object, the stability, durability of the represented real world object when in use or any affordance of the represented real-world object when subject to a mechanical interaction. Said mechanical interaction may include any mechanical interaction of the represented object with other mechanical objects and/or external mechanical stimuli. Examples of mechanical interaction of the represented object with other mechanical objects may include, collisions (including, e.g., any type of impact with other objects and/or any type of friction with other objects), contention of another object within an enclosed volume, supporting another object, holding another object, or being hanged via a mechanical connection. Said mechanical interaction of the represented object may include mechanical constraints of motion on the represented object, for example, static or dynamic constraints of motion. Said mechanical interaction may further include any mechanical response of the represented object to external mechanical stimuli. An external mechanical stimulus may consist of the application of a mechanical force to the represented object, for example, the application of forces changing from a state of rest to a state of motion, stress forces or vibration forces. A functional descriptor may involve multiple physical or mechanical functionalities, which may be combined in a single dimensional descriptor or as a multi-dimensional vector.
Therefore, the functional scores quantify an extent to which the represented real world object possesses a physical functionality related to its use or purpose, as represented by a respective functional descriptor among one or more functional descriptors. Thus, the loss penalizes output 3D modeled objects that do not have good agreement with respect to the use or purpose characteristics of the represented real-world object. Therefore, the loss allows the learning to explore the mutual dependency between one or more physical and/or mechanical functionalities of the represented real-world object characterized by the one or more functional descriptors, e.g., the learning explores mutual dependency between geometry, structure and physical and/or mechanical interaction of the object within the intended context. Consequently, the generative model trained according to the training method outputs 3D modeled objects that are particularly accurate representations of the corresponding represented mechanical part or assembly of mechanical parts, in terms of its mechanical functionality, e.g., being geometrically sound, physically realizable and with an improved interaction quality with the intended context of use.
The loss may further include another term that penalizes, for each output respective 3D modeled object, a shape inconsistency of the respective 3D modeled object with respect to the dataset. By “shape inconsistency of the respective 3D modeled object with respect to the dataset” it is meant any disagreement or mismatch between geometry of the respective 3D modeled object and geometry of at least some elements of the dataset. The disagreement or mismatch may consist in a difference or distance (exact or approximate) between shape of the respective 3D modeled object and shape of the at least some elements of the dataset. Therefore, the learning may explore the mutual dependency between the functionality (as captured by penalizing the functional scores) and the geometry of the 3D modeled object, thus outputting accurate and realistic 3D modeled objects where the shape is best adapted with the functionality of the represented mechanical part or assembly of mechanical parts.
The other term may include a reconstruction loss, an adversarial loss, or a mapping distance. Therefore, the learning may focus on improving the consistency of the shape according to either the reconstruction loss, the adversarial loss or the mapping distance. The other term including the reconstruction loss between the respective 3D modeled object and a corresponding ground truth 3D modeled object of the dataset may consist of a term that penalizes a geometric dissimilarity between the respective 3D modeled object and the corresponding ground truth 3D modeled object of the dataset. Thus, a generative model trained according to the provided method may be used for an improved 3D shape reconstruction that respects the intended functionality. An adversarial loss relative to the dataset includes or consists of a term that minimizes a discrepancy between the distribution of the dataset and that of the generated 3D output modeled objects. Thus, the minimization of the discrepancy improves shape consistency of the generated 3D output objects with respect to the distribution of the dataset. The mapping distance measures a shape dissimilarity between the respective 3D modeled object and a corresponding modeled object of the dataset. Therefore, the generative model outputs 3D modeled objects wherein a 3D modeled object may be more accurate (at least in terms of its functionality and shape consistency) with respect to the corresponding modeled object of the dataset.
The deep-learning generative model may include a 3D generative network. By “3D generative neural network” it is meant a deep-learning generative model forming a neural network, the neural network trainable according to machine-learning-based optimization, and wherein all of the learnable elements of the neural network (e.g., all weights of the neural network) are trained together (e.g., during a single machine-learning-based optimization). Therefore, the deep learning generative model may leverage from such 3D generative neural networks for improving aspects of the accuracy of the output 3D modeled objects.
The 3D generative neural network may include a Variational Autoencoder or a Generative Adversarial Network (e.g., a classic Generative Adversarial Network, or a latent Generative Adversarial Network followed by a 3D converter). Such 3D generative neural networks improve the accuracy of the output 3D models.
A Variational Autoencoder is composed of two parts: an encoder and a decoder. The encoder takes an input and outputs a distribution probability. The decoder reconstructs the input given a sample from the distribution output by the encoder. For example, the distribution may be set to be Gaussian, such that the encoder outputs two vectors of a same size representing, e.g., the mean and variance of the distribution probability, and the decoder reconstructs the input based on the distribution. The Variational Autoencoder is trained by training jointly the encoder and decoder via a variational loss and a reconstruction loss.
A Generative Adversarial Network (henceforth called as well GAN) is composed of two networks: a generator and a discriminator. The generator takes as input a low-dimensional latent variable which is sampled from a Gaussian distribution. The output of the generator may be a sample of the same type as the data within the training dataset (e.g., classic GAN), or as the data within the latent space (e.g., latent GAN). The generator may be trained using a discriminator which is trained to perform binary classification on its input between two classes: “real” or “fake”. Its input must be classified as “real” if it comes from the training dataset and as “fake” if it comes from the generator. During the training stage, while the discriminator is trained to perform its binary classification task, the generator is trained to “fool” the discriminator by producing samples which are classified as “real” by the discriminator. To train both networks jointly, the GAN may be trained through an adversarial loss.
The 3D generative neural network may alternatively include one from 3D generative neural networks including, hybrid generative networks that can be built on the basis of the Variational Autoencoder and the Generative Adversarial Network.
The deep-learning generative model may consist of the 3D generative neural network. Thus, the learning method trains all of the elements of the 3D generative neural network during the machine-learning-based optimization.
In such a case, the 3D generative neural network may include a Variational Autoencoder, and the other term may include the reconstruction loss and a variational loss. Accordingly, the deep learning generative model trained according to the training method trains the encoder and decoder of Variational Autoencoder together while minimizing the functional score. The deep-learning generative model thus outputs functionally valid 3D modeled objects while leveraging on the advantages of a Variational Autoencoder. Such advantages include, for example, sampling and interpolating (or performing other type of arithmetic on latent space) two latent vector representations from the latent space and outputting a functional 3D modeled object based on said interpolation, thanks to the decoder trained according to the training method.
Alternative to a Variational Autoencoder, the 3D generative neural network may include a Generative Adversarial Network, and the other term may include the adversarial loss. The adversarial loss may be of any kind, e.g., a Discriminator Loss, a Mini-Max loss, Non-Saturating Loss. Accordingly, the deep-learning generative model outputs functionally valid 3D modeled objects while leveraging on the advantages of a Generative Adversarial Network.
The deep learning generative model trained according to the method may be used within a method for synthesizing 3D modeled objects (i.e., creating the 3D modeled object from scratch), which may subsequently be integrated as part of a design process, notably within a CAD process, thereby increasing ergonomics of the design process while the user of the CAD system is provided with an accurate and realistic 3D modeled object. The synthesizing may comprise obtaining a latent representation from a latent space of the trained 3D generative neural network and generating the 3D modeled object from the obtained latent representation. The latent representation for generating the 3D modeled object may be obtained from any kind of sampling from the latent space, for example, as the result of performing arithmetic operations, or interpolation between at least two sampled latent representations from latent space. In further examples, the latent representation may be obtained as a result of providing an input 3D modeled object to the deep learning generative model. Therefore, the deep learning generative model may output a 3D modeled object that has an improved functionality over the input 3D modeled object.
Alternative to the deep-learning generative model consisting of the 3D generative neural network, the deep-learning generative model may consist of a mapping model followed by the 3D generative neural network. In such a case, the 3D generative neural network may be pre-trained, the other term may include the mapping distance. In such an alternative, the 3D generative neural network may optionally include a Variational Autoencoder or a Generative Adversarial Network. The mapping model may be trained by means of the mapping distance penalizing the shape dissimilarity between each respective output 3D modeled object and a corresponding modeled object of the dataset. The corresponding modeled object of the dataset may have been sampled or corresponding to a provided 3D modeled object (and thus added to the dataset). The provided 3D modeled object may thus have been output from another generative model or as the result of a prior design process. Thus, the training method may explore the mutual dependency between the functionality and the shape of the 3D modeled objects to output 3D modeled objects with improved functionality. Therefore, the deep learning generative model focuses on providing particularly accurate output 3D modeled objects with improved functionality and shape consistency while leveraging on the generative advantages of 3D generative networks, e.g., synthesizing of 3D modeled objects from a latent space of the 3D generative neural network or it may output a 3D modeled objects with an improved functionality over a provided 3D modeled object that may already leverage on improvements due to, e.g., being part of a prior design process or being output by another generative model.
The deep learning generative model trained according to the method may output 3D modeled objects based on a corresponding 3D modeled object of the dataset, for example, based on a latent vector representation from the latent space of the 3D generative model. The latent vector representation may be computed, for example, by sampling a latent vector from the latent space of the deep learning generative model and outputting a 3D modeled object based on such latent vector. Alternatively, the latent representation may correspond to a 3D modeled object provided to the deep-learning generative model and projected (e.g., represented as a latent vector) into latent space. The mapping distance specifies the manner in which the deep learning generative model outputs the 3D modeled object based on the corresponding 3D modeled object of the dataset, by penalizing the shape dissimilarity between each respective output 3D modeled object and a corresponding modeled object of the dataset. In examples, the mapping distance penalizes such shape dissimilarity by penalizing a distance between the latent vector representation of the corresponding modeled object of the dataset and the latent vector representation result of applying the mapping model to the latent vector representation of the corresponding 3D modeled object of the dataset. The mapping model may be any neural network, for example, two fully connected layers, and therefore learnable according to machine learning techniques. In examples, the minimum (found thanks to the penalization of the mapping distance) thus corresponds to the latent vector closest to the latent vector representation of the corresponding 3D modeled object of the dataset. Accordingly, the deep learning generative model outputs the 3D modeled object with the best shape consistency, that is, it is the 3D modeled object corresponding to the latent vector found thanks to the penalization of the mapping distance. Thus, the deep-learning generative model outputs a 3D modeled object with an improved functionality and shape consistency with respect to the corresponding 3D modeled object of the dataset (which may have been sampled or obtained from a provided 3D object), resulting in a more accurate output 3D model object with respect to the corresponding 3D modeled object of the dataset, with optimized functionality (thanks to the functional loss) and improved shape consistency (thanks to the mapping distance).
The training may comprise computing, for each respective 3D modeled object, a functional score of the 3D modeled object, the computing being performed by applying to the 3D modeled object one or more among a deterministic function, a simulation-based engine or a deep-learning function. Therefore, the functional scores may populate a dataset of 3D modeled objects with functional annotations.
By “deterministic function” it is meant any function (including at least a set of computations) that provides an explicit deterministic theoretical model, i.e., it produces the same output from a given starting condition. This excludes notably any method consisting of probabilistic or stochastic components. The deterministic function evaluates the 3D object (e.g., provided as an input to the explicit deterministic theoretical model) and outputs at least a functional score corresponding to a functional descriptor.
By “simulation engine” it is meant any data structure that represents a series of computations, wherein at least part evaluates the interaction of the 3D object under a variety of interactions, as supported by the simulation engine (e.g., thanks to a physics engine). A simulation-based engine takes as input the 3D object and outputs as a functional score a quantity related to the simulation output. Thus, the simulation based engine evaluates the 3D object within a relevant context, that is tracking the 3D object behavior when subjected to a variety of expected interactions, e.g., the expected response of the mechanical part or assembly of mechanical parts under the action of gravity.
By “deep learning function” it is meant any data structure that represents a series of computations, wherein at least part comprises learning a functional score based on the dataset and may be least partly based on one or more among a deterministic function or the simulation based engine. Thus, a 3D object is fed to the deep learning function that predicts a functional score that may be used as an annotation for the 3D modeled object.
In the case wherein the computing is performed by the deep-learning function, the deep-learning function may have been trained on the basis of another dataset. The other dataset may include 3D objects each associated with a respective functional score. The respective functional score may have been computed by using one or more among a deterministic function, a simulation-based engine or a deep-learning function. Therefore, the method leverages the generated 3D objects patrimony using existing computer programs, characterized with very heterogeneous compliance with functional requirements.
The training method may therefore compute the functional scores leveraging on at least a deterministic function, a simulation engine and/or a deep-learning function, or a combination of such computational tools. In examples, the training may augment a dataset of 3D modeled objects with a functional validity for each 3D modeled object of the dataset, by using the computed functional scores as annotations.
Reference is made to
The following papers are referred hereunder:
In these examples, for the category of chairs, the computation tools consist of a deep-learning function to compute a connectivity loss, and simulation engines to compute a Physical Stability and Sitting Affordance descriptors. The one or more functional descriptors provided herein are just examples. In this example, the connectivity loss is referred to as “topological loss” as it is computed via topological priors, e.g., a loss incorporated in a deep-learning function as in paper [6]. Moreover, a simulation engine is used to compute the functional score corresponding to a physical stability descriptor by applying gravity forces to the object. In addition, a simulation engine as in paper [10] is used for computing the functional score corresponding to a sitting affordance descriptor. In these examples, for the category of airplanes, the functional score corresponding to the connectivity descriptor may be obtained using topological priors as above. Additionally, a score corresponding to a drag coefficient descriptor may be computed via a simulation engine performing a computational fluid dynamics simulation as in paper [11]. In these examples, for the category of pushcart vehicles, it may be further computed, among others, a functional score corresponding to a contain affordance, by means of a simulation engine as in paper [12]. In these examples, further categories of vehicles include bicycle objects, wherein it may be computed, among others, functional scores corresponding to a human support and/or pedal affordance. In this example, a functional energy such as the one used in paper [13] may be incorporated into a simulation engine to provide an affordance model which may be used to compute such descriptors by simulating the application of forces to the seat of the bicycle, corresponding to the force exerted by a human model, and as well a simulation of the dynamics of the bicycle under pedaling effects.
The one or more functional descriptors are now discussed.
The one or more functional descriptors may include a connectivity descriptor. By “connectivity descriptor” it is meant any variable (single dimensional or multi-dimensional) describing the number of connected components of the 3D modeled object. Therefore, the training method may focus on penalizing disconnected elements of the 3D model. Consequently, a deep-learning generative model trained according to the training method creates 3D modeled objects consisting of a single connected component. Thus, the resulting 3D object is more accurate as the resulting object is a single 3D modeled object without disconnected components, a functionality required notably in the design of mechanical parts, or assembly of mechanical parts.
The one or more functional descriptors may include one or more geometrical descriptors and/or one or more affordances. By “geometrical descriptor” it is meant any variable (single dimensional or multi-dimensional) representing the spatial structure of the 3D object. By “affordance” it is meant any variable (single dimensional or multi-dimensional) representing the interaction of the object within an intended context (i.e., any type of relation that provides cues on the use of the 3D object within a specific context). Therefore, the training method may focus on penalizing geometric aberrations. Consequently, a deep-learning generative model trained according to the training method provides geometrically sound 3D modeled objects, i.e., having a more accurate spatial structure and/or a better functionality in terms of the interaction of the 3D modeled object within its intended context.
The one or more geometrical descriptors may include one or more among a physical stability descriptor and/or a durability descriptor. The physical stability descriptor may represent, for a mechanical part or an assembly of mechanical parts, a stability of the mechanical part or the assembly of mechanical parts, e.g., a capability of the mechanical part or the assembly of mechanical parts to remain in equilibrium at a spatial position, under the application of gravity force only. Therefore, the learning method may focus on penalizing any deviation of the position mechanical part from the initial position under the application of gravity force. Accordingly, a deep-learning generative model trained according to the training method outputs 3D model objects that maintain mechanical stability under the action of gravity, as expected in the context of a 3D object representing a mechanical part or an assembly of mechanical parts.
In examples, the physical stability descriptor of the 3D object may be computed via a deterministic function, a simulation engine, or a deep-learning function. In such examples, the descriptor may correspond to the positions at the center of mass of the mechanical part or assembly of mechanical parts, and the response of the 3D object under the application of gravity forces would be recorded over a time interval at least at those positions. Therefore, a functional score measuring an extent of non-respect of the stability descriptor would be a difference between the initial and final spatial position of the center of mass over the time interval. The physical stability descriptor may be further used to define a functional loss such as:
f
stability
=|p
i
−p
i
|,
wherein (pi
The durability descriptor may represent, for a mechanical part or an assembly of mechanical parts, a capability of the mechanical part or the assembly of mechanical parts to withstand the application of gravity force and external mechanical forces. The external mechanical forces may correspond to forces applied at random positions and may perturb the positions by varying the mass times gravity at each of the random positions. The external mechanical forces may further correspond to forces that features some direct contact with another mechanical object including, e.g., any type of friction between the two objects. Therefore, the training method may focus on penalizing deviations in the spatial position of the 3D modeled object due to the perturbations. Consequently, a deep-learning generative model trained according to the training method outputs a 3D modeled object that provides a more accurate representation of a mechanical part or assembly of mechanical parts, that is subject to the application of gravity force and external mechanical forces, which may represent, for example stress forces or vibrational forces or any similar external mechanical perturbation.
In examples, the response of the 3D object may be computed via a deterministic function, a simulation engine, or a deep-learning function. In such examples, the durability descriptor may be the center of mass positions of the mechanical part or assembly of mechanical parts, wherein in addition it would be recorded the initial and final spatial positions of the object center of mass to evaluate the durability of the 3D modeled. The durability descriptor would thus be used to define a functional loss term that penalizes deviations from the spatial position of the center of mass. The durability descriptor may be used to define a functional loss term as:
Wherein α is an annealing coefficient, with center of mass positions pi
The one or more affordances may include one or more among a support affordance descriptor, a drag coefficient descriptor, a contain affordance descriptor, a holding affordance descriptor, a hanging affordance descriptor. The support affordance descriptor may represent, for a mechanical part or an assembly of mechanical parts, a capability of the mechanical part or the assembly of mechanical parts to withstand the application of external mechanical forces only. Such a descriptor may be a position (e.g., a top position) of the mechanical part or assembly of mechanical parts, and it can be recorded, e.g., by a simulation engine or explicit theoretical model wherein it is applied one or more forces on the top of the mechanical part or assembly of mechanical parts. The drag coefficient descriptor may represent, for a mechanical part or an assembly of mechanical parts, an influence of a fluid environment on the mechanical part or the assembly of mechanical parts. The descriptor may be a drag coefficient, i.e., a dimensionless quantity used to represent drag or resistance of the 3D modeled object in a fluid environment, as commonly known in the field of fluid dynamics. Such descriptor may be computed, e.g., via a simulation engine performing a computational fluid dynamics simulation. The contain affordance descriptor may represent, for a mechanical part or an assembly of mechanical parts, a response of the mechanical part or the assembly of mechanical parts while containing another object in an inside volume of the mechanical part. Such descriptor may be computed, e.g., via a simulation engine. The holding affordance descriptor may represent, for a mechanical part or an assembly of mechanical parts, the capability of the mechanical part or the assembly of mechanical parts to support another object via a mechanical connection. Such descriptor may be defined at the position wherein the mechanical connection is located. The hanging affordance descriptor may represent, for a mechanical part or an assembly of mechanical parts, the capability of the mechanical part or the assembly of mechanical parts for being supported through a mechanical connection.
Each 3D modeled object of the dataset may represent a piece of furniture, a motorized vehicle, a non-motorized vehicle or a tool. The deep learning generative model trained according to the method thus provides a more accurate representation of the specified class of objects of the dataset. The learning may use any loss that penalizes a combination of functional scores, for example, by penalizing a combination of one or more among the connectivity descriptor, the one or more geometrical descriptors or the one or more affordances. The class of furniture may have as functional descriptors at least a connectivity descriptor, a physical stability descriptor and one of a sitting affordance or an object support affordance. The class of motorized vehicles may have as functional descriptors at least a connectivity descriptor and a drag coefficient descriptor. The class of tools may have as functional descriptors at least a connectivity descriptor, a durability descriptor and one of a holding affordance or an object support affordance.
The training method can combine one or more functional scores of each descriptor, in accordance with the intended context of the outputted 3D modeled object. Reference is made to
Examples are now discussed with reference to
Examples are now discussed with reference to
In these examples, each 3D modeled object of the dataset represents a piece of furniture, in particular a chair. In other words, the training method and the method of use are applied to the category of “chairs”. The 3D data representation consists of structured point cloud objects. In other words, each 3D modeled object is a point cloud.
The following examples further relate to the computation of functional descriptors and scores in the specific category of chairs, which are a class of furniture objects, computing connectivity, stability, durability and sitting affordance descriptors. The functional descriptors are to be used to generate structured point cloud objects representing the category.
Connectivity descriptor: Within the category of chairs, connectivity encourages that there are no floating segments of the object. With reference to
In these examples, to assess point cloud connectivity, the method of paper [6] is used for computing connected and disconnected components of the 3D modeled object, using Od persistent homology. According to the method of paper [6], a “Persistence Diagram” may be computed showing when topological features (connected components) appear (birth time bi) and disappear (death time di).
In these examples, the connectivity loss is defined as
This loss function sums over lifetimes beginning with i=1. Hence the loss penalizes disconnected components.
Stability descriptor: the 3D object is input to a simulation engine [8] simulating the stability of an object in the chair category. The simulation engine evaluates the stability of 3D object by evaluating if it remains static when subjected to gravity once placed on flat plane in the common orientation. Accordingly, the simulation records its center of mass positions pi when subject to gravity for i ranging from i0=0 to i2 s=2.5 seconds with a simulation step of 2.510−4 seconds. The corresponding functional loss is:
f
stability
=|p
i
−p
i
|
With reference to
Durability descriptor: ensures that the chair remains stable when subjected to small perturbations. With reference to
Where α is an annealing coefficient between 0 and 1 ensuring that the object of the category of chairs is more penalized if it fails to remain static under smaller perturbation. In the current example, α=0.9 and M=10.
The following example shows how to further combine functional descriptors.
The descriptors fstability and fdurability are now combined as:
In further examples, all of the three scores may be further combined as:
f
physical
=f
connectivity
+f
stability
+f
durability.
The following example discusses the computation of the functional scores for the training method wherein the loss includes the combination of the three functional scores fphysical and additionally, a functional score faffordance. measuring an extent of non-respect of a respective sitting affordance descriptor Reference is done to
f
affordance
=∥C
res
−C
key∥2.
The following example refers to an implementation of the deep-learning generative model trained according to the training method.
The implemented deep learning model takes as input a 3D object and outputs the functional scores fphysical and faffordance. To build the model, a database of N=5.104 chairs is generated by using 3D generative neural networks like the ones disclosed by Papers [9,10]. Specifically, given an object category, pre-trained 3D generative neural networks from the prior art are used to sample new content of this category as instructed by each of these papers. Typically, for a GAN based 3D generative neural network such as [8] that is trained to learn data distribution of the target object category, many vectors sampled are mapped from the learned distribution to a voxel grid as output using the model generator. Each of these outputs constitutes a new instance. For each chair object Oi, the functional scores fi are computed as described above. This creates a new labeled dataset {Oi, fi}1≤i≤N. The functional predictor is trained to reduce the distance between this estimated vector fi′ and the ground truth functional score fi. The distance designates a loss function such as Euclidean distance.
In this example, the functional score is defined as
f
i
=f
i,physical
+f
i,affordance;
All scores are normalized between 0 and 1.
To train the deep-learning generative model, objects {Oi}1≤i≤N are mapped to point clouds and use PointNet architecture as in Paper [9]. In this example, the training loss for functional predictor is:
L=Σ
i=1
N
∥f
i
−f
i′∥2.
The following examples discusses methods of use of the deep-learning generative model. Hereunder, it is referred to as “functional predictor” to the part of the deep-learning generative model trained according to the training method that outputs the functional scores of a corresponding output 3D modeled object. In this example, the deep-learning generative model comprises a 3D generative neural network followed by the functional predictor. In this example, the 3D generative neural network corresponds to the PQ-Net decoder [8].
The PQ-Net decoder takes a latent vector from the learned latent space of objects and maps it to 3D object, one part at a time, resulting in a sequential assembly. The decoder performs shape generation by training a generator using GAN strategy as described in the paper [8]. The GAN generator maps random vectors sampled from the standard Gaussian distribution N(0,1) to a latent vector in the object latent space of objects; from which the sequential decoder generates new objects.
First method of use.
The first method of use is an example of an implementation of the deep-learning generative model, wherein the deep learning generative model consists of the 3D generative neural network, wherein the 3D generative neural network is a Generative Adversarial Network and the other term includes the adversarial loss. In this example, the 3D generative neural network is trained to map a latent vector into a 3D object while ensuring that the output content has a low functional score.
At each training iteration, a latent vector zin is sampled from the latent space and fed to the generator, the latter maps zin to a 3D object (Oi). The 3D object (Oi) is fed to the functional predictor that predicts its functional score fi. The network is trained to generate geometrically plausible 3D object (Oi) while minimizing their functional score fi. The model is hence endowed with functional reasoning. The latent representation includes jointly geometric (with structure here) and functional dimensions of 3D objects. The training loss includes a term that penalizes the functional score Lf=fi and an adversarial loss:
L
Train
=L
f
+L
GAN
The deep-learning generative model trained according to this method synthesizes new objects, as in
Second Method of Use.
The second method of use shows an implementation wherein the deep-learning generative model consists in a mapping model followed by the 3D generative neural network, the 3D generative neural network being pre-trained, the other term including the mapping distance, the 3D generative neural network optionally being a Variational Autoencoder. An overview of the deep-learning generative model is described in
The mapping distance is thus used for training the mapping model. This includes minimizing a loss that includes the mapping distance and penalizes the functional score:
L
map(zout)=∥zout−zin∥2+fi.
The remaining models' weights (generative model and functional predictor) are frozen.
The network returns the optimal zout and the corresponding 3D object produced by the generative model.
Number | Date | Country | Kind |
---|---|---|---|
20306099.1 | Sep 2020 | EP | regional |