METHOD FOR CONSTRUCTING DESIGN CONCEPT GENERATION NETWORK (DCGN) AND METHOD FOR AUTOMATICALLY GENERATING CONCEPTUAL SCHEME

Information

  • Patent Application
  • 20240005130
  • Publication Number
    20240005130
  • Date Filed
    March 13, 2023
    a year ago
  • Date Published
    January 04, 2024
    a year ago
Abstract
A method for constructing a design concept generation network (DCGN) and a method for automatically generating a conceptual scheme are provided. A DCGN includes a Transformer encoder, a Transformer decoder, an importance constraint matrix generation module, an importance constraint embedding layer, a cross-attention (CA) layer, and an optimization module. A word importance constraint is ingeniously introduced based on an attention mechanism of a Transformer to record input word constraint information contained in a generated text sequence. This can effectively ensure the reliability and effectiveness of a generated conceptual scheme and is conducive to capturing potential semantic importance information and implementing semantic knowledge reasoning.
Description
TECHNICAL FIELD

The present disclosure belongs to the technical field of product design, relates to the automatic generation of a conceptual scheme in product design, and particularly relates to the construction of a design concept generation network (DCGN) and automatic generation of a conceptual scheme based on the DCGN.


BACKGROUND

Innovative design is the basis of product development. As the core of innovative design, concept design determines most of the costs, quality, and performance during product development and is essential to product innovation. For example, in view of the problem of how to provide a usable water source for residents in coastal areas, a system for purifying seawater into drinking water or a design concept of using solar energy to desalinate seawater to produce canned drinking water or beverage products is proposed. As another example, when exploring a future public transportation system, a design concept of providing a personalized positioning seat service in the future public transportation system is put forward, which is beneficial for taking better care of vulnerable people. These design concepts provide designers or enterprises with design ideas in an early stage of product development and are more conducive to the generation of a product conceptual design scheme.


Prior design data is an important innovation source. As a core of innovative product concept design, conceptual scheme generation is a process of extracting valuable design knowledge from the prior design data and further transferring and reorganizing cross-field design knowledge to generate a creative conceptual scheme. With the advent of an era of big data and big knowledge, engineering data applied to concept design is increasing. This brings abundant innovation sources to the research of conceptual scheme generation. Fully applying the data to the conceptual scheme generation is beneficial to expanding design space and producing more design concepts. However, there are more severe challenges, mainly in two aspects. First, with the explosive growth of design data, the amount of knowledge applied to concept design is also gradually increasing. It is increasingly difficult to reason, transfer, and reorganize a large amount of design knowledge to produce creative conceptual schemes based on the manual experience and design heuristics of the designers. Second, design knowledge mainly comes from descriptions of existing product design schemes in different fields and is often complex and diverse with various knowledge types, such as functions, structures, scientific effects, cases, and others. In addition, an association relationship between knowledge is more complex and flexible. It is increasingly difficult to obtain valuable design knowledge based on design problems or design constraints and to combine multi-type cross-field design knowledge to generate new conceptual schemes.


As deep learning technology rapidly develops, many automatic generation technologies are developed and successfully complete various intelligent tasks, such as machine translation, image generation, speech recognition, and the like. The latest depth generation model has made important breakthroughs in many aspects of engineering design, such as structure optimization, material design, shape synthesis, and the like. There are also studies that use topology optimization and generative models, such as generative adversarial networks and the like, to automatically generate design concepts in the forms of images, spatial shapes, and the like. These design concepts are too abstract to understand or too detailed and not suitable for conceptual scheme design exploration in the early stage.


It is found through research that text is the most general and common form of describing design concepts and can cover rich and valuable design knowledge. How to learn potential combination rules of reasoning, transferring, and reorganizing design knowledge from massive cross-field text data through a simple and effective model and generating conceptual schemes suitable for the early stage is an important problem to be resolved in current product design.


SUMMARY

In view of current technical status of the lack of a method for automatically generating a conceptual scheme in the field of product design, an objective of the present disclosure is to provide a method for constructing a DCGN and a method for automatically generating a conceptual scheme through a DCGN. Reasoning, transfer, reorganization, and other potential rules of cross-field design knowledge can be adaptively learned from massive text data based on design problems, and conceptual schemes in a text form can be automatically generated. In this way, dependence on the manual experience of a designer is reduced, and design efficiency is improved.


An idea of the present disclosure is as follows: First, a DCGN is constructed. Then, the DCGN is trained. Finally, a design problem is inputted into a trained DCGN to automatically generate a conceptual scheme.


To achieve the foregoing objective, the present disclosure adopts the following technical solutions:


In a method for constructing a DCGN provided in the present disclosure, a word importance constraint is ingeniously introduced based on a self-attention mechanism of a Transformer network to construct a new generative network. A DCGN includes a Transformer encoder, a Transformer decoder, an importance constraint matrix generation module, an importance constraint embedding layer, a cross-attention (CA) layer, and an optimization module. In the present disclosure, training sample set data is used to train the DCGN. The training sample set data includes a plurality of samples. Each sample includes input words and a target sequence. The method for constructing a DCGN includes the following steps:

    • S1: obtaining, by the Transformer encoder, a feature of a hidden layer of the encoder based on input words in a sample;
    • S2: obtaining, by the Transformer decoder, a feature of a hidden layer of the decoder based on a target sequence in the sample;
    • S3: obtaining, by the importance constraint matrix generation module, an importance constraint matrix based on the input words and the target sequence in the sample;
    • S4: mapping, by the importance constraint embedding layer, the importance constraint matrix to a distributed vector space to obtain two input word importance embedding features;
    • S5: obtaining, by the CA layer, a generated sequence based on the feature of the hidden layer of the encoder, the feature of the hidden layer of the decoder, and the two input word importance embedding features; and
    • S6: constructing a loss function based on the generated sequence and the target sequence, and adjusting, by the optimization module, network parameters based on the loss function; and repeating S1 to S6 until the loss function meets a specified requirement to obtain the DCGN.


In S1, the Transformer encoder maps discrete input words x={x1,x2, . . . , xm}∈custom-character (where m represents a number of input words in the current sample and n represents a dimension of an input word embedding vector) to a distributed feature representation through a self-attention layer to obtain the feature hecustom-characterm×d of the hidden layer of the encoder (where d represents a number of neurons of the hidden layer, where the number of neurons of the hidden layer of the Transformer encoder and that of the Transformer decoder are designed to be the same in the present disclosure).






h
e=SA(WeKx,WeVx,WeQx)  (1)


where SA( ) represents spatial attention, WeK, WeV, and WeQ, represent weight matrices of the self-attention layer of the Transformer encoder, and x is discrete and unordered. Therefore, there is no need to incorporate position embedding in a figure when he is calculated and the output he does not contain any position information. A dimension m of the calculated vector he is less than M, and a 0 vector is used for completion such that hecustom-characterm×d and M≥m>1 where M represents a maximum number of input words contained in an entire training sample set.


In S2, the Transformer decoder maps a target sequence y:t-1=[y0,y1, . . . , yt-1] at a moment t−1 to a distributed feature representation through a self-attention layer to obtain the feature hdt of the hidden layer of the decoder.






h
d
t=SA(WdKy:t-1,WdVy:t-1,WdQy:t-1)  (2)


where SA( ) represents spatial attention; WdK, WdV, and WdQ represent weight matrices of the self-attention layer of the Transformer decoder; and y:t-1 represents a target sequence at a moment t−1 during training.


An SA( ) function in formulas (1) and (2) may be calculated by using the following formula:







SA


(

K
,
V
,
Q

)


=

soft



max

(


Q


K
T



d


)



V
.






For the encoder, K is represented in WeKx, V is represented in WeVx, and Q is represented in WeQx. For the decoder, K is represented in WdK y:t-1, V is represented in WdVy:t-1, and Q is represented in WdQy:t-1.


In S3, the importance constraint matrix in the present disclosure is represented by C, which is a result of input word information and target sequences {y:t}t=0T at different moments and can be expressed as follows:






C=F(x,w,y0,y:1,y:2, . . . ,y:T)=[f(x,w,y0);f(x,w,y:1); . . . ;f(x,w,y:t); . . . ;f(x,w,y:T)]  (3),


where y0 represents a given sequence at a start moment, which may be generated by using a special character, such as <EOS>; f(x,w,y:t) represents an input word importance constraint vector C:t contained in the target sequence y:t; y:t represents target sequences at moments before the moment t (including the moment t) in the sample; and T represents a length of the target sequence in the sample.

    • f (x,w,y:t) is calculated as follows:






f(x,w,y:t)=w·ct  (4)


where ⋅· represents a dot product operation of a vector or a matrix and w=[w1, w2, . . . , wi, . . . , wm]∈custom-characterm represents a relative importance vector of the input words x in the target sequence y:t and is calculated as follows:












w
¯

i

=

[




w
i

-

w
min




w
max

-

w
min



×

(

M
-
1

)


]


,



i


{

1
,
2
,

,
m

}



,




(
5
)







where wi represents the relative importance of an ith input word in the target sequence y:t; wi represents the absolute importance of the ith input word in the target sequence y:t; wmin represents the minimum absolute importance of the input word in the target sequence y:t; wmax represents the maximum absolute importance of the input word in the target sequence y:t; and [ ] represents a rounding operation.


The relative importance wi∈{0,1, . . . , M−1} obtained after the foregoing regularization process is an integer.


cicustom-characterm represents an input word constraint contained in the target sequence y:t. When the target sequence y:t contains the ith input word, an ith element in the vector ct is 1, and the vector is calculated as follows:










c
t

=



(

c
t
i

)


i
=
1

m

=

{







c
t
i

=
0

,





if



x
i




y

:
t










c
t
i

=
1

,





if



x
i




y

:
t






.







(
6
)







f(x,w,y:t) calculated by using formula (3) is an integer vector of the relative importance.


In S4, two new importance constraint embedding matrices WcKcustom-characterM×d and WcVcustom-characterM×d are introduced in the present disclosure. The constructed importance constraint matrix C is mapped to the distributed vector space to obtain the two input word importance embedding features h:tK and h:tV. The features are as follows at a moment t during generation:






h
:t
K
=W
c
K(C:t-1)=WcK[f(x,w,y:t-1)]  (7)






h
:t
V
=W
c
V(C:t-1)=WcV[f(x,w,y:t-1)]  (8)


t∈{1, 2, . . . , T} In addition, in formulas (7) and (8), the importance constraint matrix wcK and a corresponding row of WcK are indexed based on the relative importance f(x,w,y:t-1), and a default row is zeroed to obtain the features h:tK, h:tVcustom-characterM×t×d.


In S5, the CA layer fuses the feature (he) of the hidden layer of the encoder, the feature (hdt) of the hidden layer of the decoder, and the two input word importance embedding features (preferably, the two input word importance embedding features h:tK, h:tV in the present disclosure) to obtain a generated sequence y:to at a moment t.






y
:t
o
=CA(WdKhe,WdVhe,h:tK,h:tV,WdQhdt)  (9),


where WdK, WdV, WdQ represent weight matrices of a self-attention layer of the decoder.


In a specific implementation, a jth element in a CA function may be expressed as follows:












CA

(

q
,
k
,

h
k

,

h
v

,
v

)

j

=




i
=
0


M
-
1




α
ij

(

v
+

h
ij
v


)



,




(
10
)











where


q

=


W
d
K



h
e



;

k
=


W
d
V



h
e



;

v
=


W
d
Q



h
d
t



;








α
ij

=


soft



max

(

e
ij

)


=


exp



(

e
ij

)







d
-
1



l
=
0



exp



(

e
il

)






;









e
ij

=




q
j

(


k
i

+

h
ij
k


)

T


d



;

i
=
0


,
1
,

,


M
-
1

;


j


and


l

=
0


,
1
,

,

d
-
1.





Over time, S2 to S5 are repeated. When t=T, the DCGN obtains a final generated text sequence yr.


For samples in the training sample set, S1 to S5 are repeated to obtain generated sequences corresponding to the samples.


In S6, for a given N samples {x(n),y(n)}n=1N, the loss function of the DCGN constructed based on the generated sequence and the target sequence is as follows:











Loss
DCGN

=


1
TN






n
=
1

N





t
=
1

T


err

(


y

:
t

o

,

y

:
t



)





,




(
11
)







where err (y:to,y:t) represents an error between a generated sequence y:to and a target sequence y:t at a moment t, and is usually calculated through cross-entropy.


The network parameters are adjusted and optimized based on the loss function by using an Adam optimization algorithm. Then, S1 to S6 are repeated until the loss function meets the specified requirement, for example, the loss function tends to be stable and basically unchanged, to complete the construction of the DCGN. The network parameters are mainly the weight matrices of the self-attention layer of the encoder that are used to obtain the feature of the hidden layer of the encoder, the weight matrices of the self-attention layer of the decoder that are used to obtain the feature of the hidden layer of the decoder, and the importance constraint embedding matrices. Initialization parameters of the importance constraint embedding matrices may be obtained through random initialization. Initialization parameters of the weight matrices of the self-attention layer of the encoder that are used to obtain the feature of the hidden layer of the encoder and the weight matrices of the self-attention layer of the decoder that are used to obtain the feature of the hidden layer of the decoder may be obtained through random initialization. In a preferred implementation, a common knowledge text database is used to train a conventional Transformer network (such as Text-to-Text Transfer Transformer (T5) or Generative Pre-trained Transformer (GPT)) to obtain the initialization parameters of the weight matrices of the self-attention layer of the encoder and the weight matrices of the self-attention layer of the decoder. In this way, the DCGN provided in the present disclosure can understand common knowledge, and fluency of a design concept generated by the DCGN is ensured. The DCGN is further trained by using the method provided in the present disclosure. The DCGN can perform intelligent reasoning on engineering design knowledge to ensure the reasonableness of the generated design concept.


The present disclosure further provides a method for automatically generating a conceptual scheme. A constructed DCGN is used to perform the following steps:

    • L1: obtaining, by a Transformer encoder, a feature of a hidden layer of the encoder based on input words;
    • L2: obtaining, by a Transformer decoder, a feature of a hidden layer of the decoder at a moment t based on a generated sequence at a moment t−1;
    • L3: obtaining, by an importance constraint matrix generation module, an importance constraint matrix based on the input words in a sample and the generated sequence at the moment t−1;
    • L4: mapping, by an importance constraint embedding layer, the importance constraint matrix to a distributed vector space to obtain two input word importance embedding features; and
    • L5: obtaining, by a CA layer, a generated sequence based on the feature of the hidden layer of the encoder, the feature of the hidden layer of the decoder, and the two input word importance embedding features.


In L1, the input words may be keywords constructed based on a design problem, at least one design incentive, keywords constructed based on design requirements, or a combination of at least two of the foregoing input word sources.


In L2, the feature hdt of the hidden layer of the decoder at the moment t is calculated based on the generated sequence at the moment t−1 by using the following formula:






h
d
t=SA(WdKy:t-1,WdVy:t-1,WdQy:t-1)  (12).


y:t-1 represents an input sequence of the decoder at a moment t during generation, y:t-1=[y0o,y:t-1o], y0o represents a given sequence at a start moment and may be denoted by a special character such as <EOS>, and y:t-1o represents the generated sequence at the moment t−1.


In L3, during the generation of a conceptual scheme, the constraint matrix is calculated based on a time step and an actual sequence generated at each moment.


The importance constraint matrix C:t-1 is calculated based on the input words in the sample and the generated sequence at the moment t−1 by using the following formula:






C
:t-1
=f(x,w,y:t-1)  (13).


x represents the input words. y:t-1 represents the input sequence of the decoder at the moment t during the generation. w=[w1, w2, . . . , wi, . . . , wm]∈custom-characterm represents a relative importance vector of the input words x in the input sequence y:t-1 of the decoder and may be calculated by using formula (5). The absolute importance of the input words in the input sequence y:t-1 of the decoder may be preset based on the importance order of the input words or may be set to be the same.


In L4, the two input word importance embedding features h:tK, h:tV are obtained. The two input word importance embedding features h:tK, h:tV at the moment t are calculated by using formulas (7) and (8).


In L5, a generated sequence at the moment t is calculated by using formulas (9) and (10).


L1 to L5 are repeated until a length of the generated sequence meets a specified requirement or the end identifier <EOS> is generated to obtain a final generated sequence, namely, the conceptual scheme.


The present disclosure has the following beneficial effects over the prior art:


(1) The present disclosure ingeniously introduces a word importance constraint based on an attention mechanism of Transformer to construct a new DCGN.


(2) The importance constraint matrix proposed in the present disclosure records input word constraint information contained in a generated text sequence. The reliability and effectiveness of the generated conceptual scheme can be effectively ensured.


(3) The importance constraint embedding layer proposed in the present disclosure maps the constructed importance constraint matrix to the distributed vector space. Continuous real-number vectors are used to represent the relative importance of the input words in the generated sequence or the target sequence. This is conducive to capturing potential semantic importance information and implementing semantic knowledge reasoning.


(4) The CA layer constructed in the present disclosure maps the input word importance embedding features to the generated sequence to supervise the generation of a text sequence containing input word importance information.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a schematic diagram of a framework for constructing and using a DCGN according to an embodiment of the present disclosure;



FIG. 2 is a schematic principle diagram of a method for constructing a DCGN according to an embodiment of the present disclosure; and



FIG. 3 is a schematic diagram of a conceptual scheme generation process according to an embodiment of the present disclosure.





DETAILED DESCRIPTION OF THE EMBODIMENTS

The technical solutions in the embodiments of the present disclosure are clearly and completely described below by referring to the accompanying drawings. The described embodiments are merely some, rather than all, of the embodiments of the present disclosure. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present disclosure without creative efforts shall fall within the present disclosure.


Embodiment 1

As shown in FIG. 1, text data is obtained through a web crawler technology and preprocessed. Subsequently, a DCGN model is constructed and trained. Finally, a conceptual scheme is automatically generated by inputting design keywords as input words into a trained DCGN model.


In this embodiment, a conventional web crawler technology is used to crawl massive text data, such as scientific papers and patents, from websites, and the acquired text data is filtered to obtain sentences with a specific length as a corpus for this study. Then, the text data is preprocessed, and a keyword extraction algorithm is used to extract a specific number of keywords (excluding stop words) and their importance from each sentence. Finally, each sentence and its corresponding keyword information are combined into a sample pair, and a sample set consisting of all sample pairs is constructed for subsequent network training. In each sample, the extracted keywords are used as an input sequence, and the corresponding sentence is used as a target sequence.


(I) Construction of a DCGN


In this embodiment, a word importance constraint is ingeniously introduced based on a self-attention mechanism of a Transformer network to construct a new generation network. A DCGN includes a Transformer encoder, a Transformer decoder, an importance constraint matrix generation module, an importance constraint embedding layer, a CA layer, and an optimization module. The Transformer encoder is configured to obtain a feature of a hidden layer of the encoder. The Transformer decoder is configured to obtain a feature of a hidden layer of the decoder. The importance constraint matrix generation module is configured to generate an importance constraint matrix. The importance constraint embedding layer is configured to map the importance constraint matrix to a distributed vector space to obtain two input word importance embedding features. The CA layer is configured to obtain a generated sequence. The optimization module is configured to optimize network parameters based on a loss function.


In a method for constructing a DCGN provided in this embodiment, the sample set is used for training to obtain weight matrices of a self-attention layer of the encoder that are used to obtain the feature of the hidden layer of the encoder, weight matrices of a self-attention layer of the decoder that are used to obtain the feature of the hidden layer of the decoder, and two importance constraint embedding matrices.


In this embodiment, a common knowledge text database (selected from Wikipedia) is used to train a conventional Transformer network (T5) to obtain initialization parameters of the weight matrices of the self-attention layer of the encoder that are used to obtain the feature of the hidden layer of the encoder and the weight matrices of the self-attention layer of the decoder that are used to obtain the feature of the hidden layer of the decoder. Initialization parameters of the two importance constraint embedding matrices are obtained through random initialization.


1. The T5 network is trained by using the common knowledge text database.


The T5 network is trained by using the common knowledge text database to obtain the weight matrices (WeK, WeV, and WeQ) of the self-attention layer of the encoder and the weight matrices (WdK, WdV, and WdQ) of the self-attention layer of the decoder. Interpretation of the encoder and the decoder is as described above. A specific process of training the T5 network can be found in literature, such as “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (Colin Raffel et al, Journal of Machine Learning Research 21(2020)1-67)”. The weight matrices (WeK, WeV, and WeQ) of the self-attention layer of the encoder and the weight matrices (WdK, WdV, and WdQ) of the self-attention layer of the decoder in a trained T5 network are used as initialization parameters of the DCGN in the present disclosure.


2. The DCGN is constructed.


As shown in FIG. 2, the method for constructing a DCGN provided in this embodiment includes the following steps:


S1: The feature of the hidden layer of the encoder is obtained by the Transformer encoder based on input words in a sample.


In this step, the feature he of the hidden layer of the encoder is calculated based on the input words x={x1,x2, . . . , xm} in the sample by using formula (1).


S2: The feature of the hidden layer of the decoder is obtained by the Transformer decoder based on a target sequence in the sample.


In this step, a feature hdt of the hidden layer of the decoder at a moment t is calculated based on the target sequence y:t-1=[y0,y1, . . . , yt-1] in the sample by using formula (2).


S3: The importance constraint matrix is obtained by the importance constraint matrix generation module based on the input words and the target sequence in the sample.


The importance constraint matrix C is determined by using formula (3).


An importance constraint matrix C:t-1=f(x,w,y:t-1) at a moment t−1 is calculated based on the input words and a target sequence y:t_1 at the moment t−1 by using formulas (4) to (6).


The following uses a specific example to describe a detailed process of calculating C during training of the DCGN. It is assumed that the input of the DCGN is a set of three keywords {“sensor”, “device”, “sowing” }, and the generated target sequence is “a sensor device for determining a position of seeds while sowing.” It is assumed that the importance of the input words in the target sequence is w=[0.9,0.7,0.5] and M=5. A relative importance vector w=[4,2,0] may be calculated by using formula (4) and represents the relative importance of the three input words in the target sequence. The following describes a procedure for calculating C, as shown in Table 1.


(a) A target sequence at a moment when a start identifier <EOS> is generated does not contain any input word. Therefore, c0 is an all-zero vector at this moment. C0=f(x,w,y0) calculated by using formula (4) is also an all-zero vector, corresponding to values in the first column in Table 1.


(b) A second generated target word is “a”, and a target sequence at this moment still does not contain any input word. Therefore, c1 is an all-zero vector at this moment. C:1=f(x,w,y:1) calculated by using formula (4) is also an all-zero vector, corresponding to values in a second column in Table 1.


(c) A third generated target word is “sensor”, and a target sequence at this moment contains only “sensor” in the input words. Therefore, c2=[1;0;0]. C:2=f(x,w,y:2)=[4;0;0] may be calculated by using formula (4), corresponding to values in the third column in Table 1.


(d) A fourth generated target word is “device”, and a target sequence at this moment contains “sensor” and “device” in the input words. Therefore, c3=[1;1;0]C:3=f(x,w,y:3)=[4;2;0] may be calculated by using formula (4), corresponding to values in the fourth column in Table 1.


(e) The rest may be deduced by analogy until an end identifier <EOS> is generated.









TABLE 1







Process of generating C during the construction of the DCGN
















Keyword
Importance
<EOS>
a
sensor
device
. . .
sowing

<EOS>





sensor
0.9
0
0
4
4
. . .
4
4



device
0.7
0
0
0
2
. . .
2
2


sowing
0.5
0
0
0
0
. . .
0
0




w = [4, 2, 0]

f(x, w, y0)
f(x, w, y1)
f(x, w, y2)
f(x, w, y3)
. . .
f(x, w, y11)
f(x, w, y12)







1st column 2nd column 3rd column 4th column









S4: The importance constraint matrix is mapped by the importance constraint embedding layer to the distributed vector space to obtain the two input word importance embedding features.


In this step, two input word importance embedding features h:tK and h:tV at the moment t are calculated by using formulas (7) and (8).


S5: The generated sequence is obtained by the CA layer based on the feature of the hidden layer of the encoder, the feature of the hidden layer of the decoder, and the two input word importance embedding features.


In this step, a generated sequence y:to at the moment t is calculated by using formulas (9) and (10).


Over time, S2 to S5 are repeated. When t=T, the DCGN obtains a final generated text sequence y:To.


For the given N samples {x(n),y(n)}n=1N in a training sample set, S1 to S5 are repeated to obtain generated sequences corresponding to the N samples.


S6: The loss function is constructed based on the generated sequence and the target sequence, and the network parameters are adjusted based on the loss function. S1 to S6 are repeated until the loss function meets a specified requirement to obtain the DCGN.


In this step, for the given N samples, the loss function of the DCGN is calculated by using formula (11). The network parameters are adjusted and optimized based on the loss function by using a conventional Adam optimization algorithm. Then, S1 to S6 are repeated until the loss function meets the specified requirement, for example, the loss function tends to be stable and basically unchanged, to complete the construction of the DCGN.


After sufficient training, the DCGN has capabilities of knowledge expression and reasoning and can adaptively extract, transfer, and reorganize cross-field design knowledge. In this stage, relevant design concept descriptions can be automatically generated by inputting well-defined design problems, valuable knowledge incentives, or the like into the trained DCGN. The DCGN combines the design knowledge from different fields such that a generated design concept contains input design information, and novelty and inventiveness of the generated design concept are ensured.


(II) Testing of the DCGN


The effectiveness and practicality of a proposed method for automatically generating a conceptual scheme are tested by inputting design problems (namely, keywords) as follows.


In the method for automatically generating a conceptual scheme provided in this embodiment, the constructed DCGN is used to perform the following steps:


L1: The feature of the hidden layer of the encoder is obtained by the Transformer encoder based on the input words.


In this step, the feature he of the hidden layer of the encoder is calculated by using formula (1).


L2: A feature of the hidden layer of the decoder at a moment t is obtained by the Transformer decoder based on a generated sequence at a moment t−1.


In this step, the feature hdt of the hidden layer of the decoder is calculated by using formula (12).


L3: The importance constraint matrix is obtained by the importance constraint matrix generation module based on the input words in the sample and the generated sequence at the moment t−1.


In this step, the importance constraint matrix C:t-1 is calculated by using formula (13).


In this embodiment, the absolute importance of input words in the input sequence y:t-1 of the decoder is set to be the same, and a value of wi is 1.


L4: The importance constraint matrix is mapped by the importance constraint embedding layer to the distributed vector space to obtain the two input word importance embedding features.


In this step, the two input word importance embedding features h:tK, h:tV are obtained. The two input word importance embedding features h:tK,h:tV at the moment t are calculated by using formulas (7) and (8).


L5: The generated sequence is obtained by the CA layer based on the feature of the hidden layer of the encoder, the feature of the hidden layer of the decoder, and the two input word importance embedding features.


In this step, a generated sequence at the moment t is calculated by using formulas (9) and (10).


L1 to L5 are repeated until a length of the generated sequence meets a specified requirement or the end identifier <EOS> is generated to obtain a final generated sequence, namely, the conceptual scheme.


Therefore, in the stage of generating the specific conceptual scheme, output words at the moment t−1 are used as a new part of the input at the moment t, and new words are generated in turn until the end identifier <EOS> is generated. The process is shown in FIG. 3. Let x={drone, deliver, life, preserver} be the input words, and the special character <EOS> represents the generated sequence y0o at a start moment. L1 to L5 are repeated until the end identifier <EOS> is generated to obtain a generated sequence y:To={the drone delivers the life preserver to a . . . }.


In the generation stage, C is calculated based on a time step and an actual sequence generated at each moment, which is irrelevant to the target sequence. This is different from the training stage.


The following describes specific examples of generating conceptual schemes from different input word sources.


1. A design problem in this example is to provide drinkable water for residents in coastal areas. To express the design problem more accurately and concisely, 10 graduate students majoring in mechanical engineering are invited to define the design problem by using a limited number of keywords. Considering the advantages of sunshine and abundant light in the coastal areas, a design team agrees to use the keywords “purification” or “purify”, “desalination” or “desalinate”, “solar”, “seawater”, and “drink” to define the design problem. Combinations of different keywords are used as design input, and corresponding design concepts can be automatically generated based on the method for automatically generating a conceptual scheme through the constructed DCGN. Results are shown in Table 2. The automatically generated design concepts are more specific and feasible, such as inventing a system for purifying seawater into drinking water or using solar energy to desalinate seawater to produce canned drinking water or beverage products. These design concepts provide the residents in the coastal areas or enterprises with design ideas in an early stage of product development.









TABLE 2







Automatic generation of conceptual schemes with


different design problem keywords as input









Sequence
Design problem



number
keywords
Automatically generated conceptual schemes












1
purification
a) the present invention relates to a water purification system



seawater
for purifying seawater, drink water, and/or distilled water.



drink
b) the present invention relates to a water purification system



water
for purifying seawater, drink water, and other liquids.


2
desalinate
a) the present invention relates to a method of desalinating



seawater
seawater, and to an apparatus for preparing the same.




b) the present invention relates to a method of desalinating




seawater, and the use of the method.


3
purify
a) a method to purify seawater using solar energy is provided.



seawater
b) a method to purify seawater using solar energy is described.



solar


4
purify
a) solar power is used to purify seawater from a water source for



seawater
use in preparing alcoholic beverages and drinks.



solar
b) solar power is used to purify seawater from a water source for



drink
use in preparing bottled water for drinking.


5
desalinate
a) the seawater is desalinated by solar energy and is used to



seawater
produce water for drinking.



solar
b) the seawater is desalinated using solar energy to produce a



drink
water drink.









2. Design problems involved in the present disclosure may also be composed of design incentives. During product innovation concept design, design incentives provide rich and valuable design inspiration. In a conventional process of manually generating a conceptual scheme, design incentives for the conceptual scheme often rely on the rich experience and knowledge of a designer. In addition, the efficiency of generating the conceptual scheme is very low. The process becomes very difficult for inexperienced novice designers. Some obtained design incentives regarding a drone in this embodiment are shown in Table 3. Combinations of different design incentives are input to the DCGN to automatically generate conceptual schemes, as shown in Table 4. Due to a wide variety of combinations, only some valuable conceptual schemes are shown and analyzed herein. Examples:


(1) Design incentives “drone”, “bio”, “radar”, and “rescue” are combined, and the DCGN automatically generates a design concept “a drone rescue radar system is disclosed that is capable of detecting the presence of an animal in the vicinity of the drone using bio”.


(2) Design incentives “drone”, “fire”, “ground”, and “data” are combined, and the DCGN automatically generates design concepts “the drone may also be configured to receive ground fire data from the ground drone and to determine a location of the fire in response to detecting the resulting fire” and “the drone may also be configured to receive ground fire data from the ground drone and to determine a location of the fire in response to determining the terrain”.









TABLE 3







Design incentives retrieved in the drone


example (in no particular order)








Sequence number
Design incentive











1
drone


2
package_delivery


3
flying_time


4
wirelessly_charge


5
jointly_optimize_trajectory


6
unmanned_drone


7
fireground


8
evacuation_rescue


9
firefighting


10
fire_rescue


11
powerline_inspection


12
smoke_alarm


13
vegetation_encroachment


14
facade_cleaning


15
wall_climb


16
ultra-wideband_localization


17
intelligence_surveillance


18
military_reconnaissance


19
bio-radar


20
search_rescue


21
disaster_stricken


22
surveillance_search_rescue


23
rescue_mission


24
data_mule
















TABLE 4







Design concepts automatically generated by combining


different design incentives in the drone example









Sequence
Design incentive



number
combination
Automatically generated design concept












1
drone
the drone includes a wireless charging module that is configured



wirelessly
to charge the battery and to provide power to the uav.



charge


2
drone
a) the drone may jointly optimize the trajectory of each of the



jointly
plurality of drones based on the estimated trajectory.



optimize
b) the drone may jointly optimize the trajectory of each of the



trajectory
plurality of drones based on the first trajectory.




c) the drone may jointly optimize the trajectory of the vehicle




based on the estimated trajectory.


3
drone
a drone rescue and evacuation system and method of using the



evacuation
drone are provided.



rescue


4
drone
the present disclosure relates to a firefighting drone and to systems



firefighting
and methods for using the same in order to control fire fighting.


5
drone
apowerline inspection drone is provided that includes at least one



powerline
control unit configured to control an inspection system of the drone.



inspection


6
drone
the drone may also include a vegetation encroachment detection



vegetation
system that detects the vegetation.



encroachment


7
drone
a facade cleaning drone includes at least one cleaning unit



facade
configured to clean the facade of the drone.



cleaning


8
drone
the drone climbs up and down the wall, thereby reducing the



wall
amount of time it takes to complete the climb.



climb


9
drone
a) a wideband localization drone is provided that is capable of



wideband
detecting and transmitting data to and from the uav.



localization
b) a wideband localization drone is provided that is capable of




detecting and transmitting data to and from one or more receivers.




c) a wideband localization drone is provided that is capable of




detecting and transmitting data to and from one or more remote locations.


10
drone
an intelligent surveillance drone and a method for controlling the



intelligent
same are provided.



surveillance


11
drone
a) a military reconnaissance drone and method of deploying the



military
same are provided.



reconnaissance
b) a military reconnaissance drone and method of using the same




are provided.


12
drone
a) the drone may also include a bio-sensing radar (bsr) module



bio
configured to detect the presence of the snr.



radar
b) the drone may also include a bio-sensing radar (bsr) module




configured to detect the presence of the sdr.




c) the drone may also include a bio-sensing radar (bsr) module




configured to detect the presence of the smr.


13
drone
a search and rescue (srs) drone is provided that can be used to



search
search for and locate objects in an area.



rescue


14
drone
a drone for search, rescue, surveillance, and/or surveillance is



surveillance
provided.



search



rescue


15
drone
the drone may include a rescue mission controller configured to



rescue
receive data associated with the mission.



mission


16
drone
a) the drone may also include a fireground sensor configured to



fireground
receive fire ground data and to generate fire surface data.



data
b) the drone may also include a fireground sensor configured to




generate fire ground data.




c) the drone may also include a fireground sensor configured to




receive fire ground data.


17
drone
a) the drone may also be configured to receive ground fire data



fire
from the ground drone and to determine a location of the fire in



ground
response to detecting the resulting fire.



data
b) the drone may also be configured to receive ground fire data




from the ground drone and to determine a location of the fire




in response to determining the terrain.


18
drone
afireground smoke alarm system for use with drones is disclosed



fireground
herein for detecting fires.



smoke



alarm


19
drone
a drone rescue radar system is disclosed that is capable of detecting



bio
the presence of an animal in the vicinity of the drone using bio.



radar



rescue









3. To fully supplement the design problems involved, some design problems may be defined based on design requirements. In an early stage of product design, the design requirements are critical to determining a design direction of a new product. Online product review data provides accurate, reliable, and truthful information for analyzing the design requirements and is easy to access. Text of 20918 user comments of a bottle sterilization cabinet is extracted from an e-commerce platform through a conventional crawler technology. Keywords and corresponding word frequencies are analyzed through data preprocessing described above. Results are shown in Table 5. It is found through analysis that users mainly expressed clear requirements in terms of function, disinfection, capacity, temperature, and the like. To apply the design requirements to obtain the design problems, the keywords “disinfection” or “sterilization”, “temperature”, “function”, and “capacity” are used as the design problems of the DCGN. Automatically generated conceptual schemes are shown in Table 6. It is easy to learn that different conceptual schemes are generated with different combinations of input keywords. More importantly, all automatically generated conceptual schemes contain the input design problem keywords, and some feasible and creative conceptual schemes are generated, such as using an ion exchanger to improve sterilization and disinfection capabilities. The design requirements are met to some extent.









TABLE 5







Top 30 requirement keywords with high frequencies


in the online user review data









Sequence number
User requirement
Frequency












1
Convenience
6753


2
Function
5707


3
Disinfection or sterilization
4485


4
Baby
3506


5
Capacity
3145


6
Appearance
2918


7
Milk bottle
2577


8
Operation
2504


9
Simple
2284


10
Temperature
2252


11
Constant temperature
2207


12
Use
2033


13
Shape
1990


14
Size
1965


15
Milk powder
1937


16
Drying
1741


17
Jingdong
1454


18
Practicality
1413


19
Fast
1375


20
Quality
1356


21
Kettle
1254


22
Purchase
1148


23
Child
1142


24
White bear
1120


25
Logistics
999


26
Home
994


27
Time
940


28
Little bear
929


29
Heat preservation
913


30
Satisfied
889
















TABLE 6







Automatic generation of design concepts with different


design requirement keywords as input









Sequence
Design requirement



number
keywords
Automatically generated design concept












1
sterilization
the present invention relates to a method and apparatus for



function
sterilization, which is capable of enhancing the function,



capacity
capacity, and/or performance of an ion exchanger.




the present invention relates to a method and apparatus for




sterilization, which is capable of enhancing the function,




capacity, and/or effectiveness of the disinfection apparatus.


2
sterilization
the present invention relates to a sterilization apparatus



capacity
capable of reducing the temperature, thereby increasing the



temperature
sterilizing capacity, and improving the sterility of the product.




the present invention relates to a sterilization apparatus capable




of reducing the temperature, thereby increasing the sterilizing




capacity, and improving the thermal stability of the device.


3
disinfection
the present invention relates to a disinfection apparatus



function
capable of enhancing the function, capacity, and/or efficacy



capacity
of an iodine disinfectant.




the present invention relates to a disinfection apparatus




capable of enhancing the function, capacity, and/or efficacy




of an antimicrobial agent.


4
disinfection
thedisinfectioncapacity of the present invention




can be improved by reducing the oxidation temperature.



capacity
thedisinfectioncapacity of the present invention



temperature
can be improved by adjusting the temperature.


5
disinfection
thedisinfectioncapacity is a function of the temperature



function
at which the vaporizer is heated and/or cooled.



capacity
thedisinfectioncapacity is a function of the temperature



temperature
at which the fluid is heated and the vaporization capacity.


6
sterilization
afunction of the sterilizationtemperature is to determine



function
if the sanitizer has the capacity to sterilize and/or not.



capacity
afunction of the sterilizationtemperature is to determine



temperature
if the sanitizer has the capacity to sterilize the product.









In summary, if designers think about these design problems and rely only on human experience to produce conceptual schemes, it is difficult and inefficient to create innovative conceptual schemes. In view of the problem that it is difficult to transfer and reorganize cross-field design knowledge and automatically generate design conceptual schemes during the generation of product conceptual schemes, the present disclosure provides the method for automatically generating a conceptual scheme through a DCGN. The DCGN can adaptively learn reasoning, transfer, reorganization, and other potential rules of the cross-field design knowledge from massive text data and automatically generate the product conceptual schemes based on the design problems. The burden of manually generating conceptual schemes is reduced, design efficiency is improved, and new ideas are provided for intelligent conceptual design.

Claims
  • 1. A method for constructing a design concept generation network (DCGN), wherein the DCGN comprises a Transformer encoder, a Transformer decoder, an importance constraint matrix generation module, an importance constraint embedding layer, a cross-attention (CA) layer, and an optimization module; and the method comprises the following steps: S1: obtaining, by the Transformer encoder, a feature of a hidden layer of the Transformer encoder based on input words in a sample;S2: obtaining, by the Transformer decoder, a feature of a hidden layer of the Transformer decoder based on a target sequence in the sample;S3: obtaining, by the importance constraint matrix generation module, an importance constraint matrix based on the input words and the target sequence in the sample;S4: mapping, by the importance constraint embedding layer, the importance constraint matrix to a distributed vector space to obtain two input word importance embedding features;S5: obtaining, by the CA layer, a generated sequence based on the feature of the hidden layer of the Transformer encoder, the feature of the hidden layer of the Transformer decoder, and the two input word importance embedding features; andS6: constructing a loss function based on the generated sequence and the target sequence, and adjusting, by the optimization module, network parameters based on the loss function; and repeating S1 to S6 until the loss function meets a specified requirement to obtain the DCGN.
  • 2. The method according to claim 1, wherein in S1, the Transformer encoder obtains the feature he of the hidden layer of the Transformer encoder by using the following formula: he=SA(WeKx,WeVx,WeQx)  (1),wherein x represents the input words; SA( ) represents a spatial attention; and WeK, WeV, and WeQ represent weight matrices of a self-attention layer of the Transformer encoder.
  • 3. The method according to claim 1, wherein in S2, the Transformer decoder maps a target sequence y:t-1=[y0,y1, . . . , yt-1] at a moment t−1 to a distributed feature representation through a self-attention layer to obtain the feature hdt of the hidden layer of the Transformer decoder: hdt=SA(WdKy:t-1,WdVy:t-1,WdQy:t-1)  (2),wherein SA( ) represents a spatial attention; and WdK, WdV, and WdQ represent weight matrices of the self-attention layer of the Transformer decoder.
  • 4. The method according to claim 3, wherein in S3, f(x, w, y:t) represents an input word importance constraint vector C:t contained in the target sequence Y:t; f(x,w,y:t) is calculated as follows: f(x,w,y:t)=w·ct  (4),wherein ⋅· represents a dot product operation of a vector or a matrix; and w=[w1, w2, . . . , wi, . . . , wm]∈m represents a relative importance vector of the input words x in the target sequence y:t and is calculated as follows:
  • 5. The method according to claim 1, wherein in S4, two importance constraint embedding matrices WcK and WcV are introduced, and the importance constraint matrix is mapped to the distributed vector space to obtain the two input word importance embedding features h:tK and h:tV, wherein h:tK and h:tV are as follows at a moment t during generation: h:tK=WcK(C:t-1)=WcK[f(x,w,y:t-1)]  (7)h:tV=WcV(C:t-1)=WcV[f(x,w,y:t-1)]  (8).
  • 6. The method according to claim 5, wherein in S5, the CA layer fuses the feature he of the hidden layer of the Transformer encoder, the feature hdt of the hidden layer of the Transformer decoder, and the two input word importance embedding features h:tK, h:tV to obtain a generated sequence y:to at the moment t: y:to=CA(WdKhe,WdVhe,h:tK,h:tV,WdQhdt)  (9),wherein WdK, WdV, WdQ represent weight matrices of a self-attention layer of the Transformer decoder;in a specific implementation, a jth element in a CA function is expressed as follows:
  • 7. The method according to claim 1, wherein in S6, for given N samples {x(n),y(n)}n=1N,the loss function constructed based on the generated sequence and the target sequence is as follows:
  • 8. A method for automatically generating a conceptual scheme, performed by the DCGN constructed by using the method according to claim 1 and comprising the following steps: L1: obtaining, by the Transformer encoder, the feature of the hidden layer of the Transformer encoder based on the input words;L2: obtaining, by the Transformer decoder, the feature of the hidden layer of the Transformer decoder at a moment t based on the generated sequence at a moment t−1;L3: obtaining, by the importance constraint matrix generation module, the importance constraint matrix based on the input words in the sample and the generated sequence at the moment t−1;L4: mapping, by the importance constraint embedding layer, the importance constraint matrix to the distributed vector space to obtain the two input word importance embedding features; andL5: obtaining, by the CA layer, the generated sequence based on the feature of the hidden layer of the Transformer encoder, the feature of the hidden layer of the Transformer decoder, and the two input word importance embedding features.
  • 9. The method for automatically generating the conceptual scheme according to claim 8, wherein in L1, the input words are keywords constructed based on a design problem, at least one design incentive, keywords constructed based on design requirements, or a combination of at least two of input word sources.
  • 10. The method for automatically generating the conceptual scheme according to claim 8, wherein in L2, the feature hdt of the hidden layer of the Transformer decoder at the moment t is calculated based on the generated sequence at the moment t−1 by using the following formula: hdt=SA(WdKy:t-1,WdVy:t-1,WdQy:t-1)  (12),wherein y:t-1 represents an input sequence of the Transformer decoder at the moment t during generation, y:t-1=[y0o,y:t-1o], y0o represents a given sequence at a start moment, and y:t-1o represents the generated sequence at the moment t−1.
  • 11. The method according to claim 4, wherein in S4, two importance constraint embedding matrices WcK and WcK are introduced, and the importance constraint matrix is mapped to the distributed vector space to obtain the two input word importance embedding features h:tK and h:tV, wherein h:tK and h:tV are as follows at a moment t during generation: h:tK=WcK(C:t-1)=WcK[f(x,w,y:t-1)]  (7)h:tV=WcV(C:t-1)=WcV[f(x,w,y:t-1)]  (8).
  • 12. The method for automatically generating the conceptual scheme according to claim 8, wherein in S1 of the method for constructing the DCGN, the Transformer encoder obtains the feature he of the hidden layer of the Transformer encoder by using the following formula: he=SA(WeKx,WeVx,WeQx)  (1),wherein x represents the input words; SA( ) represents a spatial attention; and WdK, WeV, and WeQ represent weight matrices of a self-attention layer of the Transformer encoder.
  • 13. The method for automatically generating the conceptual scheme according to claim 8, wherein in S2 of the method for constructing the DCGN, the Transformer decoder maps a target sequence Y:t-1=[y0,y1, . . . , yt-1] at the moment t−1 to a distributed feature representation through a self-attention layer to obtain the feature hdt of the hidden layer of the Transformer decoder: hdt=SA(WdKy:t-1,WdVy:t-1,WdQy:t-1)  (2),wherein SA( ) represents a spatial attention; and WdK, WdV, and WdQ represent weight matrices of the self-attention layer of the Transformer decoder.
  • 14. The method for automatically generating the conceptual scheme according to claim 13, wherein in S3 of the method for constructing the DCGN, f(x,w,y:t) represents an input word importance constraint vector C:t contained in the target sequence y:t; f(x,w,y:t) is calculated as follows: f(x,w,y:t)=w·ct  (4),wherein ⋅· represents a dot product operation of a vector or a matrix; and w=[w1, w2, . . . , wi, . . . , wm]∈m represents a relative importance vector of the input words x in the target sequence y:t and is calculated as follows:
  • 15. The method for automatically generating the conceptual scheme according to claim 8, wherein in S4 of the method for constructing the DCGN, two importance constraint embedding matrices WcK and WcK are introduced, and the importance constraint matrix is mapped to the distributed vector space to obtain the two input word importance embedding features h:tK and h:tV, wherein h:tK and h:tV are as follows at the moment t during generation: h:tK=WcK(C:t-1)=WcK[f(x,w,y:t-1)]  (7)h:tV=WcV(C:t-1)=WcV[f(x,w,y:t-1)]  (8).
  • 16. The method for automatically generating the conceptual scheme according to claim 15, wherein in S5 of the method for constructing the DCGN, the CA layer fuses the feature he of the hidden layer of the Transformer encoder, the feature hdt of the hidden layer of the Transformer decoder, and the two input word importance embedding features h:tK, h:tV to obtain a generated sequence y:to at the moment t: y:to=CA(WdKhe,WdVhe,h:tK,h:tV,WdQhdt)  (9),wherein WdK, WdV, WdQ represent weight matrices of a self-attention layer of the Transformer decoder;in a specific implementation, a jth element in a CA function is expressed as follows:
  • 17. The method for automatically generating the conceptual scheme according to claim 8, wherein in S6 of the method for constructing the DCGN, for given N samples {x(n),y(n)}n=1N, the loss function constructed based on the generated sequence and the target sequence is as follows:
  • 18. The method for automatically generating the conceptual scheme according to claim 9, wherein in L2, the feature hdt of the hidden layer of the Transformer decoder at the moment t is calculated based on the generated sequence at the moment t−1 by using the following formula: hdt=SA(WdKy:t-1,WdVy:t-1,WdQy:t-1)  (12),wherein y:t-1 represents an input sequence of the Transformer decoder at the moment t during generation, y:t-1=[y0o,y:t-1o], y0o represents a given sequence at a start moment, and y:t-1o represents the generated sequence at the moment t−1.
Priority Claims (1)
Number Date Country Kind
202210780085.4 Jul 2022 CN national
CROSS REFERENCE TO THE RELATED APPLICATIONS

This application is a continuation-in-part application of International Application No. PCT/CN2022/125347, filed on Oct. 14, 2022, which is based upon and claims priority to Chinese Patent Application No. 202210780085.4, filed on Jul. 4, 2022, the entire contents of which are incorporated herein by reference.

Continuations (1)
Number Date Country
Parent PCT/CN2022/125347 Oct 2022 US
Child 18120434 US