This application claims the priority benefit of China application serial no. 202211289031.4 filed on Oct. 20, 2022, The entirety of the above-mentioned patent application is hereby incorporated by reference herein and made a part of this specification.
The present invention relates to the technical field of computers, relates to electrical power system data processing, and provides a deep learning-based method for fusing multi-source urban energy data and a storage medium.
In an era of big data, there is an increasingly wider range of sources of urban energy data, and data types are also increasingly diversified. However, with the 5Vs of big data, namely, volume, variety, value, veracity, and velocity, it is difficult to adequately dig up implicit information in urban energy big data. Therefore, an effective means is highly needed to fuse global multi-source heterogeneous urban energy data and dig up valuable information for use. The data fusion technology is widely used as an important data processing means. In various fields, the data fusion technology effectively improves the capabilities of man to process and use industrial big data. By using the data fusion technology, a huge amount of high-dimensional, multi-source heterogeneous, and noisy industrial data goes through denoising integrated modeling, and multi-scale classification to provide reliable data resource for subsequent correlation analysis, performance prediction, and decision-making optimization. Therefore, the data fusion technology greatly supports the development of the fields to which the technology is applied.
Commonly used data fusion methods include a probability-based fusion method, a Dempster-Shafer evidence theory—based fusion method, a knowledge-based fusion method, etc. The probability-based fusion method includes Bayesian reasoning, a Kalman filtering model, a Markov model. The key mathematical theory behind such fusion methods is Bayesian reasoning in which a probability distribution and a probability density function are introduced to indicate a dependence between random variables to establish a relationship between different data sets. The evidence theory—based fusion method mainly includes the D-S evidence theory in which a confidential level and a reasonable degree are introduced to indicate uncertainty of data, the reasoning is performed dynamically, and data fusion is performed by using a specified rule of fusion. As the development of Bayesian reasoning, the D-S theory, compared with Bayesian reasoning, has an advantage of not requiring a prior probability of data. The knowledge-based fusion method includes a vector machine, clustering, and another method. In such a method, a large amount of useful knowledge information is considered to be contained in data, and the key of such a method is to find out the knowledge contained in the data and measure the correlation and similarity between the knowledge.
However, the foregoing common data fusion methods have many disadvantages for fusing big data. For example, the probability-based fusion method has drawbacks to obtaining a prior probability and processing complicated high-dimensional data; the evidence theory-based fusion method has drawbacks to estimating a quality function; and the knowledge-based fusion method is sensitive to missing data and noisy data. With the development of computation hardware and data processing technologies in computers, a computation capability of a computing device is not an obstacle to deep learning any longer. This has brought new development opportunities to use deep learning in data fusion. Deep learning can perform self-learning based on training data and does not require implementation of particular programming to solve each problem. A deep learning model is intended to model data to obtain an in-depth correlative relationship in the data and establish a knowledge frame, such that the model is finally used to predict, classify, extract a feature, and so forth. In recent years, researchers have tried to use deep learning in data fusion and hope to enhance the performance of a fusion algorithm in processing big data. There are still some difficulties in using a deep learning-based data fusion method, especially for the heterogeneity of data that is to be fused, while heterogeneity is a key feature of big data. At present, urban energy big data is from various areas, and there are not only structured data sheets, but there are more non structured data, for example, texts, images, and audio. The heterogeneity exists between these multi-source data indicates a difference between features of these data. How to correlate, cross heterogeneous data, and finally obtain a correlative relationship between the data is significant for studying fusion of current urban energy data.
The purpose of the present invention is to provide a transformer-based method for fusing multi-source urban energy data that can used to perform data fusion on different types of urban energy data from various industries and facilitate the development of a smart urban energy system, efficient management of urban energy, and revolutionary development of the urban energy industry.
The technical solution of the present invention is a deep learning-based method for fusing multi-source urban energy data. The method includes the following steps:
S1, converting obtained multi-source urban energy data into a multimodal input sequence, where three types of heterogeneous data of the urban energy data include a text XW∈RT
S2, performing one-dimensional time convolution once on the data in the multimodal input sequence in S1 to obtain time information and obtain an urban energy data feature with the time information;
S3, performing positional encoding (PE) on an output in S2 to ensure that the time information is retained in subsequent calculation;
S4, performing multi-scale and multimodal information fusion on an output in S3 by using a cross-modal transformer to implement cross-modal mutual fusion of the three types of heterogeneous data that is represented by [ZI→W[D], ZA→W[D]]∈RT
S5, putting [ZI→W[D], ZA→W[D]]∈RT
The present invention further provides a computer-readable storage medium. The computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of the foregoing method are implemented.
Compared with a conventional technology, the present invention has the following beneficial effects:
In the present invention, a manner of fusing multi-source data that is more applicable to urban energy data is proposed. Unlike existing transformer-based technologies, in this manner, multi-scale and multimodal information fusion is performed on an output in S3 by using a cross-modal transformer. Especially the independently designed MCMulT module can lay more emphasis on data fusion between modes that needs to be fused in a prediction result, thereby improving the quality of representation learned from an unaligned multimodal sequence to enhance the applicability for extensive data fusion. Both performance and efficiency are taken into account, and urban energy heterogeneous data in different dimensions and different modes can be fused, so that the problem of having difficulty in fusing heterogeneous data is solved. As the problem of heterogeneity of urban energy data is solved, a high dimension due to direct tensor multiplication for fusion is prevented, and the calculation complexity for an urban energy big data platform is reduced.
The technical solution of the present invention is clearly and completely described below in conjunction with the accompanying drawings. Obviously, the described embodiments are merely a part of the embodiments of the present invention and not all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those ordinarily skilled in the art without paying creative work shall fall within the protection scope of the present invention.
As shown in
The method of the present invention includes the following steps:
S1: Sources of the global multi-source urban energy data include the water, coal, electricity, heating power, and oil industries. Textual data is sourced in production data, management data, and marketing data that are found in urban energy big data from various energy industries, and data about consumption of various types of energy; image data is sourced in geographic information system (GIS) information and a meteorogram that are found in energy big data from coal, oil, and electricity industries, traffic flow image information about energy consumption of oil and electricity, and the like. Audio data is sourced in energy use-related audio report information obtained from various energy industries through Internet big data mining, interview audio information related to energy use in various industries, and audio data of an interview with people working in various energy industries about a usage amount of a type of energy in the future. The global multi-source urban energy data that is found includes, but is not limited to, the foregoing data. Global multi-source urban energy data in each energy industry that is found is separately converted to a multimodal input sequence X, and three types of heterogeneous data, that is, a text XW∈RT
S2: One-dimensional time convolution is performed once separately on the urban energy data in S1 to obtain time information and obtain an urban energy data feature with the time information, where a manner of the time convolution is as follows:
X′
α=Conv1D(Xα,kα)∈RT
where Xα represents the data in the multimodal input sequence, kα represents a size of a convolution kernel in a corresponding mode α, and d represents a dimension of the feature. S3: Positional encoding (PE) is performed on an output in S2 to ensure that the time information is retained in subsequent calculation; and
S4: Multi-scale and multimodal information fusion is performed on an output in S3 by using a cross-modal transformer to implement cross-modal mutual fusion of the three types of heterogeneous data. In the present invention, the multi-scale cooperative multimodal transformer (MCMulT) architecture is built to implement multimodal and multi-scale information fusion. The MCMulT module focuses more on directional cross-modal interactions between modes, and the target mode shown in the multi-scale mechanism forms a multi-scale feature of the mode by fusing the directional cross-modal interactions. Each type of integration is implemented by gathering a multi-scale feature of a source mode. As shown in
A core unit of the transformer block is a MACT block that includes three subnetwork layers: a multi-scale multi-head cross-modal layer, a multi-scale attention layer, and a position-wise feedforward layer. The CT block is used for only the representation of a single scale in the source mode and no attention processing, and can be considered a simpler MACT block. As shown in
For the target mode α and the source mode β that are to be fused, Zα∈RT
The foregoing equation is an expression of multi-head, both the MACT and the CT include multi-head, showing multi-scale crossing, where WQ
H
[i]
={CM
β→α(Zβ→α[i-1],Zα→βj)}j=0,1, . . . ,i−1 (3)
Z
α→β
[0]
=Z
β
[0] (4)
where Zβ[0] is a low-level feature of the mode β. Zβ→α[i] is output finally through preliminary fusion in a corresponding feedforward layer:
P
β→α
[i]
=f
θ(LN(Aβ→α[i]+LN(Zβ→α[i-1]))) (5)
Z
β→α
[i]
={A
β→α
[i]
+LN(Zβ→α[i-1])}+Pβ→α[i] (6)
Cross-modal mutual fusion of the three types of heterogeneous data is implemented according to Zβ→α[i] to obtain fusion results [ZI→W[D], ZA→W[D]]∈RT
S5: [ZI→W[D], ZA→W[D]]∈RT
Those skilled in the art should understand that the embodiments of the present invention can be provided as a method, a system, or a computer program product. Therefore, the present invention may use complete hardware embodiments or complete software embodiments, or have a form combining the embodiments in aspects of software and hardware. Further, the present invention may be in a form of a computer program product that is executed on one or more computer-usable storage media including computer-usable program code.
These computer program instructions may also be stored in a computer-readable memory that can instruct the computer or any other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory generate an artifact that includes an instruction apparatus. The instruction apparatus implements a function specified in one or more procedures of the flowchart and/or in one or more blocks of the block diagram.
These computer program instructions can be loaded to a computer or another programmable data processing device for the computer or the another programmable device to execute a series of operation steps to generate processing implemented by the computer, so that an step for implementing functions specified in one or more procedures of a flowchart and/or one or more blocks of a block diagram is provided by using the instructions executed on the computer or the another programmable device.
Number | Date | Country | Kind |
---|---|---|---|
202211289031.4 | Oct 2022 | CN | national |
Number | Date | Country | |
---|---|---|---|
20240134939 A1 | Apr 2024 | US |