The present disclosure relates generally to climate projections, and more specifically to a method and system for dynamic generation of high-resolution climate projections.
Accurate climate projection is important because climate change is one of the challenges facing our planet today. As global temperatures rise and weather patterns become erratic due to climate change, the need for accurate climate projections has become paramount. Such projections are vital for various sectors, including agriculture, infrastructure planning, disaster management, and policymaking, to effectively mitigate the risks associated with climate change.
General Circulation Models (GCMs), which are part of the Coupled Model Intercomparison Project (CMIP), are the foundational basis for simulating climate change projections up to the turn of the century and provide climate projections at a global scale. However, the major limitation of the GCM-based climate projections is a low geospatial resolution of the output, which ranges from 100 km to 250 km. This complicates estimations of the local-scale exposure of asset-level physical risk and forward-looking financial impact analysis. To address this limitation, NASA Earth Exchange (NEX) Global Daily Downscaled Projections (GDDP) provide climate projections at 25 km resolution by enhancing coarse-grained GCM projections up to the end of the century using traditional statistical downscaling techniques. While this approach provides useful insights at a global or regional level, it fails to provide a higher geospatial resolution needed for applications in agriculture, infrastructure planning, disaster management and risk management.
An illustrative embodiment provides a computer-implemented method for dynamic generation of climate projections. The method comprises receiving past climate data of a first spatial resolution. The past climate data includes a plurality of climatological variables. The past climate data of the first spatial resolution is converted to past climate data of a second spatial resolution. A first subset of the past climate data of the first spatial resolution and corresponding past climate data of the second spatial resolution is allocated to a training set of pairs of the past climate data of the first and second spatial resolutions. A second subset of remaining pairs of past climate data of the first spatial resolution and the corresponding past climate data of the second spatial resolution is allocated to a validation set. The method trains a machine learning model with a deep learning algorithm with the training set to generate a trained model object that maps a relationship between the past climate data of the first spatial resolution and the first climate data of the second spatial resolution. The trained model object is then validated with the validation set.
In another illustrative embodiment, the method comprises receiving climate projections of the second spatial resolution and applying the trained model object to the climate projections of the second spatial resolution to generate climate projections of the first spatial resolution. The first spatial resolution is higher than the second spatial resolution.
In another illustrative embodiment, the method comprises pre-processing the past climate data of the first spatial resolution and the past climate data of the second spatial resolution prior to training the machine learning model. The pre-processing comprises at least one of normalizing, augmenting, or enhancing the past climate data of the first and second resolutions prior to training the machine learning model.
Another illustrative embodiment provides a system for dynamic generation climate projections. The system comprises a storage device configured to store program instructions and one or more processors operably connected to the storage device and configured to execute the program instructions to cause the system to: receive past climate data of a first spatial resolution, wherein the past climate data comprises a plurality of climatological variables; convert the past climate data of the first spatial resolution to past climate data of a second spatial resolution; allocate a first subset of the past climate data of the first spatial resolution and corresponding past climate data of the second spatial resolution to a training set of pairs of the past climate data of the first and second spatial resolutions; allocate a second subset of remaining past climate data of the first spatial resolution and corresponding past climate data of the second spatial resolution to a validation set of pairs of the past climate data of the first and second spatial resolutions; train a machine learning model with a deep learning algorithm with the training set of pairs of the past climate data of the first and second spatial resolutions to generate a trained model object that maps a relationship between the past climate data of the first spatial resolution and the first climate data of the second spatial resolution; and validate the training model with the validation set of pairs of the past climate data of the first and second spatial resolutions.
Another illustrative embodiment provides a computer program product for dynamic generation of climate projections. The computer program product comprises a computer readable storage medium having program instructions embodied thereon to perform the steps of: receiving past climate data of a first spatial resolution comprising a plurality of climatological variables; converting the past climate data of the first spatial resolution to past climate data of a second spatial resolution; allocating a first subset of the past climate data of the first spatial resolution and corresponding past climate data of the second spatial resolution to a training set of pairs of the past climate data of the first and second spatial resolutions; allocating a second subset of remaining pairs of past climate data of the first spatial resolution and the corresponding past climate data of the second spatial resolution to a validation set; training a machine learning model with a deep learning algorithm with the training set of pairs of the past climate data of the first and second resolutions to generate a trained model object that maps a relationship between the past climate data of the first spatial resolution and the first climate data of the second spatial resolution; and validating the trained model object using the validation set of pairs of the past climate data of the first and second spatial resolutions.
The novel features believed characteristic of the illustrative embodiments are set forth in the appended claims. The illustrative embodiments, however, as well as a preferred mode of use, further objectives and features thereof, will best be understood by reference to the following detailed description of an illustrative embodiment of the present disclosure when read in conjunction with the accompanying drawings, wherein:
The illustrative embodiments recognize and take into account one or more different considerations. The illustrative embodiments recognize and take into account that the need for accurate climate projections has become paramount. Such projections are vital for various sectors, including agriculture, infrastructure planning, disaster management, and policymaking, to effectively mitigate the risks associated with climate change.
The illustrative embodiments also recognize and take into account that current climate projection models suffer from a drawback: their low spatial resolution. Traditional climate projection models typically operate at coarse spatial scales, often ranging from tens to hundreds of kilometers. While these models provide some insights at a global or regional level, they fail to provide details of climate changes occurring at smaller scales, such as local weather patterns and microclimates.
The illustrative embodiments provide a method and system for dynamic generation of high-resolution climate projections. In the present disclosure, the term “climate projections” generally refers to climate projection data and/or climate projection maps. The illustrative embodiments address the limitations associated with current climate projections, particularly their low resolution, in order to provide more accurate and reliable information on climate change.
In some illustrative embodiments, a machine learning model is trained using a deep learning algorithm with a training set of past climatological data of a first resolution (e.g., high resolution) and past climatological data of a second resolution (e.g., low resolution). Once the training process is completed, the resulting optimized parameters form a trained model object. The trained model object represents a mapping function which is capable of making accurate predictions or decisions on new or unseen data. In the illustrative embodiments, the trained model object is a mapping function that maps a relationship between the past climatological data of the second resolution and the past climatological data of the first resolution. The trained model object is validated using a validation set of past climatological data of the first and second resolutions. Finally, the trained model object is applied to climate projections of the second resolution. The training model object enhances the climate projections of the second resolution to generate the climate projections of the first resolution.
With reference to
In the depicted example, server computers 104 and 106 and storage unit 108 connect to network 102. In addition, client devices 110 connect to network 102. In the depicted example, server computers 104 and 106 provide information, such as boot files, operating system images, and applications to client devices 110. Client devices 110 can be, for example, computers, workstations, or network computers. As depicted, client devices 110 include client computers 112, 114, and 116. Client devices 110 can also include other types of client devices such as mobile phone 118, tablet 120, and smart glasses 122.
In the illustrative example of
Program code located in network data processing system 100 can be stored on a computer-recordable storage medium and downloaded to a data processing system or other device for use. For example, the program code can be stored on a computer-recordable storage medium on server computers 104 and 106 and storage unit 108 and downloaded to client devices 110 over network 102 for use on client devices 110.
In the illustrative example of
System 200 comprises computer system 204. Computer system 204 is a physical hardware system and includes one or more data processing systems. When more than one data processing system is present in computer system 204, those data processing systems are in communication with each other using a communications medium. The communications medium can be a network. The data processing systems can be selected from at least one of a computer, a server computer, a tablet computer, or some other suitable data processing system.
Computer system 204 includes application programming interface (API) 206. The API includes a set of definitions and protocols for building and integrating application software and allowing various programs to query and exchange data. In some example embodiments, API 206 may be an asynchronous REST (Representational State Transfer) interface which maintains and services a queue of user requests. API 206 queues and processes user inputs entered by a client (not illustrated in
Based on user inputs processed by API 206, processing unit 208 retrieves satellite and climatological images and data from external data sources 232. The climatological images and data from external data sources are also referred to as past climatological data of the first spatial resolution (e.g., high resolution). The past climatological data of the first spatial resolution may comprise historical or observed climatological images and data which may have been previously acquired by satellites or other external sources. In some example embodiments, the first spatial resolution is around 1 km.
External data sources 232 may include multiple sources of geo-spatial data, such as, but not limited to, NASA, ESA, National Center for Atmospheric Research, US National Geophysical Data Center, providing historical time-series of satellite images on climate hazards, images of climate projections either directly produced by GCMs or downscaled versions such as NEX GDDP projections depending on climate hazard or climatological variable, as well as multiple additional informative data elements, such as but not limited to elevation and land burnability per geo-location. Data retrieval from these external sources may involve using APIs with associated authentication methods, such as the NASA MODIS API & Google Earth Engine service API, web scrapping, batch-download data from external servers such as MODIS S3 buckets, file transfer protocols, or other data retrieval methods, both from public data sources as well as paid vendors.
In some illustrative examples, computer system 204 in advance and independent of user query, may retrieve required data elements in available resolutions at a global scale and across relevant climatological variables, GCM models and CMIP scenarios from respective data providers and store them in internal data storage 210 (e.g., hard drive, database, cloud storage). Based on user input, API 206 may then filter and retrieve a subset of data elements specific to user query from internal data storage 210.
In some illustrative examples, instead of prior data retrieval which may have a large data footprint, computer system 204 may dynamically retrieve only required data elements from multiple data providers from external data sources 232 based on a specific set of user requests as processed by API 206.
Data retrieved from external data sources 232 are stored in internal data storage 210 for persistent and faster access. Internal data storage 210 may include an internal database, such as, but not limited to PostGIS, cloud data storage, such as but not limited to Amazon S3, or other internal data storage for persistent storage and faster access.
Computer system 204 includes image processor 212 configured to pre-process the past climatological data of the first resolution retrieved from internal data storage 210. The past climatological data of the first spatial resolution may, for example, include raw data elements such as satellite data, satellite images and additional informative data elements, such as elevation. Image processor 212 generates past climatological data of the second spatial resolution based on the past climatological data of the first spatial resolution. The past climatological data of the second spatial resolution may be generated by, for example, converting or transforming the past climatological data of the first spatial resolution by interpolating and resampling the data. Thus, low-resolution data (e.g., past climatological data of the second spatial resolution) is generated from high-resolution data (e.g., past climatological data of the first spatial resolution) by interpolating and resampling the high-resolution data. In some example embodiments, the first spatial resolution is around 1 km, and the second spatial resolution is around 25 km. Image processor 212 prepares a set of training and validation data 214 in the form of pairs of the past climatological data of the first spatial resolution and the past climatological data of the second spatial resolution for deep learning models.
In some example embodiments, multiple image processing routines may be implemented in image processor 212. Image processor 212 may prepare the set of training and validation data 214 as input channels for training a machine learning model. In some example embodiments, a machine learning model is trained using a deep learning algorithm.
Computer system 204 includes artificial intelligence (AI) system 216. Machine learning component 218 may be implemented as a part of AI system 216. In some example embodiments, machine learning component 218 is configured to train on-demand a deep learning computer vision algorithm in the domain of image super-resolution to learn a trained model object which represents a mapping function from the past climatological data of the second spatial resolution to the past climatological data of the first spatial resolution. Thus, the deep learning computer algorithm is trained to learn a mapping function from the low-resolution data to the high-resolution data.
Deep learning is a machine learning method that focuses on training artificial neural networks to learn and make predictions or decisions based on input data. It may involve training an algorithm or model, allowing it to automatically learn hierarchical representations of the data. The deep learning algorithm may comprise convolutional neural networks (CNN) as building blocks, a type of neural network architecture particularly suited for image data.
In some example embodiments, the mapping function may incorporate learning from relevant additional observables, such as but not limited to elevation. These additional observables may be included as additional channels into the deep-learning algorithm.
In some example embodiments, the deep learning algorithm may utilize deep residual networks wherein blocks of CNN modules are repeated with residual learning implemented as shortcut connections. The architecture may also use generative approaches, such as generative-adversarial networks and diffusion models.
Subsequently, the trained model object (e.g., mapping function) is applied on climate projections of the second spatial resolution (e.g., low resolution) to enhance the climate projections of the second spatial resolution and generate the climate projections of the first spatial resolution (e.g., high resolution). In some example embodiments, the first spatial resolution is around 1 km, and the second spatial resolution is around 25 km. The climate projections of the second spatial resolution may, for example, be provided by NEX GDDP.
In some example embodiments, depending on user provided geo-location and prior instances of model training, a pre-trained model object may also be used instead of an on-demand model training.
Computer system 204 can be configured to perform at least one of the steps, operations, or actions described in the different illustrative examples using software, hardware, firmware, or a combination thereof. In some example embodiments, AI system 216 combined with API 206 transforms computer system 204 into a special purpose computer system for on-demand deep-learning for enhancing climate projections. In this example, computer system 204 operates as a tool that can increase at least one of speed, accuracy, or usability of the system. In particular, this increase in performance of computer system 204 can be used for the generation of high-resolution climate projections 220. In one illustrative example, AI system 216 in conjunction with API 206 provides for high-resolution climate projections 220 for multiple climatological variables or hazards per GCM climate scenarios, geo-location and other attributes tailored to user inputs.
The illustration of system 200 in
Turning now to
System 300 comprises image processor 310 configured to read data (e.g., past climatological data of the first resolution) from internal database 308 and pre-process raw data elements, such as satellite data, satellite images and additional informative data elements, such as elevation. Image processor 310 generates past climatological data of the second spatial resolution based on the past climatological data of the first spatial resolution. The past climatological data of the second spatial resolution may be generated by, for example, converting or transforming the past climatological data of the first spatial resolution by interpolating and resampling the data. Thus, low-resolution data (e.g., 25 km) is generated from high-resolution data (e.g., 1 km) by interpolating and resampling the high-resolution data.
Image processor 310 prepares a set of training and validation data in the form of pairs of the past climatological data of the first resolution and the past climatological data of the second resolution for a deep learning model. Thus, the set of training and validation data comprise pairs of low-resolution data and/or images and high-resolution data and/or images. The training and validation data are stored in internal database 308.
System 300 comprises model training engine 312 which can be implemented as an artificial intelligence engine. Model training engine 312 is configured to retrieve the training data and the validation data from internal storage. Model training engine 312 trains deep learning model 314 using the training data. Model training engine 312 then validates deep learning model 314 (e.g., trained model object) based on validation metrics provided by evaluation model 316 and using the validation data retrieved from internal database 308. Based on the validation metrics from evaluation model 316, multiple training iterations may be implemented to train deep learning model 314 (e.g., trained model object) until desired accuracy. Finally, a trained model object and associated artifacts are tagged 318 and stored in internal database 308.
System 300 comprises climate projection system 320 configured to generate climate projections of the first spatial resolution (e.g., high resolution) from climate projections of the second spatial resolution using the trained model object. Climate projections include climate projection maps and/or climate projection data. Climate projection system 320 comprises processing module 322 which retrieves climate projections of the second spatial resolution (e.g., low resolution) from internal database 308 and processes the climate projections of the second spatial resolution. Climate projections of the second spatial resolution may include images for climate projections per user specified geo-location, CMIP climate scenarios and GCM models. Climate projection system 320 comprises projection module 324 which retrieves the trained model object from internal database 308 and utilizes the trained model object to generate the climate projections of the first spatial resolution (e.g., high resolution) from the climate projections of the second spatial resolution (e.g., low resolution). Climate projection system 320 may implement user provided specifications on trained model object, such as aggregation schemes across multiple GCM forecasts. The climate projections of the first spatial resolution are saved in internal database 308. Subsequently, data and images associated with climate projections of the first spatial resolution are returned to API 306 which may notify client 328 with appropriate download links.
With reference next to
With reference to
The past climatological data (e.g., past observable climatological data) of the first spatial resolution is pre-processed and utilized to generate past climatological data of the second spatial resolution (step 608). The second resolution is a low resolution (e.g., 25 km). In some example embodiments, the past climatological data of the first spatial resolution is pre-processed by combining tiled satellite images, masking satellite images and aggregating satellite images. The past climatological data of the second spatial resolution may be generated by, for example, converting or transforming the past climatological data of the first resolution by interpolating and resampling the data. Thus, low-resolution data (e.g., 25 km) is generated by interpolating and resampling high-resolution observable data (e.g., 1 km).
Next, pairs of past climatological data of the first and second resolutions are formed for on-demand model training (step 612). Each pair comprises data of the first resolution and corresponding data of the second resolution based on user specifications. The pairs of past climatological data of the first and second resolutions are divided or segregated into a training set of pairs and a validation set of pairs (step 616). For example, if there are a total of 100 pairs of past climatological data of the first and second resolutions, 70 pairs may be allocated to the training set and the remaining 30 pairs may be allocated to the validation set. Process 600 then ends at step 620.
With reference to
With reference next to
Based on the training set, a deep-learning computer algorithm is trained to learn a trained model object which represents a mapping function from the low-resolution images to the high-resolution images for a given climatological variable (step 908). In some illustrative embodiments, this mapping function may incorporate learning from relevant additional observables, such as but not limited to elevation. These additional observables may be included as additional channels into the deep learning architecture. In some illustrative embodiments, the deep learning architecture may comprise convolutional neural networks (CNN) as building blocks, a type of neural network architecture suited for image data. In some illustrative embodiments, the architecture may utilize deep residual networks wherein blocks of CNN modules are repeated with residual learning implemented as shortcut connections. The architecture may also use generative approaches, such as generative adversarial networks and diffusion models. The mapping function is also referred to as the trained model object. The accuracy of the trained model is determined utilizing the validation set (step 912). The trained model object is applied to the validation set to produce high-resolution images. The accuracy of the trained model object is determined by comparing the high-resolution images generated by the trained model object to the high-resolution images in the validation set.
The trained model object along with relevant artifacts are saved in the internal storage (step 916). The above steps are repeated for each additional climatological variable (step 920). If there are no additional climatological variables, the process ends at step 924.
With reference to
In some illustrative embodiments, based on user query, an on-demand model training may not be necessary if a pre-trained model object from process 900 may be used. In such scenarios,
With reference next to
In this example, LST images 1204 (illustrated in
Next, the trained model object is applied to enhance climate projections of a low resolution to generate climate projections of a high resolution. In this illustrative example, the trained model object is applied to NEX GDDP projections 1220 (illustrated in
Turning now to
Processor unit 1304 serves to execute instructions for software that may be loaded into memory 1306. Processor unit 1304 may be a number of processors, a multi-processor core, or some other type of processor, depending on the particular implementation. In an embodiment, processor unit 1304 comprises one or more conventional general-purpose central processing units (CPUs). In an alternate embodiment, processor unit 1304 comprises one or more graphical processing units (GPUS).
Memory 1306 and persistent storage 1308 are examples of storage devices 1316. A storage device is any piece of hardware that is capable of storing information, such as, for example, without limitation, at least one of data, program code in functional form, or other suitable information either on a temporary basis, a permanent basis, or both on a temporary basis and a permanent basis. Storage devices 1316 may also be referred to as computer readable storage devices in these illustrative examples. Memory 1306, in these examples, may be, for example, a random access memory or any other suitable volatile or non-volatile storage device. Persistent storage 1308 may take various forms, depending on the particular implementation.
For example, persistent storage 1308 may contain one or more components or devices. For example, persistent storage 1308 may be a hard drive, a flash memory, a rewritable optical disk, a rewritable magnetic tape, or some combination of the above. The media used by persistent storage 1308 also may be removable. For example, a removable hard drive may be used for persistent storage 1308. Communications unit 1310, in these illustrative examples, provides for communications with other data processing systems or devices. In these illustrative examples, communications unit 1310 is a network interface card.
Input/output unit 1312 allows for input and output of data with other devices that may be connected to data processing system 1300. For example, input/output unit 1312 may provide a connection for user input through at least one of a keyboard, a mouse, or some other suitable input device. Further, input/output unit 1312 may send output to a printer. Display 1314 provides a mechanism to display information to a user.
Instructions for at least one of the operating system, applications, or programs may be located in storage devices 1316, which are in communication with processor unit 1304 through communications framework 1302. The processes of the different embodiments may be performed by processor unit 1304 using computer-implemented instructions, which may be located in a memory, such as memory 1306.
These instructions are referred to as program code, computer-usable program code, or computer readable program code that may be read and executed by a processor in processor unit 1304. The program code in the different embodiments may be embodied on different physical or computer readable storage media, such as memory 1306 or persistent storage 1308.
Program code 1318 is located in a functional form on computer readable media 1320 that is selectively removable and may be loaded onto or transferred to data processing system 1300 for execution by processor unit 1304. Program code 1318 and computer readable media 1320 form computer program product 1322 in these illustrative examples. In one example, computer readable media 1320 may be computer readable storage media 1324 or computer readable signal media 1326.
In these illustrative examples, computer readable storage media 1324 is a physical or tangible storage device used to store program code 1318 rather than a medium that propagates or transmits program code 1318. Computer readable storage media 1324, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Alternatively, program code 1318 may be transferred to data processing system 1300 using computer readable signal media 1326. Computer readable signal media 1326 may be, for example, a propagated data signal containing program code 1318. For example, computer readable signal media 1326 may be at least one of an electromagnetic signal, an optical signal, or any other suitable type of signal. These signals may be transmitted over at least one of communications links, such as wireless communications links, optical fiber cable, coaxial cable, a wire, or any other suitable type of communications link.
The different components illustrated for data processing system 1300 are not meant to provide architectural limitations to the manner in which different embodiments may be implemented. The different illustrative embodiments may be implemented in a data processing system including components in addition to or in place of those illustrated for data processing system 1100. Other components shown in
As used herein, “a number of,” when used with reference to items, means one or more items. For example, “a number of different types of networks” is one or more different types of networks.
Further, the phrase “at least one of,” when used with a list of items, means different combinations of one or more of the listed items can be used, and only one of each item in the list may be needed. In other words, “at least one of” means any combination of items and number of items may be used from the list, but not all of the items in the list are required. The item can be a particular object, a thing, or a category.
For example, without limitation, “at least one of item A, item B, or item C” may include item A, item A and item B, or item B. This example also may include item A, item B, and item C or item B and item C. Of course, any combinations of these items can be present. In some illustrative examples, “at least one of” can be, for example, without limitation, two of item A; one of item B; and ten of item C; four of item B and seven of item C; or other suitable combinations.
The flowcharts and block diagrams in the different depicted embodiments illustrate the architecture, functionality, and operation of some possible implementations of apparatuses and methods in an illustrative embodiment. In this regard, each block in the flowcharts or block diagrams can represent at least one of a module, a segment, a function, or a portion of an operation or step. For example, one or more of the blocks can be implemented as program code, hardware, or a combination of the program code and hardware. When implemented in hardware, the hardware may, for example, take the form of integrated circuits that are manufactured or configured to perform one or more operations in the flowcharts or block diagrams. When implemented as a combination of program code and hardware, the implementation may take the form of firmware. Each block in the flowcharts or the block diagrams may be implemented using special purpose hardware systems that perform the different operations or combinations of special purpose hardware and program code run by the special purpose hardware.
In some alternative implementations of an illustrative embodiment, the function or functions noted in the blocks may occur out of the order noted in the figures. For example, in some cases, two blocks shown in succession may be performed substantially concurrently, or the blocks may sometimes be performed in the reverse order, depending upon the functionality involved. Also, other blocks may be added in addition to the illustrated blocks in a flowchart or block diagram.
The different illustrative examples describe components that perform actions or operations. In an illustrative embodiment, a component may be configured to perform the action or operation described. For example, the component may have a configuration or design for a structure that provides the component an ability to perform the action or operation that is described in the illustrative examples as being performed by the component.
Many modifications and variations will be apparent to those of ordinary skill in the art. Further, different illustrative embodiments may provide different features as compared to other illustrative embodiments. The embodiment or embodiments selected are chosen and described in order to best explain the principles of the embodiments, the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.