Sargassum is a genus of macroalgae (seaweed) prevalent in oceans, including shallow coastal waters. Sargassum can present problems for local environments, tourism, and economics when washed ashore. It can also be a desirable raw material useful for the production of fertilizers, papers, cosmetics and other commercial products. The presence of Sargassum may also indicate the opportunities for commercial fishing. Likewise, other forms of floating matters (e.g., marine debris) can also have beneficial or adverse impacts on the environments.
Accordingly, systems, methods, and media for automatic detection and quantification of marine macroalgae and other floating matters are desirable.
In accordance with some embodiments of the disclosed subject matter, systems, methods, and media for systems, methods, and media for transforming geospatial images include: receiving from a user device, a request for geospatial data indicating concentrations of aquatic macroalgae for a geographic target region identified by the request; accessing multispectral aerial images of the target region; preprocessing the aerial images wherein areas obscured by cloud cover in the aerial images are masked and brightness values are adjusted to compensate for atmospheric scattering; determining, using one or more characteristics of the aerial images, an image type for each of the aerial images; generating the one or more geospatial data images by: providing preprocessed aerial images of each image type to a deep convolutional neural network (DCNN) trained using images having that image type; receiving, as outputs from each DCNN, image data indicating whether macroalgae are present in regions corresponding to each pixel of the aerial images; altering pixel values of the aerial images to visually indicate the presence of macroalgae in regions corresponding to the altered pixel values; and providing the one or more geospatial data images to the user device via a user interface.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
Various objects, features, and advantages of the disclosed subject matter can be more fully appreciated with reference to the following detailed description of the disclosed subject matter when considered in connection with the following drawings, in which like reference numerals identify like elements.
In recent years, massive blooms of pelagic Sargassum have occurred in the Atlantic Ocean, Caribbean Sea, and Gulf of Mexico, and satellite imagery have been used operationally to monitor and track the blooms. However, limited by the coarse resolution and other confounding factors, there is often a data gap in nearshore waters, and the uncertainties in the estimated Sargassum abundance in offshore waters are also unclear. Higher-resolution satellite data may overcome these limitations, yet such a potential is hindered by the lack of reliable methods to accurately detect and quantify Sargassum in an automatic fashion.
Systems and methods disclosed herein can address this challenge by combining large quantities of high-resolution satellite data with deep learning. For example, data from resources such as the Multispectral Instrument (MSI, 10-20 m), Operational Land Imager (OLI, 30 m), WorldView-II (WV-2, 2 m), and/or PlanetScope/Dove (3 m) can be used with a deep neural network, such as a deep convolution neural network (DCNN), to extract Sargassum features and quantify Sargassum biomass density or areal coverage. One type of DCNN, known as U-net could be leveraged and implemented in certain methods and embodiments described herein.
In one experiment, the inventors were able to implement an automated method that could extract Sargassum features while discarding other confusing features (waves, currents, phytoplankton blooms, clouds, cloud shadows, or striping noise). For Sargassum biomass estimated from OLI and MSI images, results indicated an accuracy of ˜92% and 90%, respectively, when evaluated using images from the same sensor. When Sargassum areal coverage was estimated from WV-2 and Dove images, accuracies of ˜98% and 80% were obtained, respectively. When images from different sensors were cross-compared, methods using OLI images revealed ˜35% more Sargassum biomass than methods using MODIS, in 14 OLI images collected in the Caribbean Sea (path/row: 001/050). On 3 Jun. 2019 and 5 Jun. 2018, methods using Dove images showed ˜150% more Sargassum coverage than the same-day MODIS for their common observation area (˜230,000 km2) in the Gulf of Mexico. Compared to the quasi-simultaneous Dove images, the MSI and OLI may have underestimated at least ˜360% and ˜70% of Sargassum, respectively. The morphological characteristics of Sargassum features from these high-resolution data are also reported to facilitate management actions. The findings here not only fill the knowledge gaps and coverage gaps from previous studies, but more importantly pave the road toward operational monitoring and tracking Sargassum features in nearshore waters.
The image analysis system 100 may operate within a computing environment 199 which may be accessible from a communications network 104 such as the Internet. The computing environment may include one or more data stores 130 accessible to the image analysis system 100. The computing environment may also provide a user interface 108 allowing users such as the user 102 to interact with the image analysis system 100, including transmitting analysis requests such (e.g., the request 105) via the network 104. The image analysis system 100 may be configured to retrieve geospatial images such as the geospatial images 190 shown in
The image analysis system 100 may be configured to process geospatial images 190 received as inputs and to transform them into geospatial data images (GSDI, e.g., the geospatial data images 195) which visually represent the results of analyses performed by the system in human-perceivable form in response to requests (e.g., the request 150 received from the user 102 via the user interface 108).
It will be appreciated that
Satellite remote sensing provides timely Sargassum monitoring information, and thus is a useful tool helping resource managers make decisions and develop mitigation strategies Currently, coarse-resolution sensors including the Moderate Resolution Imaging Spectroradiometer (MODIS), Visible Infrared Imaging Radiometer Suite (VIIRS), MEdium Resolution Imaging Spectrometer (MERIS), and Ocean Land Color Instrument (OLCI) have been successfully utilized to observe the large-scale Sargassum distributions across the Atlantic Ocean. Correspondingly, a satellite-based near real-time Sargassum Watch System (SaWS) has been established to use both MODIS and VIIRS imagery to monitor Sargassum distributions and to predict Sargassum transport in the Caribbean Sea, as shown in
However, measurements derived from these coarse-resolution sensors often suffer from several limitations. First, uncertainties in the Sargassum estimates are unclear. Sargassum in the ocean can take the form of clumps, mats, or rafts, often smaller than a pixel size. Each sensor has its own lower detection limit. For example, with a signal-to-noise ratio (SNR) of 200:1. According to some estimates, the areal detection limit is about 1% of a pixel size. From MODIS 1-km observations, the lower detection limit was estimated to be 0.2% of a pixel size (i.e., 2000 m2). It is unclear how much Sargassum these coarse-resolution sensors may “miss” due to such lower detection limits, for example in the weekly Sargassum density images, as can be seen in
The data quality of the coarse-resolution pixels is compromised in coastal waters due to interference of the shallow-water bottom, high amounts of suspended particles, or land adjacency effects. Therefore, in the SaWS, pixels within 30 km of shoreline are often masked to avoid false positives. However, the lack of data in nearshore waters can greatly hinder management actions.
To overcome the limitations of conventional approaches, various high-resolution sensors should be utilized. For example, the 10-30 m resolution Multispectral Instrument (MSI) and Operational Land Imager (OLI) sensors carried by the Sentinel-2 and Landsat-8 satellites are equipped with spectral bands to detect the enhanced Near Infrared (NIR) reflectance caused by floating vegetation including Sargassum.
The floating algae index, or FAI, is defined as the difference between reflectance at 859 nm (vegetation “red edge”) and a linear baseline between the red band (645 nm) and short-wave infrared band (1240 or 1640 nm). Through data comparison and model simulations, FAI has shown advantages over the traditional NDVI (Normalized Difference Vegetation Index) or EVI (Enhanced Vegetation Index) because FAI is less sensitive to changes in environmental and observing conditions (aerosol type and thickness, solar/viewing geometry, and sun glint) and can “see” through thin clouds. The baseline subtraction method provides a simple yet effective means for atmospheric correction, through which floating algae can be easily recognized and delineated in various ocean waters, including the North Atlantic Ocean, Gulf of Mexico, Yellow Sea, and East China Sea. Because similar spectral bands are available on many existing and planned satellite sensors such as Landsat TM/ETM+ and VIIRS (Visible Infrared Imager/Radiometer Suite), the FAI concept is extendable to establish a long-term record of these ecologically important ocean plants.
While these high-resolution sensors are designed primarily for land-based applications, some measurements are also taken over the ocean; however, Sargassum detection often suffers from confusing features induced by clouds, surface waves, or variable image background due to sensor artifacts or changes in water's optical properties. To make things worse, different sensors may have different noise characteristics. Thus, a reliable algorithm to extract Sargassum features automatically from images of an individual sensor type, not to mention a unified algorithm applicable to all high-resolution sensors, is lacking. Because of these technical difficulties, existing systems for Sargassum detection using aerial imagery, such as the Sargassum Early Advisory System (SEAS), often use human intervention in the form visual interpretation and manual delineation to locate the Sargassum slicks in Landsat imagery to predict potential beaching events.
Certain denoising and feature extraction methods for MSI images will now be described, which may be used in association with the systems described herein. These methods may be used on FAI images that use at least two spectral bands in the NIR or SWIR wavelengths. However, other methods would be more suitable for use with 3-band data (e.g., from the PlanetScope constellation of satellites). In addition, the threshold-based image segmentation methods rely on the accurate estimation of the image background variations, and may need tuning to operate on different image types.
One aspect of the systems and methods described herein that help overcome certain disadvantages of the prior art, is that these systems and methods employ deep-learning techniques that work with the specific datasets (which may be denoised or otherwise pre-processed, as further described herein) to provide automated Sargassum detection and benefit from the ability to process high-resolution imagery. Systems disclosed herein and methods for their use do not use human intervention, but may be configured to be adaptable and refinable to allow for users to guide or supervise how the algorithms process data. Automated methods described herein could use a U-structured DCNN model to enable a unified approach to extracting Sargassum features from aerial imagery and quantifying Sargassum abundance from multi-sensor high-resolution imagery in order to fill in data gaps for Sargassum abundance in nearshore waters.
For purposes of illustration, aspects of systems and methods disclosed herein will first be explained below with reference to particular example embodiments and experiments performed to validate the performance of such example embodiments. Additional and alternative embodiments and features of those embodiments will then be described in light of the examples.
A first example Sargassum quantification workflow and the details of a DCNN suitable for use in performing Sargassum extraction according to embodiments herein will be described below, followed by performance evaluations using the MSI, OLI, WV-2, and Dove image datasets. Sargassum biomasses/coverages quantified from these high-resolution images were compared with result using concurrent MODIS measurements to establish empirical relationships between Sargassum estimates generated from these image types. Sargassum morphologies generating from the OLI, MSI, and Dove images are described as examples. Operational considerations for near real-time Sargassum monitoring using satellite imagery will be discussed in light of the examples presented.
Fifty-three Sentinel-2 MSI and twenty-one Landsat-8 OLI Level-1C (top-of-Atmosphere (TOA) reflectance) images collected near the Lesser Antilles Islands and Gulf of Mexico (GOM) in 2018 and 2019 were downloaded from the USGS earth explorer https://earthexplorer.usgs.gov/, and processed to yield Rayleigh-corrected reflectance values (Rrc, unitless) at 10-m and 30-m resolution, respectively. Using the multispectral Rrc data, the FAI products were generated to quantify the enhanced reflectance of Sargassum in the near-infrared (NIR) wavelengths by comparing to the nearby red and shortwave-infrared (SWIR) bands using the following equations:
FAI=Rrc,NIR−R′rc,NIR
R′
rc,NIR
=R
rc,RED+(Rrc,SWIR−Rrc,RED)×(λNIR−λRED)/(λSWIR−λRED) (Eq. 1)
where λRED=665 nm, λNIR=865 nm, and λSWIR=1610 nm were selected for Sentinel-2 MSI image data, while λRED=655 nm, λNIR=865 nm, and λSWIR=1610 nm were selected for Landsat-8 OLI image data. In MSI FAI images, pixels with large Rrc1610(>0.10) were pre-masked to exclude the land and bright cloud pixels and treated as invalid observations (Eq. 2). Similar thresholds were also applied to mask the OLI FAI images before Sargassum extraction.
Rrc
1610>0.1,Rrc442>0.1, and Rrc560>0.1 (Eq. 2)
Worldview-2 Data: four Worldview-2 (WV-2) images collected in the northern GOM during 2014 to 2015 containing partial Sargassum coverage were acquired from DigitalGlobe. The data were processed into TOA reflectance and the FAI products were generated using the TOA reflectance centered on 659 nm, 833 nm, and 949 nm. FIG. 3C shows an example of the Sargassum features observed on the WV-2 FAI images. Three images were used for model training and one image was selected for validation.
Dove Data: a total of 4,567, 1, and 7,457 three-band Dove images collected on 3 Jun. 2019, 1 Jun. 2019, and 5 Jun. 2018, respectively, in the GOM were acquired from Planet Lab to test the applicability of the Sargassum extraction method. Considering the difficulties in conducting accurate atmospheric correction, the radiance data were directly used to detect Sargassum. Note that the four-band Dove data which contains the NIR wavelengths were mostly unavailable in the open water area within the GOM, therefore only the three-band RGB data were used in this paper. Table 1 below summarizes high-resolution satellite images used for detecting and quantifying Sargassum in this example. In this experiment, the Dove data was three-band Dove image data, which provided daily coverage over the GOM. Four-band data (with the fourth band in the NIR wavelength) cover coastal waters only.
MODIS Data: To estimate the amount of Sargassum missed by coarse-resolution sensors, MODIS data collected in the GOM on 3 Jun. 2019 and 5 Jun. 2018 and in the Central West Atlantic in 2018 were processed to compare with quasi-simultaneous and co-located MSI, OLI, and Dove observations. MODISA and MODIST Level-0 data were obtained from the U.S. National Aeronautics and Space Administration (NASA) Goddard Space Flight Center (http://oceancolor.gsfc.nasa.gov), and processed to generate Rrc data using SeaDAS software (version 7.5). The corresponding MODIS FAI images were generated using Rrc data centered at 667 nm, 748 nm, and 869 nm (Eq. 1). The Sargassum-containing pixels were extracted, and the fractional areal coverages were quantified. These area coverages were converted to biomass densities using a biomass model.
In one embodiment, training data were prepared by human experts using a semi-automated IDL GUI Specifically, the locations of the Sargassum features were first identified through visual inspection, and then these features were extracted using adjusted thresholding and to define bounded regions, subject to morphological constraints optimized by human inspection. These images along with the corresponding extraction results were used as the training data for the AI algorithm. Once the location of the features and the optimal extraction parameters were selected the extraction process was automated.
An example training data preparation process, as illustrated in
In various embodiments, Sargassum extraction and quantification can employ the following workflow or similar workflows. First, data under cloudy conditions and other unfavorable observing conditions could be discarded or treated as “no observations.” Then, the Sargassum-containing pixels are extracted using a deep convolutional neural network trained for the specific data types (these data types are described below). Finally, the corresponding biomass densities/areal coverages are quantified from all Sargassum-containing pixels.
Compared to background seawater, cloud pixels also show enhanced signals on FAI images and thus need to be masked before applying the Sargassum extraction process. Because clouds normally show higher reflectance in the SWIR wavelengths, a simple threshold can remove most thick clouds (Eq. 2). However, this preliminary mask cannot identify thin clouds, and a unified threshold may over-mask valid water observations under strong sun glint. For MSI and OLI data, a H_SWIR cloud mask was applied to mask the cloud-contaminated pixels. Instead of directly applying the single threshold over the entire image, using the H_SWIR cloud masking method involves performing segmentation after estimating the scaled reflectance by subtracting background reflectance according to Eq. 3:
Rrc
SWIR
−Rrc
SWIR
>T
SWIR (Eq. 3)
where RrcSWIR
In Dove images, cloud features are highly variable, making it challenging to effectively identify them. However, Sargassum detection is still possible, even through moderately thick clouds. Therefore, only those pixels with blue radiance greater than 17 W·sr−1·m−2 were masked as invalid observations. The WV-2 images used in this example were mostly cloud free, and cloud masking was not considered.
In some embodiments, as part of the process of preprocessing images, a system selects the best available satellite images for a given area and time period based on the image acquisition time, location, resolution, cloud cover, sun glint, or noise level. In some embodiments, images collected by drones can also be used. Prior to Sargassum extraction the H_SWIR cloud masking algorithm can be applied to mask pixels corresponding to cloud cover, as described briefly above and in greater detail below.
H_SWIR Cloud Masking: Because pixels corresponding to clouds, often result in higher FAI values than adjacent pixels corresponding to water (similarly to Sargassum), it is desirable to exclude such “cloud pixels,” from the Sargassum extraction process to avoid false-positives. Because of the relatively lower variation of water reflectance across different pixels and less confusing bright surface structures, a threshold-based cloud detection approach was developed to identify pixels with high reflectance in MSI bands 11 and 12 (1610 and 2202 nm).
Because there are many “noise” patterns from the wave-induced glint, a total variance (TV) filtering (weight=0.05) can be applied to Rrc images in both MSI bands 11 and 12. Cloud detection can then be performed on the denoised Rrc images. To capture the large-scale image variations, the background ocean reflectance in bands 11 and 12 can be estimated using an iterative mean background filter with a 200×200 window size. Then, the pixels with local high reflectance can be extracted after subtracting the background with the corresponding segmentation threshold. Here, the cloud masking method is named as the H_SWIR method
Rrc
1610
−Rrc
1610
>T
1610 and Rrc2202
where Rrc1610
In some examples, a DCNN with both noisy images and “clean” images denoised using TNRD-based denoising as training images can be used. In regions with strong noise, it is often very challenging to manually delineate the real Sargassum features, making it difficult to prepare training samples for those noisy regions. As a substitute, MSI FAI images with water-containing pixels (i.e., no Sargassum, no clouds) having known noise features may be combined with clean training images with Sargassum-containing pixels to simulate noisy data (e.g., including wave-induced glint patterns and cloud residuals). The DCNN may then be trained to detect Sargassum in noisy images.
To create typical noise patterns, FAI images can be cropped to 400×400 sub-images, as an example. Water images with various wave and cloud residuals patterns may be chosen. Instead of extracting the features in these “noisy” images, pure noise components of the water images were superimposed on the images with delineated true Sargassum features to create simulated “noisy” images for training the DCNN. Clean images were generated by adding the median filtered noise image to the background subtracted Sargassum images, while the corresponding noisy images were generated by adding the original noisy water images, as given by equation 5 below:
FAIclean=FAIs+FAIwater_clean
FAInoisy=FAIs+FAIwater (Eq. 5)
where FAIclean and FAInoisy represent the clean and noisy images, respectively. FAIs is the Sargassum image after subtracting the image background. The FAIwater_clean and FAIwater are the median filtered noisy water image and the original noisy water image, respectively.
In embodiments herein, Sargassum extraction on satellite images can be performed without including an independent denoising process. This is because most noise signals can be ruled out by the DCNN and would not affect the extraction performance.
In the example being discussed, a deep learning framework (e.g., a DCNN as discussed above) combining a U-net structure and a VGG-16-based encoder was designed for Sargassum extraction from high-resolution satellite images. This unique architecture is able to capture context, as well as to precisely locate targeted features. Using a pre-trained encoder optimized on the Images-Net dataset can further improve the segmentation performance. Therefore, pre-trained weights (see the purple arrows in
The input to the DCNN can be either single-band or multi-spectral images. Because Sargassum shows enhanced signals and distinctive spatial patterns on the FAI images, FAI images were selected as the model input to determine the Sargassum locations on MSI, OLI, and WV-2 images. For Dove, the three-band RGB images were used as the model input due to the lack of NIR bands. The model outputs pixels forming parts of detected features, i.e., pixels corresponding to areas where Sargassum is present in this example.
At 504A, the process 500A retrieves data for a geographic area identified by the request (e.g., geospatial images 190 for the identified area or other suitable data).
At 506A the process 500A pre-processes the data retrieved at 504A. The pre-processing may include mask pixels in images corresponding to cloud cover and/or correcting pixel values to account for atmospheric scattering and absorption at various wavelengths (e.g., calculating Rayleigh-corrected reflectance values as described above). The pre-process may also include calculating FAI values as described above.
At 508A the process 500A determines one or more data types of the data. As an example, the data received at 504A retrieved for the geographic area may include multiple image types (e.g., from different satellite imaging systems having different imaging characteristics such as spectral ranges, resolution, and so on).
At 510A pre-processed data is provided as input to one or more DCNNs (e.g., DCNNs 115) to determine Sargassum quantities indicated by the images. When the input data includes more than one data type, each data type may be provided to a distinct DCNN optimized for that data type.
At 512A the process 500A outputs geospatial data images 590A (e.g., the geospatial data images 195) that visually indicates quantities of Sargassum across the geographic region identified by the request. In some embodiments, any other suitable output data may be generated in addition to the geospatial data images or as an alternative.
At 514A, the process 514A optionally retrieves updated data for the geographic region identified by the request and returns to 506A. For example, the data initially retrieved may correspond to a first time period and the updated data may correspond to second time period before or after the first time period.
At 516A, the process generates additional geospatial data images that visually represent results of comparing the geospatial data images 595A generated for the first time period and data for the second time period. As one non-limiting example, the additional geospatial data images may indicate changes in Sargassum quantities in the geographic region identified by the request between the first time period and the second time period. As another non-limiting example, the additional geospatial data images may indicate motion of Sargassum rafts between the first time period and the second time period and/or predicted motion of the Sargassum rafts in the future based on motion between the first time period and the second time period.
At step 502B, the system 100 obtains an image. In some examples, the image can include a satellite image including multiple spectral pixel values for each pixel of the image. However, the image can be any other suitable image. In further examples, the multiple spectral pixel values can correspond to multiple wavelengths. In some examples, a wavelength can include a specific wavelength, a band, or a wavelength range between two wavelengths. In further examples, the image can include an object to be detected with the trained deep learning model described blow. The object can be Sargassum or other macroalgae, marine debris (i.e., marine litter, both plastic and non-plastic), pollen, sea snot (i.e., marine mucilage), or any other suitable sea floating object.
In some examples, the image can include a preprocessed image. For example, the system 100 can receive an original satellite image collected from a satellite sensor and preprocess the original satellite image to generate the preprocessed image. In some examples, the satellite sensor can include Multispectral Instrument (MSI, 10-60 m), Operational Land Imager (OLI, 30 m), WorldView-II (WV-2, ˜2 m), PlanetScope/Dove (Dove, 3 m), Moderate Resolution Imaging Spectroradiometer (MODIS, 250-1000 m) (e.g., mounted on Terra and/or Aqua satellites), Visible Infrared Imaging Radiometer Suite (VIIRS, 375 m-750 m), Ocean and Land Color Instrument (OLCI, 300 m-1000 m), Medium Resolution Imaging Spectrometer (MERIS, 300 m), Hyperspectral Imager for the Coastal Ocean (HICO, 353 m-1080 m), or any other suitable sensor capturing medium or high-resolution satellite images. In even further example, the original satellite image can be obtained from any suitable database (e.g., the USGS EarthExplorer, DigitalGlobe, Planet Labs, etc.).
For some sensors (e.g., MSI, OLI, MODIS, VIIRS, OLCI, MERIS, DOVE, HICO, etc.), the multiple spectral pixel values of the image can include or indicate multiple corrected reflectance values of the processed image corresponding to the multiple wavelengths. For example, the system 100 can preprocess the original satellite image to generate corrected reflectance data (e.g., Rayleigh-corrected reflectance (Rrc(λ), dimensionless)). In some examples, the preprocessing of the original satellite image can be performed (e.g., using ACOLITE, NASA's SeaDAS software, etc.). In further example, each pixel of the image can include multiple corrected reflectance values corresponding to the multiple wavelengths. For example, each pixel of the image can have a first corrected reflectance value (e.g., Rrc, RED) at a first wavelength (e.g., λRED=665 nm), a second corrected reflectance value (e.g., Rrc, NIR1) at a second wavelength (e.g., λNIR1=865 nm), a third corrected reflectance value (e.g., Rrc, NIR2 or SWIR) at a third wavelength (e.g., λNIR2 or SWIR=1610 nm), and/or an any other suitable corrected reflectance value at a suitable wavelength. It should be appreciated that any other suitable corrected reflectance data (e.g., atmospherically corrected surface reflectance) can also be used for the preprocessed image.
For some sensors (e.g., Dove, etc.),the top-of-atmosphere (TOA) radiance data can be directly used to detect a remote object (e.g., Sargassum or other macroalgae, marine debris (plastic or non-plastic, pollen, sea snot, etc.). In further examples, the spectral pixel values at RGB wavelengths of the image can be used to detect the remote object.
In further examples, the system 100 can mask first pixels in the image. The first pixels can correspond to a cloud area, a land area in the image, a strong sun glint, or any other invalid observations. In some examples, to mask the first pixels in the image, the system 100 can mask a subset pixel of the first pixels in the image based on when a difference between a first corrected reflectance value at a short-wave infrared (SWIR) or a near-infrared (NIR) wavelength of the multiple wavelengths and a second corrected reflectance value at the SWIR or NIR wavelength is higher than a threshold. In some examples, the NIR wavelength can be a wavelength between 700 nm and 1000 nm while the SWIR wavelength can be a wavelength between 1000 nm and 2200 nm. In other examples, the NIR wavelength can be a wavelength between 700 nm and 2200 nm to include the SWIR wavelength. In further examples, the first corrected reflectance value can include a denoised reflectance value. In further examples, the second corrected reflectance value can include a background reflectance value. For examples, the background reflectance value can include an average of a subset of the image, and the subset of the image can include the subset pixel. For example, cloud pixels can show enhanced signals on in the image and thus is desirable to be masked before applying the remote object extraction process. In some examples, a simple threshold can remove most thick clouds. In other examples, the system can mask clouds by conducting the image segmentation after estimating the scaled reflectance by subtracting background reflectance (e.g., RrcSWIR
At step 504B, the system 100 determine a spectral differencing value for each pixel of the image. In some examples, the spectral differencing value for each pixel of the image can be used to determine whether the respective pixel contains the object (e.g., Sargassum or other macroalgae, marine debris (i.e., marine litter, both plastic and non-plastic), pollen, sea snot (i.e., marine mucilage), or any other suitable sea floating object). The spectral differencing value can be defined as or include:
ΔR=RT−RW=[χRFM+(1−χ)RW]−RW=χ(RFM−RW) (Eq. 6)
where “T” stands for the target pixel, “FM” (floating matter) is for the object, “W” is for water, and χ (0.0%-100%) is the subpixel proportion of floating matter, RT is the floating matter surface reflectance at χ=100% (i.e., endmember reflectance), and RW is the water surface reflectance from pixels nearby the floating matter. In some examples, the wavelength dependence of R can be at least one of red, NIR, or SWIR wavelength. is omitted. In some examples, to be able to stand out in the image, a pixel needs to be significantly different from the surrounding pixels.
In some examples, the spectral differencing value can include a floating algae index (FAI) value. The FAI value can be generated to quantify the enhanced reflectance of the object in the near-infrared (NIR) wavelengths by comparing the nearby RED and another NIR or shortwave-infrared (SWIR) bands. In some examples, to determine the FAI value, the system 100 can determine the FAI value for each pixel of the image based on a difference between a first corrected reflectance value of the multiple corrected reflectance values at a first wavelength of the multiple wavelengths and a second corrected reflectance value at the first wavelength of the multiple wavelengths. In some examples, the first wavelength can include a first near-infrared (NIR) wavelength. In further examples, the system 100 can determine the second corrected reflectance value at the first wavelength based on a second corrected reflectance value of the plurality of corrected reflectance values at a second NIR wavelength of the multiple wavelengths, and a third corrected reflectance value of the plurality of corrected reflectance values at a red wavelength of the plurality of wavelengths.
For example, the FAI is a measure of the vegetation red-edge reflectance. To minimize the impact of aerosols as well as thin clouds and moderate sun glint, the FAI can be calculated as the reflectance in a NIR band referenced against a baseline formed linearly between two neighboring bands with Equation 1:
FAI=Rrc,NIR1−R′rc,NIR1
R′
rc,NIR1
=R
rc,RED+(Rrc,NIR2−Rrc,RED)×(λNIR1−λRED)/(λNIR2−λRED) (Eq. 7)
In some examples, for the image from the MSI sensor, the pixels with large Rrc1610) (>0.10) can be pre-masked to exclude the land and bright cloud pixels and treated as invalid observations (Eq. 8). Similar thresholds were also applied to mask the image from the OLI sensor before the remote object extraction.
R
rc1610>0.1, Rrc442>0.1, and Rrc560>0.1 (Eq. 8)
In some examples, for the image from the WV-2 sensor, the original satellite image can be processed to generate a top-of-atmosphere (TOA) reflectance value (e.g., centered on 659 nm, 833 nm, and 949 nm) for each pixel of the original image. Then, the system can determine the FAI value based on the TOA reflectance value.
In some examples, the spectral differencing value for each pixel of the image can be used to determine whether the pixel contains a certain type of the object. The system 100 achieves this by comparing the spectral differencing value with a referencing spectral value of each possible object using a spectral angle mapper index (SAM). The SAM can be obtained by:
SAM (degrees)=cos−1[(Σxiyi))/(√{square root over (Σxi2)}√{square root over (Σyi2))}] (Eq. 9)
where x and y represent two spectra (two different spectral values) and the summation is for band number i from 1 to N. SAM=0° means two parallel spectra in log space (i.e., identical spectral shapes), while SAM=90° means perpendicular spectra (i.e., different types of objects to have different spectral values or shapes). SAM<5° indicates that the two spectra are very similar. In some examples, the SAM can be separately determined separately from the deep learning model
At step 506B, the system 100 applies the multiple spectral pixel values of the image and the spectral differencing value for each pixel of the image to multiple corresponding input channels of a trained deep learning model to obtain a probability value for each pixel of the image via an output channel of the trained deep learning model. In some examples, the trained deep learning model can include a U-Net deep learning model. In further examples, the trained deep learning model can include an encoder associated with a VGG16 model and a decoder associated with a U-NET model. In such examples, a sigmoid activation function is used for a final output layer in the U-Net model to produce the probability value for each pixel of the image. In further examples, an original U-Net model is modified to a deep residual U-Net model to take advantage of both the architecture of convolutional neural network and the deep residual learning. For example, for the image from the MODIS sensor, the spectral pixel values in 7 wavelengths (412, 443, 488, 547, 678, 748 and 869 nm) and the spectral differencing value for each pixel of the image can be used as the input of the trained deep learning model to extract the probability value (e.g., Sargassum features (pixels)). For VIIRS, the corresponding wavelengths can be 410, 443, 486, 551, 671, 745 and 862 nm. For OLCI, the corresponding wavelengths can be 412, 442, 490, 560, 674, 754, and 865 nm. Because the spectral differencing value is derived from 3 of the 7 bands, one may question whether the use of spectral differencing values in addition to the 7 bands is redundant. A sensitivity test was carried out to verify whether the omission of the spectral differencing values from the model input could lead to similar model performance, and the answer was negative. Therefore, both the spectral differencing value and the multiple spectral pixel values in 7 wavelengths can be as the model input. However, it is not limited to the 7 spectral pixel values in 7 wavelengths and the spectral differencing value as the model input. The input can be less or more than 7 spectral pixel values corresponding to wavelengths with or without the spectral differencing value.
In some examples, the system 100 can further divide the image into multiple sub-images. In some examples, an edge of each sub-image of the multiple sub-images can overlap an adjacent sub-image of the multiple sub-images. In some examples, to apply the multiple spectral pixel values of the image and the spectral differencing value for each pixel of the image, the system 100 can apply the spectral pixel values for each sub-image of the multiple sub-images and the floating algae index value for each sub-image of the multiple sub-images to obtain the probability value for a subset of each sub-image of the plurality of sub-images. In some examples, the subset can exclude an overlap between the respective sub-image and the adjacent sub-image. For example, to use the VGGUnet model or a similar Res-Unet model for the remote object detection, input large satellite images (FAI or RGB) were cut into 416×416 (or other size, depending on memory availability) sub-images. As the prediction accuracy could decrease along image edges, these sub-images were prepared with redundant edges (8 pixels outward on four directions) and only the prediction results from the image center (with 400×400 pixels) were merged back to generate the final extraction results.
In even further examples, the system can obtain extracted feature pixels (e.g., for an image from a Dove sensor) for Sargassum on beaches and in nearshore waters. In the examples, to distinguish Sargassum on beaches and in nearshore waters, the system can generate base maps to contain three types: water, beach, and non-beach land. In a non-limiting scenario, this can be through a simple K-means unsupervised classification with/without human interpretation on the image. In some instances, the base maps can be used for training a deep learning model to produce outputs including Sargassum on beach and Sargassum on water.
At step 508B, the system can provide object information of an object based on the probability value for each pixel of the image. In some examples, when the continuous probability value for each pixel of the image is more than 0.5, the respective pixel can be considered a Sargassum-containing pixel. In further examples, the probability value for each pixel of the image can be classified into three classes (detected object (e.g., Sargassum) pixels, object-free pixels, and invalid pixels. However, it can be classified into less or more than three classes based on the probability value and any suitable thresholds.
In further examples, the trained deep learning model applied to independent sub-images can be used to assess the extraction accuracy using indices such as F1-score, false positive rate (FPR), false negative rate (FNR), Matthews Correlation Coefficient (MCC), and Intersection over Union (IoU), defined in Equation 3-7 below.
where TP is the number of true positive pixels, FP is the number false positive pixels, TN is the number of true negative pixels, and FN is the number of false negative pixels. The independent images were selected to cover different seasons and regions to make them representative of the entire dataset, and they were not used in the U-net model training.
In some examples, to provide object information, the system can quantify the object in the probability values or the extracted feature pixels. For example, the system can quantify Sargassum biomass density in the extracted feature pixels. To quantify Sargassum biomass density, the system can estimate the background values (e.g., background FAI values) to account for the reflectance variations of the background water. The background values can be then subtracted to calculate the scaled value (e.g., scaled FAI) to estimate the corresponding biomass density. For the OLI data, the background estimation parameters and the FAI biomass models can be similarly applied, through an iterative median filtering (with a 200×200 window) and the following FAI-biomass model (i.e., y=22.89x for (0<G x≤0.05) and y=57.42 (1.18x−0.06)2+36.00(1.18x−0.06)+1.17 for (x>0.05), where x is the OLI FAI values and y is the modeled Sargassum biomass density (kg/m2)). In other examples, when the image is collected from Dove or WV-2 sensor, the system can assign 100% subpixel Sargassum areal coverage for extracted feature pixels (i.e., Sargassum-containing pixels). For example, on the 3-m resolution Dove images, each extracted Sargassum-containing pixel was assumed to have 9 m2 of Sargassum.
In some examples for monitoring Sargassum on beaches and in nearshore waters, some of the beach and nearshore water pixels may be covered by clouds. To minimize such potential biases, the images with >50% cloud coverage over beaches and nearshore waters can be discarded. For the remaining images, the Sargassum area can be scaled up using the following equation: SSargs_norm=SSargs*C, where SSargs_norm is the normalized Sargassum area after scaling up to account for cloud coverage, SSargs is the original Sargassum area estimated from extraction results, and C (≥1.0) is the scaling factor calculated from the UDM file, which is equal to the ratio between the total number of beach and water pixels (from the base maps) and those not covered by clouds.
In some examples, steps 502C and 504C can be substantially similar to steps 502B and 504B in
At step 506C, the system 100 can obtain a ground truth label for each pixel of the training image. For example, the system 100 can obtain ground truth labels to corresponding pixels including target objects (e.g., Sargassum, Ulva, Trichodesmium, any other suitable macroalgae, sea pollen, sea snot, or marine debris) in the training image. In some examples, the ground truth label can be generated using a semi-automatic IDL feature extraction Graphic User Interface. In further examples, the system can generate a ground truth label by delineating the target pixels using their normalized difference vegetation index (NDVI) values and/or by visually inspecting the spectral shapes and training image.
In some examples, the ground truth labels can include marine debris, pollen, sea snot, floating algae (e.g., Sargassum, Ulva, Trichodesmium, etc.), or any other suitable indication. For example, the endmember spectra between the marine debris and the floating algae are different in that various forms of marine debris are all spectrally flat without narrow-band features in the visible NIR wavelengths (400-900 nm) while the floating algae shows the typical reflectance trough around 670 nm due to chlorophyll a absorption. Thus, the prominent difference between marine debris and floating algae can occur between 670 nm and 750 nm: the former is spectrally flat while the latter has a sharp increase in the NIR. Based on the difference between marine debris and floating algae, the ground truth indications can be generated to be used for classification of extracted feature pixels in the training image. In the examples, the training image can be from the MSI sensor and include a Red-Green-Blue and False-color RGB composite image based on Rayleigh corrected reflectance. In the FRGB image, a NIR band can be used to replace the green band in the RGB image, making it suitable to detect Sargassum and other floating matters with enhanced NIR reflectance. In some examples, marine debris can be detected using the deep learning model if the number of classification is at least one (e.g., macro algae, marine debris).
At step 508C, the system can train a deep learning model by applying the ground truth label, the multiple spectral pixel values, and the spectral differencing value for each pixel of the training image to the deep learning model. In some examples, the deep learning model is the model used at step 506B in
The system 100 can reduce a degree of prediction inconsistency between the extracted feature pixels for the training image and the ground truth label of the extracted feature pixels labeled in the training image. For example, the system can determine the similarity between the deep learning model output and the training image. Then, the degree of prediction inconsistency can be calculated. The system 100 can reduce the degree of prediction inconsistency by adjusting parameters of the deep learning model using a loss function. In some examples, the loss function can be calculated based on the degree of prediction inconsistency with a binary cross-entropy term. However, it should be appreciated that any other suitable loss function (e.g., mean square error (MSE), mean absolute error (MAE), likelihood loss, etc.) to reduce the degree of prediction inconsistency can be used. Then, the system can iterate steps 504B-58B with the same training image or different training images.
Each blue cube represents a multichannel feature map. The white cubes represent the copied feature maps (indicated by the yellow dashed lines and the gray arrows). The image size in each of the 5 rows is marked in the first column (e.g., 100×100). The number of channels of each feature map is annotated on the upper right corner. For example, the N marked on top of the input image means that there are N spectral bands in the input image and the 1 marked on top of the output image means that there is only one channel in the output layer. Batch normalization was applied to normalize each convolutional block. The Rectified Linear Unit (ReLU) was used as the primary activation function. The sigmoid activation function was selected in the final output layer to determine the segmentation results. Note that the model input is flexible, where multispectral data can be used.
To optimize the DCNN model for the specific feature extraction tasks, the corresponding training datasets (consisting of the input images and the segmentation results), were prepared. Here, the ground truth of the Sargassum extraction results were generated using a semi-automatic IDL feature extraction Graphic User Interface. A total of 3,289 sub-images were prepared for MSI FAI images (from 14 MSI image tiles), 1,444 sub-images were selected for OLI FAI images (from 4 OLI image scenes), 682 sub-images were selected for WV-2 FAI images (from 3 WV-2 images), and 1791 sub-images were prepared on Dove RGB images (from 12 Dove images). These extraction results were cut into 400×400 sub-images to train the extraction model. Because there are already sufficient training images prepared for each sensor under various conditions, data augmentation techniques were not used.
During model optimization, the Jaccard Index (JI, Eq. 15) was monitored to determine the similarity between the model outputs and the training data:
where Ypred is the continuous prediction probability values (Ypred∈[0, 1]) and Ytrue is the binary values from the ground-truth results (ytrue∈{0, 1}). The smooth term is 1. Then, the degree of prediction inconsistency can be defined as the Jaccard Distance shown in Eq. 16:
JD(ypred,ytrue)=−log JI(ypred,ytrue) (Eq. 16)
Because image segmentation can be a one class classification problem, the loss function L was defined as the JI after adding the binary cross-entropy term H, as in Eqs. 17 and 18:
The Adaptive Moment estimation (adam) optimizer was applied for model optimization The initial learning rate was 0.001. When the loss function failed to improve after two consecutive epochs, the learning rate would then be reduced by 20% for finer tuning. In experiments, all models were trained for 200-300 epochs as stable performance was often achieved by that time, with high JI values in the training and validation dataset. Table II summarizes the estimated training time used on each dataset. In all four cases, the models can be effectively optimized within 24 hours.
Table II shows the approximate training time of a DCNN model (e.g., as described in connection with
To use DCNN models described herein for Sargassum detection in an example large satellite input images (FAI or RGB) are partitioned into 416×416 sub-images. Since prediction accuracy may decrease along image edges, these sub-images were prepared with redundant edges (8 pixels outward on four directions) and only prediction results from the image centers (400×400 pixels) were merged back to generate the final extraction results
To quantify Sargassum biomass density, background FAI values were first estimated to account for the reflectance variations of the background water. The background FAI values were then subtracted to calculate the scaled FAI to estimate the corresponding biomass density. For the OLI data, the background estimation parameters and the FAI-biomass models were similarly applied, through an iterative median filtering (with a 200×200 window) and the following FAI-biomass model according to the following equations:
y=22.89x for (0<x≤0.05)
y=57.42 (1.18x−0.06)2+36.00(1.18x−0.06)+1.17 for (x>0.05) (Eq. 19)
where x is the OLI FAI values and y is the modeled Sargassum biomass density (kg/m2).
Considering the high spatial resolution of Dove and WV-2 and the difficulties of conducting accurate biomass quantification, all the Sargassum-containing pixels extracted on these two sensors were assigned 100% Sargassum areal coverage. For example, on the 3-meter resolution Dove images, each extracted Sargassum-containing pixel was assumed to have 9 m2 of Sargassum.
To compare with the Dove-derived Sargassum measurements, the Sargassum areal coverages derived from OLI and MSI were quantified through linear unmixing using a full coverage threshold. Those pixels with biomass densities lower than the threshold were linear unmixed to calculate the fractional coverage, while pixels with higher biomass densities were treated to have 100% Sargassum coverage (i.e., 900 m2 for a 30-m OLI Sargassum-containing pixel and 100 m2 for a 10-m MSI Sargassum-containing pixels). The full coverage thresholds were selected to be the biomass densities when FAI equals to 0.05 (the turning point of changing from linear to nonlinear relationships in the FAI-biomass model), and the values for the Sentinel-2A MSI, Sentinel-2B MSI, and Landsat-8 OLI data are 0.96, 1.24, and 1.17 kg/m2, respectively. For MODIS data, the areal coverages were estimated, where a linear unmixing was performed by referencing the FAI value to a local upper bound (representing 100% Sargassum coverage within a pixel) and lower bound (representing 0% Sargassum coverage within a pixel).
Extraction accuracy of a one embodiment was validated using a separate group of representative MSI, OLI, Dove, and WV-2 images. The extraction results were then compared with the manually extracted “ground truth” features to generate the corresponding F1 score.
On MSI FAI images, the overall Sargassum extraction accuracy, after weighting by the biomass density, is ˜90%, which is an improvement over an alternative Sargassum extraction method based on a Trainable Non-linear Diffusion Reaction (TNRD) approach (86%). Most of detection errors (either false positives or false negatives) are from pixels of relatively low biomass densities. The precision and recall rates are both >85%, suggesting that most of the Sargassum-containing pixels can be accurately detected, and most detected candidate pixels contain Sargassum.
On most OLI FAI images with large Sargassum coverages, the extraction accuracy is >95% in terms of Sargassum biomass densities. The precision and recall rates are both higher than those from the MSI FAI images. The higher accuracy is likely due to the larger pixel size and less noise interference (such as wave glitters) than found in MSI FAI images.
Due to the higher spatial resolution and larger image size, for WV-2 FAI images and Dove RGB images, only a limited number of images were selected to evaluate the extraction accuracy. The areal coverage (as opposed to biomass density) was used to evaluate the accuracy. Table III shows that the accuracy for WV-2 is almost perfect (F1 score=0.98). Even with three spectral bands in the visible wavelengths, Dove images still show satisfactory performance, with F1 score greater than 0.8.
Overall, when evaluated using similar image types (i.e., using the same satellite image sensor), the example embodiment being discussed achieved an F1-score of ≥0.90 except for the 3-band Dove images. Even for these images, which exclude the NIR bands, the F1-score is still 0.82, demonstrating that embodiments disclosed herein are suitable for performing automatic identification and quantification of Sargassum in aerial imagery such as satellite images. Table III below shows Sargassum extraction accuracy on MSI, OLI, WV-2, and Dove images using the methods described herein. Note that for Dove and WV-2 data, the pixel coverage was used to evaluate the accuracy, while for MSI and OLI, the biomass density was compared. Here the number of images means number of original images, not the 400×400 sub-images.
Using systems and method disclosed herein on high-resolution imagery allows quantification of how much Sargassum is likely to go undetected when coarse-resolution satellite sensors such as MODIS are used as the source of image data.
As shown in
The underestimation resulting from the use MODIS image data can also be quantified statistically, as shown in
Similar comparisons can also be obtained between Dove and MSI image data, and between Dove and OLI image data for common valid areas (
The MSI and OLI images were also compared with MODIS observations to evaluate the cross-sensor uncertainties in Sargassum estimates. Forty-five MSI images (tile: T20PNC) and fourteen OLI images collected in 2018 near the Lesser Antilles Islands were compared with the same-day MODIS measurements over their common valid areas. The total Sargassum biomass in the match-up areas from MODIS and MSI or OLI were summarized in
Overall, the relationship between MSI and MODIS is less clear (R2=0.65,
In addition to Sargassum abundance and distribution, characteristics of individual Sargassum features are also important for a number of reasons, for example to help implement plans for physical removal. This example uses the following parameters to characterize individual features: biomass (kg), size (m2), length (m), and length/width ratio.
As shown in
Because of the large spatial and temporal coverages, satellite remote sensing is perhaps the most reliable technique to observe large-scale Sargassum distributions and long-term changes. However, because many Sargassum clumps or rafts are small and moving in the ocean, it is nearly impossible to measure Sargassum size and biomass in the field to match satellite pixels, and therefore it is extremely difficult to validate satellite estimates in a quantitative way through field measurements.
Assuming that high-resolution sensors may provide estimates closer to the “truth”, one way to quantify uncertainties in coarse-resolution estimates is through comparison of the two. The PlanetScope constellation is the only data source at 3-m resolution with daily coverage of the entire GOM, thus providing an excellent opportunity to evaluate uncertainties in the Sargassum estimates from coarse-resolution sensors. Using 12,024 Dove images as the reference, it was determined that all MODIS, MSI, and OLI sensors underestimated Sargassum coverage and biomass. Overall, Dove showed at least ˜150%, ˜360%, and ˜70% more Sargassum than MODIS, MSI, and OLI, respectively.
The same argument also applies to Dove images, as some small Sargassum features may still be undetected in the 3-m Dove images. For this reason, the Dove estimates are not the “truth” itself, but can only be regarded as being closer to the “truth”. In fact, Dove estimates should only represent a lower bound of the true (actual) Sargassum abundance in the natural environment. In future studies, sensors with higher resolution or higher SNRs than Dove may be explored further to push the limit of satellite remote sensing of Sargassum and other macroalgae.
The ability of systems and methods herein to process 3-band Dove images and other high-resolution images enables many of the detection improvements described. Otherwise, due to the lack of spectral bands in the NIR wavelengths it is nearly impossible to extract accurate Sargassum features from the 12,025 images where confusion features such as clouds and cloud shadows are often found. Compared to the traditional methods, the deep learning techniques suitable for use in embodiments herein have the advantages of being a fast and reliable way to interpret vast amounts of satellite data. Using a unique network structure, systems, and methods disclosed herein show robust performance even with limited spectral bands, large background variations, and various confusing targets. This is especially important for high-resolution images where “noise” is highly variable, for example on the Dove images. It is also noted that even when there are small errors in the training data, methods disclosed herein can still be optimized to achieve satisfactory performance without bias. This is attributed to the training that utilizes not only the spectral information, but also spatial context.
Another advantage of systems and methods disclosed herein is flexibility. As illustrated by the examples above, systems and methods disclosed herein are easy to adapt to different type of satellite data or features. For instance, the input image data can be either single-band (e.g., FAI) or multispectral (e.g., RGB) images, depending on the specific feature characteristics. Furthermore, when appropriately trained, deep learning models disclosed herein can be trained to detect other image features that are not Sargassum such as clouds and oil slicks). Moreover, the extraction models described have no lower threshold for detecting Sargassum features. The decision is purely made with the optimized model weights learned from the training processes. This reduces the potential for bias that results from the selection of extraction thresholds when traditional threshold-based segmentation methods are used.
The availability of the various types of high-resolution data, combined with the success of methods using the DCNNs (e.g., the DCNN(s) 115 of
Table IV below summarizes the approximate processing speed for Sargassum extraction from individual MSI, OLI, and Dove images. For an MSI FAI image with 10,000×10,000 pixels, the Sargassum extraction time using methods disclosed herein is about 2 minutes (123.0 seconds), much lower than the time needed by the previous method where the TNRD denoising process alone takes about 11 minutes. For OLI and Dove images, because the image sizes in terms of number of pixels are slightly smaller than MSI images, they use less time to extract the Sargassum features using the methods disclosed herein (see Table IV). For a coastal region of 1°×1° in the tropical or subtropical ocean, it takes about 42 Dove images and 71 minutes to process all images, thus meeting the condition of near real-time monitoring. For the same 1°×1° region, it takes only 2 minutes and 22 seconds to process one MSI and one OLI image, respectively.
A near real-time monitoring system also uses frequent data coverage. While MSI and OLI show better Sargassum extraction accuracy than Dove, only the latter can provide daily coverage. The 3-m resolution also makes it possible to see cloud-free pixels among small clouds, thus improve the spatial coverage. Therefore, a combination of all available Dove, MSI, and OLI images should be able to meet the critical condition of a near real-time Sargassum monitoring and tracking system for targeted nearshore waters.
Using deep convolutional neural networks (DCNN), systems and methods herein (as illustrated by the examples to follow) enable a fully automatic neural-network-based approach to detect and quantify Sargassum macroalgae from various high-resolution images. Even with the complex ocean background and variable “noise,”, experiments using MSI, OLI, WV-2, and Dove images all achieved high detection accuracy with fast processing speeds. Systems and methods herein may also be used to provide a generic (i.e., applicable to other features such as oil slicks), concise, and effective tools for extracting Sargassum and other features from high-resolution satellite images, and also satisfies the needs for near real-time Sargassum bloom monitoring. Depending on location, previous approaches using MSI, OLI, and MODIS sensor data may result in considerable underestimate of Sargassum quantities when compared with the concurrent and co-located Dove (3-m resolution) estimation methods disclosed herein. Systems and methods herein, using high-resolution MSI, OLI, and Dove images, may be incorporated into the existing Sargassum Watch System (SaWS), to significantly improve Sargassum estimation in nearshore waters.
In some examples, the image analysis system 100 can monitor Sargassum or other macroalgae on beaches and nearshore waters (e.g., using PlanetScope/Dove imagery). Sargassum beaching events have been reported in recent years around the Caribbean Sea and Florida, USA, causing numerous environmental and economic problems. Satellite remote sensing has been widely used to monitor Sargassum blooms in open waters, yet due to either coarse spatial resolution or low-revisit frequency, it is difficult to provide timely information on Sargassum inundation from traditional satellite instruments. In the present disclosure, the capacity of 3-m resolution daily PlanetScope/Dove imagery is demonstrated in monitoring Sargassum beaching events (e.g., in Miami beach (Florida, USA) and Cancun beach (Mexico)). In some examples, a U-net deep learning computer model can be developed to extract Sargassum features from Dove imagery over beaches and nearshore waters. Application of the model to Dove image sequences between May and August 2019 shows two major inundation events on both Miami Beach and Cancun beach, consistent with local reports. Thus, with the availability of 3-m resolution PlanetScope/Dove and PlanetScope/SuperDove data around the globe, the image analysis system 100 can monitor dynamic inundation events of not only Sargassum but also other macroalgae in many other regions.
In some examples, recent developments of small, affordable satellites known as CubeSats take advantage of both high spatial resolution and frequent revisits to meet some conditions of monitoring small, dynamic features. For example, the complete PlanetScope constellation can include ˜180 satellites (CubeSat 3U dimensions are 10×10×30 cm3) equipped with Dove sensors (and recently augmented by SuperDove sensors), making it possible to image the entire land surface and nearshore waters every day at 3-m resolution. However, their utility on monitoring of Sargassum beaching events has not been addressed. In the present disclosure, the use of Dove imagery is demonstrated in monitoring Sargassum beaching events, including the timing, location, and amount of Sargassum on the beaches and adjacent waters. For example, two study regions (Miami beach, U.S. and Cancun beach, Mexico) were selected to evaluate the performance.
In total, 227 and 501 four-band Dove images from May 1, 2019 to Aug. 31, 2019 were downloaded for Miami beach (25.76˜25.87° N, 80.14˜80.08° W) and Cancun beach (21.02˜21.175° N, 86.822˜86.728° W), respectively, from the Planet Labs data portal (https://developers.planet.com/docs/apis/data/). The ancillary files, including the XML metadata files and the usable data bit masks (UDM), were also obtained. The detailed spatial distributions of satellite re-visit times for each region are shown in
1) Image pre-processing: The top-of-atmosphere (TOA) radiance data can be converted to TOA reflectance by multiplying the coefficient provided in the metadata files. For each region of interest, all available Dove tiles from different satellites on the same day can be clipped and mosaiced to have a complete coverage. The red, green, and blue bands can be used to compose RGB quick-look images. For examples, two such images are shown in
Although the UDM file corresponding to each Dove image provides information on pixels of usable data within an image (e.g., clear, snow, shadow, light haze, heavy haze, and cloud), after inspection it was found that such pre-defined pixel-wise classifications are not accurate as some Sargassum pixels are falsely masked as clouds. Therefore, the UDM files were only used for calculating the scaling factor (step (3)) rather than masking cloudy pixels when extracting the Sargassum pixels (step (2)).
Specifically, a dataset can be prepared to train the U-net model to extract Sargassum pixels. This training dataset can include the 4-band TOA reflectance images and the Sargassum label images. For creating the Sargassum label images, the Sargassum pixels can be roughly delineated based on their normalized difference vegetation index (NDVI) values. Then, by visually inspecting the spectral shapes and RGB images, the Sargassum labels can be fine-tuned. In some experiments, a total of 177 sub-images (400×400) were prepared. The maximum number of training epochs was set to be 400. The U-net model can be optimized until the number of iterations reaches the maximum training epoch number.
To distinguish Sargassum on beaches and in nearshore waters, base maps can be created to contain three types: water, beach, and non-beach land. This can be through a simple K-means unsupervised classification with human interpretation on cloud free Dove images (e.g., on Aug. 13, 2019 and Jul. 5, 2019 for Miami beach and Cancun beach, respectively). Then, the base maps can be applied to all images to determine beach locations.
S
Sargs_norm
=S
Sargs
*C (1)
where SSargs_norm is the normalized Sargassum area after scaling up to account for cloud coverage, SSargs is the original Sargassum area estimated from extraction results, and C (≥1.0) is the scaling factor calculated from the UDM file, which is equal to the ratio between the total number of beach and water pixels (from the base maps) and those not covered by clouds.
While the image sequence from the DL-based Sargassum extraction is presented in the supplemental materials,
To quantitatively evaluate the extraction results, the numbers of true positive pixels (TP), false positive pixels (FP), true negative pixels (TN) and false negative pixels (FN) are listed in Table V. Statistical measures such as false positive rate (FPR), false negative rate (FNR), and F1 score are reported in Table V as well. In this analysis, the “ground truth” images were prepared in the same way as used in preparing the training dataset for the U-net model. The “ground truth” images are independent from the training dataset.
With the extraction results validated,
The dynamic changes in Sargassum area on beaches and nearshore waters are also shown in the time-series data from May 1, 2019 to Aug. 31, 2019 (
In further examples, the image analysis system 100 can remotely detect marine debris using satellite observations in the visible and near infrared spectral range. By definition, marine debris refers to any persistent solid material that is disposed of (or abandoned) in the marine environment by natural processes (including natural disasters such as Tsunami) and human activities, for example microplastic particles, plastic bags or bottles, cigarette butts, foam take-out containers, balloons, fishing gear, tree branches/leaves, wood, among others.
Despite the importance of remote detection of marine debris, nearly all published studies are focused on either controlled experiments, or Sentinel-2 data with mixed band resolutions that are subject to large uncertainties. To date, key questions such as the following have not been addressed adequately: To what extent can the various forms of marine debris be remotely detected and differentiated through satellite observations in the visible and near infrared (NIR) spectral range, and how? Here, using published reflectance spectra of various types of floating matters, these questions can be addressed through sensitivity analyses, simulations, and spectral analyses of satellite images. While the descriptions herein are not limited to the examples disclosed in the present disclosure, several observations can still be made. First, detecting macroplastics and other debris is possible when they form large patches along ocean fronts or windrows. Second, assuming a SNR of 200, discriminating large patches of marine debris from floating algae is only possible with a subpixel coverage of >0.3%. These threshold values are based on the sensor SNRs only, and they represent the lower bounds of detection and discrimination, respectively. The real threshold values above which a detection or discrimination is possible also depend on the observing conditions, and therefore could be higher. Third, currently, Sentinel-2 MSI (Multi Spectral Instrument) sensors can provide an optimal trade between resolution and coverage, yet MSI sensors have SNRs <200, and interpretation of the MSI spectra uses extra caution due to variable spatial resolutions in different bands, among other factors. From the perspective of pure spectroscopy, it is possible to discriminate floating algae from non-algae floating matters but difficult to differentiate the type of the latter (either plastic or non-plastic debris, foam, etc.) because different non-algae floating matters all show relatively flat reflectance spectral shapes in the vis-NIR spectral range. Finally, based on these results, recommendations can be made on algorithm designs and sensor designs, for example spectral analysis should be performed over the difference spectra to minimize the impact of variable subpixel coverage, and certain spectral bands are more important than others for the remote detection of marine debris.
The current disclosure discloses whether and under what conditions various forms of marine debris can be detected and discriminated against other floating matters using the vis-NIR wavelengths. The current disclosure can use the endmember spectra, sensor sensitivity, simulation experiments, and spectral analysis of Sentinel-2 data for demonstration purpose. The current disclosure can provide example sensor designs as well as algorithms and approaches toward vis-NIR remote sensing of marine debris. The use of SWIR wavelengths is also discussed.
Endmember spectra: To date, laboratory or in situ measurements of optical properties of marine debris are scarce, with the exception of some artificial (man-made) garbage patches and field-collected micro plastic particles (
In the ocean, other forms of floating matters also exist. These include Sargassum fluitans and Sargassum natans, Sargassum horneri, Ulva, cyanobacteria Trichodesmium, emulsified oil, green Noctiluca, red Noctiluca, pumice rafts, foams (whitecaps), etc. Some of these have been measured in the field, with in situ hyperspectral reflectance being available, but others have only been assessed using multi-band satellite data (e.g., pumice rafts). Some of these spectra are compiled in
Satellite Data: While there are currently many satellite sensors in orbit, Sentinel-2 MSI can be selected to represent high-resolution optical sensors because it provides a trade between spatial resolution (10-20 m) and revisit frequency (2-3 days). MSI can cover wavelengths of vis-NIR-SWIR, suitable for detecting and differentiating small floating matters. MSI can have the following spectral bands: 443 (60), 492 (10), 560 (10), 665 (10), 704 (20), 741 (20), 783 (20), 841 (10), 865 (20), 1614 (20), and 2202 nm (20), where the numbers in the parentheses represent their ground resolutions in meters.
In the examples, Level-1 MSI data were downloaded from the U.S. Geological Survey, and processed using the Acolite software to generate Rayleigh corrected reflectance (Rrc(1), dimensionless), from which Red-Green-Blue and False-color RGB composite imagery were generated. In the FRGB imagery, a NIR band was used to replace the green band in the RGB imagery, making it suitable to detect floating matters with enhanced NIR reflectance.
Regardless of the floating matter type (either marine debris, floating algae, or other types of floating matters), remote detection can use two steps. Step 1 is to detect a spatial anomaly, i.e., some pixels “stand out” from their nearby background waters. Step 2 is to spectrally differentiate the pixel type from the spatial anomaly. Step 2 can be performed after Step 1. In simpler words, the two steps can be shortened as: 1) is there “something”? 2) what is that “something”? If the amount of floating matter is to be quantified, then a third step is to address the question of how much is that “something.”
Using image examples and simulations, Step 1 can rely on the sensor's sensitivity (i.e., signal-to-noise ratio or SNR), while Step 2 can use specific spectral bands depending on the targeted floating matter type and on the selected algorithms. These steps can be used for the sensitivity analysis below.
Sensitivity Analysis: To be able to “stand out” in an image, a pixel is significantly different from the surrounding pixels. Mathematically, this can be expressed as:
ΔR>2√{square root over (2)}σ (Eq. 20)
where σ is the sensor noise in a pixel that is inherent for a given sensor, √{square root over (2)} is to account for noise propagation in pixel differencing between the target pixel and nearby reference pixel (in this case, noise is the square root of sum squares from two pixels, therefore the √{square root over (2)} term), 2 is to make the difference statistically significant (i.e., 2 times noise), and ΔR is the difference between target pixel and nearby reference pixel (i.e., water pixel):
ΔR=RT−RW=[χRFM+(1−χ)RW]−RW=χ(RFM−RW) (Eq. 21)
where “T” stands for target, “FM” is for floating matter, “W” is for water, and χ (0.0%-100%) is the subpixel proportion of floating matter. For simplicity, the wavelength dependence of R can be omitted. In some examples, sensor noise could be inherent for a sensor, which can be obtained from either the sensor specification document, or estimated in other ways.
From Eqs. 20 and 21, once σ is known, the subpixel detection limit, χdet, can be estimated as:
χdet≥2√{square root over (2)}σ/(RFM−RW) (Eq. 22)
From Eq. 21, assuming the endmember spectra of RFM and RW are relatively stable, the spectral shape of ΔR can be determined between RFM and RW with equal weights, and the shape does not change with χ. In other words, both RFM and RW can contribute to ΔR with the same weights regardless of χ. In contrast, their weights to RT might not be equal but can be determined by χ and (1−χ), respectively. This can make the spectral shape of RT being dominated by RW when χ is very small (e.g., <5%) as for the case of marine debris.
In practice, because the spectral contrast between floating matters and water is mostly in the red-NIR-SWIR wavelengths, χdet can be estimated using a single wavelength in the NIR (Eq. 22), or a combination of these wavelengths (e.g., floating algae index). Because the latter involves more bands and therefore more noise, the lower detection limit can be from a single band in the NIR, where the spatial contrast can be the highest between floating matters and water.
Once a pixel is determined to contain a certain type of floating matter, there are several ways to discriminate the type, including a similarity index between the pixel's ΔR and (RFM-RW) where RFM and RW are from the established spectral library (e.g.,
Here, NRD stands for NIR-red difference. For a pixel containing χ floating matter and (1−χ) water, there is:
ΔNRDFM=ΔRNIR−ΔRred=χ[RFMNIR−RFMred)+(RWNIR−RWred)] (Eq. 23)
Then, to be able to separate marine debris (MD) and floating algae (FA), their difference in ΔNRD should be significantly higher than noise, i.e.,
ΔNRDFA−ΔNRDMD=χ[(RFANIR−RFAred)−(RMDNIR−RMDred)]>>noise=2×2σ (Eq. 24)
Here the first 2 in Eq. 24 represents statistical significance, and the second 2 in Eq. 24 represents the cumulative noise using the square-root rule (two bands, two types, therefore √{square root over (4)}). The discrimination limit is therefore:
χdis≥494 /[(RFANIR−RFAred)−(RMDNIR−RMDred)] (Eq. 25)
Comparing with Eq. 22, Eq. 25 is very similar except for the factor of 4 instead of 2√{square root over (2)} because of more spectral bands involved.
Simulation-Experiment: In the experiment, reflectance of a pixel covered by both floating matter and water (i.e., mixed pixel) was estimated with their endmember spectra and χ.
Then, the pixel's spectra were compared with the endmember spectra to determine their spectral similarity.
The similarity between two spectra was estimated using a spectral angle measure (SAM). The choice of SAM over other similarity measures is because SAM is based on spectral shape only. χ for marine debris or other floating matters is often very small and also variable, thus the reflectance magnitude of the mixed pixel should be deemphasized.
Mathematically, SAM is the angle between two spectral vectors, defined as:
SAM (degrees)=cos−1[(Σxiyi)/(√{square root over (Σxi2)}√{square root over (Σyi2)})] (Eq. 26)
where x and y represent two spectra and the summation is for band number i from 1 to N. SAM=0° means two parallel spectra in log space (i.e., identical spectral shapes), while SAM=90° means perpendicular spectra (i.e., completely different spectral shapes). SAM<5° indicates that the two spectra are very similar.
Four (4) endmember spectra were selected in the experiment: Sargassum (
The hyperspectral data of the 4 endmembers were first resampled to MSI wavelengths using their relative spectral response (RSR) functions, and then mixed using different subpixel coverage (χ from 1% to 20%). Then, both the mixed spectra, Rχ, and their contrasts from water, DRχ, were compared with the endmember to determine their similarity using Eq. 26.
In the above simulation experiment, because MSI bands have different spatial resolutions, the same experiment was conducted twice. The first used imaginary MSI bands where their resolutions were all set to 10 m. The second used realistic resolutions for individual bands (either 60, 10, or 20 m). In the latter case, if a 10-m band had χ=20%, the 20-m band had χ=20%/4=5%. Therefore, χ varied between bands in the same mixed-pixel spectra, causing distorted spectral shapes (see below).
Sensitivity: Table VI shows σ from the proposed NASA mission (HyspIRI, currently Surface Biology and Geology or SBG) assuming an SNR of 200, and σ estimated from MSI measurements over clear-water scenes. Here, σ represents noise estimated from R instead of total at-sensor radiance. Only several MSI bands in the green, red, and NIR wavelengths are listed because these are the most relevant bands to detect floating matters. Rt,typical is the typical total reflectance over oceans under cloud-free and glint-free conditions. In some examples, MSI SNRs are lower than the proposed HyspIRI SNRs, and the corresponding σ is 2-4 times higher than the proposed HyspIRI σ. For simplicity, σMSI in the NIR is assumed to be mean +2 standard deviations (6×10−4); Likewise, σH in the NIR is assumed to be 2×104.
Table VI. Reflectance noise (σ) used in the sensitivity analysis. Rt,typical is typical top-of-atmosphere reflectance over the ocean. “H” is for HyspIRI specification. σMSI is estimated from clear-water scenes. For simplicity, σMSI in the NIR is assumed to be mean +2 standard deviations, about 6×10−4. Likewise, σH in the NIR is assumed to be 2×10−4.
Then, for a HyspIRI-like sensor, Eq. 20 suggests that in order for a pixel to “stand out” from the nearby background water pixels, ΔR needs to be >˜6×10−4. For MSI-like sensors with σ˜6×10−4, ΔR can be >˜2×10−3. Assuming RFMNIR≈0.25 (
χdetH≥2√{square root over (2)}σH/(RFM−RW)≈0.2%
χdetMSI≥2√{square root over (2)}σMSI/(RFM−RW)≈0.8% (Eq. 27)
Similarly, assuming RFANIR−RFAred≈0.25 (
χdisH≥4σH/0.25≈0.3%
χdisMSI≥4σMSI/0.25≈1.0% (Eq. 28)
These estimates are based on the assumption that 1) for both floating plastics (or other debris) and floating algae, their NIR reflectance is ˜0.25 (
Microplastics: From 11,854 surface trawls between 1971 and 2013, microplastics distributions in global oceans (microplastics defined by particle size <5 mm) were compiled and analyzed. The dominant majority showed surface density of <1M pieces km−2, and nearly the entire data archive showed <10M pieces km−2. So far, the highest reported density is 26M pieces km−2. In other examples, a dataset of marine plastic debris measured at 1,571 stations from 680 net tows and 891 visual survey transects was compiled. The dominant majority of all compiled particle density and modeled particle density is <1M pieces km−2 for particles <4.75 mm in size. Therefore, the maximum density of microplastics in natural waters is ˜10M pieces km−2. In another example, the size of plastics from a compiled dataset can show a log-normal distribution with most particles of <5 mm and the histogram mode of ˜2 mm.
Assuming a mean size of 2.5 mm per piece, the maximum density of 10M pieces km−2 is equivalent to about 50 m2 microplastics km−2 (i.e., x=0.005% of a pixel) if all pieces are laid on the very surface without blocking each other. Clearly, χ=0.005% is <<χdetH (0.2%) and also <<χddetMSI (0.8%, Eq. 27). Indeed, for χ=0.005%, ΔR=0.25×χ≈1×10−5. Such a signal, corresponding to the maximum microplastics density reported in the literature, is 60 times lower than 6×10−4 and 20 times lower than sensor noise for a sensor with SNRs of 200. In turn, in order for microplastics particles to be detected, their density is desirable to be at least >600M pieces km−2 or 600 pieces m−2. Even though, all these particles need to be aggregated on the very surface without blocking each other in order to achieve a maximum reflectance signal.
In the marine environment, because microplastics particles may actually be below water surface due to mixing or other processes, their NIR reflectance can be much lower than when they are all aggregated at the ocean surface. Therefore, their density is desirable even higher than shown above, reinforcing the argument that remote detection of microplastics can be approached with a different technique.
In further examples, when the particles are heavily concentrated along narrow ocean fronts, windrows, or in small-scale eddy convergence zones so that particle density is >0.2% (i.e., >600 m−2), the reflectance anomalies and therefore presence/absence of microplastics particles can be detected.
The estimates above are based on the assumed SNR of ˜200, as proposed for the HyspIRI mission (currently SBG). High spatial-resolution sensors typically have SNRs much lower than 200, resulting in much higher sensor noise (Table VI). In such realistic cases, the detection limit is also higher, for example χdet>0.8% (or particle density>2400 m−2) for MSI.
Finally, the above arguments are purely from the perspective of instrument sensitivity. In some special cases, microplastics may aggregate among other larger floating matters, for example Sargassum. Because Sargassum density and distributions can be estimated using both coarse- and medium-resolution satellite sensors, if a relationship between microplastics and Sargassum density can be established from field surveys, the relationship may be applied to the synoptic observations of Sargassum in the Atlantic to estimate the total amount of microplastics around these large macroalgae mats.
Microplastics and other debris: Although made of different materials, both macroplastics (>5 mm) and other non-plastic debris have broad-band spectral response (e.g.,
Similar to the detection of microplastics, Step 1 in macro debris detection can be also to detect a spatial anomaly, where the desirable condition on subpixel coverage is the same: χdet>0.2% for a SNR of 200, assuming macro debris is on the very surface as opposed to be submersed in water. For a 10-m pixel, this means that the macro debris patch within the pixel is desirable to be at least >0.2 m2. For MSI, the detection limit can be χdet>0.8% or 0.8 m2. Once a spatial anomaly is detected, spectral analysis can be performed in Step 2 to tell whether the anomaly is due to macro debris or other floating matters (i.e., floating algae). From Eq. 28, χdis is desirable to be >0.3% for a HyspIRI-like sensor, and >1% for MSI. For a 10-m pixel, this means 0.3 m2 and 1 m2, respectively.
Although still difficult, these detection and discrimination limits can certainly be met under certain circumstances, for example around river mouths, in frontal convergence zones, or along windrows of the ocean. However, in practice, as shown below, while detection and discrimination of floating algae and non-algae features are possible, discrimination of macro debris is actually more demanding than shown above due to spectral similarity among different types of floating matters.
Simulation experiment: Spectral shape variations: While the sensor's sensitivity or SNRs to detect and discriminate marine debris is described above, this section illustrates how spectral shape changes with spectral endmember, water type (clear or turbid), and χ.
In both figures (
In further examples,
In even further examples, when the blue bands of 443-nm and 492-nm are excluded, the spectral shape in ΔR of mixed pixels is also stable between clear and turbid waters (i.e., the empty and solid symbols almost overlap with each other). This can indicate that in applications of satellite imagery in different water environments, water type (i.e., clear or turbid) may be excluded from consideration when performing spectral analysis.
Simulation experiment: Spectral similarity: While
From these results, the following can be summarized. One, consistent with the findings from
Two, such an ability is compromised for MSI spectra with mixed band resolutions because their spectral shapes are distorted to the variable χ in different bands (
Three, in contrast, if all MSI bands are forced to have the same 10-m resolution, Sargassum-containing pixels and plastic-containing pixels can be easily separated through comparing their SAM values to both endmembers, and such a separation is possible down to at least 1% subpixel coverage, a result consistent with the sensitivity analysis above. This is shown by the solid bars between the two colors in
Although the simulation experiment used only two endmembers of Sargassum and plastic, because their spectral shapes can represent floating algae and macro debris, respectively, the findings above can help guide spectral analysis and algorithm development when applying MSI imagery to detect marine debris and other floating matters, as shown below. Indeed, most floating algae have similar red-edge reflectance and similar 670-nm reflectance trough as in the Sargassum endmember (
Practical considerations: The above sensitivity analysis (Eqs. 8 and 9) is based on the ideal situations where image noise is assumed to come from the sensor noise only, and spectral shapes in either the marine debris endmembers or the floating algae endmembers are stable. Therefore, both χdet and χdis represent the lower bounds, i.e., below which detection and discrimination are impossible, but above which whether or not they are possible still depend on other factors.
For example, the detection limit can be estimated from a single band in the NIR because this is where the maximum contrast occurs between floating matters and water, but single-band images are difficult to interpret as reflectance of the water background may change substantially across the image. The use of floating algae index (FAI) or other indexes (e.g., FRGB) may facilitate image visualization, but the detection limit may be compromised to a higher value, for example to ˜1% for floating algae with a SNR of 200. Likewise, in practice, a single pixel above the detection limit is difficult to interpret (see
Furthermore, different sensors have different artifacts, which can be considered carefully when analyzing the spatial and spectral anomalies. One example is the hardware parallax in push-broom sensors such as MSI (ESA) and Landsat-8 OLI, where different bands are not co-registered in time for a given pixel. Such an effect can create colorful pepper noise in RGB composite imagery over moving targets (e.g., waves,
Similar to the sensitivity analysis, the simulation results described above are also simplified to demonstrate the concept of 1) why ΔR is preferred over R to differentiate floating matter type and 2) for MSI, why the use of single pixels is not a practical way for spectral discrimination even if ΔR is used. In real applications, other types of floating matters as well as spectral modulations by image noise also need to be considered. However, even such conceptual demonstrations may provide some guidance on how to perform the spectral analysis.
For example, to avoid MSI spectral distortion in single pixels, mean spectra from 5′5 pixels instead of a single pixel may be used to derive the spectral shape in a more reliable way, as shown in
In
Similar observations are obtained in
The results in
In some examples, the reflectance magnitude in both cases can be very small. Assuming a NIR reflectance of 0.25 in both endmembers for χ=100%,
Example Case Study over West Florida Shelf: With the findings above, an example case study using MSI data is presented here to demonstrate how to detect image features (spatial anomaly) and how to discriminate the feature type.
For simplicity, the first step in remote detection of marine debris, i.e., detecting a spatial anomaly, can be through visual inspection while sophisticated image segmentation may be implemented in the future.
Spectral analysis of representative pixels from the slicks, through the use of ΔRrc as in
In contrast, although it is clear that the slicks in Areas 1 (2702) & 2 (2704) are not floating algae, it is very difficult to discriminate their type based on the spectral shapes. The 443-band may be excluded because the spectral distortion by this 60-m band cannot be removed even after 5×5 pixel averaging. Then, except for the residual band-resolution effect in several NIR bands (704-nm, 741-nm, 783-nm, 865-nm, all having 20-m resolution), all spectra are featureless (
First, foams or white caps may be ruled out because these slicks show red-rich spectra (in contrast, whitecaps are blue-rich) and because the ocean was very calm (wind is mostly <3 m s−1 for two consecutive days before the imaging time,
Which wavelengths (bands) to use: The current disclosure comprises embodiments that exploit the vis-NIR bands rather than the SWIR bands for several reasons. First, the spectral contrast between floating matters and background waters is mostly in the NIR, regardless of the type of floating matters (either Sargassum, Ulva, Trichodesmium, or plastics, see references in
Furthermore, when implementing a detection scheme, although the use of a single NIR may maximum the pixel-to-pixel contrast at local scale, interpretating single-band images is usually difficult because of relatively large gradience (compared to noise) across the image. Therefore, band-combination indexes, such as FAI, may be used to mitigate such effects at the price of reduced sensitivity to detect spatial anomalies. This is a reason why the same 200 SNR led to the ˜1% detectability but ˜0.2% in this current disclosure. The FAI design can be changed to use different band combinations depending on band availability and application needs, for example through the alternative FAI (AFAI) or the floating debris index (FDI). In the end, a combination of a red band (665 nm), a NIR band (754 nm), and another NIR band (865 nm) or a SWIR band (1.2 μm or 1.6 μm) can be sufficient in detecting the presence of floating matters.
Then, the inclusion of a green band (560 nm) will with the 665-nm band and NIR bands make it easy to calculate SAM in order to discriminate between floating algae and non-algae floating matters (
Discriminating the type of non-algae floating matters: Once non-algae floating matters are identified using the above SAM-based or other similar approach, further discriminating the type of non-algae floating matters can represent a technical challenge as most non-algae debris appear to be similar in spectral shapes (
One exception might be the separation of plastic versus non-plastic marine debris, as the former shows specific, narrow, hydrocarbon absorption features in the SWIR wavelengths. Once hyperspectral data at those wavelengths are available, it might be possible to fingerprint floating plastics. This is similar to the use of these features in detecting and quantifying emulsified oil on the ocean surface. However, one drawback is the small magnitude of these features from marine debris, which may be extremely difficult to detect for the reasons outlined above.
Despite these difficulties, discriminating marine debris from other non-algae floating matters may still be possible through non-spectroscopy methods. For example, most of the non-algae floating matters are rare in the ocean, with often known locations, thus can be easily ruled out with some a priori knowledge. Inspection of wind data can also help rule out the possibility of whitecaps.
On the other hand, although discrimination between floating algae and floating debris (and further discrimination of the type of floating debris) is desirable, it is not always necessary in an ecological perspective. This is because biofouling of marine debris is common, which has implications on both the ecosystem and the fate of marine debris.
Automation: Based on these observations, it is possible to implement a step-wise approach to automate the detection and quantification of both floating algae and non-algae floating matters (including macro debris), for example:
Step 1—detecting spatial anomaly and delineating image features. This can be based on a single band or a combination of bands using image segmentation techniques.
Step 2—discriminating between floating algae and non-algae floating matters. This can be based on their difference around 665 nm, which can be quantified through the use of SAM or other indexes. In this step, the spectral shape around 665 nm can be derived from reflectance difference (ΔRrc) in order to maximum the spectral contrast, where the background water pixels for the individual slick pixels can be found using a nearest-neighbor approach.
Step 3—quantifying x in each pixel, through the use of locally tuned lower-bound threshold to represent x=0% and pre-defined upper-bound threshold to represent x=100% (e.g., upper bound for a single NIR band may be 0.25, but for AFAI may be 0.1 according to the endmember spectra presented in
In all steps above, a fundamental property is the spectral reflectance derived from satellite measurements. Ideally, atmospherically corrected surface reflectance can be used to remove the variable effects due to Rayleigh scattering, gaseous absorption, aerosol scattering, sun glint, and solar/viewing geometry. However, this is not always possible from a pixel-wise atmospheric correction approach (e.g., currently being used in the NASA processing software SeaDAS and ESA processing software SNAP) because the presence of floating matter can violate atmospheric correction assumptions on negligible or predictable (predicted from the red band) NIR-SWIR surface reflectance. Therefore, a nearest-neighbor atmospheric correction or a dark-target based image-wise atmospheric correction approach, such as that implemented in the Acolite software may be used. On the other hand, because ΔR rather than R is used in Steps 2 and 3, both Rrc and Rt can be used. This is because atmospheric over adjacent pixels is assumed to be the same, leading to ΔRrc=ΔRt=ΔR.
Automation also can use cloud masking and other steps to mask pixels that are impossible to determine whether they contain floating matters. These pixels are treated as no-observation pixels to avoid biasing statistics. This is more difficult than implementing the above steps, as there is no universal way to mask these pixels. For example, the standard cloudmasking algorithm in SeaDAS uses a threshold of Rrc(865 nm)>0.027 to mask clouds, and the modified algorithm uses a threshold of Rrc(2130 nm)>0.0175 to mask clouds. Because of the enhanced NIR and SWIR reflectance due to floating matters, these cloud masking schemes may falsely mask some floating matter pixels as clouds. To overcome this difficulty, customized cloudmasking algorithms have been used for different sensors, yet their global applicability can be evaluated. Likewise, cloud shadows in high-resolution imagery, among other “artifacts”, also can be identified and masked. In some examples, a regional near real-time system can be developed to monitor both floating algae and non-algae floating matters, similar to the Sargassum Watch System (SaWS) established for the Atlantic Ocean and Gulf of Mexico.
Optical sensors: In some examples, an initial review of the current satellite sensor capability can be provided, including SAR and LIDAR, with the focus on sensors equipped with NIR and SWIR bands to emphasize the contrast between marine debris and water in these bands. For passive optical sensors, none of the existing satellite sensors was designed to monitor floating matters, especially for the case of marine debris. The observations above may provide some general guide for an “optimal” sensor.
The characteristics of passive optical remote sensing can be generalized in the following 4 resolutions: spatial, spectral, radiometric, and temporal resolutions, with the last two defined by SNRs and site revisit frequency. The first three resolutions are discussed in this disclosure, and the last resolution depends on both sensor and satellite orbital designs. Because there is always a trade-off between all four resolutions, for a coastal region, an optimal sensor can detect and discriminate non-algae floating matter patches of several m2 in size every few days. Such a capacity can enable the sensor to search for missing fishing gears or large solid objects in the ocean due to Tsunami or other disasters. Thus, the following may serve as an optimal trade: 3-4 m spatial resolution, 4-6 spectral bands (443 nm, 560 nm, 620 nm, 665 nm, 754 nm, 865 nm, with 665-nm band having 10-20 nm bandwidth), SNRs of 50-100 at typical ocean radiance inputs, and revisit frequency of 3-4 days. The 865-nm band can be used to form a baseline with the 670-nm band to calculate AFAI. If only 5 bands are allowed, the 865-nm band may be removed because the 754-nm and 670-nm bands can be used to calculate the normalized difference vegetation index (NDVI). The 620-nm band can differentiate floating algae colors (green-rich or orange-rich) but can be sacrificed if only 4 bands are allowed. The other 4 bands (443, 560, 670, and 754 nm) can represent the core conditions for effective detection and discrimination of floating algae and non-algae floating matters.
Currently, the Sentinel-2 MSI sensors almost meet these conditions, yet the mixed band resolutions degrade their capacity in discriminating floating matter types, and its spatial resolution may also be improved. On the other hand, the Planet Scope/DOVE constellation of hundreds of miniature satellite sensors (CubeSats) provides 3-4 m resolution data in 3-4 spectral bands with 2-3-day revisit frequency in many coastal areas, thus almost meeting the conditions above. Unfortunately, DOVE spectral bands are too wide (60-90 nm) to discriminate floating algae against non-algae floating matters. Nevertheless, these existing satellite sensors, combined with other high-resolution sensors, can provide a “practical” solution to meet sensor conditions.
From the above sensitivity analysis, simulation experiments, and case studies using MSI imagery, the following observations and suggestions can be generalized:
Regardless of the floating matter type, either marine debris, oil slicks, or floating vegetation, from the perspective of spectroscopy, remote detection can be done through spatial anomaly analysis and remote discrimination can be done through spectral shape similarity analysis, as opposed to spectral magnitude. Both depend on a sensor's SNRs and band settings.
From a theoretical basis, with only 4 bands around 560 nm, 665 nm, and two NIR wavelengths, both χdet and χdis depend only on sensor sensitivity (SNRs). They are 0.2% and 0.3% for a sensor with SNR of 200, and 0.8% and 1.0% for MSI. Below these limits, floating matter may not be detectable. Considering other practical conditions (e.g., a spatially coherent image feature instead of a single pixel can “stand out” from the background, these limits may be increased by 2-3 times in order to detect and discriminate floating matters. For example, χdet and χdis for MSI may actually be 2% and 3%, respectively (
While the detection of presence/absence of floating matter can be through single-band images or band-combination indexes with each having its own strengths and weaknesses, spectral similarity (or anomaly) needs to be analyzed through the difference spectra (ΔR in Eq. 21) because this is the only way to retain the spectral shape of the floating matter endmember when χ is typically small (e.g., <10%). Such a practice actually started when differentiating Sargassum from Trichodesmium in the Gulf of Mexico. For the same reason, using spectral shapes derived from mixed pixels as endmembers can be subject to large uncertainties because these shapes depend not only on the floating matter endmember, but also on the unknown χ as well as on changes in the water endmember in the real environment.
While detecting macro debris appears possible from MSI, spectral shapes of a mixed pixel can be modulated by the variable χ, variable water reflectance, mismatch in band spatial resolution, and artifacts caused by waves, and sensor artifacts (e.g., parallax effect). This is true even for the same type of floating matter, not to mention the possibility that more than one type of floating matter may exist in the same region. For the reasons mentioned above, in the case of lack of “pure” spectral endmember (i.e., χ=100%), it is better to use ΔR rather than R to represent the endmember.
Because the spectral similarity between macro plastics and non-plastics debris (i.e., relatively flat, wide-band reflectance in the vis-NIR wavelengths), it appears difficult to separate them spectrally. This is also partially due to the lack of a relatively complete spectral library of various marine debris. More measurements are therefore desirable to complement those reported in previous studies focused on plastics. On the other hand, because marine debris may be a mixture of different types of materials, discriminating a specific type may be unnecessary.
The spectral analysis here using Sargassum and plastic is for demonstration purpose only. In the marine environment, other known (e.g.,
Finally, the disclosure uses MSI data to demonstrate the concept in remote sensing of marine debris in the marine environments. The arguments may change substantially for remote sensing of marine debris when they may be heavily concentrated on beaches. In both environments, sensors with finer spatial resolution such as WorldView (2 m) or PlanetScope/Dove (3-4 m) offer a better capacity in detecting smaller debris patches, although detecting such features in marine environments uses more effort because these sensors were not designed for marine applications and therefore subject to higher noise.
In some examples, while the approach can be applied to any multi-band or hyperspectral satellite-borne or airborne sensors, in practice the following steps may be used to implement a system to provide near real-time Sargassum maps or other floating matter maps.
First, a best available type of image data for a region is determined, based on image quality, resolution, and acquisition time. In some instances, different image types will be determined for different portions of a larger area. Next, the image analysis methods described above are applied to the best available data to generate the best-quality Sargassum maps. If the above is not possible due to lack of satellite data coverage or cloud cover, then other recent data (which may be inferior to images of the preferred type are used.
Next, a narrow time window (e.g., week), from which all available images may be used to fill data gaps due to cloud cover and other artifacts. Finally, based on sequential images in the past week and past month as well as the current time of year, predict future presence and abundance. For example, during spring, if the Sargassum amount in a certain region increased in the last month, the amount is likely to increase continuously in the short term.
In an example, systems and methods herein may be used to provide Sargassum estimation and forecasting as a service to users, in a cloud computing environment, or in a software package that may be installed on user's local device. Non-limiting examples of services which may be provided include regional guidance for harvesters, fishermen, coastal resource managers. Recursively updated near real-time Sargassum maps may be provided as well as historical Sargassum for the same region and same season. Realtime (or near realtime) boat positions may be overlaid on Sargassum maps. Surface currents and other information may also be overlaid on Sargassum maps. Systems and methods herein may also be used to answer questions on-demand for specific patches/locations (e.g., total Sargassum quantity at a particular location or within a predefined radius of a specified location) and may also provide predictions of answers to similar questions at future times.
In some embodiments, any suitable computer readable media can be used for storing instructions for performing the functions and/or processes described herein. For example, in some aspects, computer readable media can be transitory or non-transitory. For example, non-transitory computer readable media can include media such as magnetic media (such as hard disks, floppy disks, etc.), optical media (such as compact discs, digital video discs, Blu-ray discs, etc.), semiconductor media (such as RAM, Flash memory, electrically programmable read only memory (EPROM), electrically erasable programmable read only memory (EEPROM), etc.), any suitable media that is not fleeting or devoid of any semblance of permanence during transmission, and/or any suitable tangible media. As another example, transitory computer readable media can include signals on networks, in wires, conductors, optical fibers, circuits, or any suitable media that is fleeting and devoid of any semblance of permanence during transmission, and/or any suitable intangible media.
It should be noted that, as used herein, the term mechanism can encompass hardware, software, firmware, or any suitable combination thereof.
In some scenarios, various companies can utilize the example techniques for remote detection of marine debris and/or macro algae in various locations (e.g., on beaches, in nearshore waters, in offshore waters, etc.) disclosed herein. For example, companies collecting Sargassum rather than plastic can exploit the example techniques. In other examples, companies collecting Sargassum to make a product based on the Sargassum can use the example techniques. In further examples, research institutes or universities researching Sargassum on beaches, in nearshore waters, and/or in offshore waters or marine debris in offshore waters can use the example techniques. In even further examples, locations and quantities of Sargassum identified by the example techniques can be transmitted to a website, a phone, or any other suitable communication channel. In further examples, the locations and quantities of Sargassum can be tracked to find the trajectories of Sargassum and be shown along with directions of surface currents. It should be appreciated that the use cases described herein are not limited.
It should be understood that steps of processes described above can be executed or performed in any suitable order or sequence not limited to the order and sequence shown and described in the figures. Also, some of the above process steps can be executed or performed substantially simultaneously where appropriate, or in parallel to reduce latency and processing times.
Although the invention has been described and illustrated in the foregoing illustrative aspects, it is understood that the present disclosure has been made only by way of example, and that numerous changes in the details of implementation of the invention can be made without departing from the spirit and scope of the invention, which is limited only by the claims that follow. Features of the disclosed embodiments can be combined and rearranged in various ways.
This application claims the benefit of U.S. Provisional Application No. 63/352,166 filed Jun. 14, 2022, the entirety of which is herein incorporated by reference.
This invention was made with government support under NNX16AR74G, NNX17AF57G, 80NSSC20M0264, and 80NSSC21K0422, all awarded by the National Aeronautics and Space Administration of the United States. The Government may have certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63352166 | Jun 2022 | US |