1. Technical Field
This disclosure relates to the analysis of settlements and more particularly to the extraction and characterization of settlement structures through high resolution imagery.
2. Related Art
Land use is subject to rapid change. Change may occur because of weather conditions, urbanization, and unplanned settlements that may include slums, shantytowns, barrios, etc. The variance found in land use may be caused by cultural changes, population changes, and changes in geography. In practice, the study and analysis of change use either aerial photos or topographic mapping. These tools are costly and time intensive and may not reflect the dynamic and continuous change that occurs as settlements develop.
The use of satellite imagery has not been effective in assessing certain settlement changes or identifying settlements quickly and inexpensively. For some satellite imagery, limited spatial resolution creates mixed pixel signatures making it unsuitable for detailed analysis. Roads, buildings and farmlands may not be entirely discernible due to the low spatial extensions that may blend some features of these objects with adjacent objects. Efficient scene recognition from image data is a challenge.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
This disclosure introduces technology that analyses high resolution bitmapped, satellite, and aerial images. It discloses a settlement mapping system (settlement mapping system/tool or SMTool) that automatically detects, maps, and characterizes land use. The system includes a settlement extraction engine and a settlement characterization engine. The settlement extraction engine identifies settlement regions from high resolution satellite and aerial images through a graphic element. The settlement characterization engine allows users to analyse and characterize settlement regions interactively and in real time through a graphical user interface. The system extracts features representing structural and textural patterns in real time. A real time operation may comprise an operation matching a human's perception of time or a virtual process that is processed at the same rate (or perceived to be at the same rate) or faster rate than the physical or an external process.
The extracted features are processed by the classification engine to identify settlement regions in a given image object that may be based on low-level image feature patterns. The classification engine may be built on a discriminative random field (DRF) framework. The settlement characterization engine may provide feature computation, image labelling, training data compilation, discriminative modelling and learning and software applications that characterize and color code settlement regions based on empirical data and/or statistical extraction algorithms. Some settlement mapping systems execute Support Vector Machines (SVM) and Multiview Classifier as choices for discriminative model generation. Some systems allow users to generate different file types including shape files and Keyhole Mark Language (KML) files. KML files may specify place marks, images, polygons, three-dimensional (3D) models, textual descriptions, etc., that identify settlement regions and classes. The settlement mapping system may export data and/or files that are visualized on a virtual globe, map, and/or geographical information such as Google Earth TM. The virtual globe may map earth and highlight settlements through the superimposition of images obtained from satellite images, aerial imagery, Geographic Information System 3D globes (GIS 3D), and KML and KML like files generated, discriminated, and highlighted by the settlement mapping system.
As shown in
Activating the detect graphic object (or element) under the classification function activates the settlement extraction engine on the loaded image. The extraction engine may manage and execute programs and functions including those programmed and linked to text objects in the pull-down menu adjacent to the detect object. On a large image, the settlement extraction engine may operate in block mode. The spacing of the edges of a selected image object, the relationship of the edges of the image object to surrounding materials or other image objects, the co-occurrence distribution of the image, etc., for example, may allow the extraction engine to identify discrete settlement structures within images as shown in the detections highlighted in
The settlement extraction output generated by the extraction engine may be saved in many file formats. To save a settlement extraction output as a vector format shape file for example as shown in
The settlement extraction system may execute one, two, or more multi-scale low-level feature analysis (or in alternative systems, and/or high level feature analysis) to generate the discriminatory models based on user defined image training data shown in
The grey-level co-occurrence matrix (GLCM) takes into account the different directional components of the textural signal and is invariant to rotation and radiometric changes. The pixel-wise minimum of twelve displacement vectors using the contrast measure is computed by the system at many scales such as scales of 25×25, 50×50, and 100×100, for example. In addition to ten displacement vectors that may be processed, the settlement extraction system may also process (2,−2) displacement vectors, corresponding to the X and Y pixel shifts, respectively. These additions may account for the pixel block approach account for nearly every pixel within the given neighborhood. A PanTex index feature (or texture derived built-up index feature) may be generated that may be described as BuiltUp(bi)=∩itxi;i ∈[1 . . . n] where BuiltUp(bi) is the PanTex feature at block bi and n is the number of displacement vectors.
The Line Support Regions may provide intermediate representation of a neighborhood based on the local line parameters captures such as the size, shape, and spatial layout. The settlement extraction system may extract straight line segments from an image by grouping spatially contiguous pixels with consistent orientations. Following one or more straight line extractions, the system normalizes the image intensity range between about 0 and about 1, and computes the pixel gradients and orientations. The orientations may be quantized into a number of bins, such as eight bins for example, ranging from about 0 to about 360, in 45 degree intervals. To avoid line fragmentation attributed to the quantization of orientations, the system may quantize the orientation into more bins such as another eight bins starting from 22.5 to degrees to (360 degrees+22.5 degrees), at 45 degree intervals. Spatially contiguous pixels falling in the same orientation bin may form the line supporting regions. Regions may be generated separately based on the different quantization schemes and the results may be integrated by selecting line regions based on an automatic pixel voting scheme. One such voting scheme may ignore pixels with gradients below a predetermined threshold (about 0.5 for image intensity ranging between about 0 and about 1) to reduce noisy line regions. The system may compute the line centroid, length, and orientation from a Fourier series approximation of the line region boundary.
The Scale-Invariant Feature Transform (SIFT) may be used to characterize formal and informal settlements. The settlement extraction system may apply a dense SIFT extraction routine on each image to compute a vector such as a 128 dimensional feature vector for each pixel, for example. The system may randomly sample a fixed number of features, such as one-hundred thousand SIFT features for example from the imagery and apply clustering to generate a SIFT codebook. The SIFT codebook may consist of quantized SIFT feature vectors which are the cluster centers identified by the clustering. The cluster centers may be referred to as code words. In some implementation, the settlement extraction system employed K-means clustering with K=32. The SIFT feature computed at each pixel is assigned a codeword-id ([1 to K]) based on the proximity of the SIFT feature with the pre-computed code words. Some systems may execute Euclidean distance for the proximity measure. To compute the SIFT feature at block, settlement extraction system may render a 32-bin histogram at each scale by considering different windows around the block. The settlement extraction system may compute a number of SIFT features such ninety-six SIFT features (SIFT (bi)) from three scales. For dense SIFT feature computation the system may apply the algorithms found in an open and portable library of computer vision algorithms available at http://www.vlfeat.org/2008.
The settlement extraction system may apply orientated feature energy (Textrons) or Textron frequencies at each pixel block to characterize different settlements based on its texture measures. The settlement extraction system may execute a set of oriented filters at each pixel. The system may use predetermined number of filters such as eight oriented even-symmetric and odd-symmetric Gaussian derivative filters (or total of about 16 filters) and a Difference-of Gaussians (DoG) filter. Thus, each pixel may be mapped to a 17-dimensional filter response. The system may execute K-means clustering on a random number of responses, such as one hundred thousand randomly sampled filter responses from the imagery. The resulting cluster centers may define the set of quantized filter response vectors called textons based on empirical data. The system assigns pixel in the imagery a texton-id, which is an integer between [1, K], based on the proximity of the filter response vector with the pre-computed textons. Similar to SIFT features, the system may execute Euclidean distance for the proximity measures and the pixel is assigned the texton-id of the texton with a minimal distance from the filter response vector. At each block, the settlement extraction system computes the local texton frequency by producing a K-bin texton histogram. The system may generate the K-bin texton histogram at three different scales with three different windows. For each block, by concatenating histograms produced at three different scales, the system may generate a ninety-six-dimensional texture feature vector (TEXTON (bi)). The feature computation for each pixel block may result in a two hundred and thirty-dimensional feature vector.
f(bi)={GLCM(bi)3, HoG(bi)15, LSR(bi)9, LFD(bi)6, Lac(bi)3, rgNDVI(bi)1, rbNDVI(bi)1, SIFT(bi)96, TEXTON(bi)96},
i=1, 2, . . . , N
where N is the total pixel blocks, and the superscript on each feature denotes the feature length.
The settlement extraction system's classification engine may be built on a discriminative random field (DRF) framework. The DRF framework may classify image regions by incorporating neighborhood spatial interactions in the labels as well as the observed empirical data as shown in
As explained in
Once a settlement extraction is completed, the settlement characterization engine may execute multiple functions including (1) label image, (2) train data, (3) model generation and learning and, (4) detecting one, two, or more settlement classes using generated/learned models. The settlement extraction system allows user to label or associate portions of images with a certain settlement class. The labeled image portions are processed in a training data compilation. To generate the training data a user may label a portion of an image. A user first selects and labels a button or graphic object on the graphical user display and provides a class name, such as “Settlement A” as shown through
To compile the settlement extraction system's training data a user may select the feature sets that are needed for settlement characterization as shown in train model portion of the display shown in
To generate a discriminative model to identify “Settlement A” and “Settlement B” region across the entire displayed image, a user provides a unique name (e.g., Beijing-level2-model in
To detect the settlement classes, settlement extraction system applies the learned model on the entire image to identify “Settlement A” and “Settlement B” classes. In operation a user may select the model from a pull-down menu positioned adjacent to detect object to activate the classification engine that applies the learned attributes that discriminate the settlement classes from the limited training samples, such as the two polygonal-like portions/patches designated by the user shown in
A and Settlement B) may detect, then identify and characterize an entire image into settlements and non-settlements in seconds based on spatial and structural patterns and may be color coded, highlighted, or differentiated by different intensities or animations to differentiate the classes (e.g.,
The methods, devices, systems, and logic described above may be implemented in many different ways in many different combinations of hardware, software or both hardware and software. For example, all or parts of the system may detect and identify settlements through one or more controllers, one or more microprocessors (CPUs), one or more signal processors (SPU), one or more graphics processors (GPUs), one or more application specific integrated circuit (ASIC), one or more programmable media or any and all combinations of such hardware. All or part of the logic described above may be implemented as instructions for execution by multi-core processors (e.g., CPUs, SPUs, and/or GPUs), controller, or other processing device including exascale computers and may be displayed through a display driver in communication with a remote or local display, or stored in a tangible or non-transitory machine-readable or computer-readable medium such as flash memory, random access memory (RAM) or read only memory (ROM), erasable programmable read only memory (EPROM) or other machine-readable medium such as a compact disc read only memory (CDROM), or magnetic or optical disk. Thus, a product, such as a computer program product, may include a storage medium and computer readable instructions stored on the medium, which when executed in an endpoint, computer system, or other device, cause the device to perform operations according to any of the description above.
The settlement extraction systems may evaluate images shared and/or distributed among multiple system components, such as among multiple processors and memories (e.g., non-transient media), including multiple distributed processing systems.
Parameters, databases, mapping software, pre-generated models and data structures used to evaluate and analyze or pre-process the high and/or low resolution images may be separately stored and managed, may be incorporated into a single memory block or database, may be logically and/or physically organized in many different ways, and may be implemented in many ways, including data structures such as linked lists, hash tables, or implicit storage mechanisms. Programs may be parts (e.g., subroutines) of a single program, separate programs, application program or programs distributed across several memories and processor cores and/or processing nodes, or implemented in many different ways, such as in a library or a shared library accessed through a client server architecture across a private network or public network like the Internet. The library may store detection and classification model software code that performs any of the system processing described herein. While various embodiments have been described, it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible.
The term “coupled” disclosed in this description may encompass both direct and indirect coupling. Thus, first and second parts are said to be coupled together when they directly contact one another, as well as when the first part couples to an intermediate part which couples either directly or via one or more additional intermediate parts to the second part. The term “substantially” or “about” may encompass a range that is largely, but not necessarily wholly, that which is specified. It encompasses all but a significant amount. When devices are responsive to commands events, and/or requests, the actions and/or steps of the devices, such as the operations that devices are performing, necessarily occur as a direct or indirect result of the preceding commands, events, actions, and/or requests. In other words, the operations occur as a result of the preceding operations. A device that is responsive to another requires more than an action (i.e., the device's response to) merely follow another action.
While various embodiments of the invention have been described, it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible within the scope of the invention. Accordingly, the invention is not to be restricted except in light of the attached claims and their equivalents.
The invention was made with United States government support under Contract No. DE-AC05-00IR22725 awarded by the United States Department of Energy. The United States government has certain rights in the invention.