The present invention relates to a method for eliminating processing window artifacts that may occur when partitioning spatially related data into sections or regions for processing. More particularly, the present invention describes a split-remerge method for identifying and splitting possible artifacts of data and then merging that data with the most appropriate region.
Segmentation, the partitioning of data into related sections or regions, is a key first step in a number of approaches to data analysis and compression. In data analysis, the group of data points contained in each region provides a statistical sampling of data values for more reliable labeling based on data feature values. In data compression, the regions form a basis for compact representation of the data. The quality of the prerequisite data segmentation is a key factor in determining the level of performance of most of these data analysis and compression approaches.
Most segmentation approaches can be placed in one of three categories:
Segmentation is often used in the analysis of imagery data. The techniques described can be applied to image data and to any other data that has spatial characteristics. A data set has spatial characteristics if it can be represented on an n-dimensional grid, and when so represented, data points that are nearer to each other in the grid generally have a higher statistical correlation to each other than data points further away. For remotely sensed images of the earth, an example of a segmentation would be a labeled map that divides the image into areas covered by distinct earth surface covers such as water, snow, types of natural vegetation, types of rock formations, types of agricultural crops and types of other man created development. In unsupervised image segmentation, the labeled map may consist of generic labels such as region 1, region 2, etc., which may be converted to meaningful labels by a post-segmentation analysis. In image analysis, the group of image points contained in each region provides a good statistical sampling of image values for more reliable labeling based on region mean feature values. In addition, the region shape or texture can be analyzed for additional clues to the appropriate labeling of the region.
A segmentation hierarchy is a set of several segmentations of the same data at different levels of detail in which the segmentations at coarser levels of detail can be produced from simple merges of regions at finer levels of detail. This is useful for applications that require different levels of segmentation detail depending on the particular data objects segmented. A unique feature of a segmentation hierarchy that distinguishes it from most other multilevel representations is that the segment or region boundaries are maintained at the finest data granularity for all levels of the segmentation hierarchy.
In a segmentation hierarchy, an object of interest may be represented by multiple segments in finer levels of detail in the segmentation hierarchy, and may be merged into an encompassing region at coarser levels of detail in the segmentation hierarchy. If the segmentation hierarchy has sufficient resolution, the object of interest will be represented as a single region segment at some intermediate level of segmentation detail. The segmentation hierarchy may be analyzed to identify the hierarchical level at which the object of interest is represented by a single region segment. The object may then be identified through its spectral and region characteristics, such as shape and texture. Additional clues for object identification may be obtained from the behavior of the segmentations at the hierarchical segmentation levels above and below the level at which the object of interest is represented by a single region.
In U.S. Pat. No. 6,895,115, which is incorporated herein by reference, a segmentation approach is described that automatically provides hierarchical segmentations for data at several levels of detail. This approach, called HSEG, is a hybrid of region growing and spectral clustering that produces a hierarchical set of segmentations based on detected natural convergence points. Because of the inclusion of spectral clustering, the HSEG algorithm is very computationally intensive, and cannot be performed in less than a day on moderately sized data sets, even with the most powerful single processor computer currently available. The processing time problem was addressed through a recursive formulation of HSEG, called RHSEG. RHSEG can process moderately sized data sets in a reasonable amount of time on currently available PCs and workstations. Larger data sets required the use of a parallel implementation of RHSEG on a parallel computing system.
However, a problem with the RHSEG algorithm and certain other data processing algorithms that similarly subdivide and subsequently recombine data during processing instant processing artifacts can be introduced by the division and recombination of the data. An example of these processing artifacts can be demonstrated on an 896×896 pixel section of Landsat ETM+ (Enhanced Thematic Mapper) data displayed in
In the prior art, a “contagious clusters” or “contagious regions” concept has been used to attempt to reduce or eliminate the processing window artifacts. The contagious regions concept can be described as follows:
Flag any region that touches a boundary between processing windows and suppress any merging between flagged regions and any other region.
If a non-flagged or “non-contagious” region attempts to merge with a flagged or “contagious” region, the previously non-flagged region becomes flagged or “contagious.”
Thus, the contagious property of the flagged regions is literally contagious. Unfortunately, when the contagious regions concept is applied to the RHSEG algorithm, and when more than two or three levels of recursion are utilized, the RHSEG algorithm is only able to effectively process the data. The RHSEG algorithm effectively stalls because so many regions become contagious that the number of regions in the processing window becomes so large that the processing time required is not sufficiently advantageous over a non-segmented image processing approach.
Indirect mechanisms, however, also exist. For example, increasing the number of regions at which convergence is achieved at intermediate levels of the recursive processing may indirectly cause a reduction in processing window artifacts. A larger value may delay some region merging decisions that would have involved regions on the borders of processing windows to occur after those regions are no longer on the borders of processing windows. This indirect method, however, is inefficient because processing time increases with larger values of the number of regions needed to achieve convergence. Further, processing artifacts are not always eliminated via this method. Other approaches to reducing window artifacts may manipulate other parameters in the recursive hierarchical segmentation processing, but also increase processing time and resources, such that the approaches become impractical for large sets of data.
Indeed, all previously developed techniques for splitting inappropriately merged pixels or processing image data in a fashion to avoid creating window artifacts unacceptably increase the processing time required. Thus, in prior application Ser. No. 10/845,419, a switch-pixels method of addressing window artifacts was disclosed, however, this technique had no mechanism for giving priority to spatial adjacency in switching pixels from one region to another.
Accordingly, it is an object of the present invention to implement a split-remerge process for eliminating processing window artifacts in recursive hierarchical segmentation of data. The foregoing object of the invention is achieved by identifying candidate pixels or data points or regions of points that have been inappropriately merged, identifying a candidate region with which those data points or sets might be more appropriately merged and then evaluating the best merger candidate giving due weight to spatial distance between the flagged data and the candidate region for merger. While initially designed for the analysis of single-band, multispectral or hyperspectral remotely sensed imagery data for earth science applications, the software innovation also has applications to image data compression of image data archives, data mining (searching for particular shapes of objects with certain feature vector characteristics), and data fusion (based on matching region features between data sets from different times and/or different sensors). Applications outside of remote sensing are the analysis of imagery for medical applications and for nondestructive evaluation in manufacturing quality control. A possible military application is land mine detection.
The file of this patent contains at least one drawing executed in color. Copies of this patent with color drawing(s) will be provided by the Patent and Trademark Office upon request and payment of the necessary fee.
These and other advantages of the invention will be more fully appreciated from the following description of preferred embodiments taken in conjunction with the accompanying drawings, of which:
Most prior techniques for region growing data segmentation are based on the classic definition of 2-spatial dimension image segmentation:
A problem with the classic definition of image segmentation is that the segmentation so defined is not unique. The number, N, and shape of the partitions, X1, X2, . . . , XN, depend on the order in which the image pixels are processed. In addition, there is no concept of optimality contained in this definition of image segmentation. Under this classic definition, all partitions that satisfy the conditions represent equally good or valid segmentations of the image.
A less commonly used, more computationally intensive but often more accurate approach to 2-spatial dimension image segmentation is the Hierarchical Stepwise Optimization (HWSO) algorithm of Beaulieu and Goldberg. HSWO can be defined recursively as follows:
If HSWO segmentation is not implemented recursively, it will not be subject to problems with processing window artifacts.
A description of the HSWO algorithm follows and is shown in
HSWO Basic Algorithm Description 10:
Beaulieu and Goldberg did not specify a tie-breaking method for step 2, i.e., no procedure is specified for handling the case when more than one pair of regions was found to have the smallest dissimilarity criterion value. A suitable tie-breaking criterion is as follows: If more than one pair of regions has the smallest dissimilarity criterion, the merge involving the region with the largest region label is arbitrarily performed first. If this region with the largest region label also has the smallest dissimilarity criterion relative to more than one region, all of these regions are merged together (effectively, in parallel). This tie-breaking criteria is arbitrary, and is based on convenience of implementation. An additional arbitrary convention is that the region with the larger region label is merged into the region with the smaller region label (i.e., the smaller region label survives the merge).
Beaulieu and Goldberg show that the HSWO algorithm produces the globally optimal segmentation result if the statistics at all iterations are independent. Even though the statistics at all iterations will generally not be independent for natural images, the HSWO approach is still shown to produce excellent results. Beaulieu and Goldberg also point out that the sequence of partitions generated by this iterative approach reflect the hierarchical structure of the imagery data: the partitions obtained in the early iterations preserve the small details and objects in the image, while the partitions obtained in the latter iterations preserve only the most important components of the image. They further note that these hierarchical partitions may carry information that may help in identifying the objects in the imagery data.
Beaulieu and Goldberg developed a dissimilarity criterion (which they call a stepwise criterion) based on the piecewise approximation an image by constant value regions, with the value given by the mean value of the region. For the gray scale (single spectral band) images they considered, their stepwise merging criterion, Ci,j, is:
While HSWO has advantages over the classical image segmentation approaches in terms of optimality, it runs into problems when applied to even moderately large data sets. This is because HSWO does not allow the merging of non-spatially connected regions. The large number of regions required by HSWO for a complete representation of the segmentation detail leads to processing inefficiencies in two different ways:
HSWO and HSEG without spectral clustering are impractical for even moderately sized image because of the number of regions required to represent a useful level of segmentation detail. The optional spectral clustering step in the HSEG algorithm is what makes practical use of the HSEG algorithm possible for even very large images, with the approximation of HSEG in the RHSEG recursive implementation. (A recursive implementation of HSEG is not useful without the spectral clustering step, due to the high value of min_nregions required for even moderately sized data sets.) This is similar to the multiple spectral band “square root of the band sum mean squared error” criterion used in HSEG and explained under the parameter dissim_crit below.
HSEG is an elaboration of HSWO in which an option for merging spatially non-adjacent regions is added along with a method for selecting a subset of segmentation results to form the segmentation hierarchy output 20 shown in
HSEG Algorithm Description:
Note that spclust_wght, chk_nregions, convfact, and conv_nregions are user specified parameters (however, default values chk_nregions=64, convfact=1.01, and conv_nregions=2 are usually satisfactory). If spclust_wght=0.0, HSEG is essentially identical to HSWO with convergence checking (step 11) for segmentation hierarchy selection. The associated region information mentioned in step 11 above is the region number of pixels list, and optionally includes the boundary region map, the region number of boundary pixels list, the region mean vector list, the region standard deviation list, and the region maximum merging threshold list (for each region, the maximum merging threshold at which any merge occurred in the formation of the region).
A recursive formulation of HSEG, called RHSEG 50, is shown in
An outline of the “switch pixels” rhseg(level,X) algorithm 60 is shown in
In step 4 data points on the seam between the data subsections are analyzed to find pairs of regions that may contain pixels that are more similar to the other region. All of the regions are also compared to each other directly, and regions that are relatively similar to each other are assumed to contain pixels that may be more similar to the other region. The first approach identifies pairs of regions that occur as neighbors across the processing window seam that contain pixels that are more similar to the other region, no matter how similar the region pair is to each other. This approach is most effective at eliminating obvious blocky artifacts at the processing window seam. The second approach identifies pairs or regions that are relatively similar to each other, and thus may contain pixels that are more similar to the other region, even though the pair of regions may not occur as neighbors across a processing window seam. This approach is most effective at eliminating more diffuse processing window artifacts.
This is very efficient because it focuses specifically on the regions that are involved in the processing window artifacts. In addition, performing the pixel switching after performing the HSEG algorithm as in step 3 enhances processing efficiency. Pixel switching (as in step 4) could be performed before step 3, but then a number of pixels would be switched between pairs of regions that would end up getting merged together in step 3.
There are two main reasons for the difference in the nature of the segmentation results in these two cases. The first reason is evidenced by the fact that 96 regions are required in
In the process that produced
However, the second reason for the difference in the nature of the segmentation results is not as positive. The hierarchical data segmentation algorithm, HSEG, utilizes the parameter spclust_wght to control the relative importance of merges between spatially adjacent regions and merges between spatially non-adjacent regions. When spclust_wght=0.0, only merges between spatially adjacent regions are allowed. When spclust_wght=1.0, merges between spatially adjacent and spatially non-adjacent regions are given equally priority. For values of spclust_wght between 0.0 and 1.0, spatially adjacent merges are given priority over spatially non-adjacent merges by a factor of 1.0/spclust_wght.
A problem with the “switch pixels” approach to eliminating processing artifacts is that it does not have any mechanism for giving priority to spatial adjacency in its process for switching pixels from one region to another. Because of this, the segmentation results produced by this approach for all values of spclust_wght greater than 0.0 are very similar to the results for spclust_wght=1.0. As an example of this,
The current disclosure describes an innovative “split-remerge” approach to eliminating processing artifacts that splits certain pixels out from regions they are assigned to and remerges them into perhaps another, more similar, region, while giving the appropriate priority to spatial adjacency.
Comparing the segmentation result from
Besides the elimination of processing window artifacts, the difference between the segmentation results shown in
The software innovation described below provides hierarchical segmentations of data that are free of processing window artifacts and with maximum flexibility in controlling the nature of the segmentation results with the spclust_wght parameter. The innovation allows the spclust_wght parameter to have full effect over its permissible value range while producing results free of processing window artifacts. The RHSEG algorithm shown in
An outline of the split-remerge rhseg(level,X) algorithm 160 is shown in
Notice the differences between this split-remerge version of RHSEG and the switch pixels version in
In this split-remerge version, a new step 4(iii) is added between the step 4(ii) and step 4(iii) of the switch pixel version. In the case where spclust_wght=1.0, the pixels that were flagged as being “split out” in step 4(ii) are simply merged into the region for which it has the minimum other_region_dissim value of the regions in the candidate_region_label_set of the region it previously belonged to. This makes the combination of steps 4(ii) and 4(iii) for the current version equivalent to step 4(ii) of the previous version for spclust_wght=1.0. However, for spclust_wght<1.0, the current version of RHSEG is very different from the previous version.
For spclust_wght<1.0, the current version executes remerge(level,max_threshold,X) on the current region labeling. The remerge( ) function is one of the key innovations, which directly and through its use of a restructed HSEG algorithm introduces contagious regions. This recursive remerge(level,max_threshold,X) function 80 is outlined below and illustrated in
An outline of restricted HSEG 90 follows and is illustrated in FIGS. 13A and 13B:
The remerge( ) function is called recursively from the recursive level it was initially called from in step 4(iii) of the rhseg( ) function. At recursive level rnb_levels specially flagged single pixel regions are formed from the pixels that were split out from their region assignment in the current processing window. These new regions have their new_region flag set true, and any of these new regions that are located on an interior processing window boundary have their contagious flag set true. In addition, each of these new regions have their candidate_region_label_set set equal to the candidate_region_label_set of the region it previously belonged to augmented by the label of the region it previously belonged to (to allow for the possibility of merging back into the previous region).
In the restricted version of HSEG, nearest neighbor merges are considered first up to the maximum merge threshold previously encountered in the last call to the unrestricted version of HSEG. These nearest neighbor merges are constrained by delaying merges involving regions touching the interior processing window boundaries until those regions no longer are on a processing window boundary or the recursive level is reached from which the remerge( ) function was initially called from.
If the number of regions converge_nregions is not reached during the processing of the nearest neighbor merges, more general merges are considered. These merges are not constrained by the contagious flag, and consider possible merges in each region's candidate_region_label_set in addition to the neighboring regions. In this case the merges are constrained by last_nghbr_threshold*spclust_wght as well as converge_nregions.
The contagious region aspect of the restricted version of HSEG is a very important part of this new innovation in the processing window artifact elimination process. Without it, processing window artifacts are reintroduced, as demonstrated by the results displayed in
When the contagious region concept is incorrectly implemented, it will be less computationally efficient than the current approach. This is because when the contagious region idea is applied to the region growing process starting at single pixel regions over the entire processing window, considering contagious regions at the interior processing window boundaries invariably causes merging to stop at a number of regions much higher the usual value of min_nregions utilized in the current version of RHSEG. Larger values of min_nregions have a deleterious effect on processing time. However, the contagious region approach does not lead to these problems in the current implementation since it is applied only to a subset of regions corresponding to a very limited subset of image pixels, and since after nearest neighbor merges are stopped by all new regions becoming contagious, additional general merges may be performed. These additional merges do not lead to artifacts because for the most part the merges involve merges of the new regions into globally defined “old” regions, and because a large number of nearest neighbor merges have already occurred.
As shown in Table 1, the processing times required by HSEG and RHSEG for spclust_wght>0.0 are generally longer than required with spclust_wght=0.0. The contrast in processing times is most pronounced for HSEG (rnb_levels=1) as the image size grows. Even for a very moderate image size, 256×256 pixels, the processing time required by HSEG for spclust_wght>0.0 is about 235 times longer than that required by HSEG for spclust_wght=0.0. This is because the number of regions that need to be compared in the early stages of HSEG is on the order of N2 when spclust_wght>0.0, while the number of regions that need to be compared in the early stages of HSEG is on the order of N*n when spclust_wght=0.0, where N is the number of pixels in the image and n is the size of a pixel neighborhood (eight neighbors, by default). This increase in processing time is much less pronounced for RHSEG using the default values for rnb_levels. In this case, for the image sizes tested, the processing time for the parallel version of RHSEG with spclust_wght>0.0 is up to about 4 to 5 times longer than that for RHSEG with spclust_wght=0.0. This is because the number of regions that need to be compared in the early stages of RHSEG is limited, by default, to no more than 2048 regions.
Functional Operation of the Algorithm in Software and Parameters
Provided in this section is a user's guide-like description of the parameters used in the HSEG and RHSEG algorithms (HSEG is simply RHSEG with rnb_levels=1) embedded in a software program. This implementation assumes the input data is one or two-spatial dimension single-band, multispectral of hyperspectral image (or image-like) data. Selection between the single processor version and the multiprocessor (parallel) version is made through use of a compiler preprocessing directive.
The single processor version of RHSEG is run from the command line with the command:
where parameter_file_name is the name of the input parameter file (for contents, see below). The other command line parameters for the parallel version are:
The value of inb_levels determines the number of processors (numprocs) utilized in the parallel version. For two-spatial dimension data as input, numprocs=4inb
At program initialization, the input data is parceled out to the numprocs processors, and each processor independently processes each of the numprocs data sections. After each process finishes with its section of data, it transfers the results to the appropriate processor working at recursive level inb_levels−1, etc. until the final stage of processing at recursive level 1.
The lowest recursive level at which the input data and other pixel oriented data (such as the region label map) is maintained. As such, it is the recursive level from which the region label map and other pixel-oriented outputs are output from. This parameter is used to optimize the parallel processing efficiency. The optimal value depends on the parallel processing hardware and the size of the data being processed.
*Default value for rnb_levels for this image size.
When RHSEG reaches processing at recursive level onb_levels, the input data, region label map, and any other pixel oriented information is maintained in the processors operating at recursive level onb_levels, and not transmitted up the recursive processing chain. When information relating to the input data or region label map is required at processing levels<onb_levels, this information is passed from the processors active at the onb_levels recursive level through interprocessor communication up to the appropriate process at the current recursive level.
Setting onb_level>1 distributes the pixel-oriented data over 4onb
See
NOTE: The parameters inb_levels and onb_levels are used only in the parallel version and are not defined or used in the single processor version.
The following required parameters specify and describe the input data:
The input image data file from which a hierarchical image segmentation is to be produced. This image data file is assumed to be a headerless binary two-spatial dimension image or image-like data file in band sequential format. The number of columns, rows, spectral bands and the data type are specified by other required parameters (see below). Data types “unsigned char (byte)” and “short unsigned int” are supported.
The following required parameters specify output files:
The region label map at the finest level of segmentation detail (hierarchical level 0). Together with regmerges (see below), this forms the main output of RHSEG. Region label values of “0” correspond to invalid input data values in the input image data. Valid region label values range from 1 through 65535. The data is of data type “short unsigned int.”
The region merges list file consists of the renumberings of the region label map required to obtain the region label map for the second most detailed level (hierarchical level 1) through the coarsest (last) level of the segmentation hierarchy from rlblmap (see above). The data type is “short unsigned int.” The data is stored as rows of values, with the column location (with counting starting at 1) corresponding to the region label value in rlblmap (the region label map at the finest level of detail of the segmentation hierarchy) and the row location corresponding to the segmentation hierarchy level (the lth row is the renumberings required to obtain the (l+1)th level of the segmentation hierarchy).
The region number of pixels list consists of the number of pixels (of data type “unsigned int”) in each region stored as rows of values, with the column location (with counting starting at 1) corresponding to the region label value and the row location corresponding to the segmentation hierarchy level (with counting starting at 0).
The output parameter file contains (in ASCII form) all the output parameters from RHSEG. This parameter file is formatted in the same way as the input parameter file for RHSEG and contains most of the same parameters. Additional parameters are the number of hierarchical segmentation levels (nb_levels) in the hierarchical segmentation output and the number of regions (level0_nregions) in the hierarchical segmentation with the finest segmentation detail. These additional parameter values are required to interpret the rnpixlist, regmerges, rmeanlist, rthreshlist and boundary_npix output files (see below).
The following parameters specify recommended, but optional, output files (no defaults):
The region mean list file is an optional output of RHSEG. This list consists of the region mean value (of data type “double”) of each region stored as rows of values and groups of rows, with the column location (with counting starting at 1) corresponding to the region label value, the row location (in each row group) corresponding the spectral band and row group corresponding to the segmentation hierarchy level (with counting starting at 0).
The region maximum merge threshold list file an optional output of RHSEG. This list consists of the maximum merge threshold encountered in all merges involving each region. The values (of data type “double”) are stored as rows of values, with the column location (with counting starting at 1) corresponding to the region label value and the row location corresponding to the segmentation hierarchy level (with counting starting at 0).
The region number of boundary pixels list is an optional output of RHSEG. This list consists of the number of boundary pixels in each region (of data type “unsigned int”) stored as rows of values, with the column location (with counting starting at 1) corresponding to the region label value and the row location corresponding to the segmentation hierarchy level (with counting starting at 0).
The hierarchical boundary map is an optional output of RHSEG. The data values of this map are (of type unsigned char (byte))
The region standard deviation value list is an optional output of RHSEG. This list consists of the region's standard deviation value (of data type “double”) stored as rows of values, with the column location (with counting starting at 1) corresponding to the region label value and the row location corresponding to the segmentation hierarchy level (with counting starting at 0).
The following parameters specify optional input files and associated parameters (with defaults, if any):
The optional input data mask must match the input image data in number of columns and rows. Even if the input image data has more than one spectral band, the input data mask need only have one spectral band. If the input data mask has more than one spectral band, only the first spectral band is used and is assumed to apply to all spectral bands for the input image data. If the data value of the input data mask is not equal to mask_value (see the next parameter definition), the corresponding value of the input image data object is taken to be a valid data value. If the data value of the input data mask object is equal to mask_value, the corresponding value of the input image data object is taken to be invalid and a region label of “0” is assigned to that spatial location in the output region label map data. The input data mask data type is assumed to be “unsigned char.”
The optional region label map must match the input image data in number of columns and rows. If provided, the image segmentation is initialized according to the input region label map instead of the default of each pixel as a separate region. Wherever a region label of “0” is given by the input region label map, the region labeling is assumed to be unknown and the region label map is initialized to one-pixel regions at those locations (except see rlblmap_mask_value below).
The following are optional parameters are recommended for variation by all users (defaults provided):
Criterion for evaluating the dissimilarity of one region from another.
Dissimilarity criteria 1, 2 and 3 are based on vector norms. The 1-Norm of the difference between the region mean vectors, ui and uj, of regions Xi and Xj, each with B spectral bands, is:
where μib and μjb are the mean values for regions i and j, respectively, in spectral band b. The dissimilarity function for regions Xi and Xj, based on the vector 1-Norm, is given by:
d1-Norm(Xi,Xj)=∥ui−uj∥1. (4b)
The vector 2-Norm of the difference between the region mean vectors, ui and uj, of regions Xi and Xj is:
The dissimilarity function for regions Xi and Xj, based on the vector 2-Norm, is given by:
d2-Norm(Xi,Xj)=∥ui−uj∥2. (5b)
The vector ≈-Norm of the difference between the region mean vectors, ui and uj, of regions Xi and Xj is:
∥ui−uj∥≈=max(|μib−μjb|, b=1, 2, . . . , B), (6a)
The dissimilarity function for regions Xi and Xj, based on the vector ≈-Norm, is given by:
d≈-Norm(Xi, Xj)=∥ui−uj∥≈. (6b)
Dissimilarity criteria 6 and 7 are based on minimizing the increase of mean squared error between the region mean image and the original image data (9) (10) (11). The sample estimate of the mean squared error for the segmentation of band b of the image X into R disjoint subsets X1, X2, . . . , XR is given by:
is the mean squared error contribution for band b from segment Xi. Here, xp is a pixel vector (in this case, a pixel vector in data subset Xi), and χpb is the image data value for the bth spectral band of the pixel vector, xp. A dissimilarity function based on a measure of the increase in mean squared error due to the merge of regions Xi and Xj is given by:
BSMSE refers to “band sum MSE.” Instead of summing over the bands in (7a) one could take the maximum over the spectral bands, resulting in a “band maximum MSE:”
dBMMSE(Xi, Xj)=max{ΔMSEb(Xi, Xj), b=1, 2 . . . , B}. (8c)
Using (7b) and exchanging the order of summation, (8b) can be manipulated to produce an efficient dissimilarity function based on aggregated region features:
where μijb is the mean value for the bth spectral band of the mean vector, uij, of region represented by Xij=Xi∪Xj.
Since
an alternate form for Equation (9a) is:
Combining Equations (8a) and (9c),
Similarly combining Equations (8c) and (9c),
The dimensionality of the dBSMSE and the dBMMSE dissimilarity criteria is equal to the square of the dimensionality of the image pixel values, while the dimensionality of the vector norm based dissimilarity criteria is equal to the dimensionality of the image pixel values. To keep the dissimilarity criteria dimensionalities consistent, HSEG uses the square root of these dissimilarity criteria. The “Square Root of Band Sum Mean Squared Error” criterion is:
and the “Square Root of Band Sum Maximum Squared Error” criterion is:
Dissimilarity criterion 9 is based on the “SAR Speckle Noise Criterion” from Beaulieu. The criterion is:
NOTE: Other dissimilarity criterion can be included as additional options without changing the nature of the RHSEG implementation.
The following optional parameters may need to be modified depending on your operating system (defaults provided):
The default values should be used for the following optional parameters, except in special circumstances (defaults provided):
Setting spatial_wght=1.0 weights the spatial feature equally with the spectral band features, spatial_wght<1.0 weights the spatial feature less and spatial_wght>1 weights the spatial feature more. If D is the dissimilarity function value before combination with the spatial feature value, the combined dissimilarity function value (comparing regions i and j), Dc, is:
Dc=D+spatial—wght*|sfi−sfj| (13)
where sfi and sfj are the spatial feature values for regions i and j, respectively.
The spatial feature employed is the spectral band maximum region standard deviation. For regions consisting of 2 or more pixels, the region standard deviation for spectral band b of region i is:
where ni is the number of pixels in the region and μib is the region mean for spectral band b of region i:
The spatial feature value for region i is then defined as:
sfi=σi=B*max{σib:b=1, 2, . . . , B} (15)
where B is the number of spectral bands.
The region standard deviation is not defined for regions consisting of only one pixel. Further, the region standard deviation as calculated by equation (14) can only be considered a rough estimate for small regions (say, regions less than 9 pixels in size). Thus, if one of the regions being compared consists of less than 9 pixels, the spatial_wght factor is modified by a std_dev_factor as follows:
spatial—wght′=std—dev—factor*spatial—wght, (16a)
where
std—dev—factor=(min—npix−1.0)/8.0, (16b)
and min_npix is the number of pixels in the smallest of the two regions being compared. Note that for min_npix=1, std_dev_factor=0.0. Thus, std_dev_factor serves to gradually phase in the standard deviation spatial feature as the regions get larger.
based on the following neighborhood chart, where the focal pixel is marked “X”:
Using this chart, N Nearest Neighbors include pixels 1, 2, . . . N.
For 1-spatial dimension data, cases 1 and 2 degenerate to “Two Nearest Neighbors” and cases 3, 4 and 5 degenerate to “Four Nearest Neighbors,” according to the following neighborhood chart:
Let χpb be the original value for the pth pixel (out of N pixels) in the bth band (out of B bands). The sample mean and sample variance of the bth band are
respectively. The following transformation of the data, χpb, will produce image data, ξpb, with mean, M, and variance, Σ:
Usually, the data is normalized so that Σ2(=Σ)=1, and M=0.
As written above, the normalization is applied to each spectral band separately. It can also be defined to apply equally across all spectral bands. For this case, use σ=max{σb:b=1, 2, . . . , B} in (18) and (19). However, this type of normalization will produce the same hierarchical segmentation result as no normalization at all: the dissim_val's will change, but the tratio values will end up being the same (see step 11 of the “HSEG Basic Algorithm Description” described in connection with
The number of recursive levels. The default is calculated such that the number of data points in the subsections of data processed at recursion level rnb_levels are no more that 2048 data points. The number of columns and rows at recursion level rnb_levels is sub_ncols=ncols/2rnb
If not specified, the default is calculated to be min_nregions=sub_ncols*sub_nrows/4 (for sub_ncols and sub_nrow see the rnb_levels parameter).
The default for spclust_start is calculated such that spectral clustering is utilized only part of the time when spclust_wght>0.0. If spclust_wght=0.0, spclust_start=0. Otherwise, spclust_start=2*min_nregions.
Criterion for evaluating the quality of the image segmentations based on the global dissimilarity of the region mean image versus the original image data.
The global dissimilarity criteria 1, 2 and 3 are based on vector norms. The global dissimilarity function, based on the vector 1-Norm, for the R region segmentation of the N pixel data set X is given by:
The global dissimilarity function, based on the vector 2-Norm, for the R region segmentation of the N pixel data set X is given by:
The global dissimilarity function, based on the vector ≈-Norm, for the R region segmentation of the N pixel data set X is given by:
The global dissimilarity criteria 6 and 7 are based on the square root of the mean squared error between the region mean image and the original image data. The global dissimilarity criterion “Square Root of Band Sum Mean Squared Error” is:
The global dissimilarity criterion “Square Root of Band Maximum Mean Squared Error” is:
Dissimilarity criterion 9 is based on the “SAR Speckle Noise Criterion” from M. Beaulieu. The global criterion is:
This parameter is used to calculate min_pixels as follows:
min—npixels=└npixels*min—npixels—pct/100.0┘, (26a)
where npixels is the number of pixels in the current section of data being processed. The value of min_npixels is then used to calculate a merge acceleration factor, factor, which is multiplied times the dissimilarity criterion value. If small_npix is the number of pixels in the smaller of the two regions being compared, factor=1.0 if small_npix≧min_npixels and
factor=1.0−((min—npixels−small—npix)/min—npixels), (26b)
otherwise.
During the processing window elimination process, a “candidate region label” set is accumulated for use in considering whether or not a pixel is to be split out of its current region. A candidate region is a region that either may contain pixels that should be split out and possibly be assigned to a different region, or is a region to which a split out pixel may be assigned to. Consider the data points that are in the pairs of rows and columns along the seam between the data quadrants reassembled in step 2 of the RHSEG algorithm. For each of these pixels calculate the dissimilarity between the pixel and its current region (own_region_dissim), and calculate the dissimilarity between the pixel and the region of the pixel across the seam (other_region_dissim). If own_region_dissim>seam_threshold_factor*other_region_dissim, add the region label of the region of the pixel across the seam to the “candidate region label” set of the region the pixel belongs to.
During the processing window elimination process, a “candidate region label” set is accumulated for use in considering whether or not a pixel is to be split out of its current region. Compare each region to every other region. If the dissimilarity between a pair of regions is less than region_threshold_factor*max_threshold, add each region label to the “candidate region label” set for the other region. NOTE: max_threshold is the maximum merging threshold encountered in the previous merging iterations.
For each region with a non-empty “candidate region label” set, compute the dissimilarity of each pixel in that region to its current region (own_region_dissim) and to each region in the region's “candidate region label” set (other_region_dissim). If a pixel is found to have own_region_dissim>split_pixels_factor*other_region_dissim, the pixel is split out from its current region.
When spclust_wght>0.0, the following optional parameters may be used to output information on closed connected regions (no defaults, ignored if spclust_wght=0.0):
Similar to the output parameter file oparam.
Similar to rlblmap, but with closed connected regions.
Similar to regmerges, but for closed connected regions.
Similar to rnpixlist, but for closed connected regions.
Similar to rmeanlist, but for closed connected regions.
Similar to rthreshlist, but for closed connected regions.
Similar to boundary_npix, but for closed connected regions.
Similar to reg_std_dev, but for closed connected regions.
The following optional parameters control the run-time screen and log file outputs:
Must be specified if debug>0 (ignored if debug=0):
At a minimum (for debug=1), the output log file records program parameters and the number of regions and maximum merge ratio value for each level of the region segmentation hierarchy.
The parameters that have the most effect on the nature of the segmentation results are spclust_wght, dissim_crit, chk_nregions and convfact. The default values are recommended for the other optional parameters for routine use of HSEG and RHSEG, with the exception that specification of the output file name parameters regmerges, rthreshlist, boundary_npix and boundary_map is also recommended. In addition, the file name parameter rmeanlist is recommended when the number of spectral bands is less than 10 or so. Of course, if some input data elements are invalid, the some method of data masking should also be employed.
The following paragraphs give some guidance on the setting of the spclust_wght, dissim_crit, chk_nregions and convfact parameters:
spclust_wght: The user may want to vary the value of spclust_wght to modify the overall nature of the segmentation results. For spclust_wght=0.0, obtains relatively coherent closed connected regions. For spclust_wght=1.0, obtains relatively variated regions consisting of possibly several spatially disjoint subsections. For other values of spclust_wght obtains results intermediate the spclust_wght=0.0 and spclust_wght=1.0 results.
dissim_crit: The user may also want to vary the value of dissim_crit to modify the overall nature of the segmentation results. The different dissimilarity criterion will result in different merge ordering. NOTE: If any of the vector norm dissimilarity criterion is chosen (selections 1, 2 or 3), the user may also want to specify the value of min_npixels_pct (small values in the range of 0.1 to 1.0 are suggested).
chk_nregions: The user may want to vary the value of chk_nregions to vary the level of segmentation detail in the most detailed level of the segmentation hierarchy. Higher values will increase the detail (the segmentation will have more regions) and lower values will decrease the detail (the segmentation will have fewer regions).
convfact: The user may want to vary the value of convfact to control the number of hierarchical segmentation levels contain in the final segmentation hierarchy. Lower values of convfact will produce more segmentation levels and higher values of convfact will produce fewer segmentation levels. Too high of a value for convfact will produce only two hierarchical levels: one with chk_nregions number of regions and one with conv_nregions number of regions.
Varying the other optional parameter values away from the default values requires a thorough understanding of the inner workings of the software implementation of the HSEG and RHSEG algorithms.
This disclosure has been written in a general way as to include alternate embodiments of the algorithms in software. Varying the parameters will bring about various alternate embodiments of the innovation.
The extension of the basic RHSEG algorithm to 3-spatial dimensions is conceptually straightforward. One just needs to deal in data “voxels” instead of “pixels,” extend the definition of conn-type to include 3-spatial dimension neighborhoods, and recursively divide the data into eight equal sections (halving each spatial dimension) rather than four.
Concerning conn_type:
For 2-spatial dimensions, we defined neighbors based on the following neighborhood chart, where the focal pixel is marked “X”:
Using this chart, N Nearest Neighbors include pixels 1, 2, . . . N for 2-spatial dimension data. A similar chart could be developed for 3-spatial dimensions, but printing would require 3-dimensional plotting. It is easier to present a table of values. For 2-spatial dimensions, the table corresponding to above 2-spatial dimension chart, for up to Eight Nearest Neighbors, is:
where the focal pixel is at location (col,row), and nbcol is the neighboring pixel column location and nbrow is the neighboring pixel row location.
The corresponding chart for 3-spatial dimensions, for up to Twenty-Six Nearest Neighbors, is given on the next page. In this chart, the focal pixel is at location (col,row,depth), nbcol is the neighboring voxel column location, nbrow is the neighboring voxel row location and nbdepth is the neighboring voxel depth location. Note that the equivalent of Eight Nearest Neighbors (the default) in 2-spatial dimensions is Twenty-Six Nearest Neighbors in 3-spatial dimensions.
The extension to 3-spatial dimensions of RHSEG is also conceptually straightforward. The seams between the data sections that are reassembled after processing at the previous level of recursion now become pairs of surfaces rather than pairs of lines.
The extension of the parallel implementation of RHSEG to 3-spatial dimensions is also straightforward conceptually. The number of processors utilized by the 3-spatial dimension implementation is numprocs=8inb
At program initialization, the input data would again be parceled out to numprocs processors, and each processor would independently process each of the numprocs data sections. Again similar to the 2-spatial dimension version, upon completion of processing at recursive levels rnb_levels through inb_levels, the 3-spatial dimension version would transfer its results to the appropriate processor at recursive level inb_levels−1.
The parallel implementation scheme for the 3-spatial dimension version of RHSEG could be charted similarly to
While the parallel implementation of the modified version of RHSEG is noted to be conceptually straightforward, the amount of detail that needs to be kept track of by the program is increased substantially, complicating the programming task substantially.
All publications, patents, and patent documents are incorporated by reference herein as though individually incorporated by reference Although preferred embodiments of the present invention have been disclosed in detail herein, it will be understood that various substitutions and modifications may be made to the disclosed embodiment described herein without departing from the scope and spirit of the present invention as recited in the appended claims.
The present application is a continuation in part of U.S. Ser. No. 10/845,419 filed May 11, 2004, and which is incorporated herein by reference.
The invention described herein was made by an employee of the United States Government, and may be manufactured and used by and for the Government or for governmental purposes without the payment of any royalties thereon or therefor.
Number | Date | Country | |
---|---|---|---|
Parent | 10845419 | May 2004 | US |
Child | 11251530 | Sep 2005 | US |