Image segmentation can be used to identify objects within images. For example, an image can be segmented to identify a particular object such as a person (or representation of a person) within the image. The representation of the person can then be copied or extracted from the image and added to another image.
Some image segmentation systems provide a user interface via which a user can provide input to identify a portion of an object within an image that should be segmented (or extracted from the image). Such image segmentation systems then sample the portion (or sample region) of the object, generate a model of the object using the identified portion of the object, and identify the object using the model.
Image segmentation systems that rely on a model of an object generated from a sample region can fail to accurately identify the object. For example, if the sample region includes not only a portion of the object, but also other portions of the image, the model can fail to accurately represent the object. That is, the model includes information about or derived from both the object and the other portions of the image included in the sample region because the model is derived from samples (e.g., pixel values) within the sample region. The model, therefore, represents both the portion of the object included in the sample region and the other portions of the image include in the sample region, which impairs the accuracy of the model as a representation of the object.
Additionally, a model generated from samples taken from one portion of an object within an image may not accurately represent the object when the object is visually spatially diverse. In other words, if the visual appearance of the object varies across the object (i.e., one segment of the object looks different than another segment of the object), a model generated from a sample region that includes one portion of the object, but not other portions of the object with visual appearances that differ from the portion of the object included in the sample region, may not accurately represent the entire object.
These limitations can cause such image segmentation systems to be overly inclusive (e.g., identify portions of the image that do not include parts of a particular object as including parts of the object) or overly exclusive (e.g., fail to identify portions of the image that are parts of the object). As a result, users often refine the segmentation performed by such image segmentation systems by manually adjusting which portions of the image are identified as including particular objects.
Moreover, image segmentation systems that rely on user input to select sample regions of an object within an image are unable to perform image segmentation independent of user input. Such image segmentation systems can be particularly problematic when implemented as network services (e.g., applications that are accessed via communications links such as the Internet), because such image segmentation systems require a user interface such as a graphical user interface (GUI) via with a user can identify or select a sample region of an image. Often, such image segmentation systems can appear slow or unresponsive to users when a GUI is provided for user input such as selection of a sample region due to latencies, throughput, and other limitations of communications links.
Implementations discussed herein identify objects within images using discriminative classifiers. For example, implementations discussed herein select sample regions of an image independent of user input. Such sample regions include foreground sample regions (i.e., sample regions that are targeted to or intended to include a portion of a particular object) and background sample regions (i.e., sample regions that are targeted to or are intended to include portions of an image that do not include a particular object). Additionally, the sample regions are associated with particular segments of an object within an image. Accordingly, the discriminative classifiers are also associated with, or tuned to, those segments. Thus, such implementations can identify segments of the object with enhanced accuracy when compared to other methodologies. The sample regions are then used to generate discriminative classifiers to identify an object against a background within the image without generating a model that represents the object. In some implementations, the object is a person (or human being), and a discriminative classifier is generated for each of a variety of segments of the person such as a hair segment, a pant segment, and a shirt segment.
As used herein, a functionality or operation is or is performed “independent of user input” if user input does not provide arguments, parameters, or other data to that functionality or operation. For example, a functionality or operation is performed independent of user input if the functionality or operation is invoked, initiated, or started by user input, but the user input does not provide data used as an operand within the functionality or operation. As another example, a functionality or operation is performed independent of user input if the functionality or operation is selected or specified by user input, but the user input does not provide data used as an operand within the functionality or operation.
Objects often include multiple segments. A segment is a portion or part of an object. For some objects, one segment of the object is visually distinct from other segments of the object. For example, a person depicted within an image can include a face segment, a hair segment, a shirt segment, and a pant segment, each of which is visually distinct from the other segments. As another example, a flower can have a stem segment, one or more leaf segments, and a corolla segment. Moreover some segments can include other segments. As an example, a corolla segment of a flower can include one or more petal segments.
A reference segment is a segment of an object that is used as a reference to identify other segments of the object. As such, reference segments can typically be identified with confidence. Said differently, a reference segment can be identified within an image from other portions of the image with a low error rate. Often, reference segments are visually distinctive or have visual properties that can be identified from other portions of an image. As examples, the face of a person, the corolla of a flower, and the license plate of an automobile can be reference segments.
As discussed above, the reference segment can be identified at block 110 independent of user input. For example, a user of an image segmentation system implementing process 100 can provide an image (e.g., a file including image data encoded according to any of a variety of such as a JPEG format, a GIF format, a bitmap format, or a PNG format) to the image segmentation system via a communications link. In response to receiving the image, the image segmentation system can analyze the image for reference segments. As a specific example, the image segmentation system can analyze the image to identify any of a group of segments as reference segments. More specifically, with reference to the examples above, the image segmentation system can analyze the image to identify a face of a person, a corolla of a flower, or a license plate of an automobile as a reference segment. In other words, the image segmentation system can apply various image processing techniques or processes such as template comparison, edge or other feature detection, or Hamming distance analysis to the image to identify the reference segment.
In some implementations, the image processing techniques or processes are directed to or tuned to identify reference segments of a particular type or class or of any from a group of types or classes of reference segments. In some implementations, the user can indicate to the image segmentation system for which class or type of reference segment the image should be analyzed. For example, the image segmentation system can provide an interface via which the user can indicate (e.g., select a checkbox or radio button) that the picture includes people, and the image segmentation system can identify a face of a person as the reference segment independent of user input. That is, user input identifies the type of reference segment (here, a face) for which the image segmentation system should analyze the image, but the user input does not include data such as a location, position, or area within the image that is used to identify the reference segment.
A foreground sample region and a background sample region for a first segment of the object of which the reference segment is a part is then selected (or identified or determined) at block 120 independent of user input. For example, the reference segment can be used at block 120 to identify a foreground sample region and a background sample region for a first segment of the object of which the reference segment is a part. Additionally, a foreground sample region and a background sample region for a second segment of the object are identified at block 120. Similar to the foreground sample region and a background sample region for the first segment, the foreground sample region and a background sample region for the second segment can be selected independent of user input.
Foreground and background sample regions can be (or be said to be) associated with particular segments of an object within an image. That is, the foreground and background sample regions can be positioned, sized, and/or oriented within the image (or portions of the image at particular positions and with particular sizes and/or orientations can be selected from the image), to be located within the image relative to the associated segments. As discussed in more detail herein, a discriminative classifier for each segment is generated from the foreground sample region and the background sample region associated with that segment. Accordingly, the discriminative classifier with a particular segment can be used to identify that segment with enhanced accuracy.
As illustrated in
Furthermore, the locations of the foreground sample region and the background sample region of the first segment can depend on a physical property or attribute of the object. For example, a physical relationship can exist between the reference segment and the first segment, and the locations of the foreground sample region and the background sample region of the first segment can depend on that relationship.
As a more specific example, the reference segment can be a face of a person, and the first segment can be an upper-body or shirt segment of the person. A length of the face (e.g., a measure of the distance between the mouth and eyes) can be defined, and the foreground sample region for the shirt segment (i.e., the first segment) can be placed approximately three lengths of the face vertically below the top of the face (i.e., the reference segment). That is, a physical attribute (here, a relationship) between the size of the face and the location of the shirt segment can be used to determine where in an image to place or from where in an image to select the foreground sample region for the shirt segment. In some implementations, the foreground sample region for the shirt segment can be offset horizontally from the center of the face if the face is oriented to the left or right of the image. For example, if the face is oriented toward the left of the image, the foreground sample region for the shirt segment can be offset an absolute distance or a distance proportional to the orientation of the face to the left. Moreover, in some implementations, a size of the foreground sample region for the shirt segment can vary based on a size of the reference segment, a location of the reference segment, an orientation of the reference segment, some other attribute or property of the reference segment, or a combination thereof.
The background sample region for the first segment can also be selected based on the reference segment. For example, similar to the foreground sample region for the shirt segment, the background sample region for the shirt segment can be positioned relative to a size of the foreground sample region for the shirt segment can vary based on a physical attribute of the object (here, a person), a size of the reference segment, a location of the reference segment, an orientation of the reference segment, some other attribute or property of the reference segment, or a combination thereof. Although some portions of an image other than an object may be included in foreground sample regions and some portions of an object may be included in background sample regions, as discussed above, foreground sample regions are placed or selected to primarily include one or more portions of an object, and background sample regions are placed or selected to primarily include portions of an image other than the object. Moreover, in some implementations, a background sample region can partially overlap with a foreground sample region.
Accordingly, the background sample region for the first segment can be offset vertically and/or horizontally from the reference segment and/or foreground sample region for the first segment. Said differently, the location and/or size of the background sample region for the first segment can be determined solely based on the reference segment, or based on the reference segment and the foreground sample region for the first segment. Although the background sample region for the first segment may be determined based on the foreground sample region for the first segment, the background sample region for the first segment can be said to be based on the reference segment because the foreground sample region for the first segment is based on the reference segment.
As a more specific example, the location of the background sample region for the first segment can be determined as a vertical offset and a horizontal offset from a location and an orientation of the reference segment; and the size of the background sample region for the first segment can be determined from the size of the reference segment. As another specific example, the location of the background sample region for the first segment can be determined as a vertical offset and a horizontal offset from a location of the foreground sample region of the first segment; and the size of the background sample region for the first segment can be determined from the size of the reference segment.
As another example, the foreground sample region can be described by (X, Y), where x1<x<x2 and y1<y<y2. That is, the image can be described in a Cartesian coordinate system where X includes a number of x-coordinates in one dimension (the x dimension) between points x1 and x2, and Y includes a number of y-coordinates in another dimension (the y dimension) between points y1 and y2. The background sample region (or a group of background sample regions) can be selected by identifying the background of the image for the segment as follows. The foreground sample region can be extended by m in the x dimension and n in the y dimension to (X′, Y′), where x1−m<X′<x2+m and y1−n<Y′<y2+n, and the background of the image can be defined by (X″, Y″), where X″<x1−m∥X″>x2+m and Y″<y1−n∥Y″>y2+n. One or more background sample regions can then be selected from the background, (X″, Y″), of the image for the segment. In some implementations, m and n are equal.
In the example illustrated in
After the foreground sample region and the background sample region for the first segment and the second segment are selected at block 120, discriminative classifiers are defined for the first segment at block 130 and the second segment at block 140. In other words, using sample such as pixel values within the sample regions, discriminative classifiers for or associated with each of the first segment and the second segment are generated. As discussed above, because a discriminative classifier for each segment is generated from the foreground sample region and the background sample region associated with that segment, that discriminative classifier can be said to be tuned to that segment (or the specific visual characteristics or traits of that segment). Accordingly, such discriminative classifiers can be used to identify segments with enhanced accuracy.
A discriminative classifier is a framework (e.g., data and operations for that data) to distinguish between two or more classes or classifications. In contrast with generative models, a discriminative classifier does not define a description or approximation of members of the classes the discriminative classifier classifies. For example, generative models can define a class in terms of a probabilistic or statistical distribution. Such models can be said to be generative models because the model can be used to generate a sample that is a member of the class modeled by a particular model. A discriminative classifier, rather, determines to which class a sample (or input value or collection of values) belongs by modeling the differences between/among the classes. Said differently, a model defines (or attempt to define) what a class is, discriminative classifier describes differences between two or more classes.
As examples of discriminative classifiers, support vector machines, conditional random fields, and some neural networks are discriminative classifiers. As another example, random forests (or random decision forests) are discriminative classifiers. Once trained for two or more classes (e.g., groups of data sets), such discriminative classifiers can accept a descriptor of a sample, and output an indication of to which class the sample belongs.
As a specific example of blocks 130 and 140, samples of the object and other portions of the image (i.e., the background of the image) can be accessed from the foreground sample region and the background sample region for the first segment, respectively, for the first segment. The samples can be, for example, pixel values in a color space such as RGB, CYMK, or YCbCr. Additionally, information such as texture information or gradient information can be generated from the samples, and can be included with the pixel values in a descriptor for each sample. That is, a descriptor including a color component, a texture component, and a gradient component can be defined for each sample within the foreground sample region and the background sample region for the first segment.
The descriptors for samples in the foreground sample region and the descriptors for samples in the background sample region for the first segment can be used as background data to define (e.g., train) at block 130 a discriminative classifier for the first segment that classifies other descriptors defined from sample values (e.g., pixel values) within the image as belonging to the foreground (here, the first segment of object to which the reference segment belongs) or the background (i.e., everything in the image other than the first segment of the object). The same methodology can be applied for the samples in the foreground sample region and the descriptors for samples in the background sample region for the second segment at block 140 to define the discriminative classifier for the second segment that classifies other descriptors defined from sample values (e.g., pixel values) within the image as belonging to the foreground (here, the second segment of object to which the reference segment belongs) or the background (here, everything in the image other than the second segment of the object).
In some implementations, process 100 can be combined with other processes. For example,
At block 210, a first discriminative classifier is defined for a first segment of an object within an image, and a second discriminative classifier is defined for a second segment of an object within an image at block 220. As shown in the example illustrated in
Descriptors are defined or derived from the samples within a foreground sample region (e.g., either for the first segment or the second segment) at block 291. A descriptor is a data set that describes a sample. For example, as discussed above, a sample can be a pixel value, and a descriptor for that pixel value can include one or more color components, texture components, gradient components, and/or other components determined from that pixel value and/or neighboring pixel values. That is, a descriptor can include information different from and/or in addition to raw sample values such as information derived or synthesized from the sample described or represented by that descriptor and other samples in some proximity to that sample. Said differently, a descriptor for a sample can include more dimensions or have a higher order (i.e., a number of dimensions) than the dimensions or order of the sample.
Some dimensions of a descriptor can include texture information (e.g., variations in color or texture within a portion of an image). As examples, for each sample, a local binary patterns (LBP) histogram can be generated using a sample window such as a 7×7 sample (e.g., pixel) window, a 9×9 sample window, or a 5×5 sample window around that sample. In some implementations, the LBP histogram can be quantized into a number of bins and the number of values in each bin can be included within the descriptor. As a specific example, the LBP histogram can be quantized into four bins (e.g., one bin each for values between 0 and 63, between 64 and 127, between 128 and 191, and between 192 and 255), and the number of values in each of the four bins can be included within the descriptor. Such a histogram has four dimensions—one for each bin. Thus, in this example, the texture component of a descriptor is four-dimensional. In other implementations, a texture component of a descriptor can be defined using other methodologies and/or can include more or fewer dimensions.
Moreover, some dimensions of a descriptor can include gradient information (e.g., variations in color or intensity of an image along a particular direction or vector). As an example, for each sample, a histogram of oriented gradients (HOG) can be generated using a sample region such as a 7×7 sample region, a 9×9 sample region, or a 5×5 sample region around that sample. In some implementations, the HOG can be quantized into a number of bins, and the number of values in each bin can be included within the descriptor. As a specific example, the HOG can be quantized into four bins (e.g., one bin each for values between 0 and 89 degrees, between 90 and 179 degrees, between 180 and 269 degrees, and between 270 and 359 degrees), and the number of values in each of the four bin can be included within the descriptor. Thus, in this example, the gradient component of a descriptor is four-dimensional. In other implementations, a gradient component of a descriptor can be defined using other methodologies and/or can include more or fewer dimensions.
As a specific example, each sample (e.g., pixel value) within a sample region can include three values between 0 and 255 for each component (i.e., red, green, and blue) of an RGB color space. A descriptor for each sample can include the three color space values for that sample (e.g., a three-dimensional color component), four LBP texture values (e.g., a four-dimensional texture component as discussed above), and four HPG values (e.g., a four-dimensional gradient component as discussed above). Thus, each sample has 3 dimensions (i.e., an order of 3), and the descriptor for each sample has 11 dimensions (i.e., an order of 11).
Descriptors are defined or derived from the samples within a background sample region (e.g., either for the first segment or the second segment) at block 292 similarly as at block 291. A discriminative classifier based on the descriptors for the foreground sample region and the descriptors for the background sample region is then generated (or defined) at block 293. For example, as discussed above, the discriminative classifier for a segment can be defined by training the discriminative classifier (e.g., a random forest) using the descriptors for the foreground sample region and the descriptors for the background sample region. That is, the descriptors for the foreground sample region and the descriptors for the background sample region are provided to a framework representing the discriminative classifier for a segment to train the discriminative classifier based on the data (e.g., the samples and information derived or synthesized from the samples) within the descriptors defined at block 291 and 292.
After the first and second discriminative classifiers are defined at block 210 and 220, the first segment (or at least a portion thereof) and the second segment (or at least a portion thereof) are identified at blocks 230 and 240 using the first discriminative classifier and the second discriminative classifier, respectively. In other words, pixel values (or descriptors derived from those pixel values) from the image are provided to the first discriminative classifier to determine whether those pixel values are part of the first segment, and pixel values (or descriptors derived from those pixel values) from the image are provided to the second discriminative classifier to determine whether those pixel values are part of the second segment.
In some implementations, descriptors for each pixel value of the image are provided to the first discriminative classifier, and are marked, flagged, or annotated as part of the first segment (or in the foreground) or not part of the first segment (or in the background) based on output values from the first discriminative classifier. Such descriptors can be generated using the same or similar methodologies discussed above in relation to blocks 291 and 292. In some implementations, descriptors for each pixel value in the image are provided to the first and second discriminative classifiers. In other implementations, only descriptors for pixel values within some proximity of the foreground and background sample regions of the first segment are provided to the first discriminative classifier, and only descriptors for pixel values within some proximity of the foreground and background sample regions of the second segment are provided to the second discriminative classifier. In other words, in some implementations, only portions of the image local to the sample regions for a segment are analyzed by the discriminative classifier for that segment.
A description of the first segment can then be defined based on the pixel values that are determined to be part of or included within the first segment. Similarly, a description of the second segment can be defined based on the pixel values that are determined to be part of or included within the second segment. The descriptions of the first segment and second segment can be represented in a variety of forms or formats. For example, a descriptor can be a binary bit map or mask with an element (e.g., bit) for each pixel value in the image. If the element has a true value (e.g., a value of 1), the pixel value that corresponds to that element is included in the segment represented by that descriptor. Similarly, if the element has a false value (e.g., a value of 0), the pixel value that corresponds to that element is not included in the segment represented by that descriptor. In other implementations, a descriptor of a segment can include a list of vertices of one or more polygons that define a perimeter of that segment, a definition of a shape or shapes that define a perimeter of that segment, or a list of coordinates (e.g., Cartesian coordinates relative to an origin of the image) of pixel values that are included in the segment represented by that descriptor.
In some implementations, the first and second discriminative classifiers may output false negatives (e.g., determine that some pixel values that are included in the first or second, respectively, segment are not included in that segment) and false positives (e.g., determine that some pixel values that are not included in the first or second, respectively, segment are included in that segment). Thus, the descriptors of the first and second segments may have errors or defects (e.g., include pixel values that are not included in a particular segment or not include pixel values that are included in a particular segment) or not be entirely accurate.
The segmentation is then refined at block 250. For example, the description of the first segment and the description of the second segment can be combined to define a description of the object. The description of the object can be provided to a segmentation refinement engine to refine the identification of the object (or the first and second segments). For example, some segmentations methodologies such as graph cuts can produce highly refined segmentation in localized portions of an image if provided with an accurate description of the object to be segmented. Because the description of the object describes much of the object, this description can be provided to the segmentation refinement engine as the description of the object to be segmented, and the segmentation refinement engine can refine the identification of pixel values included in the object (e.g., in the first segment and second segment).
Processes 100 and 200 illustrated in
Furthermore, although various modules (i.e., combinations of hardware and software) are illustrated and discussed in relation to
Sample engine 310 selects sample regions (i.e., foreground sample regions and background sample regions) for segments of an object within an image. For example, sample engine 310 can select sample regions based on physical attributes of an object (or a class of objects) and/or attributes of a reference segment. As discussed above, sample engine 310 can use a reference segment and/or such physical properties or attributes to select sample regions independent of user input. Additionally, sample engine 310 provides a description of sample regions to descriptor module 330.
Reference module 320 identifies a reference segment of an object within a image. For example, reference module 320 can implement various image processing methodologies such as edge detection, character recognition, skin tone or texture recognition, facial feature recognition, and/or template matching to identify a reference segment. Moreover, reference module 320 can provide a description of the reference segment to sample engine 310, and sample engine 310 can use attributes of a reference segment to select sample regions.
Descriptor module 330 defines descriptors for sample regions. For example, as discussed above in relation to blocks 291 and 292 of
Discriminative classifier generator 340 receives the descriptors generated at descriptor module 330, and defines (or generates) a discriminative classifier for segments of the object within the image using descriptors for each segment. In some implementations, discriminative classifier generator 340 can generate and train a random forest for each segment using the descriptors defined using the samples from the foreground and the background sample regions. In other implementations, other frameworks such as support vector machines can be generated and/or trained for each segment at discriminative classifier generator 340.
Analysis module 350 analyzes the image using discriminative classifiers defined at discriminative classifier generator 340. In other words, analysis module 350 applies portions of the image (e.g., pixel values) to the discriminative classifiers to determine which portions of the image are included in the segments associated with those discriminative classifiers. Said differently, analysis module 350 applies the discriminative classifiers to portions of the image to identify the segments (or portions thereof) associated with those discriminative classifiers.
In some implementations, analysis module 350 identifies other segments of an object using methodologies other than discriminative classifiers. For example, if the object is a person and a face segment is the reference segment for the object, analysis module can generate a model such as a Gaussian mixture model (GMM) (or use a GMM generated at a different module) for skin tones or textures. The model can then be applied to the image to identify skin segments of the person (e.g., portions of the image that include parts of the person with exposed skin).
As another example, subtraction methodologies such as background methodologies can be used to identify other segments of an object. As an example with a person as the object and a lower-leg segment (e.g., below the knees of the person), lower-leg segment can be identified by background subtraction. More specifically for this example, a background region for the lower-leg segment can be defined by masking skin segments below a pant (here, knee-length pants or shorts) segment. In some implementations, the skin segments can be dilated and the skin segments and a central region of the image below the pant segment can be masked (or ignored). The unmasked or remaining portions or regions of the image in some proximity to the masked dilated skin segments and masked central region can then be used to generate a model. That is, for example, pixel values in those remaining portions can be used to train a GMM for the background of the lower-leg segment. The model can then be applied to the image to subtract the lower-leg segment from the background. For example, the portions of the image below the pant segment that match or satisfy the GMM for the background can be subtracted from the image (or marked or flagged as not part of the lower-leg segment).
Additionally, analysis module 350 can generate descriptors for the segments identified at analysis module 350, which can be provided to segmentation refinement engine 370 or to combination module 360, at which the descriptions for the segments are combined or joined to define a description of the object that is provided to segmentation engine 370. As discussed above, the segments identified at analysis module 350 and described by the descriptors can include errors or defects (e.g., false positive and/or false negative data).
Segmentation refinement engine 370 further refines the segmentation to more accurately extract the object. For example, segmentation refinement engine 370 can implement sensitive graph cut methodologies to segment areas of the image in close proximity to the object. More specifically, for example, segmentation refinement engine 370 can receive the descriptors of the segments identified at analysis module 350 (or a descriptor of the object defined from those descriptors) as input, and define a portion of the image adjacent to (or about) those segments as an unknown area. For example, analysis module 350 can define a description of a 10-pixel-wide periphery around the segments identified at analysis module 350, and can provide that description to segmentation refinement engine 370. Segmentation refinement engine 370 can then apply, for example, a graph cut process to the description of the segments and the description of the periphery to determine which pixels in the periphery and at the edges of the identified segments should be included in the segments. Segmentation refinement engine 370 can then output a refined description of the object. This description can be used, for example, to copy the object from the image (e.g., copy the pixel values of the object); and that copy of the image can be stored separate from the image, inserted into another image, or otherwise manipulate the image separate from the object.
As a specific example of the operation of an image segmentation system,
As illustrated in
As illustrated in
Foreground descriptors 481, background descriptors 482, foreground descriptors 483, background descriptors 484, foreground descriptors 485, and background descriptors 486 are then provided to discriminative classifier generator 340 as illustrated in
Discriminative classifiers 471, 472, and 473 and image 400 are then provided to analysis module 350 as illustrated in
In some implementations, analysis module 350 also identifies arm segments 415 and 416 and/or leg segments 417 and 418, and defines description 504 of those segments. For example, a description of face segment 414 can be provided to analysis module 350, and analysis module can identify arm segments 415 and 416 and/or leg segments 417 and 418 based on face segment 414. As a more specific example, analysis module 350 can define a model of skin tone and/or texture of object 410 based on face segment 414, and can identify portions of image 400 that fit the model as arm segments 415 and 416 and/or leg segments 417 and 418 based on physical attributes of object 410. For example, analysis module 350 can be configured to identify objects that are persons, and can identify arm segments 415 and 416 and/or leg segments 417 and 418 based on a model derived from face segment 414 and anatomy or physiology of the human body.
As another example, analysis module 350 can define a discriminative classifier for face segment 414 using, for example, a foreground sample region that include face segment 414 can one or more of sample regions 491-496 and/or other sample regions as background sample regions. That discriminative classifier can then be applied to image 400 at analysis module 350 to identify arm segments 415 and 416 and/or leg segments 417 and 418.
As illustrated in
The image segmentation system defines a portion of image 400 adjacent to object 410 as an unknown region of image 400. This portion of image 400 is illustrated as region 610 in
Description 601 and descriptions of regions 610 and 620 can be provided to segmentation refinement engine 370 to generate description 701 of image 410. As illustrated in
Processor 510 is any combination of hardware and software that executes or interprets instructions, codes, or signals. For example, processor 510 can be a microprocessor, an application-specific integrated circuit (ASIC), a distributed processor such as a cluster or network of processors or computing systems, a multi-core or multi-processor processor, or a virtual or logical processor of a virtual machine.
Communications interface 520 is a module via which processor 510 can communicate with other processors or computing systems via a communications link. For example, communications interface 520 can include a network interface card and a communications protocol stack hosted at processor 510 (e.g., instructions or code stored at memory 530 and executed or interpreted at processor 510 to implement a network protocol) to communicate with clients to receive images. As specific examples, communications interface 520 can be a wired interface, a wireless interface, an Ethernet interface, a Fiber Channel interface, an InfiniBand interface, and IEEE 802.11 interface, or some other communications interface via which processor 510 can exchange signals or symbols representing data to communicate with other processors or computing systems.
Memory 530 is a processor-readable medium that stores instructions, codes, data, or other information. As used herein, a processor-readable medium is any medium that stores instructions, codes, data, or other information non-transitorily and is directly or indirectly accessible to a processor. Said differently, a processor-readable medium is a non-transitory medium at which a processor can access instructions, codes, data, or other information. For example, memory 530 can be a volatile random access memory (RAM), a persistent data store such as a hard disk drive or a solid-state drive, a compact disc (CD), a digital video disc (DVD), a Secure Digital™ (SD) card, a MultiMediaCard (MMC) card, a CompactFlash™ (CF) card, or a combination thereof or other memories. Said differently, memory 530 can represented multiple processor-readable media. In some implementations, memory 530 can be integrated with processor 510, separate from processor 510, or external to computing system 500.
Memory 530 includes instructions or codes that when executed at processor 510 implement operating system 531 and image segmentation system 533 (and the components or modules of image segmentation system 533). Said differently, image segmentation system 533, or the modules that define image segmentation system 533, is hosted at computing system 500.
In some implementations, computing system 500 can be a virtualized computing system. For example, computing system 500 can be hosted as a virtual machine at a computing server. Moreover, in some implementations, computing system 500 can be a virtualized computing appliance, and operating system 531 is a minimal or just-enough operating system to support (e.g., provide services such as a communications protocol stack and access to components of computing system 500 such as communications interface 520) image segmentation system 533.
Image segmentation system 533 can be accessed or installed at computing system 500 from a variety of memories or processor-readable media. For example, computing system 500 can access image segmentation system 533 at a remote processor-readable medium via communications interface 520. As a specific example, computing system 500 can be a thin client that accesses operating system 653131 and image segmentation system 533 during a boot sequence.
As another example, computing system 500 can include (not illustrated in
In some implementations, image segmentation system 533 can be accessed at or installed from multiple sources, locations, or resources. For example, some component of image segmentation system 533 can be installed via a communications link, and other components of image segmentation system 533 can be installed from a DVD.
In other implementations, image segmentation system 533 can be distributed across multiple computing systems. That is, some components of image segmentation system 533 can be hosted at one computing system and other components of image segmentation system 533 can be hosted at another computing system or computing systems. As a specific example, image segmentation system 533 can be hosted within a cluster of computing systems where each component of image segmentation system 533 is hosted at multiple computing systems, and no single computing system hosts each component of image segmentation system 533.
While certain implementations have been shown and described above, various changes in form and details may be made. For example, some features that have been described in relation to one implementation and/or process can be related to other implementations. In other words, processes, features, components, and/or properties described in relation to one implementation can be useful in other implementations. As another example, functionalities discussed above in relation to specific modules or elements can be included at different modules, engines, or elements in other implementations. Furthermore, it should be understood that the systems, apparatus, and methods described herein can include various combinations and/or sub-combinations of the components and/or features of the different implementations described. Thus, features described with reference to one or more implementations can be combined with other implementations described herein.
As used herein, the term “module” refers to a combination of hardware (e.g., a processor such as an integrated circuit or other circuitry) and software (e.g., machine- or processor-executable instructions, commands, or code such as firmware, programming, or object code). A combination of hardware and software includes hardware only (i.e., a hardware element with no software elements), software hosted at hardware (e.g., software that is stored at a memory and executed or interpreted at a processor), or at hardware and software hosted at hardware.
Additionally, as used herein, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, the term “module” is intended to mean one or more modules or a combination of modules. Moreover, the term “provide” as used herein includes push mechanism (e.g., sending data via a communications path or channel), pull mechanisms (e.g., delivering data in response to a request), and store mechanisms (e.g., storing data at a data store or service at which the data can be accessed). Furthermore, as used herein, the term “based on” includes based at least in part on. Thus, a feature that is described as based on some cause, stimulus, or data; can be based only on that cause, stimulus, or data; or based on that cause, stimulus, or data and on one or more other causes, stimuli, or data.