This disclosure relates to computer graphics and specifically to raw drawings or sketches as represented on a computer or similar processing environment. Particular embodiments provide automated methods for consolidating the strokes typically present in raw sketches into a number of artist-intended curves. Particular embodiments provide automated methods for converting computer representations of raw sketches into computer representations of artist-intended curves.
Freehand line drawing provides a natural avenue for artists to quickly communicate shapes, ideas and images. When creating line drawings from scratch, artists often employ oversketching, using groups of multiple raw strokes to depict their intended, aggregate, curves.
There is a general desire to provide automated methods for consolidating the strokes 12, 22 typically present in raw sketches 10, 20 into a number of artist-intended curves 112, 122 to provide consolidated drawings 110, 120. There is a general desire to provide automated methods for converting computer representations of raw sketches 10, 20 and their pluralities of strokes 12, 22 into computer representations of artist-intended curves 112, 122 to provide computer representations of consolidated drawings 110, 120.
The foregoing examples of the related art and limitations related thereto are intended to be illustrative and not exclusive. Other limitations of the related art will become apparent to those of skill in the art upon a reading of the specification and a study of the drawings.
The following embodiments and aspects thereof are described and illustrated in conjunction with systems, tools and methods which are meant to be exemplary and illustrative, not limiting in scope. In various embodiments, one or more of the above-described problems have been reduced or eliminated, while other embodiments are directed to other improvements.
One aspect of the invention provides a method for converting a raw drawing comprising a plurality of strokes into an artist-intended curve drawing. The method comprises: obtaining, at a computer system, a computer representation of a raw drawing, the raw drawing comprising a plurality of strokes, each stroke represented in a vector format in the computer system; clustering, by the computer system, the plurality of strokes into one or more clusters, each cluster comprising a corresponding group of one or more strokes; for each of the one or more clusters, performing a curve fitting, by the computer system, to thereby determine a computer representation of a corresponding aggregate curve that is fitted to the group of strokes in the cluster; and generating, by the computer system, a computer representation of an artist-intended curve drawing corresponding to the raw drawing, the artist-intended curve drawing comprising the aggregate curve corresponding to each cluster in place of the group of one or more strokes corresponding to each cluster. Clustering, by the computer system, the plurality of strokes into one more clusters comprises performing, by the computer system, a plurality of iterative procedures to either group strokes into precursor clusters or to divide precursor clusters into precursor sub-clusters based on one or more models of human perception of raw drawings comprising pluralities of strokes.
The plurality of iterative procedures to either group strokes into precursor clusters or to divide precursor clusters into precursor sub-clusters may comprise: generating, by the computer system, a precursor aggregate curve corresponding to a precursor cluster by performing, by the computer system, a curve fitting to fit the precursor aggregate curve to one or more strokes within the precursor cluster; for each of a plurality of discrete points on the precursor aggregate curve, determining, by the computer system, at least one parameter of the one or more models; and evaluating, by the computer system, whether the precursor cluster should be divided into precursor sub-clusters or combined with other strokes based on the at least one parameter determined at at least some of the plurality of discrete points on the precursor aggregate curve.
For each of the plurality of discrete points on the precursor aggregate curve, determining, by the computer system, the at least one parameter may comprise: determining, by the computer system, a first tangent t′ to the precursor aggregate curve at the point p′ on the precursor aggregate curve; determining, by the computer system, a second tangent t to a stroke in the precursor cluster at a point p on the stroke closest to the point p′ on the precursor aggregate curve; and determining, by the computer system, an angular distance between the first and second tangents (t, t′).
Evaluating, by the computer system, whether the precursor cluster should be divided into precursor sub-clusters or combined with other strokes based on the at least one parameter determined at at least some of the plurality of discrete points on the precursor aggregate curve may comprise determining, by the computer system, an aggregate angular distance between the stroke in the precursor cluster and the precursor aggregate curve based at least in part on a sum of the angular distances between the first and second tangents determined at the at least some of the plurality of discrete points on the precursor aggregate curve.
Evaluating, by the computer system, whether the precursor cluster should be divided into precursor sub-clusters or combined with other strokes based on the at least one parameter determined at at least some of the plurality of discrete points on the precursor aggregate curve may comprise determining, by the computer system, an aggregate angular distance between the stroke in the precursor cluster and a second stroke in the precursor cluster based on: the aggregate angular distance between the stroke in the precursor cluster and the precursor aggregate curve; and an aggregate angular distance between the second stroke in the precursor cluster and the precursor aggregate curve.
The plurality of iterative procedures to either group strokes into precursor clusters or to divide precursor clusters into precursor sub-clusters may comprise: determining, by the computer system, for each of a plurality of pairs of strokes (Si,Sj) within the plurality of strokes of the raw drawing, an aggregate angular distance between the pair of strokes (Si,Sj); assigning, by the computer system, an angular compatibility score ComA(Si,Sj) to each of the plurality of pairs of strokes (Si,Sj) based on the aggregate angular distance between the pair of strokes (Si,Sj); and performing, by the computer system, an optimization which maximizes ΣijComA(Si,Sj)Yij, where Yij=1 if the pair of strokes (Si,Sj) is grouped into a precursor cluster and Yij=0 otherwise.
The plurality of iterative procedures to either group strokes into precursor clusters or to divide precursor clusters into precursor sub-clusters may comprise: determining, by the computer system, for each of a plurality of pairs of strokes (Si,Sj) within the plurality of strokes of the raw drawing, an aggregate angular distance between the pair of strokes (Si,Sj); assigning, by the computer system, an angular compatibility score ComA(Si,Sj) to each of the plurality of pairs of strokes (Si,Sj) based on the aggregate angular distance between the pair of strokes (Si,Sj); and performing, by the computer system, an optimization, which maximizes ΣijComA(Si,Sj)Yij, where Yij=1 if the pair of strokes (Si,Sj) is grouped into a precursor cluster and Yij=0 otherwise, to groups strokes into precursor clusters.
For each of the plurality of discrete points on the precursor aggregate curve, determining, by the computer system, the at least one parameter may comprise: projecting, by the computer system, a first ray to from the point p′ on the precursor aggregate curve and extending to a left of the precursor aggregate curve in a first orientation orthogonal to the precursor aggregate curve at the point p′; projecting, by the computer system, a second ray from the point p′ on the precursor aggregate curve and extending to a right of the precursor aggregate curve in an second orientation orthogonal to the precursor aggregate curve at the point p′; determining, by the computer system, the at least one parameter based on a first intersection of the first ray with a first stroke Si from among the plurality of strokes of the raw drawing and a second intersection of the second ray with a second stroke Sj from among the plurality of strokes of the raw drawing.
For each of the plurality of discrete points on the precursor aggregate curve, determining, by the computer system, the at least one parameter may comprise determining an inter-stroke distance between a point p at the first intersection of the first ray with the first stroke Si and a point q at the second intersection of the second ray with the second stroke Sj according to ∥p−q∥.
Evaluating, by the computer system, whether a precursor cluster should be divided into precursor sub-clusters or combined with other strokes based on the at least one parameter determined at at least some of the plurality of discrete points on the precursor aggregate curve may comprise determining a stroke separation parameter Di,j(I1) between the first stroke Si and the second stroke Sj based, at least in part, on a sum, over the discrete points on the precursor aggregate curve in a section I1 of the precursor aggregate curve where the first stroke Si and the second stroke Sj are side-by-side, of the inter-stroke distances ∥p−q∥ between the point p at the first intersection and the point q at the second intersection.
Evaluating, by the computer system, whether the precursor cluster should be divided into precursor sub-clusters or combined with other strokes based on the at least one parameter determined at at least some of the plurality of discrete points on the precursor aggregate curve may comprise: for each precursor cluster C, determining, by the computer system, an internal precursor cluster proximity parameter Dc based on the stroke separation parameters of the nearest neighbor strokes within the precursor cluster; for each pair of precursor clusters C, C′ determining, by the computer system, an intercluster spacing Dc,c′ parameter based on the smallest stroke separation parameter between any two strokes where one of the two strokes belongs to the first precursor cluster C and the second one of the two strokes belongs to the second precursor cluster C′; and determining, by the computer system, that the pair of precursor clusters C, C′ should be merged into a single cluster based on evaluation of one or more merge criteria, the merge criteria based on the internal precursor cluster proximity parameters Dc, Dc′ for each of the precursor clusters C, C′ and the intercluster spacing parameter Dc,c′.
Determining, by the computer system, that the pair of precursor clusters C, C′ should be merged into a single cluster based on evaluation of one or more merge criteria may comprise: merging, by the computer system, the pair of precursor clusters C, C′ if both of:
D
c,c′
<T′
d·max(Dc,Dc′)
max(Dc,Dc′)<T′d·min(Dc,Dc′)
are true, where T′d is a constant; and maintaining the pair of precursor clusters C, C′ separate otherwise.
Evaluating, by the computer system, whether a precursor cluster should be divided into precursor sub-clusters or combined with other strokes based on the at least one parameter determined at at least some of the plurality of discrete points on the precursor aggregate curve may comprise: for each precursor cluster C, determining, by the computer system, an internal precursor cluster proximity parameter Dc based on the stroke separation parameters of the nearest neighbor strokes within the precursor cluster; for each pair of precursor clusters C, C′ determining, by the computer system, an intercluster spacing parameter Dc,c′ based on a smallest stroke separation parameter between any two strokes where one of the two strokes belongs to the first precursor cluster C and the second one of the two strokes belongs to the second precursor cluster C′; and determining, by the computer system, that the pair of precursor clusters C, C′ should be merged into a single cluster based on evaluation of one or more merge criteria, the merge criteria based on the internal precursor cluster proximity parameters Dc, Dc′ for each of the precursor clusters C, C′ and the intercluster spacing parameter Dc,c′.
The method may comprise: determining, by the computer system, that the first stroke Si and the second stroke Sj are candidates for merger into a precursor cluster if, for any of the plurality of discrete points p′ on the precursor aggregate curve, the inter-stroke distance between a point p at the first intersection of the first ray with the first stroke Si and a point q at the second intersection of the second ray with the second stroke Sj is less than a width threshold, the width threshold based on a width Ws of at least one of the first and second strokes; and otherwise, determining by the computer system, that the first stroke Si and the second stroke Sj are not candidates for merger into the precursor cluster.
For each first stroke Si and second stroke Sj determined to be candidates for merger into the precursor cluster, the method may comprise: determining, by the computer system, that the first stroke Si and the second stroke Sj remain candidates for merger into the precursor cluster based, at least in part, on a sum, over the plurality of discrete points p′ on the precursor aggregate curve, of the angular distances between the tangents t at the point p at the first intersection of the first ray with the first stroke Si and at the point q at the second intersection of the second ray with the second stroke Sj being less than a angular compatibility threshold, where p′=Mi(p)=Mj(q) and Mi and Mj are the mappings from the first and second strokes Si, Sj to the precursor aggregate curve; and otherwise, determining by the computer system, that the first stroke Si and the second stroke Sj are not candidates for merger into the precursor cluster.
The method may comprise for each first stroke Si and second stroke Sj determined to remain candidates for merger into the precursor cluster: determining, by the computer system, that the first stroke Si and the second stroke Sj should be merged into the precursor cluster based, at least in part, on determining, by the computer system, that a length to width ratio of the precursor aggregate curve in a section where the first stroke Si and the second stroke Sj are side by side is greater than a narrowness threshold and, if it is determined that the first stroke Si and the second stroke Sj should be merged into the precursor cluster, merging the first stroke Si and the second stroke Sj into the precursor cluster; and otherwise, determining by the computer system, that the first stroke Si and the second stroke Sj are not candidates for merger into the precursor cluster.
Determining, by the computer system, that a length to width ratio of the precursor aggregate curve in a section where the first stroke Si and the second stroke Sj are side by side is greater than a narrowness threshold comprises traversing the discrete points p′ on the precursor aggregate curve and determining the farthest left and right intersections il(p) and ir(p) with first and second strokes Si, Sj and determining width Wc,ij of the precursor aggregate curve according to Wc,ij=max(Ws,medianp∈I
Performing, by the computer system, the plurality of iterative procedures to either group strokes into precursor clusters or to divide precursor clusters into precursor sub-clusters may comprise dividing precursor clusters into precursor sub-clusters. Dividing precursor clusters into precursor sub-clusters may comprise evaluating whether a precursor cluster should be divided into precursor sub-clusters and evaluating whether a precursor cluster should be divided into precursor sub-clusters may comprise: assessing, by the computer system, one or more separability criteria for the precursor cluster; and if the one or more separability criteria are satisfied: assigning, by the computer system, strokes from the precursor cluster to one of a pair of potential sub-clusters CL, CR; otherwise determining, by the computer system, that the precursor cluster is not separable.
Assessing the one or more separability criteria for the precursor cluster may comprise: generating, by the computer system, a precursor aggregate curve corresponding to the precursor cluster by performing, by the computer system, a curve fitting to fit the precursor aggregate curve to the strokes from the precursor cluster; for each of a plurality of discrete points on the precursor aggregate curve: projecting, by the computer system, a first ray to from the point p′ on the precursor aggregate curve and extending to a left of the precursor aggregate curve in a first orientation orthogonal to the precursor aggregate curve at the point p′; projecting, by the computer system, a second ray from the point p′ on the precursor aggregate curve and extending to a right of the precursor aggregate curve in an second orientation orthogonal to the precursor aggregate curve at the point p′; and determining, by the computer system, intersections between the first and second rays and the strokes in the precursor cluster and determining, for each pair of adjacent intersections, a gap corresponding to the distance between the adjacent intersections. The one or more separability criteria at each point p′ may be based at least in part on one or more of the gaps on the first and second rays projecting from the point p′.
Assigning, by the computer system, strokes from the precursor cluster to one of a pair of potential sub-clusters CL, CR may comprise, for a point on the aggregate curve where the one or more separability criteria are satisfied: assigning, by the computer system, strokes in the cluster intersected by the first ray to the potential sub-clusters CL and assigning strokes in the cluster intersected by the second ray to the potential sub-clusters CR; and considering, by the computer system, adjacent points on the aggregate curve and, at such adjacent points, assigning previously unassigned strokes to one of the potential sub-clusters CL, CR by an assignment that maximizes an average gap between the potential sub-clusters CL, CR.
Assessing, by the computer system, the one or more separability criteria for the precursor cluster may comprise: determining, by the computer system, that the pair of potential sub-clusters CL, CR is separable based at least in part on a gap ratio r determined for at least some of the points p′ on the precursor aggregate curve, the gap ratio r determined according to:
where g is a gap between a rightmost intersection with the first ray and the leftmost intersection with the second ray, gL is an average of the gaps between the intersections between the strokes from the precursor cluster and the first ray and gR is an average of the gaps between the intersections between the strokes from the precursor cluster and the second ray.
Determining, by the computer system, that the pair of potential sub-clusters CL, CR is separable based at least in part on a gap ratio r determined for at least some of the points p′ on the precursor aggregate curve may comprise: determining an aggregate gap ratio R based on the gap ratios r for a subset of the points p′ on the precursor aggregate curve; if the aggregate gap ratio R is greater than a gap-ratio threshold, then determining, by the computer system, that the pair of potential sub-clusters CL, CR is separable and separating the potential sub-clusters CL, CR into new precursor clusters; otherwise determining, by the computer system, that the pair of potential sub-clusters CL, CR should remain in the same precursor cluster.
Assessing, by the computer system, the one or more separability criteria for the precursor cluster may comprise: comparing a narrowness of the precursor cluster to a narrowness threshold, the narrowness comprising a ratio of a length of the precursor cluster to a maximal gap of the precursor cluster; if the narrowness of the precursor cluster is less than the narrowness threshold, determining, by the computer system, that the pair of potential sub-clusters CL, CR is separable and separating the potential sub-clusters CL, CR into new precursor clusters; otherwise determining, by the computer system, that the pair of potential sub-clusters CL, CR should remain in the same precursor cluster.
Performing, by the computer system, the plurality of iterative procedures to either group strokes into precursor clusters or to divide precursor clusters into precursor sub-clusters may comprise: assessing one or more unification criteria for a pair of precursor clusters; if the one or more unification criteria are satisfied grouping the pair of precursor clusters into a single precursor cluster; and otherwise determining that the pair of precursor clusters should remain as separate precursor clusters.
Assessing one or more unification criteria for the pair of precursor clusters may comprise at least one of: assessing, by the computer system, narrowness criteria between the pair of precursor clusters, the narrowness criteria based on a length to width ratio of a combined precursor aggregate curve fit to strokes within the pair of precursor clusters; assessing, by the computer system, an angular compatibility criteria: between a first precursor aggregate curve fit to strokes within a first one of the pair of precursor clusters and the combined precursor aggregate curve; and between the combined precursor aggregate curve and a second precursor aggregate curve fit to strokes within a second one of the pair of precursor clusters; and assessing, by the computer system, a proximity criteria which comprises: defining a first envelope comprising a region that includes all of the strokes in the first one of the pair of precursor clusters; defining a second envelope comprising a region that includes all of the strokes in the second one of the pair of precursor clusters; and determining a distance between the first and second envelopes.
Another aspect of the invention provides a computer system comprising one or more processors, the processors configured to: obtain a computer representation of a raw drawing, the raw drawing comprising a plurality of strokes, each stroke represented in a vector format in the computer system; cluster the plurality of strokes into one or more clusters, each cluster comprising a corresponding group of one or more strokes; for each of the one or more clusters, perform a curve fitting to thereby determine a computer representation of a corresponding aggregate curve that is fitted to the group of strokes in the cluster; and generate a computer representation of an artist-intended curve drawing corresponding to the raw drawing, the artist-intended curve drawing comprising the aggregate curve corresponding to each cluster in place of the group of one or more strokes corresponding to each cluster; wherein the one or more processors are configured to cluster the plurality of strokes into one more clusters by performing a plurality of iterative procedures to either group strokes into precursor clusters or to divide precursor clusters into precursor sub-clusters based on one or more models of human perception of raw drawings comprising pluralities of strokes.
In addition to the exemplary aspects and embodiments described above, further aspects and embodiments will become apparent by reference to the drawings and by study of the following detailed descriptions.
Exemplary embodiments are illustrated in referenced figures of the drawings. It is intended that the embodiments and figures disclosed herein are to be considered illustrative rather than restrictive.
Throughout the following description specific details are set forth in order to provide a more thorough understanding to persons skilled in the art. However, well known elements may not have been shown or described in detail to avoid unnecessarily obscuring the disclosure. Accordingly, the description and drawings are to be regarded in an illustrative, rather than a restrictive, sense.
Aspects of the invention provide automated methods for consolidating the strokes typically present in raw sketches into a number of artist-intended curves. Particular aspects provide automated methods for converting computer representations of raw sketches into computer representations of artist-intended curves. In some embodiments, the strokes present in the input raw sketch are clustered according to an automated method which attempts to mimic the mental processes that human viewers apply to consolidate the strokes in a raw sketch. Then, once clustered, consolidated curves are generated for each cluster to arrive at the corresponding output drawing with artist-intended curves.
Aspects of the invention provide a method for converting a raw drawing comprising a plurality of overdrawn strokes into an artist-intended curve drawing. The method comprises: obtaining, at a computer system, a computer representation of a raw drawing, the raw drawing comprising a plurality of strokes, each stroke represented in a vector format in the computer system; clustering, by the computer system, the plurality of strokes into one or more clusters, each cluster comprising a corresponding group of one or more strokes; for each of the one or more clusters, performing a curve fitting, by the computer system, to thereby determine a computer representation of a corresponding aggregate curve that is fitted to the group of strokes in the cluster; and generating, by the computer system, a computer representation of an artist-intended curve drawing corresponding to the raw drawing, the artist-intended curve drawing comprising the aggregate curve corresponding to each cluster in place of the group of one or more strokes corresponding to each cluster. Clustering the plurality of strokes into one more clusters comprises performing, by the computer system, a plurality of iterative procedures to either group strokes into precursor clusters or to divide precursor clusters into precursor sub-clusters based on one or more models of human perception of raw drawings comprising overdrawn strokes.
Method 200 starts in block 202, which comprises procuring a raw drawing 204 as input. In some embodiments, raw drawing 204 is received as input to computer system 250. In some embodiments, raw drawing 204 may be obtained in block 202 in a computer graphic representation known as a vector formatted graphic representation. In some embodiments, raw drawing 204 may be converted (e.g. in block 202 or otherwise) from another format (e.g. a raster format) into a vector formatted representation. In some embodiments, raw drawing 204 may be generated on paper and scanned or generated on some other computer and/or the like and may be provided to computer system 250 as input. In some such embodiments, raw drawing 204 may be converted from a raster format into a vector format. In some embodiments, raw drawing 204 may be generated in computer system 250 and may be generated using the same software as that which implements drawing conversion method 200 or using different software (e.g. commercial drawing and/or sketching software (Adobe Illustrator, Autodesk Sketchbook, Inkscape and/or the like) and/or the like).
Exemplary and non-limiting raw drawings 204 are illustrated as drawings 10, 20 in
In some embodiments, raw drawing 204 may be created by an artist using a tablet or other stylus-sensitive display which, together with suitable drawing software (not shown), can be used to create line drawings 204 within the software and have the strokes of the raw drawing 204 recorded in vector format. For example, raw drawing 204 may be obtained (in block 202) using a standard stylus interface (e.g. pen tablets such as the Wacom, Intuos™ series and Bamboo™ series and/or the like or pen displays such as the Wacom, Cintiq™ series and/or the like), where each stroke is represented by a polyline. In some such embodiments, each stroke of raw drawing 204 (see strokes 12, 22 of raw drawings 10, 20 of
Raw strokes captured via a stylus interface are often noisy due to a combination of involuntary hand movement and capture software inaccuracy. In some embodiments, data obtained direction from a stylus interface may optionally be pre-processed to remove such noise-related artefacts by smoothing and densely resampling individual strokes. Such optional pre-processing may be performed as part of obtaining raw drawing 204 in block 202 or prior to obtaining raw drawing 204 in block 202.
In some such embodiments, such smoothing and dense resampling may be performed using the Cornucopia method suggested by Baran, et al. 2010. Sketching clothoid splines using shortest paths. In Computer Graphics Forum, Vol. 29. 655-664, which is hereby incorporated herein by reference. To preserve the input stroke shape as much as possible, in some embodiments, the Cornucopia “error cost” may be set to 5, which keeps the output stroke close to the input. In some embodiments, the original strokes are cut at Cornucopia-detected C0 discontinuities, as well as at sharp curvature extrema, where the curvature is both high (e.g. larger than 0.125 in some embodiments) and distinct from that along the rest of the curve (e.g. at least three times the median curvature in some embodiments).
The hanging portion of so-called hooks which are commonly present at the end of strokes represent capture artifacts and not part of the intended artist input, and, consequently, may be deleted during optional pre-processing. The hanging portions of hooks may be classified as segments between a detected discontinuity and an end point if the segment is both short in absolute terms (e.g. less than 15Ws in some embodiments, where Ws represents a width of the stroke) and forms less than a threshold percentage (e.g. less than 15% in some embodiments) of the overall stroke length.
Method 200 then proceeds to block 210 which involves generating a clustering map 212. The block 210 clustering map 212 assigns the strokes of raw drawing 204 to corresponding clusters. For example, each stroke of raw drawing 204 may be assigned to a particular cluster in clustering map 212. The block 210 clustering process may be based on mimicking the mental processes that human viewers apply to consolidate the strokes in a raw sketch. As explained in more detail below, the block 210 clustering process may be implemented as a coarse-to-fine gradual clustering algorithm which may comprise: forming initial coarse clusters based on angular compatibility between strokes; refining the initial coarse clusters based on average pairwise distance between them, to form clusters of roughly evenly spaced strokes; assessing intra-cluster stroke spacing to detect and separate stroke branches; and analyzing the internal consistency of the computed clusters to resolve ambiguities and to merge clusters which are both angle and spacing compatible.
Once clustering map 212 is obtain in block 210, method 200 proceeds to block 220 which comprises fitting shape-preserving artist-intended curves to the individual clusters in clustering map 212 to arrive at output drawing 222 which corresponds with raw drawing 204 but which comprises artist-intended curves in the place of raw strokes. Specifically, output drawing 222 comprises artist-intended aggregate curves. These aggregate curves in output drawing 22 replace clusters of strokes that jointly depict individual artist intended curves (within raw drawing 204). In some embodiments, the aggregate curves in output drawing 222 may be represented in vector format as polylines with an optional associated width.
The clustering techniques used in block 210 of method 200 in accordance with particular embodiments of the invention are now described in more detail. The disclosure starts with a discussion of how humans perceive oversketched strokes in raw drawings (e.g. strokes 12, 22 in drawings 10, 20 of
Humans are capable of perceiving aggregated curved lines (e.g. curves 112, 122 of drawings 110, 120 of
Angular Compatibility. Studies indicate that viewers rely on angular compatibility between strokes when grouping nearby strokes. Angular compatibility between strokes may be represented by the degree of similarity between stroke tangents. Strokes with similarly oriented tangents over the length of the stroke may be interpreted by humans to belong together (i.e. because they represent the same aggregate line).
Relative Proximity.
Perceptual literature strongly suggests that humans group objects based on relative proximity, or relative distance. Given a set of shapes, humans tend to visually group objects if the spacing between them is much smaller than the space between them and other objects. An example of this grouping based on relative proximity is shown in
It may be observed that proximity based grouping is contextual—for example, similarly spaced objects 258 (
Using proximity as a criterion for stroke grouping poses several challenges. First, it requires context, since one cannot assess the relative proximity of any individual pair of strokes. Second, relative proximity tends to be a negative rather than positive property. That is, relative proximity indicates when objects do not belong together—when both or one of them have much more proximate objects—not when objects do belong together. For roughly evenly spaced strokes, relative proximity alone does not tend to provide a cue as to whether these strokes should, or should not, belong together. Lastly, distances between side-by-side strokes tend to vary at different points along them, raising a question of how to assess proximity locally.
Narrowness.
Humans tend to intuitively understand curves as being narrow, namely having a small width to length ratio. This intuition may help to distinguish between equally spaces strokes that jointly depict aggregate curves and those that do not.
Connectedness.
The connectedness principle highlighted by perception research suggests that humans group objects (e.g. strokes) that are inter-connected, such as points connected by lines. For strokes, this principle argues for grouping intersecting or near intersecting strokes when doing so does not contradict other cues. This is illustrated, for example, in
Strength in Numbers.
Even with the aforementioned cues in place, there remain stroke configurations which, from a human-perception driven perspective, are ambiguous.
This is illustrated, for example, in
Method 300 may be performed by computer system 250 (schematically depicted using dashed lines in
Method 300 begins in block 302 which comprises using the strokes that form part of raw input drawing 204 to implement a coarse clustering based on angular compatibility and to arrive at a first stage clustering map 304.
Method 400 comprises a first loop 402A which (in the schematic illustration of
The angular compatibility between a stroke pair (Si,Sj) provides a first cue about whether these strokes are meant to depict a common aggregate curve. Two nearby strokes Si and Sj are more likely to depict the same aggregate curve when they are fully or partially parallel and are less likely to depict the same aggregate curve when they are orthogonal to one another. As described in more detail below, method 400 may involve determining an angular compatibility score ComA(Si,Sj) between each stroke pair (Si,Sj). This angular compatibility score ComA(Si,Sj) may be set to be positive for stroke pairs that are angle compatible, and negative for those which are not. The value (e.g. magnitude) of the angular compatibility score ComA(Si,Sj) may reflect the degree of (in)compatibility.
For a particular pair of strokes (Si,Sj), block 406 comprises evaluating a proximity threshold inquiry. In general, the angular compatibility being assessed in block 302 (method 400) only impacts clustering decisions for nearby strokes. If the pair of strokes (Si,Sj) are spaced far apart, it may be expected that such strokes will only end up in the same cluster if there are some other criteria (e.g. intermediate proximate and angularly compatible strokes) for grouping the spaced apart strokes. Accordingly, block 406 involves assessing whether the current pair of strokes (Si,Sj) is within (closer to one another than) a proximity threshold.
The block 406 assessment may take place between the closest pair of points on the current stroke pair (Si,Sj).
The block 406 proximity threshold may be a configurable (e.g. user-configurable) parameter of method 400. In some embodiments, the block 406 proximity threshold may be defined relative to the stroke width Ws of the ith or jth stroke. By way of non-limiting example, the block 406 proximity threshold may be defined to be AWs, where A is a multiplying factor (e.g. A=10, A=20, A=50 or the like). If the current pair of pair of strokes (Si,Sj) is within the block 406 proximity threshold, then method 400 proceeds via block 406 YES branch to block 408. Otherwise, method 400 proceeds via block 406 NO branch directly to block 414.
Assuming, for the moment, that method 400 proceeds to block 408 for the current stroke pair (Si,Sj), block 408 involves determining an aggregate curve SijA for the current pair of strokes (Si,Sj). The block 408 aggregate curve SijA may comprise a curve fitted between the current pair of strokes (Si,Sj). In general, the block 408 aggregate curve SijA may be determined by any suitable curve fitting technique. A particular embodiment for determining aggregate curves for a plurality of stokes (including the block 408 aggregate curve SijA) is described in more detail below.
Method 400 then proceeds to block 410, which involves determining an aggregate angular distance metric Da(Si,SijA) between each of the current pair of strokes (Si,Sj) and the aggregate curve SijA. The block 410 procedure for determining the aggregate angular distance metric Da(Si,SijA) between each of the current pair of strokes (Si,Sj) and the aggregate curve SijA may comprise the following steps. For a point p∈Si on the stroke Si, we may define the corresponding point p′∈SijA as its closest point on the aggregate curve SijA. Given this correspondence mapping p′=Mi(p), we may compute the pointwise angular difference at p′ as Ai(p′)=arccos(t·t′), where t and t′ are unit tangents to Si and SijA at points p and p′ respectively.
Because we intend to use the aggregate angular distance metric Da(Si,SijA) to evaluate whether two strokes (Si,Sj) are roughly parallel, in some embodiments, instead of integrating angular distances along the entire aggregate curve SijA, the block 410 determination may be narrowed to sections I1 of interest (e.g. where points on the aggregate curve SijA have corresponding points on both of the current pair of strokes (Si,Sj). Such a section I1 is illustrated in
where |1| represents the number of samples along the section I1.
Method 400 then proceeds to block 412 which may involve choosing one angular distance metric Da(St, Sj) to be the angular distance metric between the current pair of strokes (Si,Sj). In some embodiments, the block 412 angular distance metric Da(Si,Sj) between the current pair of strokes (Si,Sj) may be set according to
D
a(Si,Sj)=max(Da(Si,SijA),Da(Si,SijA)) (2)
Since each point p on each of the current pair of strokes (Si,Sj) has a corresponding point p′ on the fitted curve SijA, the equation (1A) formulation addresses all possible stroke configurations, providing a unified measure.
Method 400 may then proceed to block 414 where the block 412 angular distance metric Da(Si,Sj) between the current pair of strokes (Si,Sj) may be converted to an angular compatibility score ComA(Si,Sj) between the current stroke pair (Si,Sj). In some embodiments, this angular compatibility score ComA(Si,Sj) may be determined according to:
where, for brevity, we have used ϕ=Da(Si,Sj).
The parameters of the equation (3) function for angular compatibility score ComA(Si,Sj) reflect cues from perception research, which indicate that viewers use approximately Ta=20° as a threshold distinguishing between perceived similar and dissimilar tangents. Equation (3) is therefore centered around this angular threshold value Ta. The parameters σ12 and σ22 represent the standard deviation parameters of a pair of Gaussian curves. The spread of these Gaussians (as determined by the parameters σ12 and σ22) may be set to create smooth dropoff. By way of non-limiting example, in some embodiments, σ1=2.57°=9°/3.5 and σ2=2°=7°/3.5. In some embodiments, at this stage, block 414 involves a conservative assessment of clusters, and, consequently, equation (2) involves the use of a higher amplitude negative than positive maximal correlation score (1 v.s. −1.5).
It will be appreciated by those skilled in the art that any of the constants in equation (3) and the parameters σ12 and σ22 may be configurable (e.g. user configurable) to achieve different outcomes.
Method 400 may also arrive in block 414 via the NO branch of block 406. In such a circumstance, it has been determined that the current stroke pair (Si,Sj) is unlikely to be clustered together merely by the angular compatibility in the absence of some other criteria (e.g. intermediate proximate and angularly compatible strokes). In such cases, the angular compatibility score ComA(Si,Sj) between that pair of strokes may be set to a small amplitude negative number (e.g. −10−6). This small amplitude negative number may be selected to be small enough to allow strokes to be grouped together if they exhibit some other criteria indicative of clustering (e.g. they share angle compatible intermediate strokes), but to push the strokes apart (i.e. to be non-clustered) otherwise.
While not expressly shown in
Method 400 exits from block 414 along one of paths 416, 418, or 420, depending on the particular state of the loop indices for loops 402A, 404A. If it is desired to iterate loop 404A again for the same stroke Si, but for a different stroke Sj (e.g. to increment the index j), then method 400 proceeds back to block 404 via path 416 (where the index j is incremented and another iteration of loop 404A is performed). If stroke Si has been completely evaluated and it is desired to increment the index i to evaluate a new stroke Si, then method 400 proceeds back to block 402 via path 418 (where the index i is incremented, the index j is reset and another iteration of loop 402A is performed). If the angular compatibility score ComA(Si,Sj) has been evaluated for each stroke pair, then method 400 proceeds to block 422.
Block 422 involves performing a clustering optimization. It will be noted that block 422 is performed once (i.e. is not part of loop 402A or loop 404A). Given the angular compatibility scores ComA(Si,Sj), the block 422 clustering optimization may be used to group stroke pairs with positive scores, to separate strokes with negative scores, and to resolve ambiguities by considering the magnitude of the scores. This set of requirements naturally fits into a correlation clustering framework. An advantage of using correlation clustering over other clustering formulations is that the number of clusters emerges directly from the input scores and does not need to be estimated a priori. In some embodiments, the block 422 clustering optimization may be formulated according to maximizing the objective function:
ΣijComA(Si,Sj)Yij (4)
where Yij∈{0,1}, Yij=1 if the ith and jth strokes are determined to be in the same cluster and Yij=0 otherwise. Obtaining an optimal correlation clustering in block 422 is a NP-complete problem, which is computationally expensive. Consequently, in some embodiments, any one of a variety of methods (e.g. the method of Keuper et al. in Keuper et al. 2015. Efficient decomposition of image and mesh graphs by lifted multicuts. In Proc. ICCV. 1751-1759, which is hereby incorporated herein by reference) may be used to efficiently provide an approximate solution to the equation (4) optimization problem.
The output of block 414 (and method 400 and block 302) is a first stage clustering map 304. First stage clustering map 304 may assign a cluster label to each of strokes Si where i∈{1, 2, 3 . . . N} in the raw drawing 204. Strokes which share a common cluster label in first stage clustering map 304 may be said to belong to or be in the same cluster in first stage clustering map 304. Conversely, strokes with different cluster labels in first stage clustering map 304 may be said to belong to or be in different clusters in first stage clustering map 304.
Returning now to method 300 (
Method 450 comprises a loop 452A which (in the schematic illustration of
In block 454, method 450 performs an initialization procedure for the current first stage cluster. The block 454 initialization analyzes the current first stage cluster and outputs a number of initialization sub-clusters 456 within the current first stage cluster.
Method 480 (block 454) bases initialization sub-clusters 456 on the connectedness principle discussed above, which suggests that humans often perceive intersecting or near-intersecting strokes as being grouped together. To consider whether the current stroke pair (Si,Sj) is intersecting or connected, block 466 involves a consideration of the distance between the strokes Si, Sj in the current stroke pair (Si,Sj). As discussed above, we may define an aggregate curve SijA which is fitted between the strokes Si, Sj in the current stroke pair (Si,Sj). The details of this aggregate curve fitting are discussed in more detail below. The aggregate curve SijA between the current stroke pair (Si,Sj) provides the common parameterization between this stroke pair (Si,Sj). For a point p e S on the stroke Si and a point q∈Sj on the stroke Sj, we may define the corresponding point p′∈SijA as its closest point on the aggregate curve SijA and a correspondence mapping q=Mij(p) may be defined where Mi(p)=p′=Mj(q) are the mappings from the strokes Si, Sj to the aggregate curve SijA.
It will be appreciated that by this construction, the points p′, q, p are colinear and the line connecting them is orthogonal to the aggregate curve SijA. This construction is illustrated schematically in
The average distance between the current stroke pair (Si,Sj) may then be defined as:
where I1 is a section of interest (e.g. where points on the aggregate curve SijA have corresponding points on both of the current pair of strokes (Si,Sj) and has the meaning described above and in
It may be observed at this stage, that the equation (5) formulation of the average distance between stroke pairs (Si,Sj) directly employs the mapping between the stroke points, since at this point in the computation, the side-by-side portions (e.g. sections of interest I1) of the strokes Si, Sj being considered are roughly parallel, ensuring reliable correspondences. This was not the case for the angle difference computation (equation (1)), where, to obtain reliable values, strokes St, Sj were mapped to the aggregate curve SijA instead of to one another.
In block 466, the average distance Di,j between the current stroke pair (Si,Sj) is compared to an initialization threshold to determine if the current stroke pair (Si,Sj) is sufficiently close to be considered for a potential sub-cluster. The block 466 initialization threshold may be a configurable (e.g. user-configurable) parameter. In some embodiments, the block 466 initialization threshold is based on the stroke width Ws of one or both of strokes Si, Sj in the current stroke pair. In one particular embodiment, the block 466 initialization threshold is based on a multiple of the larger one of the stroke widths Ws (e.g. 2Ws, 5Ws or the like) of strokes Si, Sj in the current stroke pair. If the block 466 inquiry determines that the current stroke pair (Si,Sj) is father apart than the initialization threshold (block 466 NO branch), then method 460 proceeds to block 468 and concludes that the current stroke pair (Si,Sj) is not suitable for being an initial sub-cluster, before reaching block 480 and looping back to evaluate the next stroke pair. However, if the block 466 inquiry determines that the current stroke pair (Si,Sj) is closer together than the initialization threshold (block 466 YES branch), then method 460 evaluates a number of other criteria before deciding that the current stroke pair (Si,Sj) should form an initial sub-cluster.
Specifically, if the block 466 inquiry determines that the current stroke pair (Si,Sj) is closer together than the initialization threshold (block 466 YES branch), then method 460 proceeds to block 470. Block 470 involves determining the angular compatibility of the current stroke pair (Si,Sj). This angular compatibility Ai,j(I1) may be determined in block 470 in accordance with:
where p′=Mi(p)=Mj(q) and t(p), t(q) are their respective unit tangents.
In block 472, the block 470 angular compatibility Ai,j(I1) is compared to a suitable threshold. The block 472 angular compatibility threshold may be a configurable (e.g. user-configurable) parameter. In some embodiments, the block 472 angular compatibility threshold is based on the same angular compatibility parameter Ta discussed above in connection with block 414. In some embodiments, the block 472 angular compatibility threshold is the same angular compatibility parameter Ta (e.g. Ta=20°) discussed above in connection with block 414. If the block 472 inquiry determines that the angular compatibility Ai,j(I1) of the current stroke pair (Si,Sj) is greater than this angular compatibility threshold (block 472 NO branch), then method 460 proceeds to block 468 and concludes that the current stroke pair (Si,Sj) is not suitable for being an initial sub-cluster, before reaching block 480 and looping back to evaluate the next stroke pair. However, if the block 472 inquiry determines that the angular compatibility Ai,j(I1) of the current stroke pair (Si,Sj) is less than this angular compatibility threshold (block 472 YES branch), then method 460 proceeds to block 474.
Block 474 involves determining a width parameter Wc,i,j of the joint aggregate curve of the current stroke pair (Si,Sj). This width parameter Wc,ij may be determined by shooting left and right orthogonal rays from densely sampled points p∈I1 on the section of interest I1 on the aggregate curve SijA and locating the farthest left and right intersections il(p) and ir(p) with cluster strokes Si, Sj along each ray. This width parameter Wc,ij may be determined (in some embodiments) according to:
Method 460 then proceeds to block 476 which involves comparing a length to width ratio (narrowness) of an aggregate curve section I1 (i.e. a ratio of the length of aggregate curve section I1 to the block 474 width parameter Wc,ij) to a suitable width threshold Tn. The block 476 width threshold Tn may be a configurable (e.g. user-configurable) parameter. In some embodiments, the block 476 width threshold Tn is in a range of 5≤Tn≤10. In some embodiments, this width threshold Tn is set to be Tn=8.5. If the block 476 inquiry determines that the length to width ratio of an aggregate curve section I1 is less than this width threshold Tn (block 476 NO branch), then method 460 proceeds to block 468 and concludes that the current stroke pair (Si,Sj) is not suitable for being an initial sub-cluster, before reaching block 480 and looping back to evaluate the next stroke pair. However, if the block 476 inquiry determines that the length to width ratio of an aggregate curve section I1 is greater than this width threshold Tn (block 476 YES branch), then method 460 proceeds to block 478 where the current stroke pair (Si,Sj) is grouped together as an initial sub-cluster. After block 478, method 460 loops back to block 464 via block 480.
The method 460 loop 464A repeats for each pair of strokes to generate a number of initial sub-clusters 456 (see
Returning to
Intuitively, the equation (8) intra-sub-cluster spacing parameter Dc returns the size of the largest gap between strokes in the each sub-cluster.
Method 450 then proceeds to block 492 which involves determining an inter-sub-cluster spacing parameter Dc,c′ that is representative of a distance between two distinct sub-clusters C and C′. In some embodiments, the inter-sub-cluster spacing parameter Dc,c′ between two distinct sub-clusters C, C′ may be ascertained by finding the closest distance between any two strokes, where each stroke belongs to a different sub-cluster:
Once the inter-sub-cluster spacing parameter Dc,c′ is determined for each pair of sub-clusters C, C′ in the current set of initial sub-clusters, then method 450 proceeds to block 494 which involves an inquiry into whether any sub-clusters should be merged into a single sub-cluster. In one particular embodiment, the block 494 evaluation ascertains whether both of the following criteria are met for a sub-cluster pair C, C′ under consideration:
D
c,c′
<T′
d·max(Dc,Dc′) (10)
max(Dc,Dc′)<T′d·min(Dc,Dc′) (11)
Equation (10) evaluates whether the spacing between sub-cluster pairs C, C′ is less than a parameter T′d times the maximum of the intra-sub-cluster spacings for each of the sub-cluster pair C, C′ under consideration. Equation (11) evaluates whether the maximum of the intra-sub-cluster spacings for each of the sub-cluster pair C, C′ under consideration is less than a parameter T′d times the minimum of the intra-sub-cluster spacings for each of the sub-cluster pair C, C′ under consideration.
The parameter T′d may be a configurable (e.g. user-configurable) parameter. In one particular embodiment, the parameter T′d may be determined based on research into human proximity perception. The inventors conducted such a study which indicted that humans separate lines when the ratio of intra-cluster to inter-cluster distances reaches approximately Td=2.1. The distances employed in method 450 are averaged along the sections of interest I1, and are thus only approximating closest inter- and intra-sub-cluster distances. As discussed in more detail below, a more fine-grained analysis may be performed during subsequent local separation; therefore, to avoid over-segmentation at this stage, the block 494 evaluation may use T′d=γ·Td where γ is a configurable (e.g. user-configurable) parameter. In currently preferred embodiments, γ is set to be somewhere in a range of 1.1≤γ≤1.5.
If no sub-clusters meet the block 494 merge criteria, then method 450 proceeds to block 498. Block 498 involves outputting the initial sub-clusters 456 to be the second stage clustering map 308 for the current first stage cluster label. If any pairs (or larger groups) of of sub-clusters meet the block 494 merge criteria, then method 450 proceeds to block 496 which involves merging the applicable sub-clusters to provide the second stage clustering map 308 for the current first stage cluster label. Where there are sub-clusters that meet the block 494 merge criteria, the merging of sub-clusters between blocks 494 and 496 may be performed incrementally. That is, once a pair of sub-clusters is merged, the merged pair of sub-clusters becomes a candidate sub-cluster for further merge with other sub-clusters. In some embodiments, the merging computation of method 450 may be performed using the HDBSCAN algorithm described in Campello et al. Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection. ACM Trans. Knowl. Discov. Data 10, 1 (2015), 5:1-5:51, which is hereby incorporated herein by reference.
Whether by way of block 496 or 498, method 450 reaches block 499 when the analysis of the current first stage cluster label has been completed. Method 450 then loops back to block 452 to evaluate another first stage cluster label. When loop 452A has been performed for all of the first stage cluster labels, then method 450 outputs second stage clustering map 308 as the output of block 306 (see
Returning to
Method 510 starts in block 512, which involves assessing potentially separable clusters 514 from within second stage clustering map 308.
Method 530 comprises a loop 532A which (in the schematic illustration of
Block 534 involves assessing separability candidacy criteria for the current second stage cluster by looking at the local spacing between strokes in the current second stage cluster. The block 534 separability candidacy assessment may be understood with reference to
In one particular embodiment, the block 534 separability candidacy assessment comprises detecting candidate gaps g which indicate possible cluster separation using the ratio between the length of the gap g and the average lengths of the gaps to its left gL and right gR as a cue to its potential separability. Specifically, in some embodiments, a gap g may be considered to be a candidate for separation if:
If the gap g under consideration is the leftmost or rightmost gap, its size may only be compared against the average size of the gaps to the right (gR), or left (gL), respectively. If there is only one gap g (i.e. only two participating strokes), block 534 may involve setting gL=gR=2Ws, which is the same lower bound on gap size as in the block 454 (method 460,
If the block 534 separability candidacy assessment is negative (i.e. the current cluster is not a candidate for separability), then method 530 proceeds to block 538 via the block 534 NO branch and loops back to block 532 to evaluate the next cluster. If, on the other hand, the block 534 separability candidacy assessment is positive (i.e. the current cluster is not a candidate for separability), then method 530 proceeds to block 536 via the block 534 YES branch. Block 536 involves splitting the potentially separable cluster into a pair of left and right sub-clusters CL, CR.
The schematic illustrations of
In some embodiments, for a gap g at a point pj on aggregate curve 570 corresponding to a cluster that is determined in block 534 to be potentially separable, block 536 may assign the strokes to the left and right of the gap g into the separate left and right sub-clusters, CL and CR, respectively. This is shown in
In some embodiments, at each encountered point on aggregate curve 570, the unassigned strokes may be split locally based on the largest gap between the previously assigned strokes 572, 574. Intuitively, the optimal assignment of the remaining strokes 576 is one that maximizes the average gap between the left and right sub-clusters CL, CR. In some embodiments, making such an assignment involves assessing a number of alternatives and selecting the assignment that produces the largest average gap ratio. Some embodiments involve testing three alternatives: assigning each stroke to the nearest, left or right, sub-cluster CL, CR based on shortest distance (e.g. between the unassigned stroke and the nearest stroke already assigned to a sub-cluster CL, CR); assigning all unassigned strokes to CL, or assigning all unassigned strokes to CR. An example of such an assignment is shown in
While it is not expressly shown in
Returning to
Returning now to method 510 (
Block 518 involves assessing separability criteria for the current potentially separable cluster. In some embodiments, there are two separability criteria assessed in block 518: an evaluation of the evenness of the spacing between cluster strokes; and cluster narrowness. The evenness of the spacing between cluster strokes may be assessed as follows. Given a pair of sub-clusters CL, CR (i.e. a pair of sub-clusters C*L, C*R output from block 536 for a particular current potentially separable cluster, where we drop the asterisk notation for simplicity), block 518 may involve analyzing a parameter which is referred to herein as the gap ratio r. Determining the gap ratio r may involve iterating over the points p on the aggregate curve, where orthogonal left and right rays intersect both sub-clusters CL, CR and, for each ray, locating the leftmost intersection with the right sub-cluster CR and the rightmost intersection with the left sub-cluster CL. If these intersections are immediately next to one another, then block 518 may involve determining the ratio r between the size of the middle gap g and the average size of the average left and right gaps gL, gR, as above:
This circumstance is shown in
If the intersection order is flipped, the sub-clusters are locally connected. In this case, the gap ratio may be set to r=0. This circumstance is illustrated in
If the one of the sub-clusters CL, CR includes only a single stroke at a given point p on the aggregate curve, then the left and right gap values gL, gR may be ill-defined. Also, these gap values gL, gR can be arbitrarily small at a location where two or more strokes intersect; a division by a value close to zero would result in an arbitrarily large ratio value r. In some embodiments, both of these cases may be resolved by rounding
(the denominator of equation (13)) up to a lower bound. This lower bound may be determined by the average inter-stroke distances dL and dR within the left and right sub-clusters CL, CR. The inter-stroke distances may be determined in accordance with equation (5) discussed above. The denominator of equation (13) may be set to max
min(max (dL, dR), 2Ws)).
These computed gap ratios r may be used in block 518 to determine if the left and right sub-clusters CL, CR are separable. In theory, if each of the left and right sub-clusters CL, CR had uniform internal spacing, the averages of the local ratios r could be compared to the proximity threshold Td to determine if the two sub-clusters CL, CR should be separated (i.e. if r>Td, then a potentially separable cluster could be separated and the block 518 inquiry would be positive). However, the current potentially separable cluster could have multiple branches. Thus, either of the left or right sub-clusters CL, CR may consist of more than one branch and, as a result, may have large internal gaps; this makes direct assessment of the gap ratio r less reliable.
To nevertheless separate such right and left sub-clusters, a more relaxed gap ratio assessment may be used by defining a parameter R to be the average of the largest 10% of the ratio values r and comparing R to the threshold Td as a separation criterion (i.e. if R>Td, then a potentially separable cluster may be separated and the block 518 inquiry may be positive). While this approach may occasionally lead to over-segmentation, the resulting split clusters may be merged back by subsequent analysis. As discussed above in connection with block 536, for a given cluster, there may be multiple potential left and right sub-clusters CL, CR and the techniques of one of both of methods 510 (e.g. block 518) and/or 530 (block 536) may choose the left and right sub-clusters C*L, C*R with the highest R value (Rmax).
The block 518 inquiry may also involve examining if the current potentially separable cluster is sufficiently wide to merit separation. A cluster may be classified as wide or not based on its length l and its maximal gap gm (i.e. the largest gap in the union of GL, GR). In some embodiments, a cluster may be considered to be wide if the ratio of its length l to its maximal gap gm is less than some multiple of the narrowness threshold Tn. For example, in some embodiments, a stroke may be considered to be wide if
If it is determined, in the block 518 inquiry, that a potentially separable cluster should not be separated (block 518 NO branch), then method 510 proceeds to block 526 which involves looping back to block 516 and testing the next potentially separable cluster. If, on the other hand, it is determined, in the block 518 inquiry, that a potentially separable cluster should be separated (block 518 YES branch), then method 510 proceeds to block 520, where the sub-clusters CL, CR are separated into new clusters. Method 510 then proceeds to block 522 where each of the newly created clusters is assessed for potential separability. Other than for acting on the clusters newly created in block 520, the block 522 assessment may be similar to block 512 (method 530) discussed above. If neither of the newly created branches is potentially separable (block 522 NO branch), then method 510 proceeds to block 526 which involves looping back to block 516 and testing the next potentially separable cluster. If, on the other hand, it is determined, in the block 522 inquiry, then that newly created and potentially separable cluster is recursively added to the set of potentially separable clusters before looping back to block 516 (via block 526) and testing the next potentially separable cluster.
Loop 516A continues operating until all of the potentially separable clusters 514 (and all of the newly created clusters that themselves may be potentially separable) are evaluated and, if applicable, separated into new clusters. At the conclusion of loop 516A, method 510 outputs third stage clustering map 312.
In one particular embodiment, the procedures of block may be performed in accordance with a specific algorithm, which is reproduced below.
Returning to
Method 600 comprises a loop 602A which (in the schematic illustration of
The current third stage clustering pair in loop 602A may be referred to herein using CL, CR to refer to the two clusters in the pair. In loop 602A, block 604 performs a narrowness assessment to evaluate whether the current third stage clustering pair CL, CR should be merged. The block 604 narrowness assessment may be similar to the narrowness assessment discussed above (e.g. in blocks 474, 476 of
Block 608 involves assessing the angular compatibility within the section where the two clusters of the current cluster pair are side-by-side. Given the aggregate curve SlrA between the union of the current cluster pair CL, CR and the aggregate curves Sl, Sr of the individual clusters in the current cluster pair CL, CR, block 608 may involve determining the average angle difference as described in above by equation (6) (see blocks 470, 472 of
Block 610 involves an assessment of the proximity between the current cluster pair CL, CR. The block 610 assessment may attempt to overcome local noise by computing distances between clusters CL, CR that account for their average (rather than pointwise width).This average width may be used when computing distances between clusters CL, CR in regions where the pointwise width is smaller than the average. To this end, the notion of a cluster envelope may be introduced and determined based on the cluster's width. A cluster envelope may be designed to reflect an average width of the cluster and to contain all cluster strokes. To determine a cluster envelope, an aggregate curve SA may be fit to each cluster in the current cluster pair CL, CR and the widths Wc of these curves may be determined in accordance with equation (7). Then, orthogonal rays may be projected left and right from densely ordered sample points on the cluster's aggregate curve SA. For each point on the cluster's aggregate curve SA, if the distance from the curve SA to the outermost intersection with a cluster stroke is larger than half the curve's width Wc, this intersection may be used as an envelope vertex; otherwise a point along the orthogonal ray at a half width
distance may be used as the envelope vertex. The vertices may then be connected on the left and right of the cluster's aggregate curve SA to provide two envelope boundaries. The left and right vertices on the ends of the cluster may be connected to form a closed envelope. An example cluster envelope 630 is shown in
Block 610 may involve determining the median gap g within each cluster of the current cluster pair CL, CR. The median gap g for each cluster CL, CR may be determined by considering all gaps between adjacent intersections along all orthogonal rays emanating from sampled points on the aggregate curve and selecting the median gap. For median computation, rays that intersect only a single stroke and intersections which are less than a stroke width apart may be ignored.
In some embodiments, the block 610 proximity inquiry is positive if the local distance between the envelopes of the clusters in the current cluster pair CL, CR is less than
everywhere along their side-by-side sections (block 610 YES branch). Otherwise, method 600 proceeds to block 612 (block 610 NO branch). The local distances between the two envelopes along the rays may be computed in block 610 for each cluster pair and compared to the threshold. Noise may be accounted for in the block 610 assessment by ignoring sequences of gaps larger than this threshold if the length of this sequence (measured as distance between the originating samples of the rays) is less than min(5Ws, 0.1·L) where L is the length of the aggregate curve SlrA of the union of the current cluster pair CL, CR.
As mentioned above, relative proximity assessment of the type performed in block 610 encounters challenges for circumstances where there are fewer than three strokes being considered. This can be a challenge when attempting to assess the proximity of a single stroke cluster in block 610. Moreover, when drawing free-hand, artists do occasionally draw outlier strokes—strokes that are intended to depict a target aggregate curve but are sufficiently inaccurate to be visually separate from the other strokes in their intended cluster. Such ambiguous configurations may be handled by leveraging the strength in numbers principle. If the current cluster pair CL, CR has one cluster with multiple strokes and one cluster with only one stroke, a relaxed version of the proximity test may be applied in block 610. In such a relaxed proximity test, the block 610 inquiry may be negative if the shortest distance between the single stroke and the envelope of the multistroke cluster is greater than the median gap g (determined as discussed above) in connection with the multi-stroke cluster—block 610 NO branch—and method 600 proceeds to block 612. Otherwise, as before, the gaps between the multi-stroke cluster's envelope and the single stroke may be measured and compared to Td·g. The strict proximity requirement discussed above may be relaxed the block 610 inquiry may be positive (block 610 YES output) if half the gaps within the side-by-side region between the multi-stroke cluster's envelope and the single stroke are below the threshold. If more than half of these gaps are above the threshold, the block 610 inquiry may be negative and method 600 may proceed to block 612 via the block 610 NO branch.
When both clusters of the current cluster pair CL, CR are single stroke clusters, block 610 may involve the same process as the multiple stroke-single stroke process described above, except that the stroke width Ws may be used instead of the gap size g.
If any one of the evaluations of blocks 604, 608 and 610 is negative, then the current cluster pair CL, CR is not merged and method 600 proceeds to block 612, where it loops back to block 602 to evaluate another cluster pair. If, on the other hand, all of the block 604, 608, 610 inquiries are positive, then method 600 proceeds to block 606 where the current cluster pair CL, CR is merged into a single cluster before looping back (via block 612) to block 602 to evaluate another cluster pair. In some embodiments, the newly formed cluster (formed by merging the current cluster pair in block 606) may be added to the 3rd stage clustering map as a new cluster for consideration—i.e. method 600 is recursive in the sense that a newly formed cluster (formed by merging a cluster pair in block 606) may itself be merged with a third cluster.
Loop 602A concludes when all of the cluster pairs (optionally including any newly formed clusters) have been evaluated, in which case method 600 proceeds from block 612 to block 614 which involves an inquiry into whether there are outliers in the current set of clusters. Outliers represent a common artifact present in raw artist drawings. When artists draw clearly erroneous strokes, instead of deleting them, they frequently simply hide them underneath wide clusters of overdrawn strokes. To detect such outliers, for each pair of single-stroke and multi-stroke clusters, containment may be assessed as follows. The single stroke S may intersect with the cluster's envelope and the portion of the single stroke S which is outside the envelope may be measured. The stroke S may be classified as an outlier (block 614 YES branch) and removed from the clustering map (block 616) if the portion of the stroke S which is outside the envelope is less than some threshold percentage (e.g. 10% or 15% of its length).
In the illustrated embodiment of
As the conclusion of method 600, the output is a final clustering map 212.
Returning now to
It will be appreciated that block 220 may be performed for each cluster in final clustering map 212. In determining the aggregate curve corresponding to a particular cluster, block 220 may seek to capture the curve's artist intended shape, and to explicitly preserve the slopes, or tangents, of the input strokes in the particular cluster. While the input points may be ordered along each given stroke, there is a challenge in block 220 in that there is, in general, no order between points on different strokes. Some curve fitting frameworks are not well designed for such data: traditional polyline or parametric curve fitting techniques for unordered data typically do not account for tangents, while implicit frameworks that use normals or tangents are typically designed for closed curves.
In some embodiments, block 220 involves the use of a modified Moving-Least-Squares (MLS) fitting algorithm (such as, for example, one of those described by Lee. 2000. Curve reconstruction from unorganized points. Computer aided geometric design 17, 2 (2000), 161-177 or Levin. 2004. Mesh-independent surface interpolation. In Geometric modeling for scientific visualization. 37-49, both of which are hereby incorporated herein by reference. The standard MLS formulation does not support tangent optimization, since tangent processing requires point order information which is not available in the MLS setting. To provide an ordering, the block 220 curve fitting may, in some embodiments, be split into three stages. First, an initial MLS optimization may be performed, where positions and tangents may be separately solved for; second, these positions and tangents may be used to compute an initial aggregate polyline; and third, the edges of this polyline may be aligned with the desired tangent directions.
To perform meaningful operations on point tangents in block 220, it is desirable for their orientations to be consistent. More specifically, it is desirable for point tangents along parallel or near-parallel strokes to have similar directions. This goal may be achieved as part of block 220 using a pair-based orientation method. The longest stroke in the cluster may be picked, and its orientation set as defined (for example, in the input or by some other orientation definition technique (e.g. user input)). The orientations of all other strokes may be set to be undefined. The closest pair of one defined stroke and one undefined stroke may then be repeatedly selected based on a distance computed as described below. An orientation may be assigned to the undefined stroke such that t(p)·t(p′)>0 using their respective representative points (p, p′). A distance value may be assigned to each pair of strokes as follows. If the midpoint tangents of the two strokes are near perpendicular (e.g. larger than 600 in some embodiments), their orientation with respect to one another may not be well defined. In such cases, the distance between the two strokes may be set to be arbitrarily large. This choice delegates the orientation decision to other more reliable pairs if these exist. If the midpoint tangents of the two strokes are not nearly perpendicular, close and representative pairs of points on the two strokes may be located. To avoid points with unreliable normals, some embodiments involve only considering points on each stroke whose tangents are within some threshold (e.g. within 60°) to the mid-point tangent. The closest pair of such sample points (p, p′) may then be selected and the distance between them may be used as the pairwise stroke distance. This process worked well for the data tested by the inventors and involves less computational expense than more complex alternatives such as eigenspace analysis described by Orbay et al. 2011. Beautification of design sketches using trainable stroke clustering and curve fitting. IEEE Trans. Vis. Comput. Graph 17, 5 (2011), 694-708, which is hereby incorporated herein by reference.
The initial block 220 fitting step uses Moving-Least-Squares (MLS) with adaptive neighborhood size [see Lee 2000; Levin 2004, supra]. The basic MLS framework may be adapted to simultaneously compute both position and tangent values. MLS takes a point cloud as the input and projects these points to the position-error-minimized manifold (the position stroke SP in the case of the block 220 curve fitting process). To conduct the MLS projection step, each point may be associated with a local neighborhood. Following the method in Lee [2000], the neighborhood may be constructed by adaptively increasing the radius of a disk centered at each point. The radius may be increased until all points in this disk are adequately co-linear; that is, until the correlation reaches a minimum value ρ. Some embodiments may involve the use of an initial neighborhood size based on the stroke width Ws (e.g. h0=10Ws and setting the minimum correlation to p=0.7). The point positions on SP may be obtained using the standard MLS projection.
The corresponding tangents may be determined as follows. Let p be the position of an input sample and t be its corresponding tangent. With the final neighborhood size h, the averaging kernel may now be defined for a position p0 with tangent t0 as:
The neighborhood N(p0) may be defined to include all the points p that satisfy ∥p−p0∥<αh and t·t0>β. Here, the neighborhood size h may be scaled by some suitable scaling factor α (e.g. α=0.6) to avoid tangent over-smoothing, since tangents are more sensitive than positions. The parameter β may be set to some suitable constant (e.g.
to avoid averaging outlier tangents,
is a Gaussian function, similar to the position Gaussian of the MLS projection.
After determining the positions and tangents for points on SP, an ordered sequence of such points that will form the base for the output polyline may be extracted as part of block 220. This sequence may be determined as a path in a directed graph as follows. An Euclidean proximity graph may be constructed where each point is connected to all neighbors within the distance of h. Each edge in this graph may be assigned a direction that aligns with the averaged tangent of its two endpoints. When the dot product of the two tangents is negative, it suggests that one of them is an outlier and the edge is thus deleted. The minimum spanning may be extracted may then be determined, using, for example, Edmonds algorithm (as disclosed in Chu. 1965. On the shortest arborescence of a directed graph. Science Sinica 14 (1965), 1396-1400 and Edmonds. 1967. Optimum branchings. J. Res. Nat. Bur. Standards 71B, 4 (1967), 233-240 both of which are hereby incorporated herein by reference). The tree may be trimmed down to its largest path. It may be ascertained if the path is closed by searching for a path from its end to its beginning. If such a path exists, and its length is below a suitable threshold value (5Ws in one particular implementation), SP may be labelled as closed. An artist may not precisely line up the start and end of a closed loop, and may accidentally extend the end of a closed loop past its starting point. To address this case in addition to the start to end path, paths between all vertices within 10% away from the start and end points may be tested.
Block 220 may then seek to optimize the polyline S={pi}(i=1, . . . , n)) by aligning its edges (pi, pi+1) with the corresponding neighborhood tangents. In some embodiments, the objective function may be defined as:
where pi0 is the initial position of point pi on the aggregate polyline curve. Here, the first term enforces tangent alignment and the second term reflects the expectation that the polyline stays close to its original position. The parameter h may be set to some suitable value (e.g. λ=10−3) to prioritize tangent alignment.
Equation (15) may be minimized using any suitable fitting technique (e.g. iterated least squares and/or the like). The kth round objective may be defined as
Here, the varying polyline edge length term in the denominator may be replaced with the known corresponding length in Sk−1; K(pik−1,T) is the average kernel centered at position pik−1 and T is the input tangent set. The aggregate tangent update helps center the curve and diminish the impact of outlier stoke tangents.
This least-squares problem may be solved using standard Cholesky decomposition. For smooth input data a single tangent update step is typically sufficient. However, solving the problem for multiple rounds may provide better results for highly noisy cases. The inventors have found three iterations to be sufficient for all experiments.
Once an aggregate curve is determined for each cluster in final clustering map 212, method 200 (
As discussed above, the block 220 method for determining an aggregate curve for a plurality of strokes may be used for determining any of the aggregate curves described herein.
Input (e.g. raw drawings 204 (
Suitable systems are not limited to the particular type shown in the schematic depiction of
Unless the context clearly requires otherwise, throughout the description and the
Words that indicate directions such as “vertical”, “transverse”, “horizontal”, “upward”, “downward”, “forward”, “backward”, “inward”, “outward”, “vertical”, “transverse”, “left”, “right”, “front”, “back”, “top”, “bottom”, “below”, “above”, “under”, and the like, used in this description and any accompanying claims (where present), depend on the specific orientation of the apparatus described and illustrated. The subject matter described herein may assume various alternative orientations. Accordingly, these directional terms are not strictly defined and should not be interpreted narrowly.
Embodiments of the invention may be implemented using specifically designed hardware, configurable hardware, programmable data processors configured by the provision of software (which may optionally comprise “firmware”) capable of executing on the data processors, special purpose computers or data processors that are specifically programmed, configured, or constructed to perform one or more steps in a method as explained in detail herein and/or combinations of two or more of these. Examples of specifically designed hardware are: logic circuits, application-specific integrated circuits (“ASICs”), large scale integrated circuits (“LSIs”), very large scale integrated circuits (“VLSIs”), and the like. Examples of configurable hardware are: one or more programmable logic devices such as programmable array logic (“PALs”), programmable logic arrays (“PLAs”), and field programmable gate arrays (“FPGAs”)). Examples of programmable data processors are: microprocessors, digital signal processors (“DSPs”), embedded processors, graphics processors, math co-processors, general purpose computers, server computers, cloud computers, mainframe computers, computer workstations, and the like. For example, one or more data processors in a control circuit for a device may implement methods as described herein by executing software instructions in a program memory accessible to the processors.
Processing may be centralized or distributed. Where processing is distributed, information including software and/or data may be kept centrally or distributed. Such information may be exchanged between different functional units by way of a communications network, such as a Local Area Network (LAN), Wide Area Network (WAN), or the Internet, wired or wireless data links, electromagnetic signals, or other data communication channel.
For example, while processes or blocks are presented in a given order, alternative examples may perform routines having steps, or employ systems having blocks, in a different order, and some processes or blocks may be deleted, moved, added, subdivided, combined, and/or modified to provide alternative or subcombinations. Each of these processes or blocks may be implemented in a variety of different ways. Also, while processes or blocks are at times shown as being performed in series, these processes or blocks may instead be performed in parallel, or may be performed at different times.
In addition, while elements are at times shown as being performed sequentially, they may instead be performed simultaneously or in different sequences. It is therefore intended that the following claims are interpreted to include all such variations as are within their intended scope.
Software and other modules may reside on servers, workstations, personal computers, tablet computers, image data encoders, image data decoders, PDAs, color-grading tools, video projectors, audio-visual receivers, displays (such as televisions), digital cinema projectors, media players, and other devices suitable for the purposes described herein. Those skilled in the relevant art will appreciate that aspects of the system can be practised with other communications, data processing, or computer system configurations, including: Internet appliances, hand-held devices (including personal digital assistants (PDAs)), wearable computers, all manner of cellular or mobile phones, multi-processor systems, microprocessor-based or programmable consumer electronics (e.g., video projectors, audio-visual receivers, displays, such as televisions, and the like), set-top boxes, color-grading tools, network PCs, mini-computers, mainframe computers, and the like.
The invention may also be provided in the form of a program product. The program product may comprise any non-transitory medium which carries a set of computer-readable instructions which, when executed by a data processor, cause the data processor to execute a method of the invention. Program products according to the invention may be in any of a wide variety of forms. The program product may comprise, for example, non-transitory media such as magnetic data storage media including floppy diskettes, hard disk drives, optical data storage media including CD ROMs, DVDs, electronic data storage media including ROMs, flash RAM, EPROMs, hardwired or preprogrammed chips (e.g., EEPROM semiconductor chips), nanotechnology memory, or the like. The computer-readable signals on the program product may optionally be compressed or encrypted.
In some embodiments, the invention may be implemented in software. For greater clarity, “software” includes any instructions executed on a processor, and may include (but is not limited to) firmware, resident software, microcode, and the like. Both processing hardware and software may be centralized or distributed (or a combination thereof), in whole or in part, as known to those skilled in the art. For example, software and other modules may be accessible via local memory, via a network, via a browser or other application in a distributed computing context, or via other means suitable for the purposes described above.
Where a component (e.g. a software module, processor, assembly, device, circuit, etc.) is referred to above, unless otherwise indicated, reference to that component (including a reference to a “means”) should be interpreted as including as equivalents of that component any component which performs the function of the described component (i.e., that is functionally equivalent), including components which are not structurally equivalent to the disclosed structure which performs the function in the illustrated exemplary embodiments of the invention.
Where a record, field, entry, and/or other element of a database is referred to above, unless otherwise indicated, such reference should be interpreted as including a plurality of records, fields, entries, and/or other elements, as appropriate. Such reference should also be interpreted as including a portion of one or more records, fields, entries, and/or other elements, as appropriate. For example, a plurality of “physical” records in a database (i.e. records encoded in the database's structure) may be regarded as one “logical” record for the purpose of the description above and the claims below, even if the plurality of physical records includes information which is excluded from the logical record.
Specific examples of systems, methods and apparatus have been described herein for purposes of illustration. These are only examples. The technology provided herein can be applied to systems other than the example systems described above. Many alterations, modifications, additions, omissions, and permutations are possible within the practice of this invention. This invention includes variations on described embodiments that would be apparent to the skilled addressee, including variations obtained by: replacing features, elements and/or acts with equivalent features, elements and/or acts; mixing and matching of features, elements and/or acts from different embodiments; combining features, elements and/or acts from embodiments as described herein with features, elements and/or acts of other technology; and/or omitting combining features, elements and/or acts from described embodiments.
Specific examples of systems, methods and apparatus have been described herein for purposes of illustration. These are only examples. The technology provided herein can be applied to systems other than the example systems described above. Many alterations, modifications, additions, omissions, and permutations are possible within the practice of this invention. This invention includes variations on described embodiments that would be apparent to the skilled addressee, including variations obtained by: replacing features, elements and/or acts with equivalent features, elements and/or acts; mixing and matching of features, elements and/or acts from different embodiments; combining features, elements and/or acts from embodiments as described herein with features, elements and/or acts of other technology; and/or omitting combining features, elements and/or acts from described embodiments.
Various features are described herein as being present in “some embodiments”. Such features are not mandatory and may not be present in all embodiments. Embodiments of the invention may include zero, any one or any combination of two or more of such features. This is limited only to the extent that certain ones of such features are incompatible with other ones of such features in the sense that it would be impossible for a person of ordinary skill in the art to construct a practical embodiment that combines such incompatible features. Consequently, the description that “some embodiments” possess feature A and “some embodiments” possess feature B should be interpreted as an express indication that the inventors also contemplate embodiments which combine features A and B (unless the description states otherwise or features A and B are fundamentally incompatible).
It is therefore intended that the following appended claims and claims hereafter introduced are interpreted to include all such modifications, permutations, additions, omissions, and sub-combinations as may reasonably be inferred. The scope of the claims should not be limited by the preferred embodiments set forth in the examples, but should be given the broadest interpretation consistent with the description as a whole.
This application claims the benefit under 35 U.S.C. § 119 of application No. 62/691,736, filed 29 Jun. 2018, and entitled StrokeAggregator: Consolidating Raw Sketches into Artist-Intended Curve Drawings which is hereby incorporated herein by reference for all purposes.
Number | Date | Country | |
---|---|---|---|
62691736 | Jun 2018 | US |