The present application refers to ASCII arts, in particular, to methods and devices for producing structure-based ASCII pictures.
Generally, ASCII pictures are divided into two major categories, i.e., tone-based ASCII pictures and structure-based ASCII pictures. The tone-based ASCII pictures maintain an intensity distribution of a reference image, while the structure-based ASCII pictures capture a major structure of image content. For example, with reference to
For the structure-based ASCII pictures, a major challenge is an inability to depict unlimited image contents with limited character shapes and the restrictive placement of characters over character grids. Two matching strategies usually employed by ASCII artists for producing structure-based ASCII pictures are shown in
In the present application, a method for generating a structure-based ASCII picture to capture a major structure of a reference image is proposed, in which an alignment-insensitive shape similarity metric is used. The proposed similarity metric tolerates the misalignment while accounts for the difference in transformation. In addition, a constrained deformation of the reference image may be introduced to increase the chance of character matching. Given an input and a target text resolution, the ASCII picture generation may be formulated as an optimization of minimizing the shape dissimilarity and deformation.
The proposed techniques aim to provide a creative tool for producing structure-based ASCII arts which can only be created by hand currently. This tool need few user interactions and can generate cool ASCII work in a few minutes, and its quality is comparable to people's handwork. This is a premier system to provide ASCII fans with an efficient and facile way of structure-based ASCII art generation.
According to an aspect of the present application, a method for producing an ASCII picture from a vector outline image, comprises: rasterizing and dividing the vector outline image into a plurality of grid cells, at least one of which has a reference image; matching each reference image with an ASCII character based on log-polar histograms thereof; and gathering all matched ASCII characters to form the ASCII picture.
According to another aspect of the present application, a device for producing an ASCII picture from a vector outline image, comprises: a rasterizing module configured to rasterize and divide the vector outline image into a plurality of grid cells, at least one of which has a reference image; a matching module configured to match each reference image with an ASCII character based on log-polar histograms thereof; and a generating module configured to gather all matched ASCII characters to generate the ASCII picture.
a), 1(b), and 1(c) illustrate a tone-based ASCII picture and a structure-based ASCII picture generated based on a reference image;
a), 2(b), and 2(c) illustrate two matching strategies usually employed by ASCII artists;
a), 5(b), and 5(c) illustrate an alignment-insensitive shape similarity metric used in the present application;
a) and 7(b) illustrate deformation of a line in the method according to the present application;
a), 8(b), 8(c), and 8(d) illustrate constrained deformation of an image in the method according to the present application;
Hereinafter, illustrative embodiments according to the present application will be described with reference to the accompanying drawings.
Referring to
Rasterizing at Step 301
According to an embodiment of the present application, the vector outline image may be rasterized and divided into a plurality of grid cells based on a target resolution, as well as a width and a height of ASCII characters to be used. For example, the target resolution is assumed to be Rw×Rh, where Rw and Rh are maximum numbers of ASCII characters in an ASCII picture to be output along horizontal and vertical directions thereof, respectively. In general, ASCII character fonts may have varying thickness and varying character width. After a font and a size of characters to be used are selected by a user, a width Tw and a height Th of the characters to be used are determined. An aspect ratio ∂=Th/Tw of the characters is also determined. In addition, the font thickness of the characters may be ignored via a centerline extraction, so that only fixed-width character fonts with a fixed aspect ratio ∂=Th/Tw are handled. Both the input polylines and the characters are rendered with the same line thickness so as to focus only on the shapes during matching. W and H. as Rh=┌H/(∂┌W/Rw┐┐. Thus, the resolution can be solely determined by a single variable Rw as Rh=┌H/(∂┌W/Rw┐┐, where W and H are the width and height of the input image in a unit of pixels which are available when the input image is given, and the variable Rw is specified by the user. With the above parameters, the text resolution in vertical direction can be calculated. Hence, the vectorized picture is first scaled and rasterized to a domain of TwRw×ThRh. Each of the TwRw×ThRh grid cells is treated to match a corresponding character respectively.
Matching at Step 302
For the ease of description, the image content in each of the grid cells to be matched with an ASCII character may also be referred as a reference image. According to an embodiment, a log-polar histogram is used. In particular, after the vector outline image is rasterized and divided into a plurality of grid cells, each of the grid cells is sampled and the log-polar histogram of each sample point in the grid cell is determined. The log-polar histogram may characterize a shape feature in a local neighborhood of the sample point, covered by a log-polar window. As shown in
During the pixel grayness summation, a black pixel has a grayness of 1 while a white pixel has a grayness of 0. A bin value hi,k of the k-th bin according to an i-th sample point p may be calculated as hi,k=Σ(q-p)εbin(k)I(q), where p is the i-th sample point which is in the center of the log-polar window; q is a position of a current considered pixel; (q−p) is a relative position of the point q to the center p of the log-polar window; I(q) returns a grayness at the point q. Thus, a feature vector hi including all the bin values hi,k according to the i-th sample point p may be formed. For example, if there is 60 bins for each sample point, the feature vector hi is formed to contain 60 bin values hi,k (k=1, . . . , 60).
Similarly, each ASCII character is sampled and the log-polar histogram of each sample point in the ASCII character is determined. Both the reference image and the ASCII character are sampled with a same sample pattern. For each sample point, a log-polar histogram is measured. In order to account for a transformation (or position) difference, in an embodiment, the reference image and the ASCII character are sampled with a same grid layout pattern. Referring to
Thus, the shape dissimilarity between the reference image and an ASCII character may be obtained by comparing their feature vectors on a point-by-point basis. The ASCII character whose feature vector is closest to that of a certain grid cell may be selected to match the grid cell.
According to an embodiment, the shape dissimilarity may be measured by comparing their feature vectors in a point-by-point basis, given by
where hi(hi′) is the feature vector of the i-th sample point on the reference image S (the ASCII character S′); M=(n+n′) is the normalization factor, n(n′) is the total grayness of the shape S(S′). The shape S′ minimizing the shape dissimilarity with the shape S may be selected to match the shape S. Hereinafter, the shape dissimilarity for the j-th cell in the vector outline image and its corresponding ASCII character is also denoted as DAISSj.
According to such matching, the normalization factor counteracts the influence of absolute grayness, and the global transformation of the shape is accounted. Besides, the log-polar histogram may natively account for orientation. All log-polar histograms may share a same scale, so that the scale is also accounted. Thus, the method according to the present application is alignment-insensitive and transformation-variant.
In the above, the histograms are empirically constructed with 5 bins along the radial axis in log space and 12 bins along the angular axis. Thus totally 60 bins are included in a histogram. The radius of the coverage window is selected to be about half of the shorter side of a character. The number of sample points N equals to (Sw/2)×(Sh/2) where Sw and Sh are the width and height of the image in a unit of pixels, denoting the total number of pixels in horizontal and vertical direction of image, respectively. To suppress the aliasing due to the discrete nature of bins, the image may filtered by a Gaussian kernel, for example, of a size of 7×7, before measuring the shape feature.
The metrics according to the present application has been evaluated by comparing with three commonly used metrics, including the classical shape context (a translation- and scale-invariant metric), SSIM (an alignment-sensitive, structure similarity metric), and the RMSE (root mean squared error) after blurring. For the last metric, RMSE is measured after the compared images are blurred by a Gaussian kernel of a size 7×7 for better performance.
Referring to
According to a second embodiment of the present application, the reference image is deformed slightly so as to raise the chance of character matching.
Referring to
The steps 903-905 are performed iteratively until it is determined in step 906 that the iterative deformation is to be terminated. The above described process may be referred as discrete optimization. In each iterative deformation during the optimization, one vertex in the vector outline image is selected randomly, and its position is displaced randomly with a distance of at most the length of the longer side of the character image. Then, all affected grid cells due to this displacement are identified and best-matched with an ASCII character again. For each deformation, an average shape dissimilarity value is obtained. In this case, it may be determined in the step 906 that whether the obtained average shape dissimilarity value is lower than previous ones. If a predetermined number of average shape dissimilarity values obtained consecutively are not lower than each of the previous ones, it may be determined in the step 906 that the optimization should be terminated.
Alternatively, it may be determined at step 906 that the vector outline image has been deformed for predetermined times. If it is the case, the optimization is to be terminated.
After the termination is determined in the step 906, an ASCII picture having the minimum average shape dissimilarity with corresponding vector outline image is selected from the all the produced ASCII pictures at step 907 so as to approximate the original image. For example, after the image has been deformed for a predetermined times, the optimal image 405 is selected to output.
In the above second embodiment, however, unconstrained deformation may destroy a global structure of the input image. In this case, a third embodiment is proposed to quantify the deformation and select an ASCII picture having a minimum deformation value.
In this embodiment, both the deformation of vectorized image and the dissimilarity between characters and the deformed picture are considered. A flow chart of the method 1000 according to the third embodiment is illustratively shown in
Local Constraint
According to the embodiment, the deformation of the vector outline image is determined based on each line segment in the image. The local deformation of each line segment, in terms of orientation and scale, is determined as below. For an original line segment AB (hereinafter referred as line segment Li) deformed to A′B′ as shown in
Here, Vθ and Vr are the deformation values with respect to orientation and scale respectively, and the larger value of them finally determines the local deformation of a line segment. θε[0, π] is the angle between the original line segment and the deformed line segment. r and r′ denote the lengths of the original line segment and the deformed line segment, respectively. Parameters λ1, λ2, and λ3 are weights, which may be empirically set to values of 8/π, 2/min{Tw,Th}, and 0.5, respectively. The function exp( ) is to make sure that deformation value would be not less than 1.0. The weights λ1, λ2, and λ3 are used to control the deformation due to the variation of angle, absolute length, and relative length, respectively. The variation of absolute length is evaluated by using the distance |r′−r|, and variation of relative length using the times of length
Thus the local deformation value for each line segment is obtained.
Accordingly, a local deformation value of image content in a grid cell j, Dlocalj, may be determined as follow.
In particular, all line segments Li intersecting the current cell j are identified at first and denoted as a set {L}. li is the length of the part of line segment Li occupied by the grid cell j. Then, the local deformation value of cell j may be calculated as the weighted average of deformation values of involved line segments.
In this case, in step 1007, an energy E may be defined as an overall objective function as follow to evaluate each ASCII picture produced corresponding to each deformed image,
where m is the total number of character cells, and K is the number of non-empty cells and used as the normalization factor. DAISSj is the dissimilarity between j-th cell content and its best-matched character, as defined in Equation (1) and determined in the step 1002 or 1005. The term Dlocalj is the local deformation value of the j-th cell determined in the step 1006. It is understood that, for the original vector outline image which is not deformed, the term Dlocalj may be deemed to be equal to 1.
Similarly, the steps 1003-1007 are performed iteratively until it is determined in step 1008 that the iterative deformation is to be terminated. The above described process may be referred as discrete optimization. In each iterative deformation during the optimization, one vertex in the vector outline image is selected randomly, and its position is displaced randomly with a distance of at most the length of the longer side of the character image. Then, all affected grid cells due to this displacement are identified and best-matched with an ASCII character again. For each deformation, an energy value E is obtained. In this case, a simulated annealing scheme may also be used. In particular, in the step 1008, it is determined if the recomputed E is reduced. If it is the case, the displacement is accepted. Otherwise, a transition probability Pr=exp(−∂/t) may be used to make decision, where ∂ is the energy difference between two iterations; t=0.2tac0.997 is the temperature; c is the iteration index; ta is an initial average matching error of all grid cells. If Pr is smaller than a random number in [0, 1], this displacement is accepted; otherwise, it is rejected. The optimization is terminated whenever E is not reduced for co consecutive iterations. For example, the parameter co may be selected to be 5000 in this implementation. Thus, if the energy E is not reduced for co consecutive iterations, it may be determined in the step 1008 that the optimization should be terminated.
Alternatively, it may be determined at step 1008 that the vector outline image has been deformed for predetermined times. If it is the case, the optimization is to be terminated.
After the termination is determined in the step 1008, the ASCII picture corresponding to the minimum energy value E may be selected at step 1009 to approximate the original image.
The local deformation constraint as discussed above prevents the over-deformation in local scale. However, it cannot avoid the over-deformation in a global scale. For example, as shown in
To constrain the deformation in a global scale, a fourth embodiment according to the present application, including a 2D accessibility constraint, is further proposed so that the relative orientation and position between the current line segment and its surrounding line segments are maintained.
Referring to
Global Constraint
To compute the global deformation of a line segment in the deformed image, say AB in
where nl is the total number of intersected line segments for the point P. Dlocal(PPk) is the local deformation of the line segment PPk as defined in Equation (2); wk is the weight and may be computed as the normalized distance wk=|PPk|/(Σk-1nl|PPk|). The weight value is higher when the corresponding point Pk is closer to P.
Hence, the overall deformation value may be determined by
Ddeform(Li)=max{Dlocal(Li),Daccess(Li)}. (6)
With the above metric, the deformation value for each line segment of the deformed polyline may be evaluated. Thus, a deformation value of image content in a grid cell j, Ddeformj, may be determined. In particular, all line segments intersecting the current cell j are identified at first and denoted as a set {L}. li is the length of the part of line segment Li occupied by the grid cell j. Then, the deformation value of cell j may be calculated as the weighted average of deformation values of involved line segments,
In this case, in the step 1107, the energy E, also referred as the objective function, may be defined as an overall objective function as follow,
where m is the total number of character cells, and K is the number of non-empty cells and used as the normalization factor. DAISSj is the dissimilarity between j-th cell content and its best-matched character, as defined in Equation (1). The term Ddeformj is the deformation value of the j-th cell.
After an optimization as stated with reference to the second embodiment, upon the termination determined at the step 1108, the ASCII picture corresponding to the minimum energy value E may be selected at step 1109 to approximate the original image.
Hereinabove, illustrative embodiments of the method for producing structure-based ASCII pictures according to the present application are described. The device for producing structure-based ASCII pictures according to the present application will be described as below.
A device 1200 for producing structure-based ASCII pictures according to an embodiment of the present application is illustratively shown in
The matching module 1220 may comprise a sampling unit 1221 configured to sample the reference image and all ASCII characters by using a same layout pattern; a calculating unit 1222 configured to determine a shape dissimilarity between the reference image and each of the ASCII characters; and a comparing unit 1223 configured to compare the determined shape dissimilarity one another; and a selecting unit 1224 configured to select an ASCII character corresponding to a minimum shape dissimilarity with the reference image to match the reference image.
The calculating unit 1222 may be further configured to calculate a feature vector of each sample point in both the reference image and each of the ASCII characters based on the log-polar histograms thereof; and to determine the shape dissimilarity between the reference image and each of the ASCII characters based on the determined feature vector.
The rasterizing unit 1210 may be further configured to rasterize the vector outline image into the plurality of grid cells based on a target resolution as well as a height and a width of ASCII character selected to be used; and to render the vector outline image with a line having a same thickness as that of the ASCII character by extracting a midline of the vector outline image.
The feature vector of an i-th sample point p may comprise a plurality of component, each of which represents a grayness value at a set of points in a log-polar window associated with the sample point p.
The shape dissimilarity between a grid cell and an ASCII character may be determined based on feature vectors corresponding to the sample points in the grid cell and the ASCII character.
The calculating unit 1222 may be further configured to determine a plurality of first feature vectors corresponding to the sample points in the grid cell; determine a plurality of second feature vectors corresponding to the sample points the ASCII character; calculate a norm of a difference between the first and second feature vectors; and normalize the norm to form the shape dissimilarity.
A device 1300 for producing structure-based ASCII pictures according to another embodiment of the present application is illustratively shown in
In this embodiment, the calculating unit 1322 may be further configured to determine an average shape dissimilarity of the vector outline image with its matched ASCII picture; determine an average shape dissimilarity of each deformed image with its matched ASCII picture; and select an ASCII picture corresponding to a minimum average shape dissimilarity value. Alternatively, the calculating unit 1322 may be further configured to determine a local deformation value of each deformed image; and select an ASCII picture based on both the shape dissimilarity and the local deformation value. Alternatively, the calculating unit 1322 may be further configured to determine a final deformation value of each deformed image according to a larger one in a local deformation value and a global deformed value of each line segment in the deformed image; and select an ASCII picture based on the shape dissimilarity and the final deformation value.
It should be noted that the energy values of different text resolutions are directly comparable, as the function is normalized. Referring to
In our current implementation, a simulated annealing is employed to solve this discrete optimization problem. Other tools for solving discrete optimization may also be applicable.
Users may opt for allowing the system to determine the optimal text resolution by choosing the minimized objective values among results of multiple resolutions, as the objective function is normalized to the text resolution.
Number | Name | Date | Kind |
---|---|---|---|
20090115797 | Poupyrev et al. | May 2009 | A1 |
Number | Date | Country |
---|---|---|
2010074324 | Apr 2010 | JP |
Entry |
---|
Schlosser et al. “Fast Shape Retrieval Based on Shape Contexts.” Proceedings of 6th International Symposium on Image and Signal Processing and Analysis, Sep. 16, 2009, pp. 293-298. |
English Translation of JP2010074324. |
Arkin et al. “An Efficiently Computable Metric for Comparing Polygonal Shapes.” IEEE Transactions on Pattern Analysis and Machine Intelligence. 13(3):209-216 (1991). |
Bayer. “An Optimum Method for Two-Level Rendition of Continuous-Tone Pictures.” IEEE International Conference on Communications. 1:11-15 (1973). |
Belongie et al. “Shape Matching and Object Recognition Using Shape Contexts.” IEEE Transactions on Pattern Analysis and Machine Intelligence. 24(24):509-522 (2002). |
Cohen et al. “Tracking Points on Deformable Objects Using Curvature Information.” Proceedings of the Second European Conference on Computer Vision. 458-466 (1992). |
Crawford. “ASCII Graphical Techniques v1.0.” 1994. |
Goh. “Strategies for Shape-Matching Using Skeletons.” Computer Vision and Image Understanding. 110:326-345 (2008). |
Kang et al. “Coherent Line Drawing.” Proc. ACM Symposium on Non-photorealistic Animation and Rendering. pp. 43-50, San Diego, CA (2007). |
Milios. “Shape Matching using Curvature Processes.” Computer Vision, Graphics, and Image Processing. 47(2):960-963 (1989). |
Mori et al. “Efficient Shape Matching Using Shape Contexts.” IEEE Transactions on Pattern Analysis and Machine Intelligence. 27(11):1832-1837 (2005). |
O'Grady et al. “Automatic ASCII Art Conversion of Binary Images Using Non-Negative Constraints.” Signals and Systems Conference (2008). |
Randall. “FAQ: New to ASCII Art? Read Me First!” Archive ascii-art/faq. 2003. |
Sundar et al. “Skeleton Based Shape Matching and Retrieval.” Shape Modeling International. 2003. |
Torsello et al. “A Skeletal Measure of 2D Shape Similarity.” Computer Vision and Image Understanding. 95(1). pp. 1-12. (2004). |
Wang et al. “Image Quality Assessment: From Error Visibility to Structural Similarity.” IEEE Transactions on Image Processing. 13(4):600-612 (2004). |
Zahn et al. “Fourier Descriptors for Plane Closed Curves.” IEEE Transactions on Computers. C21(3):269-281 (1972). |
Number | Date | Country | |
---|---|---|---|
20110317919 A1 | Dec 2011 | US |