The present invention relates to an image inspection method and a sound inspection method capable of detecting an error in an image and sound included in a digital image and sound signal.
Nowadays infrastructure, such as communication lines, and the like is improved, and thus digital image and sound signals have come to be transmitted from overseas, and it has become possible to domestically view overseas content easily. However, there are sometimes differences in the communication systems between domestic communication facilities and oversea communication facilities. Accordingly, it is difficult to completely prevent noise from being mixed in the signals at the time of conversion of the digital image and sound signals. When such noise is mixed in an image signal, an error, such as an image disorder, block noise or the like sometimes occurs. Also, when noise is mixed in a sound signal, the noise is sometimes recognized as an error, such as a “puff” sound (Audio Pop Noise), or the like. An audience might have an uncomfortable feeling by the occurrence of such an error, and thus a content inspection, in which an examiner actually views the content in advance, is carried out. However, there is a problem in that the content inspection requires long-time viewing using human eyes and ears, and thus the inspection result greatly varies in accordance with the physical condition and the individual difference. Also, the facility for the inspection becomes a big burden. Accordingly, there is a demand for a machine inspection in place of a human being.
Concerning this, Patent Document 1 discloses a technique in which pixels are differentiated for each predetermined rectangular block in order to mechanically detect block noise.
PTL 1: Japanese Unexamined Patent Application Publication No. 2001-119695
PTL 2: Japanese Unexamined Patent Application Publication No. 2013-81078
However, Patent Documents 1 and 2 are applied only to the image signals that have been subjected to compression and decompression processing, and a method for detecting an error due to all kinds of noise, such as a communication line problem, a VTR failure error, the other failures, or the like has not been achieved yet. In addition, techniques for inspecting a “puff” sound due to noise in sound signals, or the like with high precision have not been realized.
It is an object of the present invention to provide an image inspection method for detecting an image disorder caused by noise that occurs due to various causes in the digital image signal. Also, it is another object of the present invention to provide a sound inspection method for detecting a sound error caused by noise that occurs due to various causes in the digital sound signal.
According to a first embodiment of the present disclosure, there is provided an image inspection method including: sampling a continuous digital image signal by dividing the signal by less than or equal to 20 msec; extracting a high-frequency component from the sampled signal; and detecting an error occurred in an image on the basis of the extracted high-frequency component.
With the present invention, it is possible to sample a continuous digital image signal by dividing the signal by less than or equal to 20 msec, which is a very short time period, to extract a high-frequency component from the sampled signal, and to detect an error occurred in an image with high precision in distinction from the actual content on the basis of the extracted high-frequency component.
It is preferable to divide one frame of the digital image signal into a plurality of areas, and to detect the error for each of the areas.
It is preferable that the error is an image disorder, and the extracted high-frequency component is an activity, which is the average of the variances of the digital image signal for each block.
It is preferable that when the activity (Vn(t)) is second-order differentiated with respect to time (t) to obtain d2Vn(t)/dt2, if acceleration (d2Vn(t)/dt2)/Vn(t−1) is arranged in order of “positive, negative, and positive” or “negative, positive, and negative” along a time axis, a determination is made that an image disorder has occurred.
It is preferable that when the error is block noise, and if pixel values in an inspection block of the image signal are subjected to orthogonal transformation, and the transformation coefficient satisfies a predetermined condition, a determination is made that block noise has occurred.
It is preferable that when the transformation coefficient satisfies the predetermined condition, a determination is made that a corner has occurred in content displayed by the image signal.
It is preferable that the corner is distinguished between a corner due to block noise and a corner due to the content from the number of corners and a deviation thereof.
According to a second embodiment of the present disclosure, there is provided a sound inspection method including: sampling a continuous digital sound signal by dividing the signal by less than or equal to 5 msec; extracting a high-frequency component from the sampled signal; and detecting an error occurred in a sound on the basis of the extracted high-frequency component.
With the present invention, it is possible to sample a continuous digital sound signal by dividing the signal by less than or equal to 5 msec, which is a very short time period; to extract a high-frequency component from the sampled signal; and to detect sound noise occurred in an image with high precision in distinction from the actual content on the basis of the extracted high-frequency component.
It is preferable that when the digital sound signal is recorded on a plurality of channels, the error is detected for each of the channels.
It is preferable that when sampling is performed at time t along the time axis, frequency conversion is performed on the sampled signal, and n power values Pn(t) and a total power value P(t) in a predetermined bandwidth are obtained, respectively,
[1] if the total power value P(t) is higher than a first threshold value, and
[2] if a value (P(t)/P(t−T)) produced by dividing the total power value P(t) by total power value P(t−T) at time (t−T) before that time, and a value (P(t)/P(t+T)) produced by dividing the total power value P(t) by total power value P(t+T) at time (t+T) after that time are individually higher than a second threshold value, and
[3] if values (Pn(t)/P(T)) produced by dividing the individual power values Pn(t) by the total power value P(T) are higher than a third threshold value, a determination is made that an error has occurred.
It is preferable that when three power values along the time axis are compared, a first power value Pn(t−T5) and a third power value Pn(t+T+T5) are higher than a fourth threshold value, and a string of second power values Pn(t), . . . , Pn(t+T) is lower than a fifth threshold value, a determination is made that sound skipping has occurred.
It is preferable that when three power values Pn(t) along the time axis are compared, a first power value Pn(t−T5) and a third power value Pn(t+T+T5) are lower than a sixth threshold value, and a string of second power values Pn(t), . . . , Pn(t+T) is higher than a seventh threshold value, a determination is made that noise has occurred.
With the present invention, it is possible to provide an image inspection method for detecting an image disorder caused by noise generated in a digital image signal due to various causes. Also, it is possible to provide a sound inspection method for detecting a sound error caused by noise generated in a digital sound signal due to various causes.
A description will be given of an image and sound inspection apparatus capable of achieving an image inspection method and a sound inspection method according to the present embodiment with reference to the drawings.
Detection of Image Disorder
An “image disorder” means a phenomenon in which a content image instantaneously disappears and then returns to normal between frames, or the content image is shifted. Here, a description will be given by taking, as an example, an image and sound signal by the BTAS-001B standard for the 1125/60 system HDTV (High-definition television) broadcasting that is standardized by, a general incorporated association, the Association of Radio Industries (ARIB). Such an image signal includes a luminance signal Y, and color-difference signals Pb and Pr.
When an image and sound signal is input from the input unit 11 to the extraction unit 12, the extraction unit 12 divides within the range of lines V1 to V2 and pixels H1 to H2 in one frame into four fields (areas) A, B, C, and D as illustrated in
More specifically, if it is assumed that there are 8 pixels from the frame ends to H1 and H2, respectively, and there are 8 pixels from the frame ends to V1 and V2, it is possible to set an inspection target frame to have H2=1864 pixels in the horizontal direction, and to have V2=536 lines in the vertical direction, and thus one field produced by dividing this by four has 928 pixels and 264 lines. Here, as illustrated in
Further, the average of signals as a DC component and the variance as an AC component are obtained for each small block. That is to say, obtaining the variance as a video activity is extracting a high-frequency component. An expression (1) is an expression for obtaining the average A(k) of the luminance signal Y in a small block #k, and an expression (2) is an expression for obtaining the variance V(k) for the luminance signal Y in the small block #k. Thereby, the average A(k) and the variance V(k) are obtained in accordance with the number of blocks in the fields A to D, respectively (k=1 to 1914).
Further, the average A(k) and the variance V(k) obtained in accordance with the expressions (1) and (2) are averaged for each one field. An expression (3) is an expression for obtaining video averages FkA=L11, L21, L12, and L22 of each field, and an expression (4) is an expression for obtaining activity averages VkA=S11, S21, S12, and S22 of each field.
Here, if it is assumed that the video activity in the n-th block #n in one field at time t is Vn(t), attention is given to its change over time. On the basis of the time t, the video activities are calculated before that time, time (t−2) and (t−1), and after that time, time (t+1) and (t+2) as Vn(t−2), Vn(t−1), Vn(t+1), and Vn(t+2), respectively. Note that a time interval between (t−2), (t−1), t, (t+1), and (t+2) is less than or equal to 20 msec, and is assumed to be a unit time.
Here, when a first-order differential value is obtained at each time, the result becomes as follows.
dVn(t−1)/dt=Vn(t−1)−Vn(t−2) (5)
dVn(t)/dt=Vn(t)−Vn(t−1) (6)
dVn(t+1)/dt=Vn(t+1)−Vn(t) (7)
dVn(t+2)/dt=Vn(t+2)−Vn(t+1) (8)
Further, when a second-order differential value is obtained at each time, the result becomes as follows.
d
2
Vn(t)/dt2=dVn(t)/dt−dVn(t−1)/dt (9)
d
2
Vn(t+1)/dt2=dVn(t+1)/dt−dVn(t)/dt (10)
d
2
Vn(t+2)/dt2=dVn(t+2)/dt−dVn(t+1)/dt (11)
Here, (d2Vn(t)/dt2)/Vn(t−1) is defined as an acceleration AC of the content at time, and this is capable of having a positive or negative value. The acceleration AC is input from the extraction unit 12 to the comparison and determination unit 13.
Specifically, the comparison and determination unit 13 compares three accelerations AC that are consecutive along the time axis. First, in
Next, at time (t+1), the direction of the acceleration AC returns to a positive value again, and the acceleration AC is higher than the threshold value Th1. Accordingly, the acceleration AC is greater than the threshold values between (t−1), t, and (t+1), and arranged in order of positive, negative, and positive. In this manner, if the acceleration AC changes greatly, it is possible to determine that an image disorder has occurred in a block in the area #n at time t. In the same manner, if the acceleration AC is higher than the threshold value, and is arranged in order of negative, positive, and negative, it is possible to determine that an image disorder has occurred.
Further, the direction of the acceleration AC has returned to a negative value again at time (t+2), but is not lower than the threshold value Th2. Accordingly, between time t, (t+1), and (t+2), the acceleration AC is arranged in order of negative, positive, and negative along the time axis, but is not greater than the threshold value. Accordingly, the image of the content is always within a normal range, and a determination is made that an image disorder has not occurred at time (t+1). In this regard, it is possible to change the values of the threshold values Th1 and Th2 to any values by the input from the device control unit 14. The above calculation and comparison are performed for all the small blocks.
If the comparison and determination unit 13 determines that an image disorder has occurred, the comparison and determination unit 13 inputs information indicating in which small block and in which field, an image disorder has occurred to the alarm output unit 15. The alarm output unit 15 displays an alarm on the monitor (not illustrated in the figure) on which the image and sound to be inspected is displayed on the basis of the input information. At this time, it is preferable to display an alarm by being superimposed on the image displayed on the monitor, for example. It is then possible to make the edges of the field in which the image disorder has detected shine in red.
(Detection of Image Block Noise)
“Image block noise” means a phenomenon in which an image of content is converted into another image in a block state. Here, a description will be given by taking an HDTV image and sound signal as an example. As illustrated in
At this time, when 64 pixel values in an inspection block are represented by Y(0, 0) . . . , and Y(7, 7), and the Fourier transform coefficients are represented by F(u, v)=F(0, 0) . . . , and F(7, 7), a relationship of an expression (12) holds. By this Fourier transform, a high-frequency component is extracted.
As a result of the Fourier transform performed by the extraction unit 12, if the Fourier transform coefficients satisfy any one of the following conditions 1 to 4, the comparison and determination unit 13 determines that the inspection block DB exists at any one of the four corners of the block noise BN illustrated in
[1] If the condition 1 holds, this indicates that the pixels Y(6, 6), Y(7, 6), Y(6, 7), and Y(7, 7) of the inspection block DB are located in the block noise, and the other pixels are located outside the block noise. Accordingly, this means that the inspection block DB(1) illustrated in
[2] If the condition 2 holds, this indicates that the pixels Y(0, 6), Y(1, 6), Y(0, 7), and Y(1, 7) of the inspection block DB are located in the block noise, and the other pixels are located outside the block noise. Accordingly, this means that the inspection block DB(2) illustrated in
[3] If the condition 3 holds, this indicates that the pixels Y(6, 0), Y(7, 0), Y(6, 1), and Y(7, 1) of the inspection block DB are located in the block noise, and the other pixels are located outside the block noise. Accordingly, this means that the inspection block DB(3) illustrated in
[4] If the condition 4 holds, this indicates that the pixels Y(0, 0), Y(1, 0), Y(0, 1), and Y(1, 1) of the inspection block DB are located in the block noise, and the other pixels are located outside the block noise. Accordingly, this means that the inspection block DB(4) illustrated in
Accordingly, as illustrated by an arrow in
Condition 1: |W30−W33|/8≧Th3 and |W03−W33|/8≧Th3
P1/P2≧(Th4)2, provided that
P1=(⅓){W332+W302+W032}
(unconditionally
holds when P2=0)
P1/P2≧(Th4)2 (P2=0 unconditional)
Condition 3: |W30+W33|/8≧Th3 and |W03−W33|/8≧Th3 and
P1/P2≧(Th4)2 (P2=0 unconditional)
Condition 4: |W30+W33|/8≧Th3 and |W03+W33|/8≧Th3 and
P1/P2≧(Th4)2 (P2=0 unconditional)
Note that WUV is a square root of sum of squares (√(A2+B2)) of a real part (A) and an imaginary part (B) of F(u, v).
Incidentally, with only the above-described conditions, a window of a building as content, characters inserted into an image, or the like might be detected as block noise. Thus, it is necessary to distinguish block noise from a window and characters. This is performed by the comparison and determination unit 13 as follows.
To give a more specific description, as illustrated in
First, the total number of corners Nc in the inspection target area is equal to the total number of pixels where a corner has occurred, and is also equal to the total number of lines on which a corner has occurred, and thus is expressed by an expression (13). Further, it is assumed that the standard deviation (Dh)2 of the corners that have occurred in the horizontal direction in the inspection target area is expressed by an expression (14), and the standard deviation (Dv)2 of the corners that have occurred in the vertical direction is expressed by an expression (15).
Here, if the standard deviation of the corners is small, there is a strong tendency for the corners to be on the same vertical line or on the same horizontal line. Accordingly, when α=N×Dh×Dv is obtained in the inspection target area, if the value of α is relatively small, it is possible to estimate that there are many corners due to the content. Thus, if the comparison and determination unit 13 determines that a corner has occurred in the inspection target area, the comparison and determination unit 13 determines whether α is equal to or higher than a threshold value Th5. If α≧Th5, the comparison and determination unit 13 determines that block noise has occurred in the inspection target area. In this regard, it is possible to freely change the values of the threshold values Th3 to Th5 by the input from the device control unit 14.
If the comparison and determination unit 13 determines that image block noise has occurred, the comparison and determination unit 13 inputs the information including the position information indicating a corner, or the like into the alarm output unit 15. The alarm output unit 15 displays an alarm on the monitor (not illustrated in the figure) on which the image and sound to be inspected is displayed on the basis of the input information. At this time, it is desirable to display the positions of the corners of block noise superimposedly on the image displayed on the monitor.
(Detection of Sound Error)
One of sound errors detected by the present embodiment is a so-called “puff” sound that instantaneously occurs and disappears. The digital sound is input on four channels, for example, and thus an error for each of the channels is detected.
First, the extraction unit 12 divides the digital sound by 1 msec along the time axis as illustrated in
(f0 direct current, and f1 to f23 alternating current)
(Detection of Puff Sound)
The comparison and determination unit 13 calculates the sum of squares of the real part and the imaginary part from the high-frequency component fj(t) at time t so as to obtain power. Accordingly, the power is calculated for all the samples, and this is assumed to be Pn(t) (Note that n=1 to 23).
It is understood that the power of a puff sound is uniform among the sample data. Assuming that the total power of the sample data m1 to m2 at time t is P(t), P(t) is expressed by an expression (17).
The comparison and determination unit 13 determines that a puff sound has occurred when the following expressions (18) to (20) are satisfied. The condition of the expression (18) indicates that the sound signal is not zero, the expression (19) indicates that there is a relatively large change before and after a puff sound, and the expression (20) indicates that the power is relatively constant in the sampling time. In this regard, it is possible to change the values of the threshold values Th6 to Th8, T, m1, m2, n1, and n2 in any way by the input from the device control unit 14.
P(t)≧Th6 (18)
P(t)/P(t−T)≧Th7 and P(t)/P(t+T)≧Th7 (19)
P
n(t)/P(t)≧Th8 (Note that n is the sample data of any serial number n1 to n2 among the sample data #1 to #23) (20)
(Detection of Sound Skipping)
P
n(t−T5)≧Th9 (21)
P
n(t),Pn(t+1), . . . Pn(t+T)≦Th10 (22)
P
n(t+T−T5)≧Th9 (23)
(Detection Noise Insertion)
P
n(t−T5)≦Th11 (24)
P
n(t),Pn(t+1), . . . Pn(t+T)≧Th12 (25)
P
n(t+T−T5)≧Th11 (26)
If the comparison and determination unit 13 determines that a sound error has occurred, the comparison and determination unit 13 inputs an audio alarm signal to the alarm output unit 15. The alarm output unit 15 displays an alarm on the monitor (not illustrated in the figure) on which an image and sound to be inspected is displayed.
With the present invention, it is possible to detect an image error and a sound error with high precision without relying on an examiner whose inspection precision is dependent on the examiner's physical condition and individual difference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2013/078660 | 10/23/2013 | WO | 00 |