The present invention relates to an apparatus that facilitates acquisition of a light map sensitive to the shape of a specular surface of an object, viewed by a camera, and surrounded by an artificial scene designed to facilitate the process of producing the light map.
As is known in the field of deflectometry, a light map is useful in digital recovery of the three-dimensional shape of an object's specular surfaces [Balzer et. al 2014]. A light map has also been referred to as a Simple Geometric Mapping Function in [Rapp et. Al. 2012].
An existing approach to producing a light map is diagrammatically illustrated in
For the computing device 18 to produce a light map, it may consider the data relating to each camera pixel, which data (called the observed signal) describes what that camera pixel observed over the frame sequence, and therefrom attempt to determine to which scene point that camera pixel is responsive.
To facilitate the aforementioned process, a technique called temporal encoding may be employed. To employ temporal encoding, the displayed images are designed such that each point on the display is identifiable (to some required degree of accuracy) from the light signal emanating from that point on the display over the succession of displayed images which are separated in time. This means that any two points on the display that are (for practical purposes) physically separate from one another emanate light signals (over the succession of displayed images) that are (for the same practical purposes) distinguishable from one another.
The computing device 18 attempts to apply temporal decoding, which is to attempt to recognize the scene point corresponding to a camera pixel's observed signal. By repeating this for each camera pixel, the computing device can construct the light map. For some camera pixels this process may fail (and indeed should fail when the camera pixel is not responsive to any scene point on the display's active surface) in which case that camera pixel remains unmapped.
One technique in the field of deflectometry that employs temporal encoding has been referred to as “phase-measuring deflectometry with multi-frequency temporal phase unwrapping” by [Huang et al 2018].
Some other techniques in the field of deflectometry employ partial temporal encoding (meaning that ambiguity remains after temporal decoding) in conjunction with assumptions about spatial constraints and local smoothness of the object's surface (to resolve said ambiguity) to arrive at the same result. One such technique has been referred to as “phase measuring deflectometry with marker-assisted spatial phase unwrapping” [Huang et al 2018].
The arrangement shown in
For some objects even an infinitely large flat display would provide insufficient coverage (of camera pixels mapped to display points in the light map) as a flat display cannot subtend more than 180° at the object, and an arcuate or cylindrical display may be called for. Substantially arcuate or cylindrical displays are not as readily available as flat displays, and liable to significantly impact the cost of producing such a system.
While an arcuate or cylindrical display can potentially subtend an angle greater than 180° (at the object) along its curved dimension, it cannot accomplish this along the transverse (uncurved) dimension. A display that is curved along two perpendicular dimensions would be even less readily available and more difficult to manufacture from readily available rectangular display modules.
One approach to addressing the aforementioned problems is to use multiple displays, arranged around the object under inspection [Balzer et al. 2014]. However, this approach introduces seams/artefacts into the acquired light map at the edges and corners where the flat displays/screens meet one another. Moreover the approach remains mechanically cumbersome, requiring either multiple large displays held in a specific relationship to one another, or careful positioning and calibration in the case of projectors and screens.
Another problem encountered in the aforementioned approaches is that the cameras observing the object under inspection may themselves occlude parts of the scene from parts of that object. In any given camera's perspective, the parts of the object's specular surface covered in reflected images of the cameras (including the viewing camera itself) will remain unmapped in the light map eventually produced (and therefore that part of the object's specular surface cannot be digitally recovered or analyzed by means of that light map).
Another limitation of all the aforementioned approaches is that displays and projectors emitting light of invisible wavelengths are not as readily available as those emitting visible light, [Hofer et al 2016, Huang et. al. 2018], since commercially available displays and projectors are normally intended for human viewing. For example, surfaces such as those of unpainted steel may be insufficiently specular at visible wavelengths of light yet sufficiently specular at longer wavelengths such as infra-red. Infra-red cameras are readily available, and it may be desirable to use an infrared-emitting scene surface to produce a light map for such surfaces.
To use infrared light for scene point encoding, one approach is to apply or print a static sinusoidal pattern (varying along one dimension) on a thermally emitting metallic sheet (potentially heated), and then to physically shift and rotate the sheet [Höfer et al 2016], so as to emulate the display in phase measuring deflectometry. Accomplishing this movement, involving two degrees of freedom may be complex and costly, and the encoded scene surface is presumably confined to a plane.
An object of the present invention is to address to a significant extent the aforementioned problems and limitations by encoding scene points with an apparatus that involves an illuminant physically movable with one degree of freedom.
In the following description use is made of a number of technical terms which have specific meanings. To eliminate ambiguity as to the meanings of these terms the definitions in Annex A should be applied.
The definitions in some instances may extend to inventive features and, in that respect, are to be interpreted and applied as potentially embodying aspects of the disclosure of the inventive concept and its manner of application.
According to the present invention there is provided an apparatus to facilitate generation of a light map sensitive to a specular surface of an object under inspection, which apparatus includes:
The screen, may be shaped as a self-sliding surface.
The screen may be movable relative to the object with primarily one degree of freedom. The camera may be held in a fixed spatial relationship to the object.
The rendering of the graphic on the screen may be such that the rendered graphic illuminates the object from all positions traversed by the sliding screen.
The sliding motion may be controlled or time-predictable or otherwise measured, or alternatively a control mechanism may be used, to associate each captured frame with the slide parameter at the time that the frame was captured.
The slide parameter may be referred to as the “scan state” as it identifies the position of the screen within its set of positions traversable by sliding motion.
The camera and the object are stationary or held in fixed relation to one another. During a procedure hereinafter referred to as “a scan procedure”, the screen undergoes sliding motion from an initial position corresponding to scan state s0 to a final position corresponding to scan state s1, while the camera captures frames of the object as illuminated by the latitude-encoding graphic carried by the moving screen.
The screen may be embodied in any suitable way and the graphic may be rendered on the screen in any suitable way. These are not here considered significant problems to solve.
The apparatus may include a synchronising mechanism for associating each frame captured by the camera with a scan state (i.e. screen position) corresponding to the frame's time of capture.
If the motion of the screen is controlled in such a way that the scan state is a predictable function of time, then the synchronizing mechanism may simply associate each camera frame with the time-predicted scan state corresponding to the camera frame's time of capture. In an alternative embodiment, the synchronizing mechanism may employ a distance sensor or position encoder, the measurements of which may be used to estimate the scan state at the time of each captured camera frame.
The data relating to the captured camera frames are input to the computing device, together with the scan state associated with each camera frame (or sufficient information to determine it).
If the frames were captured during sliding motion with sufficient frequency (with respect to scan state) then the computing device is in a position, for each camera pixel, to reconstruct an approximation of the signal describing light incident on said pixel as a function of scan state.
That region of the screen's (stationary and notional) reference surface that is traversed both by the screen's leading edge and by the screen's trailing edge during the scan procedure is referred to hereinafter as the “encoded scene surface”.
Due to the way in which the latitude-encoding graphic is embedded on the screen in relation to its sliding motion, any given scene point (fixed in the encoded scene surface) is visited by all points of the corresponding latitude on the screen (and therefore by an entire latitude codeword of the graphic) during the course of the scan procedure. The light emanating from any point in the encoded scene surface, as a function of the slide parameter (i.e. scan state) can be described by the scene point's latitude codeword shifted by the scene point's longitude.
More formally: If the screen is described parametrically by ((u, v)∈R)(P(u+s,v)∈
3) where u is longitude on the screen, v is latitude on the screen s is the scan state and P(u+s,v)∈
3 is the physical location of the point with screen longitude u and screen latitude v, and if the latitude-encoding graphic is (u,v)
G(u,v), then the light emanating from any moving point P(u+s,v) in the encoded scene surface is described by G(u,v) and therefore the light emanating from any stationary point P(U,V) in the encoded scene surface is described by G(U−s,V)=GV(s−U).
To facilitate prospective identification of the latitudes of unknown but fixed scene points from optical signals describing light observed to be emanating from those scene points (as a function of scan state), the latitude-encoding graphic may be selected (by design) such that each respective latitude corresponds to a unique readable codeword, or such that readable codewords at consequentially different latitudes (for the practical purposes of the invention) are recognizably different from one another (for the same practical purposes).
Moreover, since the optical signal emanating from any given scene point is in general approximated by a latitude codeword shifted by that scene point's longitude, the latitude-encoding graphic may be selected (by design) such that readable codewords at consequentially different latitudes are separated by a sufficient shift-invariant distance to robustly discriminate between them in the presence of noise or distortion.
Each camera pixel that is responsive to a (typically small) neighbourhood of scene points in the encoded scene surface, observes light (varying over scan state) that is an aggregation of the light emanating from the scene points in that neighbourhood, combined with other light such as diffuse reflection on the object's surface.
Diffusely reflected light observed by any given camera pixel at some point on the object's surface originates from a diffuse variety of scene points, and therefore changes gradually, if at all, during the motion of the screen, and can in principle be suppressed by a high-pass filter known in the art of signal processing.
The latitude-encoding graphic may be selected (by design) such that its latitude codewords comprise relatively high-frequency components, so as to facilitate discrimination of specular reflection (observed by a camera pixel and originating from a particular scene point's neighbourhood) from diffuse reflection consisting predominantly of low-frequency components in the combined optical signal (describing observed light as a function of scan state).
Due to spatial variation in curvature of the specular surface of the object under inspection, as well as due to other factors that vary spatially, some camera pixels are (individually) responsive to substantially larger regions of the encoded scene surface than approximated by the reading resolution.
The latitude-encoding graphic may be selected (by design) such that the pointwise average of neighbouring readable codewords is similar to the readable codeword corresponding to the average of those neighbouring readable codewords' latitude. More particularly, the latitudinal code nonlinearity of the latitude-encoding graphic may be bounded above by a predetermined maximum value suitable for the expected degree of maximum curvature in the specular surface of the object under inspection.
The camera may be situated inside the encoded scene surface (meaning on the same side of the encoded scene surface as the object under inspection) or outside (meaning on the other side of) the encoded scene surface, in which case it cannot occlude parts of the encoded screen surface from the object, although the object may be occluded from the camera by the moving screen at some point during its motion.
To facilitate use of a screen that is substantially narrower than it is long, the latitude-encoding graphic may be selected (by design) to have a latitudinal code packing coefficient that is substantially greater than 1.0 (which corresponds to the latitude-encoding graphic used in
The optical signal (varying over scan state) observed by any given camera pixel (during the scan procedure) may have a component arising from diffuse reflection, which component may be suppressed by a signal-conditioning process such as high-pass filtering.
In the optical signal observed by any given camera pixel responsive to some encoded scene point may be decoded by means of a lookup table populated by (Gv, v) entries separated by a sufficiently short latitude interval. The latitude decoding may in principle then proceed by selecting the latitude v in the entry whose codeword Gv is most similar (in some appropriate sense such as having the lowest shift-invariant distance) to the original latitude codeword (or shifted version of it that had been observed). If the observed signal is conditioned (such as by high-pass filtering) then the Gv entries populating the lookup table may be similarly pre-conditioned by the same process.
In principle the longitude of the scene point from which an observed optical signal emanated can be determined from the (potentially conditioned) observed signal and its decoded latitude's (potentially pre-conditioned) codeword as being equal to the relative shift that minimizes the optical distance between them, or between similarly conditioned versions of them.
To facilitate prospective identification of longitudes of scene points from which observed signals emanated, the latitude-encoding graphic may be selected (by design) such that readable codewords each have sufficiently concentrated auto-correlation functions, which may be attained if the codewords have sufficient spatial bandwidth (along longitude), such as for example by including at least one high contrast feature, such as a black-white (or other contrasting colours or symbols) (longitudinally) edge extending across latitudes.
If the latitude-encoding graphic is such that the latitude and longitude of a scene point can be identified from the light emanated by that scene point (or the signal describing said light over scan state) then the scene points within the encoded scene surface have been temporally encoded, and the frames captured by the camera during the scan procedure (together with their associated scan states) constitute an encoded light map.
For purposes of inspecting objects with surfaces that are more specular to infrared light than to visible light, the screen may be embodied, for example, by a metallic sheet shaped appropriately and potentially heated, with the graphic applied to it by means of a thermally-emissive substance e.g. black paint. This is also known as an emissivity-coded screen in [Höfer et. al 2016].
The metallic sheet may be cut away in regions, or may be otherwise treated to contrast with a background.
Additional cameras may be employed in different positions (all stationary relative to the object) to generate additional light maps (for example to measure different regions of the object's specular surface, or for purposes of enriching the acquired data). These cameras can be considered to instantiate multiple instances of the present invention sharing all components except the cameras.
To extend the encoded scene surface to a greater range of latitudes at the cost of introducing ambiguity in the prospective latitude decoding process, multiple latitude-encoding graphics (which may employ some or all of the same latitude codewords) may be rendered on the screen to occupy different (preferably disjoint) intervals of latitude. An introduced ambiguity may be resolved by means outside of temporal encoding/decoding, which may involve techniques of spatial phase-unwrapping known in the art.
The latitude-encoding graphic may be carried on any suitably shaped slidable surface. Such surface may be arcuate and be curved sufficiently to ensure that the at least one camera can detect reflections from all points on the object. It may therefore be necessary to make use of a plurality of cameras with each camera being in a fixed position relative to the object.
The graphic may be elongate with a length which is greater, possibly by orders of magnitude, than its width. The movement of the graphic may be transverse to its length i.e. in the direction of its width.
It is preferable to move the graphic with respect to the object and the cameras (which are kept stationary), but this is not essential, for conceivably the graphic could be kept stationary and the object and the cameras, which are maintained in a fixed spatial relationship to one another during the scan procedure, could be moved with respect to the graphic.
The distance by which the screen (and hence the graphic) is movable may be at least five times greater than its width (or median width if applicable). The physical constraints and comparative factors mentioned herein assist in the implementation of a cost-effective apparatus with a satisfactory resolution.
Thus in contrast to the prior art approach described hereinbefore with reference to
In one embodiment any slice on the graphic has a longitudinal coordinate that varies on a line which is parallel to the direction of movement. The value of the longitudinal coordinate may be ascertained by noting the extent of movement of the slice e.g. the longitudinal coordinate of a point, from which light (captured by a pixel) emanated, may be directly related to the degree of movement of the slice from a reference location which embodies that point. Such movement is in turn is synchronised with the capturing of the successive frames by the camera in that each time a frame is captured the position of the graphic is known and hence the longitudinal coordinate of that slice is known.
A latitudinal coordinate of that slice is ascertainable from the data captured by the relevant pixel, for the succession of frames, for this is determined by the design of the graphic.
The detail of the graphic in each slice may be unique so that the slice is distinguishable from any other slice in the graphic. Alternatively the detail of the graphic in each slice in a defined set of consecutive slices may be unique in that set of slices.
The size and shape of the reference surface are determined by the shape and size of the screen, and by the direction and extent of movement thereof.
The reference surface, also referred to as a slidable surface, can take on a variety of shapes as depicted, by way of example only, in
The cylindrical surface in
The slices of the graphic, also referred to herein as segments of the graphic, are preferably contiguous—i.e. adjacent one another over the extent of the graphic.
The invention also provides a method of generating data relating to a specular surface of an object, the method including the steps of:
The frames may be captured at a rate which is dependent on the speed of movement of the graphic. As the data in each pixel is dependent on the position of the graphic relative to the object, a longitudinal coordinate of the scene point in the graphic from which the light in the pixel originated can be determined.
In one embodiment, the graphic in an elongate direction may at least notionally be divided into a plurality of contiguous elements or segments referred to as slices (as per the definition thereof). Within each slice the characteristics of the graphic are related to the position of the slice, in the graphic, and hence related to the latitudinal coordinate of the graphic in the reference surface from which the light in the respective pixel originated.
The graphic may be a latitude-encoding graphic.
It is apparent e.g. from
The invention also extends to an apparatus to generate data relating to a specular surface of an object, the apparatus including:
The graphic may be elongate with a length which is greater at least by an order of magnitude than its width.
Features of the invention, listed hereinbefore, may be combined with one another, according to requirement.
The invention also provides an apparatus to facilitate generation of data relating to the shape of a partially or totally specular surface of a stationary object, the apparatus including:
The camera may be positioned on an opposing side of the notional surface from the object, such that the screen occludes the object from the camera at some point during its motion.
The screen may be longer by an order of magnitude or more than it is wide. The screen may be substantially arcuate along its length.
The screen may be movable by rotating about an axis. The axis may be notional or physical.
The static graphic may be notionally divisible by notional dividing lines into at least twenty notional slices of approximately equal thickness wherein:
The longitudinal movement of the notional dividing lines is thus such that the movement is parallel to the direction in which the notional dividing lines extend.
Preferably each notional slice has a thickness approximately equal to a predetermined speed at which the movable screen is capable of moving divided by a predetermined rate at which the camera is capable of capturing frames.
Preferably the notional slices are each unique or are distinguishable from one another and are non-repeating for at least twenty consecutive notional slices.
Preferably there are substantially more than twenty slices.
As the number of slices (for a graphic) is increased, the size of each slice is reduced and the resolution of the apparatus is increased.
The screen may be substantially arcuate along a dimension that is substantially perpendicular to its width.
The screen may be curved along two perpendicular dimensions.
The invention is further described by way of examples with reference to the accompanying drawings wherein:
(There is no
Conceptual aspects of the invention are described by way of a simplified example with reference to
A latitude-encoding graphic 20, which in this instance comprises a white triangle on a black background, is carried on a screen (not shown) at a constant speed in a direction U i.e. from the left to the right. The graphic has a height H and a width W. The graphic's initial displacement 205 is zero. A fixed point 204 is located in the path of the moving latitude-encoding graphic, in a stationary reference surface 209, each point of which can be identified by latitude V and longitude U coordinates.
If regard is had to the intensity of light emanating from a given stationary point 204 which lies in the path of the moving latitude-encoding graphic then, in general, the point turns from black to white and then from white to black as the triangle in the graphic traverses it. The period of time for which the point remains white (between these transitions) depends on the latitude of the point. The greater is the latitude the shorter is the duration of the period of time for which the point remains white. The greater is the longitude, the later are the times at which the point changes colour. If the screen's displacement 206 (
Any given point on the encoded scene surface emanates (as a light signal over the screen's displacement) a rectangular pulse whose width encodes the point's latitude, and whose delay (or shift) encodes the point's longitude, and in this way temporal encoding of all the scene points traversed by the graphic is achieved.
A pixel of a camera 90 responsive to a particular scene point 204 in the encoded scene surface (via an optical path 201 which may involve specular reflection on the object 92) observes a light signal that encodes the scene point's latitude V and longitude U, and may be decoded in principle by means of the aforementioned equations.
The scene point encoder 76 in
Some variations of the present invention have already been accommodated within the definitions of terms used to describe the invention. Some of those and further variations of some parts of the present invention are described briefly below by way of example only.
The means of associating each frame captured by the camera (or cameras) with a scan state corresponding to the frame's time of capture, can take many different forms, and is not here considered as a significant problem to solve. In the case of the screen undergoing controlled motion with a time-predictable position, known speed, it may simply be a matter of deducing the scan state (implying screen position) corresponding to each frame from the frame's timestamp. Alternatively, the position of the screen can be tracked by means of a distance sensor and a state estimation algorithm such as Kalman filtering. A third example possibility is to signal motion-increments optically into the successive frames, and deduce the screen position for each frame by extracting this information.
In some embodiments the screen may be planar (as depicted for example in
In alternative embodiments the screen may be an arcuate (cylindrical) surface as depicted for examples in
In alternative embodiments the screen may be spherical in shape as depicted for example in
Further embodiments with different screen shapes are accommodated within the definition of the term self-sliding surface to which the screen may conform if it is a slidable surface.
The latitude-encoding graphic may take the form of the examples depicted in
Further embodiments with different latitude-encoding graphics are accommodated within the definition of the term latitude-encoding graphic.
The latitude-encoding graphic, and the self-sliding surface to which the screen conforms, may be selected independently and combined to yield a variety of different embodiments of the scene point encoder, and therefore of the present invention.
In one embodiment the screen may be the surface of an optical diffuser sheet installed into a structure that holds it fixed in the appropriate shape (of a slidable surface), and onto which the latitude-encoding graphic is printed, and to which back lighting may be applied.
In an alternative embodiment the latitude-encoding graphic may be printed on a passive white sheet, held fixed in an appropriate shape (of a slidable surface) and illuminated by ambient or artificial external illumination.
For purposes of inspecting objects with surfaces that are more specular to infrared light than to visible light, the screen may be embodied, for example, by a metallic sheet shaped appropriately and potentially heated, with the graphic applied to it by means of an emissive substance such as black paint. This is also known as an emissivity-coded screen in [Höfer et. al 2016]
A latitude-encoding graphic can be repeated at successive intervals of latitude. The repetition naturally introduces ambiguity in the latitudinal coordinate from whence any particular signal of light emanated. To resolve these ambiguities, marker-assisted spatial phase-unwrapping algorithms can be used if certain assumptions can be made about the surface of the object under inspection. One way to introduce absolute markers for such algorithms would be to make modifications to particular latitude codes in the latitude-encoding graphic, for example by introducing dots. It should be noted, however, that this technique is in part just a repeated application of the concept essentially employing multiple graphics concurrently. Each slice of the latitude-encoding graphic, or in a defined set of slices should be unique, and distinguishable from all other slices, or from all other slices in the set (as the case may be). In this way data detected by a pixel can be uniquely associated with the scene point thereof i.e. the point of origin of the light which conveyed that data to the pixel.
The inventive concept described herein can be used in a variety of applications and, typically, in the field of deflectometry and surface metrology.
A carrier 48 is movable in a controlled manner, in a direction 50, relative to the vehicle 42. In this example the carrier 48 is in the form of an arch which extends from a surface of the platform 44 on one side of the vehicle to a surface of the platform on an opposing side of the vehicle.
The carrier 48 on an inner surface 54 carries a unique image of predetermined characteristics e.g. a latitude-encoding graphic of the kind shown in
The graphic can be backlit using suitable light sources which are carried by the carrier or, if required, use can be made of ambient light. Irrespective of the illumination mechanism adopted each camera captures a succession of frames of views of the surface of the vehicle as the carrier is moved relative to the vehicle. The time at which each frame is captured by a camera is known and the position of the carrier and hence of the graphic at that time is also known. During the carrier's motion (in the direction 50) any given scene point (e.g. observed by a camera pixel via specular reflection) may be visited by all points of a given latitude codeword in the latitude-encoding graphic. To the extent that the latitude codeword has shift-invariant distance to all other latitude codewords in the graphic, it can be identified uniquely to ascertain the latitude of the observed scene point in the graphic.
By way of example only the speed of the moving graphic depends on the number of frames per second and is such that the distance moved by the graphic between successive frame captures of the camera may be between 0.2 mm (for inspecting small objects) and 20 mm (for inspecting larger objects).
The length of the graphic is dependent on the particular application and may vary from 100 mm to 10 meters. The width of the graphic is materially less than the length of the graphic typically about from 3% to 30% of the length of the graphic.
The width of a slice is ideally zero. Practically though it can be increased up to approximately the same value as the distance moved by the graphic between successive frame captures i.e. from 0.2 mm to 20 mm.
Typically at least twenty frames per second are captured. However the greater is the rate of frame capture the better is the resolution of the scene point encoder.
These numerical and comparative values, and other numerical and comparative values given in the specification, which have been established through design and test work to yield satisfactory results relate to inventive features of various embodiments of the invention.
The encoder 76 includes a carrier 78 with a screen 78 which carries a latitude-encoding graphic 80. The screen 78 undergoes sliding motion (in this example is movable in a linear manner in a fixed direction and in a controlled manner) by means of a drive mechanism 84 using energy from a power source 86. At least one stationary camera 90 is in a fixed position relative to a stationary object 92, a surface of which is to be monitored, in the manner described hereinbefore. Movement of the screen 78 is synchronised (94) by means of a computing device 96 with the capturing by the camera 90 of a succession of frames of the object in such a way that the position of the graphic 80 (in the direction of movement) is ascertainable for each frame i.e. the longitudinal coordinate is determined. The data in each pixel, for the succession of frames captured by the camera enables the identity (position) of the slice from which the light captured by that pixel originated, to be determined i.e. the latitudinal coordinate is known. Thus the coordinates of the scene point in the surface are determined.
The plurality of scene points, determined in this manner constitutes the light map of the surface of the object 92.
In some of the definitions use is made of a number of terms which have specific meanings in mathematics. The terms continuous, tuple, curve (when used as a noun), vector space, pointwise, support, embedding, one-parameter group, surface of revolution, rigid motion, parametric surface, (generalized) cylindrical surface, continuous alphabet code, region, metric, functional, L1 distance, L1(X, Y) distance, Euclidean norm, Euclidean distance, function [of] and self-evident variations of the aforegoing terms in other grammatical cases should be interpreted in the senses in which they are commonly understood in mathematics and geometry. The term [signal] conditioning should be interpreted in the sense in which it is commonly understood in the field of signal processing.
A camera means an instrument for recording or capturing a sequence of images, called frames, which are subsequently stored or transmitted digitally. The camera sensor typically includes a two-dimensional grid of sensing elements, called camera pixels, each of which transduces light incoming to the camera from a particular direction (or small neighborhood of directions represented by a particular nominal direction), to produce data, called the frame pixel, describing one or more characteristics, such as intensity, spectral components, or some other characteristic, of the light incident on that camera pixel.
A graphic is considered to be carried by a surface if it is rendered on the surface. This implies that the rendered graphic moves with the surface if the surface undergoes motion. More particularly for slidable surfaces, a graphic G: 2→O is considered to be carried by a slidable surface S: ((u,v)∈R)
(P(u+s,v)∈
3) where u is longitude within the slidable surface and v is latitude within the slidable surface, if it is rendered on the slidable surface, which implies that the point S(u,v)=P(u+s,v), whose position is dependent on the slide parameter s, emanates light described by G(u,v).
A display means a controllable illuminant with a surface, divided into a two-dimensional grid of neighborhoods, called display pixels, each of which emanate light that can be modulated independently to display a sequence of images on the surface. A display may be embodied by a computer screen, a passive white screen with a video projector, or by some other means.
Light may be said to emanate from a point (or equivalently: the point can be said to emanate light) if the light diverges from that point in a variety of directions to illuminate an object under inspection. This may be accomplished by radiant emission (such as by points on a backlit display screen), by scattering reflection (such as by points on a passive screen responding to external illumination), or even by simple transmission (such as by a transparent material) from a relatively uniform background, or by other means.
An encoded scene surface means a region of a (potentially notional) reference surface, which region is traversed both by the leading edge and by the trailing edge of a slidable surface carrying a latitude-encoding graphic.
The encoded light map for a given camera observing a given object in a given scene means an association mapping camera pixels to signals that potentially correspond to scene points to which those camera pixels are responsive, which signals may potentially be decoded into the scene points by a predetermined algorithm.
A graphic means a mathematical function G:R⊂2→O of two coordinates, where O is some set of optical quantities, and R⊂
2 is a bounded region of
2. Equivalently a graphic may be represented by G:
→O with support on a bounded region R⊂
2 outside of which the graphic's value is assigned to optical zero. A graphic G: (R⊂
2)→O is considered to be rendered on a parametrically described surface P:
2→
3 if each point P(u,v) of the surface emanates light as described by the corresponding point in the graphic G(u,v). Therefore, depending on the geometry of the slidable surface and the coordinate system by which it is parametrized, the graphic may appear on the surface in a stretched or distorted form, as is unavoidable if the surface is geometrically not a developable surface.
The inter-fold distance of a latitude-encoding graphic means the minimal shift-invariant distance between any two readable codewords that are spaced apart in latitude by at least
where N is the latitudinal code nonlinearity of the latitude-encoding graphic. More formally, the inter-fold distance is
where G: →O is the latitude-encoding graphic and d is the reading resolution.
latitude: see longitude and latitude.
A latitude-encoding graphic means a graphic G: R⊂2→O (intended to be rendered on a slidable surface) such that at each latitude v there appears a spatially-varying (along longitude u) optical signal Gv: u
G(−u,v) called a latitude codeword. Ideally the encoding function v
Gv mapping each latitude to its codeword is a continuous alphabet code and the latitude codewords form a topologically continuous one-dimensional metric space of optical signals. Practically, however, the latitude-encoding graphic may be notionally divided into a substantial but finite number of readable slices, the readable codewords of which are distinguishable from one another for practical purposes.
latitude codeword: see latitude-encoding graphic.
The latitudinal code gradient of a latitude-encoding graphic means the rate at which the readable codeword changes with respect to latitude. More particularly, the latitudinal code gradient at a given latitude v is the shift-invariant distance between the two adjacent readable codewords that are both bordered by the line of latitude v, divided by the reading resolution. More formally, the latitudinal gradient of a latitude-encoding graphic G: 2→O at a given latitude v can be defined as
where d is the reading resolution.
The latitudinal code packing coefficient of a latitude-encoding graphic G: R⊂2→O means the median (over all latitudes) latitudinal code gradient divided by
This definition is designed such that the latitudinal code packing coefficient is equal to 1.0 for a latitude-encoding graphic comprising only a triangle (as illustrated in
The latitudinal code nonlinearity of a latitude-encoding graphic at a particular latitude v is the L1(,O) distance between a readable codeword and the average of its neighbors, divided by the L1(
,O) distance between the neighbors. More formally, the latitudinal code nonlinearity of a latitude-encoding graphic G:
2→O at latitude v is
where G′v denotes the readable codeword centered at latitude v. If a readable codeword is equal to the average of its neighbors, the latitudinal code nonlinearity is zero at that latitude.The leading edge of a slidable surface means the curve consisting of, for each latitude occurring in the surface, the point of maximal longitude at that latitude. More formally, the leading edge is {S(u,v)|((u′,v′)∈R)[v=v′∧u′>u])} if the slidable surface is described parametrically by S: ((u,v)∈R)
(P(u+s,v)∈
3).
The term light map for a given camera observing a given object in a given scene means an association mapping (some) camera pixels to the scene points to which those camera pixels are responsive (potentially via specular reflection on the surface of the object). Each camera pixel is either mapped to a scene point, in which case it is said to be a mapped pixel, or is an unmapped pixel and associated with no scene point. (Since pixels typically have non-zero extent, each camera pixel is typically responsive to a neighborhood of scene points. The neighborhood corresponding to a given mapped pixel is represented in the light map by a single scene point that is near to all scene points in the neighborhood.)
The noun light means electromagnetic radiation of wavelengths shorter than 1 millimeter. This would include, for examples, infrared, visible light, ultraviolet, etc.
A line of latitude means a line or curve, all points of which have the same latitude.
A line of longitude means a line or curve, all points of which have the same longitude.
The terms longitude and latitude, when used in reference to a self-sliding surface (or some region thereof), shall be understood as referring to two coordinates that jointly identify any point of the self-sliding surface, defined such that if the surface undergoes a sliding motion along with every point within it, every point remains at its original latitude (in reference to the original unslid surface) while every point undergoes the same change in longitude (in reference to the original unslid surface), which shall be referred to as the slide parameter (of the sliding motion). (This is akin to the motion and coordinate system of the Earth's surface as the Earth revolves about its own axis.) The slide parameter may represent a linear displacement, an angle of rotation, or a combination thereof. A self-sliding surface with this coordinate system can be described parametrically by (u,v)(P(u+s,v)∈
3) for some P:
2→
3, where s is the slide parameter, and u and v are longitude and latitude respectively.
When longitude and latitude are used in renference to a graphic, they mean the first and second (respectively) coordinates that jointly identify any point in the graphic. See also the defined meaning of rendering a graphic on a slidable surface.
mapped pixel: see light map.
The optical distance between two optical values O1 and O2 (which may be scalar or vector) means the Euclidean norm of the difference between those two optical values ∥O1−O2∥2. The optical distance between two optical signals f: →O and g:
→O means the integral of the pointwise optical distance between corresponding values in those signals, ∫∞∞μf(u)−g(u)∥2du, which is also known mathematically as the L1(
,O) norm of the pointwise difference between the signals. If O is a bounded region of
(e.g. grayscale value varying between black and a predetermined white) then the optical distance between f:
→O and g:
→O can be visualized as the area enclosed between their plots.
The optical maximum Omax of a graphic G: R⊂2→O means max {∥G(u,v)∥2|(u,v)∈
2}, which is, of all optical values appearing in the graphic, the optical value that is most optically distant from optical zero. For a graphic of visible light, black typically corresponds to optical zero and white typically corresponds to optical maximum.
An optical signal means a signal whose codomain consists of optical quantities. An optical signal may, for example, describe light varying temporally or spatially.
An optical quantity means a quantitative description of the nature of local light (e.g. emanating from some local point or neighborhood, or entering a camera pixel). An optical quantity describing, for example, grayscale luminous intensity would be a scalar, whereas an optical quantity describing multiple spectral components (such as, for example, the intensities of red, green, and blue light) would be a tuple vector. In general an optical quantity may describe any aspect of light that can be quantified by a camera pixel in principle.
The term optical zero means an optical quantity value serving as a background or default quantity in the absence of modulation. For example, the absence of light (i.e. darkness) would typically be considered optical zero, if the background in the absence of modulation is dark. Alternatively some other substantially uniform background can in principle influence the meaning of optical zero. Optical zero is represented numerically as O.
A readable codeword of a latitude-encoding graphic means a (spatial) signal over longitude that is the latitude-averaged readable slice of a latitude-encoding graphic. More particularly, for a readable slice of a latitude-encoding graphic G: 2→O between latitudes v−½d and v+½d (where d is the reading resolution) the readable codeword is
A readable slice (of a graphic) means a region (contiguous segment) of a rendered graphic between two lines of latitude separated (wherever they are most distant from each other) by a distance equal to the reading resolution.
The reading resolution of a given camera positioned in relation to an object and a scene surface, means a predetermined dimension approximating the minimum size of the region of scene points to which a single camera pixel is responsive, via specular reflection on a flat surface of the object under inspection. More particularly, for a camera with Nh pixels across a width dimension corresponding to the camera's lateral angle of view of αh, and where the camera is separated from the object under inspection by a distance of dc and where the scene surface is at its closest point a distance ds from the object, the reading resolution is given by
The reference surface of a slidable surface means the self-sliding surface of which the slidable surface is a region, along with its coordinate system, fixed in position such that the slide parameter remains zero. The reference surface with its coordinate system can be described parametrically by (U,V)(P(U,V)∈
3) if the self-sliding surface is described by (u,v)
(P(u+s,v)∈
3) where s is the slide parameter.
render: see graphic.
A camera pixel may be said to be responsive to a scene point if there exists an optical path connecting that scene point to the camera pixel, either directly or via specular reflection, along which light emanating from the scene point can stimulate the camera pixel.
The scanning resolutionmeans the a predetermined distance approximating the maximum distance by which any point of a screen carrying a graphic moves between two consecutive frame captures by the camera, as determined by a predetermined speed at which the screen moves and by a predetermined rate at which the camera captures frames.
The scene means the surroundings of an object under inspection. Typically a reflected (and possibly distorted) image of some part of the scene is visible as a reflection in the specular surface of the object under inspection, if the surface is specular to visible light.
A scene point means a point of the scene surface, with latitude and longitude determined in reference to the (stationary) scene surface.
The term scene point encoding means modulation of light (over time or over some time-dependent variable) emanating from scene points such that the modulated light emanating from each scene point carries information about the location of that scene point.
The term scene point decoding means an attempt (e.g. by a computer) to recover the location of a scene point from observation of the light emanating from it, as modulated by scene point encoding.
A scene surface means a (physical or notional) surface consisting of points in the scene, each of which can potentially emanate light. The scene surface is stationary relative to (i.e. held in fixed relationship with) the object under inspection.
A self-sliding surface means a smooth surface S⊂3 that is invariant under some continuous one-parameter group of transformations, which shall be referred to as sliding motions. If the sliding motions are rigid motions (as would be required if the surface is rigid), then the self-sliding surface may be any of the following examples:
A screen means a surface on which a predetermined graphic can be rendered. If the screen moves the predetermined graphic is carried along with the screen.
To shift a signal xf(x) by Δ, or apply a shift Δ to the signal, means transforming the signal into x
f(x−Δ).
The shift-invariant distance between two optical signals f: →O and g:
→O means the optical distance between them, minimized over all relative shifts between the signals. Or more formally: min{∫∞∞∥f(u)−g(u+s)∥2du|s∈
}. Informally, the shift-invariant distance between two optical signals is the signal distance between them after aligning them to each other. The shift-invariant distance is deliberately defined such that its value does not change when one of the signals under considerations is shifted relative to the other.
A signal means a mathematical function of one scalar variable, such as scan state (temporally varying), or longitude (spatially varying).
A slidable surface means a bounded region of a self-sliding surface. When a slidable surface undergoes sliding motion, it remains confined to the self-sliding surface of which it is a region. And if it undergoes substantially sliding motion, it remains substantially confined to the surface of which it is a region.
A sliding motion: see self-sliding surface.
A specular surface means a glossy surface of a physical body on which (at the scale of neighborhoods observed by individual camera pixels, and at the wavelengths of light observed by the relevant camera) at least a non-negligible fraction of incoming light undergoes specular reflection rather than diffuse reflection.
A surface means a surface in the geometric (mathematical) sense of the word, in physical space without necessarily involving any physical body unless it is a surface of a body. A Surface is a generalization of a plane or subplane that needs not be flat, meaning that the curvature is not necessarily zero. This is analogous to a curve generalizing a straight line or curve segment generalizing a line segment.
The trailing edge of a slidable surface means the curve consisting of, for each latitude occurring in the surface, the point of minimal longitude at that latitude. More formally, the trailing edge is {S(u,v)|((u′,v′)∈R)[v=v′∧u′<u])} if the slidable surface is described parametrically by S: ((u,v)∈R)
(P(u+s,v)∈
3).
unmapped pixel: see light map.
Number | Date | Country | Kind |
---|---|---|---|
2018/06583 | Oct 2018 | ZA | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/ZA2019/050064 | 10/4/2019 | WO | 00 |