The present application generally relates to the field of computer technologies, and more particularly to a method and apparatus for identifying and displaying augmented reality (AR) content.
Nowadays, some known AR applications can be used to detect an AR target image and display AR content together with the AR target in an AR scene. Such known AR applications typically adopt a marker-based method or a markerless method for image processing. Specifically, a marker-based AR application can detect marker(s) in the AR target image and then retrieve AR context based on the detected marker(s). Design of the markers, however, typically provides a non-esthetic visual look when the markers are applied to commercial products. On the other hand, a markerless AR application can detect visually distinctive features distributed in the AR target image as key points, and then compare the acquired key points with reference key point data stored at an AR database. AR content corresponding to the key points that match the acquired key points can then be retrieved and displayed. Such a markerless AR application, however, typically requires high CPU burden due to complex computing and processing of data, particularly for continuous visual search of the AR database. Furthermore, the markerless AR applications typically show unreliable detection performance when the target image contains repetitive patterns or other less-distinctive features.
Therefore, a need exists for a method and apparatus that can provide a fast and reliable AR content search, as well as an esthetic visual look for an AR target image.
The above deficiencies associated with the known AR applications may be addressed by the techniques described herein.
In some embodiments, a method for detecting an AR target image and retrieving AR content for the detected AR target image is disclosed. The method is performed at a user device, which has one or more processors and memory for storing one or more programs to be executed by the one or more processors. The method includes receiving data of the AR target image. The method includes detecting, based on the data of the AR target image, a group of markers on the AR target image. In some instances, the group of markers can include at least five dots. In some instances, the group of markers can be located within a first region of the AR target image, and an image can be displayed within a second region of the AR target image that is mutually exclusive from the first region of the AR target image.
The method includes calculating a set of cross ratios based on locations of the group of markers. In some instances, the set of cross ratios can include at least two cross ratios. In some instances, the method includes calculating the set of cross ratios based on a set of projected coordinates of the group of markers and a unique order of the group of markers. The set of projected coordinates and the unique order of the group of markers can be determined based on the data of the AR target image. In some instances, the unique order of the group of markers can be determined based on at least a shape of a marker from the group of markers, a design of a marker from the group of markers, a color of a maker from the group of markers, a predefined rotational direction, and/or the like.
The method also includes retrieving, based on the set of cross ratios, AR content associated with the AR target image. In some instances, the AR content can include a three-dimensional (3D) object. In some instances, retrieving the AR content includes sending the calculated set of cross ratios to a database such that the calculated set of cross ratios is compared with a group of predefined sets of cross ratios stored in the database, and a predefined set of cross ratios that matches the calculated set of cross ratios is determined from the group of predefined sets of cross ratios. AR content associated with the determined predefined set of cross ratios is then retrieved from the database and provided to the user device. The method further includes displaying the retrieved AR content and the AR target image in a single AR scene.
In some embodiments, a user device includes one or more processors and memory storing one or more programs for execution by the one or more processors. The one or more programs include instructions that cause the user device to perform the method for detecting an AR target image and retrieving AR content for the detected AR target image as described above. In some embodiments, a non-transitory computer readable storage medium of a user device stores one or more programs including instructions for execution by one or more processors. The instructions, when executed by the one or more processors, cause the processors to perform the method of detecting an AR target image and retrieving AR content for the detected AR target image as described above.
In some embodiments, a method for searching and retrieving AR content for an AR target image is disclosed. The method is performed at a server device, which has one or more processors and memory for storing programs to be executed by the one or more processors. The method includes receiving a set of cross ratios associated with the AR target image. The method includes comparing the received set of cross ratios with a group of predefined sets of cross ratios. Each predefined set of cross ratios from the group of predefined sets of cross ratios is associated with an AR content file from a group of AR content files. The method also includes determining, based on the comparison result, an AR content file from the group of AR content files. The method further includes sending AR content associated with the AR content file to a user device such that the user device displays the AR content and the AR target image in a single AR scene.
In some instances, the method includes determining, from the group of predefined sets of cross ratios, a predefined set of cross ratios that matches the received set of cross ratios. The method includes identifying, from the group of AR content files, the AR content file associated with the determined predefined set of cross ratios.
In some instances, each predefined set of cross ratios from the group of predefined sets of cross ratios can be associated with data of a keypoint descriptor and an AR content file from a group of AR content files, where the data of the keypoint descriptor is associated with the AR content file. In such instances, the method further includes receiving data of a keypoint descriptor of the AR target image.
Moreover, to determine the AR content file, the method includes identifying, based on the comparison result and from the group of predefined sets of cross ratios, a subset of the group of predefined sets of cross ratios. Each predefined set of cross ratios from the subset of the group of predefined sets of cross ratios is closer to the received set of cross ratios than each predefined set of cross ratios excluded from the subset of the group of predefined sets of cross ratios. The method includes comparing the data of the keypoint descriptor of the AR target image with data of keypoint descriptors associated with the subset of the group of predefined sets of cross ratios. The method also includes determining, based on the comparison of data of keypoint descriptors, data of a keypoint descriptor that matches the data of the keypoint descriptor of the AR target image. The method further includes identifying the AR content file associated with the determined data of keypoint descriptor.
Various advantages of the present application are apparent in light of the descriptions below.
The aforementioned implementation of the present application as well as additional implementations will be more clearly understood as a result of the following detailed description of the various aspects of the application when taken in conjunction with the drawings.
Like reference numerals refer to corresponding parts throughout the several views of the drawings.
Reference will now be made in detail to embodiments, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the subject matter presented herein. But it will be apparent to one skilled in the art that the subject matter may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the embodiments.
To promote an understanding of the objectives, technical solutions, and advantages of the present application, embodiments of the present application are further described in detail below with reference to the accompanying drawings.
Although shown in
The network 150 can be any type of network configured to operatively couple one or more server devices (e.g., the server device 140) to one or more user devices (e.g., the user device 120), and enable communications between the server device(s) and the user device(s). In some embodiments, the network 150 can include one or more networks such as, for example, a cellular network, a satellite network, a local area network (LAN), a wide area network (WAN), a wireless local area network (WLAN), etc. In some embodiments, the network 150 can include the Internet. Furthermore, the network 150 can be optionally implemented using any known network protocol including various wired and/or wireless protocols such as, for example, Ethernet, universal serial bus (USB), global system for mobile communications (GSM), enhanced data GSM environment (EDGE), general packet radio service (GPRS), long term evolution (LTE), code division multiple access (CDMA), wideband code division multiple Access (WCDMA), time division multiple access (TDMA), bluetooth, Wi-Fi, voice over internet protocol (VoIP), Wi-MAX, etc.
The server device 140 can be any type of device configured to function as a server-side device of the system 100. Specifically, the server device 140 is configured to communicate with one or more user devices (e.g., the user device 120) via the network 150, and provide AR content to the user device(s). In some embodiments, the server device 140 can be, for example, a background server, a back end server, a database server, a workstation, a desktop computer, a cloud computing server, a data processing server, and/or the like. In some embodiments, the server device 140 can be a server cluster or server center consisting of two or more servers (e.g., a data processing server and a database server). In some embodiments, the server device 140 can be referred to as, for example, an AR server.
In some embodiments, the server device 140 can include a database that is configured to store AR content and other data and/or information associated with AR target images. In some embodiments, a server device (or an AR server, e.g., the server device 140) can be any type of device configured to store AR content and accessible to one or more user devices (or AR devices, e.g., the user device 120). In such embodiments, the server device can be accessed by a user device via one or more wired and/or wireless networks (e.g., the network 150) or locally (i.e., not via a network). In some embodiments, the server device can be accessed by a user device in an ad-hoc manner such as, for example, home Wi-Fi, NFC (near field communication), Bluetooth, infrared radio frequency, in-car connectivity, and/or the like.
The user device 120 can be any type of electronic device configured to function as a client-side device of the system 100. Specifically, the user device 120 is configured to communicate with one or more server device(s) (e.g., the server device 140) via the network 150 to display AR content with AR target images for user(s) that operate the user device 120. In some embodiments, the user device 120 can be, for example, a cellular phone, a smart phone, a mobile Internet device (MID), a personal digital assistant (PDA), a tablet computer, an e-book reader, a laptop computer, a handheld computer, a desktop computer, a wearable device, and/or any other personal electronic device. In some embodiments, a user device can also be, for example, a mobile device, a client device, a terminal, a portable device, an AR device, and/or the like.
Additionally, a user operating the user device 120 can be any person (potentially) interested in viewing AR content and an AR target image in a single AR scene. Such a person can be, for example, an instructor, a photographer, a designer, a painter, an artist, a student, a computer graphics designer, etc. As shown and described herein, the system 100 (including the server device 140 and the user device 120) is configured to enable the user(s) operating the user device 120 to view AR content and an AR target image in the same AR scene.
Although shown as two separate devices in
The marker-based AR application identifies embedded AR index data based on the detected AR marker. The marker-based AR application then searches for appropriate AR content using the AR index data. AR content can include one or more any type of visible objects. In some embodiments, AR content can include a virtual object such as, for example, a 3D object (e.g., a 3D cartoon of bee as shown in
The markerless AR application then searches for appropriate AR content using the acquired key points. In some embodiments, for example, the markerless AR compares the acquired key points with reference key point data stored in a database, which are associated with various AR content. If the markerless AR application can retrieve appropriate AR content based on the acquired key points, then the AR device displays the AR content (e.g., a 3D object such as a 3D cartoon of bee in
In some embodiments, a marker-based AR application typically requires relatively lower CPU burden than a markerless AR application as the detection algorithm for a fiducial marker is simpler than that of a markerless algorithm. However, the design of a marker usually provides non-esthetic visual look when it is applied to commercial products. On the other hand, the markerless AR application typically requires relatively high CPU burden, particularly for a continuous visual search from a database. A markerless algorithm may also show unreliable detection performance when the target image contains repetitive patterns and/or less-distinctive features, which are sometimes preferred by graphic design.
In some embodiments, a user device (e.g., an AR device, the user device 120 in
After computing the set of cross ratios, the user device sends the computed set of cross ratios to a server device (e.g., the server device 140 in
In some embodiments, occlusion and/or illumination on the set of markers (e.g., dots) may cause the user device to produce an incorrect calculation of cross ratio. As a result, execution of the primary method may not retrieve the appropriate AR content for the target image. In such embodiments, the markerless algorithm (i.e., the back-up method) can be activated to identify the appropriate AR content for the target image by detection of distinctive key points, similar to the process described above with respect to
In some embodiments, the distinctive key points (or distinctive features) of the target image can be located within a second region (e.g., a specified inner region) of the target image. In some embodiments, the second region and the first region can be mutually exclusive. In such embodiments, for example, a (complete) image is displayed within the second region; and the set of markers but no portion of the image is displayed within the first region.
In some embodiments, the hybrid method has one or more of the following features and advantages: 1) fast algorithm to retrieve AR index data using the cross ratio based AR content search; 2) reliable AR content search by the hybrid method of image processing; 3) low impact on the target image design in terms of esthetic visual look; and 4) low computational burden for both the AR device and the server device for identification of AR content.
In some embodiments, the software method associated with the hybrid method disclosed herein consists of simplified fiducial marker based AR algorithm as a primary identification method of AR index. The software method also executes markerless AR algorithm if the primary detection method fails to detect a specific set of markers. If the hybrid method computes the cross ratios of the set of markers (e.g., five dots) as a simplified marker set and/or distinctive key point descriptors, the AR device sends the computed cross ratios and/or the key point descriptors to the AR server for downloading the appropriate AR content. After the AR device successfully downloads the appropriate AR content from the AR server, the hybrid method computes a camera-pose matrix between the camera of the AR device in the 3D space and the surface of the target image, such that the AR device displays the AR content at an appropriate location with the target image in the same AR scene.
In some embodiments, the system (including the user device and the server device) performing the method 300 can include one or more processors and memory. In such embodiments, the method 300 can be implemented using instructions or code of an application that are stored in one or more non-transitory computer readable storage mediums of the system and executed by the one or more processors of the system. The application is associated with identifying an AR target image, determining appropriate AR content for the AR target image, retrieving and displaying the AR content, etc. In some embodiments, such an application can include a server-side portion that is stored in and/or executed at the server device, and a client-side portion that is stored in and/or executed at the user device. As a result of the application being executed, the method 300 is performed at the system. As shown in
At 301, the user device captures a target image and detects five dots located in an outer region of the target image. The user device then computes a set of cross ratios. In some embodiments, the user device can receive data of a target image from, for example, an image-capturing device (e.g., a camera) of the user device. The user device can then detect, based on the data of the target image, a set of markers. In some embodiments, the set of markers can include more or less than five dots, and/or one or more other type of markers (e.g., identifiers, symbols, signs, etc. with various shapes, colors, sizes, patterns, etc.). In some embodiments, the target image can include two mutually-exclusive regions, where the set of markers (e.g., five dots) are located within one of the two regions, and an image is located with the other of the two regions. For example, the set of markers can be located within an outer region of the target image that surrounds an inner region of the target image. For another example, the set of markers can be located within a left region of the target image that is to the left of a right region of the target image.
As another example,
In some embodiments, the cross ratio is defined as a ratio of ratio in the geometry fields of applied mathematics. For example, as shown in
Cross Ratio=[(X1−X3)/(X2−X3)]/[(X1−X4)/(X2−X4)]=[(X1′−X3′)/(X2′−X3′)]/[(X1′−X4′)/(X2′−X4′)] Equation 1
In case of two-dimensional transform of geometry, the projective transform (also known as 2D homography transform) can preserve a cross ratio, the ratio of the ratio of lengths, collinearity of points and order of points across viewing. Since those projective invariants remain unchanged under the image transformation, the cross ratio can be used as an index to retrieve the appropriate AR content that is associated with the identical cross ratio corresponding to the specific target image. In other words, the cross ratio obtained from the captured target image in the pixel coordinates and the cross ratio computed from a reference image (e.g. a hard copy image as a “true image”) is identical. Thus, the cross ratio preserves a unique value of the target image from any viewing direction of the AR device that captures the target image.
For example,
In some embodiments, the design of dots for the cross ratio calculation can be important in order to provide reliable recognition of the dots as markers for the target image. The layout of dots can have strong features in terms of, for example, shape, color, gray scale, size, etc. In some embodiments, for example, the shape of a small black dot with a white circle surrounding the small black dot (as shown in
In some embodiments, multiple cross ratios can be calculated for a set of multiple markers (e.g., five markers) with a given order. For example, as illustrated in Equations 2 and 3 below, at least two cross ratios can be calculated for a set of five markers (e.g., dots) with a given order.
Cross Ratio 1=(|M431|×|M521∥)/(|M421|×|M531|) Equation 2
Cross Ratio 2=(|M421|×|M532|)/(|M432|×|M521|) Equation 3
Where, each Mijk is a matrix:
with suffices i, j, and k being indexes of the markers in the given order (i.e., i, j, and k can be any of 1 to 5); (Xi, Yi) is the 2D coordinates of the marker with the index i; and the scalar value |Mijk| is a determinant of matrix Mijk. In some embodiments, a cross ratio for a set of markers can be calculated using any other suitable method.
Next, projected coordinates (e.g., 2D coordinates) of each marker on the target image can be determined based on the captured images, and then cross ratios can be calculated, respectively, for the two captured images using the corresponding project coordinates. The resulted cross ratios for the two captured images, however, are identical because the cross ratio is a projective invariant. Specifically, the cross ratio calculated using Equation 2 for the captured image from the top view is equal to the cross ratio calculated using Equation 2 for the captured image from the titled view; and the cross ratio calculated using Equation 3 for the captured image from the top view is equal to the cross ratio calculated using Equation 3 for the captured image from the titled view.
In some embodiments, for example, one or more cross ratios for the captured image from the top view can be pre-calculated and stored in a database at an AR server as reference data. In such embodiments, an AR device can capture an image of the target image (e.g., a captured image from the titled view), calculate one or more cross ratios based on projected coordinates obtained from the captured image, and then send the calculated cross ratio(s) to the AR server. In response to receiving the calculated cross ratio(s), the AR server can compare the calculated cross ratio(s) with the pre-calculated cross ratios stored as reference data in the database to determine a match between the calculated cross ratio(s) and stored cross ratio(s). The AR server can then retrieve appropriate AR content associated with the matched cross ratio(s) and send the appropriate AR content to the AR device.
Similarly stated, in some embodiments, a user device can capture an AR target image from a first viewing direction, and calculate a first set of cross ratios based on a first set of projected coordinates of a set of markers of the target image that are associated with the first viewing direction. The user device can also capture the same AR target image from a second viewing direction different than the first viewing direction, and calculate a second set of cross ratios based on a second set of projected coordinates of the set of markers of the target image that are associated with the second viewing direction. The second set of projected coordinates is different from the first set of projected coordinates. Due to the invariant feature of cross ratios for the same target image, the first set of cross ratios is identical to the second set of cross ratios.
A calculation difficulty of the cross ratio is known in projective geometry of applied mathematics. In some embodiments, when three markers from a set of five markers are located on a same line (i.e., collinear), the cross ratios calculated for the set of five markers using Equations 2 or 3 will be zero or infinity. Therefore, in such embodiments, the distribution of the five markers should avoid such a collinear condition to obtain mathematically meaningful values for the cross ratios. FIG. SE depicts an example of the collinear condition described above, where dots #1, #4 and #5 are located (substantially) on the same line.
As described above, a cross ratio for a set of markers can be calculated based on an order of markers from the set of markers. In some embodiments, a cross ratio can have different values for different orderings of the same set of markers. In other words, a change in the order of the markers can cause a change in the resulted value of the cross ratio. In some embodiments, an order of a set of markers can be defined using any suitable method. For example, an order of a set of markers can be defined based on shapes of the markers, designs of the markers, colors of the markers, sizes of the markers, a predefined rotational direction, and/or the like.
Furthermore, for example, an order of the five dots can be defined based on colors of the dots such that a black dot is marker number 1, a red dot is marker number 2, a green dot is marker number 3, a yellow dot is marker number 4, and a blue dot is marker number 5. As a result, if two dots at the same location in two of the three target images have different colors, then the orders of the five dots for the two target images are different. For example, based on the different pattern of colors between
In some embodiments, as described above with respect to Equations 2 and 3, 2D coordinates (e.g., camera pixel coordinates) of each marker from a set of markers and the order of the set of markers are determined before cross ratios for the set of markers can be calculated. Various methods can be used to determine the order of the set of markers. For example, a marker having a different size, color, shape, etc. than the other markers from the set of markers can be determined as the first marker, and the order of the remaining markers can be determined based on a predefined rotational direction (e.g., clockwise or counter clockwise). For example, as shown in the target image on the right side of
A pair of X-Y coordinates of a marker Pi can be converted to cylindrical coordinates as: Pi (Xi,Yi)=Pi(r*cos θi, r*sin θi), where r=Square root of (Xî2+Yî2) is the radius, θi is the angle, and Arctan(θi)=Yi/Xi.
After the first marker is determined, the remaining markers can be ordered using their values of θ. For example, as shown in
Returning to
Otherwise, if at 302 the user device determines that the correct set of cross ratios is not obtained, at 306, the user device detects key points within the inner region of the target image and computes descriptor vectors based on the detected key points. At 307, the server device compares the descriptor vectors of the target image and reference images stored in the server device. In some embodiments, the server device storing the descriptor vectors of reference images can be the same or a different server device with the server device that stores data of cross ratios in 303.
Subsequently, at 308, the server device determines whether the descriptor vectors of the target image match the descriptor vectors of any reference image. If the server device determines that the descriptor vectors of the target image do not match the descriptor vectors of any reference image, then the process is terminated for the target image and the process returns to and restarts from 301 for another target image.
Otherwise, if at 308 the server device determines that the descriptor vectors of the target image match the descriptor vectors of a reference image, at 309, the user device downloads appropriate AR content from the server device storing the descriptor vectors of the reference images. The user device then computes a homography matrix. Next, at 310, the user device computes a camera-pose estimation using the homography matrix. Finally, the user device proceeds to 305 to display the AR content on the surface of the target image.
In some other embodiments, the AR device and the AR server can be operatively interconnected and reside within a single device. In yet some other embodiments, the AR device and the AR server can be operatively interconnected via a wireless network such as the IEEE 802.11 network standards, Bluetooth technologies, infrared radio frequency, NFC, or the like. In yet some other embodiments, the AR server can be a memory device configured to store the AR database, and the AR device can be connected to the AR server to access and retrieve AR content from the AR server. For example, the AR server can be a memory card configured to store the AR database, which can be inserted into the AR device's memory slot such that the AR device can retrieve AR content from the AR server. In some instances, the AR device can download the AR content from the AR server and store the downloaded AR content in the AR device's internal memory (e.g., AR database 738 in
As shown in
In operation, the AR device sends to the AR server cross ratio data of a set of markers in a first region (e.g., an outer region) of a target image and/or key point descriptors of visually distinctive features in a second region (e.g., an inner region) of the target image. In response to receiving the cross ratio data and/or the key point descriptors, the AR server searches the database to determine matched cross ratio data and/or matched key point descriptors by comparing the received cross ratio data and/or key point descriptors with the cross ratios and feature vectors stored in the database. If the AR server determines a match, the AR server identifies an associated AR content file and retrieves AR content from the associated AR content file. The AR server then sends the retrieved AR content to the AR device, such that the AR device displays the AR content with the target image in the same AR scene.
In some embodiments, the AR server can include one or more processors and memory. In such embodiments, the method 600 can be implemented using instructions or code of an application that are stored in one or more non-transitory computer readable storage mediums of the AR server and executed by the one or more processors of the AR server. The application is associated with determining and retrieving appropriate AR content for the AR target image. As a result of the application being executed, the method 600 is performed at the AR server. As shown in
At 601, the AR server receives a set of cross ratios and key point descriptor data from the AR device. As described above, the set of cross ratios can be calculated at the AR device based on a set of markers (e.g., five dots) located within a first region (e.g., an outer region) of the AR target image. The key point descriptor data can be computed at the AR device based on visually distinctive features located within a second region (e.g., an inner region) of the AR target image that is mutually exclusive from the first region.
At 602, the AR server compares the set of cross ratios with cross ratios stored in the database using a first threshold. At 603, the AR server determines whether the received set of cross ratios matches any stored set of cross ratios using the first threshold. Specifically, the AR server determines whether the received set of cross ratios is equal to any cross ratio data set stored in the database of the AR server. For example, such a comparison can be illustrated in the following Equation 4:
|Cross_Ratio(1)_dev−Cross_Ratio(1,j)_svr|<Threshold—1
|Cross_Ratio(2)_dev−Cross_Ratio(2,j)_svr|<Threshold—1
Where, Threshold_1 is the first threshold, which is a predefined threshold value for the first stage of evaluation; Cross_Ratio(1)_dev and Cross_Ratio(2)_dev are a pair of cross ratios computed by the AR device from the received data of the AR target image; and Cross_Ratio (1, j)_svr and Cross_Ratio (2, j)_svr are a pair of cross ratios with ID number j stored in the database of the AR server.
Thus, the AR server compares the received set of cross ratios (i.e., Cross_Ratio(1)_dev and Cross_Ratio(2)_dev) with each set of cross ratios (i.e., Cross_Ratio (1, j)_svr and Cross_Ratio (2, j)_svr) stored in the database to determine whether the received set of cross ratios and any stored set of cross ratios satisfy Equation 4.
If at 603 the AR server determines that the received set of cross ratios matches a stored set of cross ratios using the first threshold, then at 604, the AR server determines an AR content file that is associated with the matched cross ratio set. Subsequently, at 605, the AR server sends AR content of the determined AR content file to the AR device.
For example, if Equation 4 is satisfied by the stored set of cross ratios identified by ID number j, then the AR server determines that AR content associated with ID number j is appropriate for the AR target image. As a result, the AR server determines an AR content file associated with the ID number j in the database, then retrieves AR content from that AR content file and sends the AR content to the AR device. Consequently, the AR device displays the AR content with the AR target image in the same AR scene.
Otherwise, if at 603 the AR server determines that the received set of cross ratios does not match any stored set of cross ratios using the first threshold, at 606, the AR server performs a next matching procedure to determine appropriate AR content for the AR target image. Specifically, the AR server adopts another threshold (e.g., Threshold_2) to determine candidates of cross ratio sets that are close to the received set of cross ratios. Note that because the AR server fails the first stage of evaluation at 603, none of the candidates of cross ratio sets is equal to the received set of cross ratio according to the first threshold. In other words, none of the candidates of cross ratio sets satisfy Equation 4.
At 6061, the AR server applies the second threshold to identify candidates of cross ratio sets stored in the database. Specifically, the AR server computes absolute values of difference between candidates of cross ratio sets and the set of cross ratios provided by the AR device. For example, a deviation (Diff) of each cross ratio set stored in the database from the received set of cross ratios can be calculated using the following Equation 5:
Diff(k)=|Cross_Ratio(1)_dev−Cross_Ratio(1,k)_svr|+|Cross_Ratio(2)_dev−Cross_Ratio(2,k)_svr|
Where Diff(k) is the absolute value of difference between a stored cross ratio set with ID number k and the set of cross ratios received from the AR device.
The AR server then compares the calculated deviation of each stored cross ratio set with the second threshold (e.g., Threshold_2), and identifies each stored cross ratio set whose deviation is less than the second threshold as a candidate cross ratio set. As a result, each identified candidate of cross ratio set is not equal but close to the set of cross ratios received from the AR device.
At 6062, the AR server determines priority of candidates of cross ratio sets based on the absolute value of difference (i.e., the deviation) of each candidate. Specifically, the AR server compares the deviation values of the candidates of cross ratio sets to place them in an order from the smallest to the largest. The first candidate has the smallest deviation and is the closest to the set of cross ratios received from the AR device among all the candidates. Similarly, the last candidate has the largest deviation and is the farthest to the set of cross ratios received from the AR device among all the candidates.
At 6063, the AR server performs a matching procedure to determine appropriate AR content using the key point descriptor data computed by the AR device and key point descriptors associated with the candidates of cross ratio sets, which are stored in the database of the AR server (as shown in
Specifically, according to the priority order determined at 6062, the AR server first compares the key point descriptor associated with the first candidate (i.e., the one having the smallest deviation) with the key point descriptor received from the AR device. If the key point descriptor associated with the first candidate matches (e.g., is equal to) the received key point descriptor, then the AR server determines that the first candidate is a matched candidate, and AR content of the AR content file associated with the first candidate is the appropriate AR content for the AR target image. Otherwise, if the key point descriptor associated with the first candidate does not match (e.g., is not equal to) the received key point descriptor, then the AR server subsequently compares the key point descriptor associated with the second candidate (i.e., the one having the second smallest deviation) with the key point descriptor received from the AR device.
Similarly, if the key point descriptor associated with the second candidate matches (e.g., is equal to) the received key point descriptor, then the AR server determines that the second candidate is a matched candidate, and AR content of the AR content file associated with the second candidate is the appropriate AR content for the AR target image. Otherwise, if the key point descriptor associated with the second candidate does not match (e.g., is not equal to) the received key point descriptor, then the AR server moves on to the third candidate. The AR server repeats such operations until a matched candidate is determined or all the candidates are compared to the data received from the AR device.
Returning to
Otherwise, if the AR server determines that the key point descriptor data received from the AR device does not match any candidate key point descriptor, at 608, the AR server performs extensive search for remaining non-candidate key point data that is not included in the search performed at 606-607. In other words, the AR server compares the received key point descriptor with the key point descriptors associated with each cross ratio set that is stored in the database and not identified as a candidate at 6061.
At 609, the AR server determines whether the key point descriptor data received from the AR device matches any remaining non-candidate key point descriptor. If the AR server determines that the key point descriptor data received from the AR device matches a remaining non-candidate key point descriptor, the AR server proceeds to the steps 604-605 to retrieve and send AR content as described above. Otherwise, if the AR server determines that the key point descriptor data received from the AR device does not match any remaining non-candidate key point descriptor, at 610, the process is terminated and no AR content is sent from the AR server to the AR device.
As shown in
The network interface 760 is configured to enable of communications between the user device 700 and other devices (e.g., a server device) and/or networks. The network interface 760 is configured to send data to and receive data from another device and/or a network (e.g., the network 150 in
The user input interface 740 is configured to receive input data and signals and also generate signals caused by operations and manipulations of user input devices such as, for example, the touch sensor 790, the keyboard 720, and other user input means (e.g., a user's finger, a touch pen, a mouse, etc.). The screen 710 may be a touch screen (e.g., a liquid-crystal display (LCD), a light-emitting diode (LED), etc.) or a representation of a projection (e.g., providing projection signals). The screen 710 is commanded by the screen interface 715 that is controlled by the processor 780. The camera interface 775 is coupled to and controls the camera 770. The camera 770 can be any type of camera configured to capture an image such as, for example, a video camera, an image camera, a complementary metal-oxide semiconductor (CMOS) camera, etc.
The memory 730 is configured to store software programs and/or modules. The processor 780 can execute various applications and data processing functions included in the software programs and/or modules stored in the memory 730. The memory 730 includes, for example, a program storage area and a data storage area. The program storage area is configured to store, for example, an operating system and application programs such as the application module 735. The data storage area is configured to store data received and/or generated during the use of the user device 700 (e.g., AR content, data of target images, calculated cross ratios, etc.).
In some embodiments, as shown in
The memory 730 can include one or more high-speed random-access memory (RAM), non-volatile memory such as a disk storage device and a flash memory device, and/or other volatile solid state memory devices. In some embodiments, the memory 730 also includes a memory controller configured to provide the processor 780 and other components with access to the memory 730. In some embodiments, the memory 730 may be loaded with one or more application modules that can be executed by the processor 780 with or without a user input via the user input interface 740.
In some embodiments, each application module included in the memory 730 can be a hardware-based module (e.g., a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), etc.), a software-based module (e.g., a module of computer code executed at a processor, a set of processor-readable instructions executed at a processor, etc.), or a combination of hardware and software modules. Instructions or code of each application module can be stored in the memory 730 and executed at the processor 780.
Specifically, for example, the application module 735 can be an AR application or module configured to perform a set of functions such as image processing for detecting AR target images, calculating cross ratios, displaying AR content in a camera view area in the screen 710, etc., which are described herein. Particularly, when such an AR application or module is executed, the user device 700 receives an image or video from the camera interface 775, and then processes the image or video to determine if a target image is captured or not. When such a target image is detected, the user device 700 further processes the image or video to overlay one or more AR objects on a real scene image or video. Thus, AR content and the target image are displayed in the same AR scene.
The processor 780 functions as a control center of the user device 700. The processor 780 is configured to operatively connect each component of the user device 700 using various interfaces and circuits. The processor 780 is configured to execute the various functions of the user device 700 and to perform data processing by operating and/or executing the software programs and/or modules stored in the memory 730 and using the data stored in the memory 730. In some embodiments, the processor 780 can include one or more processing cores.
Although shown and described above with respect to
The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the present application to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the present application and its practical applications, to thereby enable others skilled in the art to best utilize the present application and various embodiments with various modifications as are suited to the particular use contemplated.
While particular embodiments are described above, it will be understood it is not intended to limit the present application to these particular embodiments. On the contrary, the present application includes alternatives, modifications and equivalents that are within the spirit and scope of the appended claims. Numerous specific details are set forth in order to provide a thorough understanding of the subject matter presented herein. But it will be apparent to one of ordinary skill in the art that the subject matter may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the embodiments.
The terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the present application. As used in the description of the present application and the appended claims, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “includes,” “including,” “comprises,” and/or “comprising,” when used in this specification, specify the presence of stated features, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, operations, elements, components, and/or groups thereof.
As used herein, the term “if” may be construed to mean “when” or “upon” or “in response to determining” or “in accordance with a determination” or “in response to detecting,” that a stated condition precedent is true, depending on the context. Similarly, the phrase “if it is determined [that a stated condition precedent is true]” or “if [a stated condition precedent is true]” or “when [a stated condition precedent is true]” may be construed to mean “upon determining” or “in response to determining” or “in accordance with a determination” or “upon detecting” or “in response to detecting” that the stated condition precedent is true, depending on the context.
Although some of the various drawings illustrate a number of logical stages in a particular order, stages that are not order dependent may be reordered and other stages may be combined or broken out. While some reordering or other groupings are specifically mentioned, others will be obvious to those of ordinary skill in the art and so do not present an exhaustive list of alternatives. Moreover, it should be recognized that the stages could be implemented in hardware, firmware, software or any combination thereof.
This application claims priority to U.S. Provisional Application No. 61/965,753, entitled “A Hybrid Method to Identify AR Target Images in Augmented Reality Applications,” filed Feb. 7, 2014.
Number | Date | Country | |
---|---|---|---|
61965753 | Feb 2014 | US |