The present invention provides a system and method for visual guidance using real-time visual anchor point detection, and in particular, relates to a method for providing a user with an automatic update system of real and accurate landmark images.
A vehicle navigation system enables drivers to search for the destination through navigation instructions, mainly by promoting the estimated distance and route map to guide the user to the destination. The navigation system usually provides step-by-step instructions to a driver and notifies the driver to turn left or right at the intersection around tens or hundreds of meters in advance.
However, due to the positioning error issue of GPS, the instructions sometimes are delayed or inaccurate such that the driver may not be able to take an action at the right moment. Furthermore, the driver was expected to recognize the street name sign by matching the instruction given by the navigation system with the image template. This may consume a lot of effort and distract the driver from focusing on road conditions.
The current navigation system not only provides information about the street names and distances but also a simulation map or schematic diagram of the real scene. However, most schematic diagrams or schematic buildings will require the driver to pay overtime to find out the correct sign. Although it seems to give more information, it is easy to distract the user's attention from the road conditions and actually causes the user to be more dangerous.
Drivers only have a few seconds to complete the series of actions from obtaining the information, judging the information to decide to turn. Therefore, the information provided by the navigation system should be simple and easy to understand. It is preferable to have the same picture or photograph as the real scene so that the user can make the decision whether to turn or not with the clearest information without conversion.
Some manufacturers have proposed to enhance navigation information through landmark images. The landmark images of all the routes were defined before the navigation system left the factory, they can't really match the actual environment. There are often misjudgments when using it. Consequently, how to effectively update the navigation system with real-time landmark images has become an urgent issue and challenge.
The present invention can automatically plan a route for the user and prompt the user mainly through distance, street name, and building number, and it generates navigation directions based on the route. For example, the system can provide the user with instructions such as “go forward a quarter of a mile and then turn right into Maple Street”. However, it is difficult for the user to accurately estimate the distance indicated by the navigation prompt, and it is not always easy to find the street sign prompted by the navigation system. In addition, some areas have fuzzy street and road signs, which makes it more difficult for users to drive a vehicle while looking for landmarks.
In order to provide the user with a navigation system similar to the guidance of a real person, it is better to refer to the prominently marked images in the travel route to enhance the navigation and guidance quality, and the prominently marked may be visually prominent buildings or billboards. It can be called “Visual Anchor” in the present invention. Therefore, the navigation directions that the system of the present invention can be “at a quarter-mile, you will see the McDonald's restaurant on your right, then turn right into Maple Street”. The user can approach the position of the destination (ex, street address or coordinates) so that the system can automatically select the appropriate visual landmark when generating navigational directions.
In view of this, the present invention provides a system capable of realizing a more intuitive and accurate navigation system. In the system, in addition to providing voice instructions to users, landmark images of visual anchor points are captured by human eyes. When the user's vehicle approaches the landmark, the present invention can provide the user with an image of the real landmark. At the same time, the present invention will recognize a visual anchor point (for example, a signboard) and display the signboard shown on the user interface. The visual anchor point can guide the user through the detected signboard, rather than by distance. In this way, the user can focus more on driving and travelling through landmark images without having to use his own experience to calculate the distance described by the navigation system, which greatly improves the user's driving concentration and efficiency in driving the vehicle.
The present invention provides an automatic system for visual guidance navigation using real-time visual anchor point detection, which includes an edge device, a cloud device, and a landmark database, wherein the edge device includes: a camera, which is configured on a user's preset location, which can capture a real-time image while the user is driving a vehicle; a user interface, which provides a user operation, which can view information provided by an application program, and enter the user data and visual anchor; a location module for determining the current geographic location of the vehicle; a wireless network module for transmitting the current geographic location of the vehicle and a destination set by the user to the wireless network module; a processer, which will perform an edge computing, can process the real-time image and the current geographic location of the vehicle to provide the user with a driving instruction through the user interface, and the driving instruction includes a candidate visual landmark image; a memory device for caching a reference landmark image received from the data in the wireless network module; and a navigation application module, the user can set the destination, transmitting the vehicle position and destination to the wireless network module, obtaining a route instruction and a landmark image information, and displaying the processing result and driving instruction to the user on the user interface; wherein, the cloud device includes: a navigation instruction generator, which generates a navigation instruction, and an action intersection; a route module, which can query the route from the landmark database according to the current geographical location of the vehicle and the destination; a navigation instruction generation module, which generates the navigation instruction according to the route of the route module, and defines the action intersection according to the navigation instruction; a landmark query module, which queries the visual landmark image from the landmark database according to the action intersection; and a landmark update module, which automatically updates the visual landmark image of the landmark database.
Wherein, the landmark database includes landmark records, visual landmark images, intersections where landmarks are located, and latitude and longitude of landmarks.
Preferably, the present invention includes a map database, and the map database includes map information such as intersections, latitude and longitude of intersections, and road travel directions.
The present invention further provides a method for visual guidance navigation using real-time visual anchor point detection, which includes: obtaining a route for guiding a vehicle user to a destination through a processing module; retrieving a visual landmark image set along the route from a database through the processing module; capturing a real-time landmark image from a present location of the user during navigation along the route through a camera; performing an edge calculation by using the retrieved visual landmark image and the collected real-time landmark image through the processing module, wherein the real-time image and the geographic location of the vehicle can be processed; and the user interface provides the user with a driving instruction including a candidate visual landmark image.
The present invention further provides a method for providing driving directions, receiving a request for driving directions to a destination from a user of the vehicle through a user interface operating in the vehicle; capturing real-time landmark images from a present location of the user during navigation along the route through a camera; using the retrieved visual landmark images and the collected real-time landmark images, and performing an edge calculation through the processing module, the real-time images and the geographic location of the vehicle can be processed; providing the user with a driving instruction via the user interface, the driving instruction including a candidate visual landmark image.
Preferably, the processing module of the present invention further comprises: receiving a candidate visual landmark image at the current geographic location of the vehicle, and comparing the captured real-time image with the received candidate visual landmark image; wherein the candidate visual landmarks are compared to determine whether the candidate visual landmark image is visible in the real-time image. When the candidate's visual landmark image is not visible in the real-time image, the candidate's visual landmark image is deleted from the instruction.
Preferably, the present invention determines whether the captured real-time image depicts an object of a predetermined object, and determines whether the object is visible within the real-time image based on at least one of the size or color of the object; if it is determined that the object is visible, the object is selected as the visual landmark image.
Preferably, a certain predetermined category of the present invention includes storefront signs, buildings, installation art, bridges, texts, vehicles, billboards, traffic lights or portraits.
Preferably, the processing module of the present invention further comprises: determining whether the captured real-time image depicts an object of a predetermined category, and determining, based on at least one of the size or color of the object, where the object is located. Whether the real-time image is visible; if it is determined that the pair is not visible, the captured real-time image is stored in the memory device and transmitted to the user interface, and the user can rely on subjective judging and selecting the best visual landmark image, performing a voting action, and sending the voting action back to the processing module; the processing module can perform calculations according to the voting results to obtain a best visual landmark image, and transmit the best visual landmark image to the landmark database as the subsequent visual landmark image.
Preferably, the processing module of the present invention further includes: the user is a plurality of users, which can select the best visual landmark image according to the subjective judgment of the plurality of users, and perform a voting action to vote the votes. The action is sent back to the processing module; the processing module can perform calculations according to the plurality of voting results of the plurality of users to obtain the best visual landmark image, and transmit the best visual landmark image to the landmark database is used as the subsequent visual landmark image.
The present invention further includes a method for automatically updating the visual landmark images in the landmark database, comprising:
In the method for automatically updating the visual landmark images in the landmark database of the present invention, the filtering rule is a frame area filtering rule or an aspect ratio parameter filtering rule.
The area parameter (frame area) filtering rules are the characteristics of the detected candidate landmark pictures themselves, and filter the candidate landmark pictures with unreasonable picture area:
The aspect ratio parameter is used to filter landmark maps with unreasonable rules. A reasonable aspect ratio should be greater than ⅕, the second-best is greater than ¼, and the best is greater than ⅓. In addition, a reasonable aspect ratio should be greater than ⅕. The aspect ratio should be less than 5, the second-best is less than 4, the best is less than 3, and in another preferred embodiment, the best aspect ratio can be between ⅓ to 3.
In the present invention, the features of the landmark pictures are further extracted through the convolutional neural network model. The input parameter of the model is the original frame of the landmark image (raw frame), and the output is the feature of the image. Use this feature to calculate the similarity between landmark images.
The system of the present invention provides the user with navigational directions using visual landmarks that may be visible when the user arrives at the corresponding geographic location. In a preferred embodiment, the system selects a candidate visual landmark image from an extensive visual landmark database. The system calculates the time of day, current weather conditions, current season, and more. In addition, the system can collect real-time images through a camera on the vehicle's dashboard, a camera in a smartphone, or another user's camera. The system may also provide feedback on the visibility or prominence of the landmark to improve the visual landmark imagery for subsequent users of the system.
In order to let the reviewer further understand the present invention, the preferred embodiment will be described in detail as the following description:
The present invention provides an automatic system for visual guidance and navigation using real-time visual anchor point detection, which is shown in
Wherein, the cloud device 14 includes: a navigation instruction generator that generates a navigation instruction, and an action intersection; a route module, which queries the route from the landmark database according to the current geographic location of the vehicle and the destination; a navigation instruction generator 21, which generates the navigation instruction according to the route of the route module 22, and defines an action intersection according to the navigation instruction; a landmark query module 24 that queries visual landmark images from the landmark database 30 according to the action intersection; and a landmark update module 25, which automatically updates the visual landmark images of the landmark database 30. Wherein the landmark database 30 includes a landmark record, a visual landmark image, the intersection where the landmark is located, or the longitude and latitude of the landmark.
The processing module 15 further includes: receiving a candidate's visual landmark image at the current geographic location of the vehicle, and comparing the captured real-time image with the received candidate visual landmark image, to determine whether the candidate visual landmark image is visible in the real-time image; when the candidate visual landmark image is not visible in the real-time image, the candidate visual landmark image is deleted from the instruction. Also, the processing module 15 of the present invention determines whether the captured real-time image depicts an object of a predetermined object, and determines whether the object is visible within the real-time image based on at least one of the size or color of the object; if it is determined that the object is visible, the object is selected as the visual landmark image.
The present invention provides an automatic visual landmark image acquisition and landmark database update function as shown in
For example, as shown in
The similarity of these pairs (S12) was used to estimate the weight score of each landmark (Confidence), and then the five candidate landmarks are sorted according to this weight scores. Take L1 as an example, its C1=f(S1n) (n=2˜5). The lower the score, the less similar it is to other candidate landmarks, and the more representative it is. Therefore, it is used as the candidate landmark image of this intersection. In
When a user loads the automatic system 100 of the present invention for visual guidance and navigation using real-time visual anchor point detection in a vehicle, the vehicles become data collectors and can function regardless of whether the vehicle is navigating. The present invention designs an automated system that can collect data from these vehicles, scale up with low labor costs , and quickly adapt to dynamically changing environments. The present invention uses camera 11 in the moving vehicle. Camera 11 can be installed in a preset location, and the location can be considered according to the size and type of the vehicle. Any location where it is convenient to collect video-related visual anchor features, collect videos to retrieve the set features of related visual anchors , the visual anchors include, but are not limited to signs, specific buildings, installation art, bridges, text, vehicles, billboards, traffic lights or portrait, and visual icon images. Each vehicle can be regarded as a visual landmark image collector. Each landmark image collector is equipped with a camera and a GPS sensor, so the GPS location of each video can be recorded. When the original video is collected, the landmark image detector detects visual anchors, and crops visual landmark images, which can be signs, specific buildings, installations, bridges, text, vehicles, billboards, traffic lights, or people. Thus, the system can collect multiple images of visual landmarks and their attributes, such as GPS locations.
The system of the present invention executes an automatic update program and uses the collected visual landmark images to improve the landmark database, and the process is shown in
The present invention uses the automatic system and method of visual guidance navigation of real-time visual anchor point detection, can automatically update the visual landmark image in the landmark database, as shown in
The present invention simulates the user scenario, and its process flow is shown in
Taking
In summary, in the present invention, the wireless network module 14, the map database 40, the landmark database 30, and the application program running on the edge device are formed. The map database 40 contains map information such as intersections, latitude and longitude of intersections, and road travel directions. The landmark database 30 includes a landmark record, a picture corresponding to the landmark, the intersection where the landmark is located, and the latitude and longitude of the landmark. On the wireless network module 14 sides, the database stores multiple landmark records as shown in
In edge device 10, a processing module is used for connecting with the server and collecting images to provide a visual guidance function for the user. When a route is planned and all action points are obtained by the navigation instruction generator, the visual anchors and their features for each action point are retrieved from the landmark database. When the user approaches the action point notified by the navigation engine, the processing module will find the corresponding visual anchor by comparing the features of the visual anchor with the features of the sign/landmark image in the video, and the visual anchor will be displayed on the user interface.
Although the present invention has been described in terms of specific exemplary embodiments and examples, it will be appreciated that the embodiments disclosed herein are for illustrative purposes only and various modifications and alterations might be made by those skilled in the art without departing from the spirit and scope of the invention as set forth in the following claims.