The present disclosure relates to the field of data processing technologies and in particular to a method and system for interactively searching for a target object and a storage medium.
Along with rapid development of the security protection technologies, a security protection system is deployed in many key regions. The cameras in the security protection system may perform all-weather monitoring for security protection regions by collecting and recording videos. Furthermore, the existing security protection systems also allow users to plan a region as a forbidden region in a video picture at a webpage end (e.g. web end) and thus carry out key monitoring for the forbidden region.
The present disclosure provides methods and system for interactively searching for a target object and a storage medium, so as to solve the shortcomings in the related arts.
According to one aspect, there is provided a method of interactively searching for a target object, which is applied to a server side and includes:
In some embodiments, the first interactive instruction includes: coordinate data of a recognition region, user data obtained by code scanning, or both.
In some embodiments, determining the target object corresponding to the first interactive instruction includes:
In some embodiments, determining the target object corresponding to the first interactive instruction includes:
In some embodiments, obtaining the target information of the target object includes:
In some embodiments, the method further includes:
In some embodiments, obtaining the target information of the target object includes:
In some embodiments, the scenario information includes: an image collected by the second terminal, a scenario element, or both; where the scenario element includes: a second terminal device serial number, a peripheral device serial number of a peripheral device around the second terminal, or both.
In some embodiments, when the scenario information includes picture information and obtaining the target information of the target object based on the scenario information includes:
In some embodiments, the scenario information includes picture information and obtaining the target information of the target object based on the scenario information includes:
In some embodiments, the target information includes the navigation route, and obtaining the target information of the target object based on the scenario information further includes:
In some embodiments, the target information further includes a navigation identifier, and obtaining the target information of the target object based on the scenario information further includes:
In some embodiments, the target information includes the current position of the target object, and obtaining the target information of the target object includes:
In some embodiments, obtaining the position data of at least three reference signal sources close to the first terminal includes:
In some embodiments, based on the position data of at least three reference signal sources, determining the position data of the first terminal includes:
In some embodiments, calculating the position data of the first terminal is based on predetermined weighted centroid algorithm.
In some embodiments, the formula of a weighted centroid algorithm is as follows:
In some embodiments, the scenario information includes picture information and obtaining the target information of the target object based on the scenario information includes:
In some embodiments, the scenario information includes picture information and obtaining the target information of the target object based on the scenario information includes:
In some embodiments, based on the picture information, obtaining the target sub-map data of the place where the second terminal is located includes:
In some embodiments, the target sub-map data includes: map data, map recognition code, or both.
In some embodiments, matching the feature vector of the picture information with the feature vectors in the predetermined visual sub-map database to obtain the feature vector with maximum similarity includes:
In some embodiments, obtaining the target information of the target object includes:
In some embodiments, obtaining the second coordinate data of the recognition region includes:
In some embodiments, the scenario information includes picture information and based on the position data of the second terminals distributed at different positions and the second mapping relationship, obtaining the scenario information of the scenario of the second terminal includes:
In some embodiments, obtaining the candidate feature vectors matching the feature vector of the current image in the object database to obtain the target feature vector includes:
In some embodiments, the object database includes a first database and updating the object database includes:
In some embodiments, the object database includes a second database and updating the object database further includes:
In some embodiments, sending the target information to the first terminal includes:
In some embodiments, the method further includes:
In some embodiments, the method further includes:
In some embodiments, stopping obtaining the target information of the target object includes:
According to another aspect, there is provided a method of interactively searching for a target object, which is applied to a first terminal side. The method includes:
In some embodiments, the first interactive instruction includes: coordinate data of a recognition region, user data obtained by code scanning, or both.
In some embodiments, the target information includes at least one of: a current position of the target object, picture information of the current position, a historical trajectory of the target object, a navigation route for searching for the target object, or any combination thereof.
In some embodiments, the first terminal further includes an offline predetermined visual sub-map database, and when the target information includes a sub-map recognition code of a target sub-map, displaying the target information of the target object includes:
In some embodiments, the target information includes the navigation route and displaying the target information of the target object includes:
In some embodiments, the target information includes a navigation identifier and displaying the target information of the target object includes:
In some embodiments, the method further includes:
According to another aspect, there is provided a system for interactively searching for a target object, including a server, a first terminal and a second terminal; where,
In some embodiments, the second terminal includes at least one of: a mobile terminal, a vertical screen, a spliced screen, a camera, or any combination thereof.
In some embodiments, the server includes a feature vector obtaining module, a feature vector search engine and an object database;
According to another aspect, there is provided with a security protection system, which includes a first terminal, a second terminal and a server. The server includes:
According to another aspect, there is provided with a first terminal, including:
According to another aspect, there is provided with a computer readable storage medium, wherein executable computer programs in the storage medium are executed by a processor to perform the above methods.
The technical solutions of the present disclosure have the following beneficial effects.
In the solutions provided by the present disclosure, after the first interaction instruction is obtained from the first terminal, the target object corresponding to the first interactive instruction is determined; then, the target information of the target object is obtained, where the target information includes at least one of: a current position of the target object, picture information of the current position, a historical trajectory of the target object, a navigation route, or any combination thereof; finally, the target information is sent to the first terminal, such that the first terminal displays the target information. Thus, the target information of the target object can be obtained and displayed to help the user to search for the target object, thereby increasing the searching efficiency and the use experiences.
It should be understood that the above general descriptions and subsequent detailed descriptions are merely illustrative and explanatory rather than limiting of the present disclosure.
In order to more clearly describe the technical solutions of the embodiments of the present disclosure, the drawings required for descriptions of the embodiments will be briefly introduced. Apparently, the drawings described hereunder are only some embodiments of the present disclosure. Those skilled in the arts can obtain other drawings based on these drawings without making creative work.
The technical solutions of the embodiments of the present disclosure will be fully and clearly described below in combination with the drawings in the embodiments of the present disclosure. Apparently, the embodiments described herein are only some embodiments of the present disclosure rather than all embodiments. All other embodiments obtained by those skilled in the arts based on these embodiments in the present disclosure without making creative work shall fall within the scope of protection of the present disclosure.
In order to solve the above technical problems, the present disclosure provides a method of interactively searching for a target object, which may be applied to a system for interactively searching for a target object. In some embodiments, the system for interactively searching a target object may include a first terminal, a second terminal and a server. The first terminal may be a device for displaying a picture for a user to search for an object and inputting an interactive instruction, for example, a smart phone, a tablet computer, a personal computer, a vertical screen, a spliced screen, or a joint terminal or the like for the user to perform trigger operation. The second terminal may include a plurality of devices having image collection function, for example, a camera, a vertical screen or the like, which can upload multimedia files such as a collected image or video stream, or an uploaded picture or video stream to the server. The server may be implemented by at least one server or a server cluster and can process the images uploaded by the second terminal upon receiving an interactive instruction from the first terminal so as to obtain target information of the target object and send it to the first terminal for displaying. In subsequent embodiments, the solutions of the embodiments will be described with the server performing a method of interactively searching for a target object, but the solutions do not constitute any limitation.
At step 11, a first interactive instruction is obtained from the first terminal.
In some embodiments, when displaying a picture, the first terminal may detect whether a user triggers a touch screen of the first terminal. When an interactive instruction is to be input, the user may perform trigger operation on an interactive interface of the first terminal to input the interactive instruction. The above interactive instruction may include but not limited to coordinate data of a recognition region, an image of a recognition object, a stored image, user data obtained by code scanning, and user feature data obtained by recognizing the collected user image and the like, which can be selected based on specific scenarios and not limited herein. For ease of descriptions, in the present disclosure, an instruction for establishing a task is referred to as the first interactive instruction and an instruction for cancelling the previous instruction for establishing the task is referred to as a second interactive instruction.
With a person searching scenario as an example, the first interactive instruction may be coordinate data of a recognition region. For example, in a case of a person-searching requirement, the user may click on several positions on the interactive interface to form a polygon, i.e. a recognition box. The region in the recognition box is the recognition region. The recognition region includes a to-be-recognized object. At this time, the first terminal may obtain coordinate data of the recognition region (representing the coordinate data of the recognition region by using the coordinate data of vertices of the recognition region) and send the coordinate data as the first interactive instruction to the server.
With another person-searching scenario as an example, the first interactive instruction may be a position image of a recognition object. For example, in a case of a person-searching requirement, the user may select an upload button in the interactive interface and select one picture containing a to-be-searched object. The first terminal may recognize each object in the image based on a predetermined recognition algorithm and generate a minimum bounding box of each object, and then based on the minimum bounding box, crop the above image to obtain a position image corresponding to each object. The first terminal may upload the position image of the recognition object as the first interactive instruction to the server. Thus, the server can obtain the above first interactive instruction.
With a vehicle-searching scenario in a parking lot as an example, the first interactive instruction may be user data obtained by code scanning. For example, in a case of a vehicle-searching requirement in the parking lot, the first terminal may display a code scanning interface or a two-dimensional code in the interactive interface in response to a trigger operation of the user. For example, if a code-scanning interface is displayed, the first terminal may scan the two-dimensional code displayed on the vertical screen of the second terminal to establish communication with the vertical screen, upload user data, and establish a mapping relationship, where the user data may include at least one piece of information entered by the user during registration, for example, a name, a user name, a vehicle plate number and a first terminal recognition code and the like. The first terminal may send the user data as the first interactive instruction to the server through the vertical screen. Thus, the server can obtain the above first interactive instruction.
With another vehicle-searching scenario in a parking lot as an example, the first interactive instruction may be user feature data obtained by recognizing a user image collected by a camera with the user's knowledge. For example, in a case of a vehicle-searching requirement in the parking lot, the first terminal may switch from the interactive interface to a shooting interface in response to a trigger operation of the user and shoot an image upon appearance of a face in the shooting interface; based on a predetermined image recognition model, recognize the image to obtain user feature data. The first terminal may be a terminal device held by the user or a device such as a vertical screen in the parking lot, which may be set based on specific scenarios. Finally, the vertical screen in the parking lot may send the user feature data as the first interactive instruction to the server. Thus, the server can obtain the above first interactive instruction.
It should be noted that in some embodiments, the first terminal may directly send the coordinate data of the recognition region as the first interactive instruction to the server, or send the initial coordinate data as the first interactive instruction to the server, where the value of the coordinate data of each vertex in the initial coordinate data is within a predetermined range which is [0,1]. Thus, as shown in
At step 21, the first terminal determines a recognition region based on a target operation or a position of a target object in a current picture (referred to as second picture below). In practical applications, if setting a recognition region is needed, the user usually clicks on a plurality of points continuously in the second picture on a display (current picture or presented by a browser), where these points are taken as vertices for enclosing a recognition region.
In other words, the recognition region usually has a plurality of vertices which are sequentially connected to form one polygon. A region inside the polygon forms a recognition region, for example, a triangular region, a tetragon region or the like. In some embodiments, first coordinate data of each vertex is within a predetermined range which is [0, 1]; or, a reference picture is set, where the upper left corner of the reference picture is set to origin (0, 0), and the lower right corner of the reference picture is set to (1, 1); and thus, the initial position data of the recognition region is the coordinate data of the above reference picture.
In some embodiments, functional components are usually disposed in the interactive interface. When detecting an operation for selecting a function component in the second picture, the first terminal may display an interactive interface corresponding to the above functional component. The interactive interface may include a brush component and a store component. As shown in
When detecting an operation for selecting the brush component, the first terminal may obtain a trigger position of the brush component in the second picture and take the trigger position as one vertex of a recognition region. At this time, third coordinate data of the trigger position can be obtained.
With obtaining one vertex P of the recognition region as an example, as shown in
abscissa x=ev.clientX+document.body.scrollLeft−document.body.clientLeft−document.querySelector(‘.frame’).getBoundingClientRecto.left;
x=distance of the vertex P from the left side of the window (i.e. ev.clientX)+width of the page hidden at the left side of the scroll bar when the page is moved to the right side by the scroll bar (i.e. document.body.scrollLeft)−width of the left side of the region in which contents are seen through the browser (i.e. document.body.clientLeft)−distance of the second picture from the left side of the window (document.querySelector(‘.frame’).getBoundingClientRecto.left).
ordinate y=ev.clientY+document.body.scrollTop−document.body.clientTop−document.querySelector(‘.frame’).getBoundingClientRecto.Top;
y=distance of the vertex from the top of the window (i.e. ev.clientY)+width of the page hidden at the top end of the scroll bar when the page is moved to the bottom end by the scroll bar (i.e. document.body.scrollTop)−width of the top of the region in which contents are seen through the browser (i.e. document.body.clientTop)−distance of the second picture from the top of the window (document.querySelector(‘.frame’).getBoundingClientRect( ).Top).
At step 22, the first terminal obtains the third coordinate data of each vertex in the recognition region, and stores the third coordinate data of each vertex into a set array pointlist at a specified position. In this step, after obtaining the third coordinate data of the trigger position, the first terminal may store the third coordinate data into the set array pointlist. The user may perform repeated operations (several click operations) by using the brush component in the second picture, and the first terminal may detect a plurality of trigger positions and the third coordinate data of each trigger position. In practical applications, when three or more trigger positions are detected, the first terminal may connect these trigger positions sequentially to form one closed candidate region and display it in the second picture for viewing by the user.
At step 23, the first terminal adjusts the third coordinate data of each vertex based on a size of the second picture to enable the adjusted first coordinate data to be within the predetermined range. With continuous reference to
At step 24, the first terminal records the first coordinate data of each vertex in a predetermined sequence to obtain the coordinate data of the recognition region, i.e. the initial position data, and store the initial position data into the set array areapoints. The above predetermined sequence may include clockwise or counterclockwise. For example, the first terminal may, with the center of the recognition region as a reference, record the first coordinate data of each vertex of the recognition region in a clockwise sequence, where the first coordinate data of all vertices of the recognition region can form the initial position data. Furthermore, the first terminal may store the above initial position data into the set array areapoints, and the initial position data in the set array areapoints may be uploaded as the first interactive instruction to the server.
It should be noted that the set array pointlist is a variable name with which the first terminal stores the third coordinate data and the set array areapoints is a variable name with which the first terminal stores the first coordinate data. In other words, the set array pointlist and the set array areapoints are used to respectively store the coordinate data before and after processing. In some embodiments, the above set array pointlist and the set array areapoints may be implemented by using a first-in-first-out queue such that the vertices of the recognition region can be easily stored in a predetermined sequence and easily read in a predetermined sequence, so as to achieve the effect of increasing the access efficiency.
In some embodiments, when detecting an operation for selecting the store component, the first terminal may display predetermined prompt information for prompting completion of configuration of the recognition region in the second picture. In some embodiments, the first terminal may display the predetermined prompt information of “region configuration completion” at the upper left corner of the second picture, and remind the user by a 3-second fade-out animation effect such that the user can determine that the matching process for the recognition region is completed, thereby improving the use experiences.
In some embodiments, the data structure of the initial position data includes:
In combination with the coordinate structure of the set array areapoints in the above data structure, the third coordinate data is 52,133.25,581,81.25,754,298.25,42,324.25, namely,
For example, the width and the height of the second picture are 800 and 450 respectively, which are measured in the unit of pixels. In this way, the converted first coordinate data is as follows:
0.065,0.2961111111111111,0.72625,0.18055555555555555,0.9425,0.66277777 77777778,0.0525,0.7205555555555555
In some embodiments, with continuous reference to
In some embodiments, in combination with the data structure of the initial position data, the first terminal may also carry out determination on the recognition function, which, as shown in
At step 51, when detecting an operation for selecting the store component, the first terminal detects whether a recognition function type of the recognition region is already configured, where the recognition function type includes a full-graph configuration and a region configuration. The full-graph configuration refers to that the recognition region occupies the whole second picture and the region configuration refers to that the recognition region occupies a part of the second picture.
At step 52, when detecting that the function recognition type is not configured, the first terminal displays a type interactive interface containing a full-graph configuration component and/or region configuration component.
At step 53, when detecting an operation for selecting the full-graph configuration component or the region configuration component, the first terminal stores the recognition function type. Thus, the setting of the recognition function type can guarantee the integrity of the data structure and help displaying the recognition region subsequently.
When a second terminal needs to display the above recognition region and an object, the second terminal may read the above initial position data from the server and obtain a size of its current picture, i.e. a first picture. It is considered that when the display contents of one terminal are transferred to another terminal for displaying, the display contents may be affected by the sizes of the two terminals or picture sizes. Thus, the second picture of the first terminal may be changed when transferred to the first picture of the second terminal.
Thus, the second terminal may obtain a first size of the first picture and a second size of the second picture and then obtain a ratio of the first size to the second size and then obtain the size of the first picture by multiplying the size of the second picture by the above ratio. In other words, when the size of the second terminal changes relative to the size of the first terminal where the user sets the recognition region, the first picture and the second picture both change pro rata and further the recognition regions also change pro rata, thereby avoiding the problem of failure to display the recognition region or deformation of the recognition region, and hence improving use experiences. Therefore, the recognition region can be set in any display device, and the above recognition region can be displayed on any display device of the system for interactively searching for a target object, eliminating the need of configuring the above recognition region on all display devices and improving the configuration efficiency.
For ease of calculation, if the size of the second picture is identical to the size of the first picture, the ratio of the size of the first picture to the size of the second picture is 1, namely, the width and the height of the first picture are 800 and 450 respectively which are measured in the unit of pixels.
It should be noted that, considering different configuration data for different terminals to display pictures, the second terminal may obtain the coordinate data, i.e. the initial position data, of the recognition region uploaded by the first terminal and based on a predetermined conversion rule, generate a corresponding shape. For example, the recognition region of the first terminal may be a rectangle whereas the recognition region of the second terminal may be a triangle, and thus the predetermined conversion rule may be that one minimum circumscribed triangle is generated based on the shape of the recognition region in the initial position data, and then taken as a recognition box of the second terminal. For another example, the recognition region of the first terminal may be a triangle whereas the recognition region of the second terminal may be a rectangle, and thus the predetermined conversion rule may be that one minimum circumscribed rectangle is generated based on the shape of the recognition region in the initial position data and then taken as a recognition box of the second terminal. In other words, after the initial position data is obtained, the initial position data may be firstly converted into target position data corresponding to the present terminal and the shape of the recognition region is obtained, and then based on the above shape, one minimum circumscribed shape is generated as the shape of the recognition region of the present terminal. It can be understood that the above predetermined conversion rule is only an example of conversion between different recognition region shapes, and those skilled in the arts can select a corresponding conversion rule based on specific scenarios, for example, select a minimum circumscribed circle or a minimum inscribed circle or the like, and hence, the corresponding solutions shall fall within the scope of protection of the present disclosure. It can be understood that in some embodiments, the recognition boxes of different terminals are allowed to have different shapes to meet the use requirements of different users, helping improve the use experiences. Alternatively, when a target object appears in different terminals, the recognition boxes of different terminals may mark the same target object to ensure the target object will not be lost, achieving the effect of tracking the object.
At step 12, a target object corresponding to the first interactive instruction is determined.
In some embodiments, after obtaining the first interactive instruction, the server may determine the target object corresponding to the first interactive instruction. The target object may include at least one of: a person, a vehicle, a recognition box of a recognition region, or any combination thereof. When the target object is a person, the target object may be represented by a feature vector of a position image containing the person; when the target object is a vehicle, the target object may be represented by a vehicle plate number of the vehicle or by a feature vector of a position image of a user searching for the vehicle; when the target object is a recognition box of a recognition region, the recognition box of the recognition region may include the above recognition box of rectangle, triangle and circle and the like.
In some embodiments, for example, with the target object as a person, when the first interactive instruction is the coordinate data of the recognition region, as shown in
In some embodiments, for example, with the target object being a person, when the first interactive instruction is the coordinate data of the recognition region, as shown in
In some embodiments, for example, with the target object as recognition box, when the first interactive instruction is the coordinate data of the recognition region, the server generates a recognition box based on the coordinate data of the recognition region and represents the target object by using the recognition box.
Considering that the coordinate data of the recognition region in the first interactive instruction may be stored in the same manner as the above initial position data, the server may, at this time, obtain the coordinate data of each vertex in the initial position data, for example,
0.065,0.2961111111111111,0.72625,0.18055555555555555,0.9425,0.66277777 77777778,0.0525,0.7205555555555555,
Then, the sever obtains may obtain a size of the first picture, where the first picture may be a picture displayed by the first terminal or a picture displayed by the second terminal. Furthermore, the server may obtain target coordinate data of each vertex in the first picture by multiplying the abscissa of the each vertex by the width of the size of the first picture and multiplying the ordinate of each vertex by the height of the size of the first picture.
For example, with the first picture and the second picture being same in size, it is assumed that the width and the height of the second picture are 800 and 450 respectively, which are measured in the unit of pixels. Thus, after obtaining the second coordinate data of each vertex in the first picture, the server may obtain the target coordinate data, namely, the second coordinate data of each vertex of the recognition region in the first picture is same as third coordinate data in the second picture, and thus the second coordinate data in the target coordinate data is obtained as follows: 52,133.25,581,81.25,754,298.25,42,324.25;
It is assumed that the width and the height of the second picture are 800 and 450 respectively, which are measured in the unit of pixels. After the second coordinate data of the each vertex in the first picture is obtained, the target coordinate data can be obtained.
In some embodiments, after the recognition region is displayed in the first picture, a background color of the recognition region may be further adjusted to a target color, for example, green or red the like. With the recognition region as forbidden region, the background color may be adjusted to red at this time; with the recognition region as open region, the background color may be adjusted to green at this time. By adjusting the background color the recognition region, the recognizability of the recognition region can be increased.
In some embodiments, when the first interactive instruction is user data obtained by code scanning, the server may determine a target object based on the user data. For example, based on a vehicle plate number in the user data and a parking position of the vehicle corresponding to a vehicle number in the historical data, a parking position of the vehicle can be determined, and a recognition box is generated at the parking position of the vehicle. Alternatively, the server may obtain a current position of the first terminal and then generate a recognition box corresponding to the current position on a map and represent the target object by using the recognition box.
It is noted that in a person-searching scenario, the user usually searches for a target object based on a picture of the first terminal; at this time, the target object may appear within a coverage scope of the second terminal, and thus a picture collected by the second terminal may be synchronized to the first terminal for displaying. For the target object and the user (monitoring person), the user may be considered to be in stationary state while the target object may be considered to be in a movement state, and the movement direction of the target object to the user is random. In a vehicle-searching scenario, in the present disclosure, the vehicle may be considered as a user in a person-searching scenario and the driver may be considered as a target object in the person-searching scenario; the movement direction of the target object is moving toward the vehicle. Based on the essence of the above idea, the movement of the driver and the stationariness of the vehicle are relative to each other and thus, the movement of the driver may be converted into relative movement of the vehicle, that is, the driver itself moves but is considered as stationary whereas the vehicle itself is stationary but is considered as moving. In other words, in the present disclosure, the person-searching scenario and the vehicle-searching scenario belong to a same idea, and hence, the target object (to-be-searched person or driver) can be searched for based on the person-searching idea.
At step 13, target information of the target object is obtained, where the target information includes at least one of: a feature vector of the position image, a current position of the target object, picture information of the current position, a historical trajectory of the target object, a navigation route, or any combination thereof.
In some embodiments, the server may obtain the target information of the target object.
In some embodiments, the target information may be a feature vector of the position image. As shown in
It should be noted that in a person-searching scenario, the above first terminal is used to display a picture of the second terminal containing the target object, and thus, the first terminal actually is the second terminal containing the above target object. In a vehicle-searching scenario, the above first terminal is also used to display a picture of the second terminal (e.g. a vertical screen or a camera) containing the target object and the position of the first terminal is represented by using the position of the second terminal, and thus the first terminal actually is the second terminal containing the above target object.
In other words, the above first mapping relationship may be understood as a second mapping relationship between first terminal and second terminal. Thus, the server may store, in advance, the second mapping relationship between first terminal and second terminal, where the second mapping relationship may include a first terminal device serial number, a second terminal device serial number, a physical position of each terminal, and a serial number of a terminal around each terminal and the like. Those skilled in the art may set the contents of the second mapping relationship based on specific scenarios and the corresponding solutions fall within the scope of protection of the present disclosure. By obtaining the peripheral device serial number of a peripheral device around at least one second terminal around the first terminal, a source for obtaining other data subsequently can be determined, helping increase the data processing efficiency.
In some embodiments, as shown in
In some embodiments, the scenario information includes picture information, and based on the position data of the second terminals distributed at different positions and the second mapping relationship, obtaining, by the server, the scenario information of the scenario of the second terminal includes: obtaining, by the server, a feature vector of the current image corresponding to the first interactive instruction; then, obtaining, by the server, candidate feature vectors matching the feature vector of the current image in the object database to obtain a target feature vector, where the object database includes candidate feature vectors corresponding to the picture information uploaded by the second terminal in the second mapping relationship with the first terminal; then, obtaining, by the server, picture information corresponding to the target feature vector and taking the picture information as the scenario information. In this case, it is only required to determine the target feature vector in the candidate feature vectors in the second mapping relationship with the first terminal, thus improving the efficiency of obtaining the scenario information.
In some embodiments, in a process of obtaining the target feature vector, the server may update the object database, where the object database includes several candidate feature vectors. For example, when the object database includes a first database, the server may obtain a current image where the target object is located, where the current image is a first image of a current stream address corresponding to the first interactive instruction; recognize a second number of objects in the first image to obtain a second number of position images; respectively extract a feature of each position image to obtain a second number of feature vectors; update the second number of feature vectors to the first database and obtain feature vector identifiers (ID). For another example, when the object database includes a second database, the server may store the second number of position images into the second database and obtain an accessible Uniform Resource Locator (URL) address of each position image returned by the second database and a feature vector ID generated when the feature vector of each position image is updated to the first database; based on the feature vector ID and the URL address, generate a second number of pieces of data and store the second number of pieces of data into the second database. Therefore, the server can obtain a position image in the second database once obtaining a feature vector ID to achieve one-to-one matching effect of feature vector and position image. Then, the server may match the feature vector of the target object with the candidate feature vectors in the object database to obtain a first number of candidate feature vectors. Then, the server may obtain a candidate feature vector with maximum similarity as the target feature vector from the first number of candidate feature vectors.
In some embodiments, for example, with the target information as the historical trajectory of the target object, as shown in
It can be understood that at step 102, when sorting the time that the target object appears in each piece of scenario information, the server obtains one piece of scenario information closest to a current time; the server obtains a position of the second terminal corresponding to this scenario information, and takes the position of the second terminal as a current position of the target object and hence the server takes the current position of the target object as the target information of the target object.
In some embodiments, for example, with the target information as picture information of the target object, as shown in
At step 111, the server obtains at least one object of the picture information in the scenario information and obtains a position image corresponding to each object. At step 112, the server determines a target position image matching the current image. The second terminal may upload the collected image or video frame to the server. The server may adjust the image or video frame uploaded by the second terminal, obtain at least one object of each image or video frame and obtain the position image corresponding to each object. Then, the server may extract a feature vector of each position image and store it into the first database of the object database as candidate feature vector. The server may obtain the feature vector of the current image and match the feature vector of the current image with the candidate feature vectors in the first database of the object database, for example, calculate a similarity of two feature vectors; then, sort the similarities and take a candidate feature vector corresponding to maximum similarity as the target feature vector and take the position image corresponding to the target feature vector as the target position image. At step 113, the server, based on a correspondence between scenario information and second terminal, determines position data of the second terminal corresponding to the target position image and takes the position data of the second terminal as the target information of the target object, so as to obtain the real-time position of the target object.
In some embodiments, for example, with the target information as the navigation route, as shown in
Thus, the server may, based on the position data of the at least three reference signal sources, determine the position data of the first terminal itself. It should be noted that the reference signal sources may be signal sources individually disposed for positioning (e.g. Bluetooth signal transmitter), or the first terminal is determined by using the second terminal with known position information, which is based on similar positioning principle and will not be repeated herein. It is further noted that when the navigation route is obtained, the position information of the first terminal is the actual position of the first terminal itself; when the first terminal displays a picture coming from the second terminal, it is required to switch to a picture of a particular second terminal and the position information of the first terminal is an equivalent position which is the position of the second terminal corresponding to the currently-displayed picture. The specific meaning of the position information of the first terminal may be analyzed based on specific scenarios.
At step 122, the server, based on the position data of the first terminal and the position data of the second terminal corresponding to the target position image, determines a navigation route between the first terminal and the second terminal, and takes the navigation route as the target information of the target object. Thus, by obtaining the navigation route, the user can accurately obtain the target object in a case of holding the first terminal, improving the searching efficiency.
In some embodiments, the target information may further include a navigation identifier used to assist the user in determining a movement route and a movement direction. The server may, based on an orientation of the first terminal and the navigation route, generate a navigation identifier and take the navigation identifier as the target information of the target object. As shown in
In some embodiments, when the scenario information includes picture information, the step 93 of obtaining, by the server, the target information of the target object based on the scenario information includes: obtaining, by the server, the picture information of the second terminal closest to the first terminal and taking the picture information as the target information of the target object. For example, during a movement process, the first terminal may establish communication with the second terminal through Bluetooth or wireless network communication technology (WiFi) and take the closest second terminal as the current position of the first terminal and take the picture information uploaded by the second terminal as the target information of the target object, namely, take the picture information of the second terminal as the target information of the target object. Thus, the first terminal can persistently display the picture information of the closest second terminal, helping the user to find the target object in time.
In some embodiments, the server may send the picture information of two or more second terminals which are close to each other to the first terminal such that the first terminal can display multiple pieces of picture information by split screen. When displaying multiple pieces of picture information, the first terminal may detect an operation that the user clicks on a picture and display this picture separately. Of course, the first terminal may also detect a user return operation to restore the displaying of the multiple pieces of picture information. By displaying a particular picture or multiple pictures, the user can conveniently view each of the pictures so as to quickly find the target object.
In some embodiments, the server may send recognition code information (e.g. MAC address, and serial number or name of the second terminal) of the second terminal close to the first terminal to the first terminal, such that the first terminal can establish bluetooth connection or WiFi connection or another wireless connection with the second terminal based on the above recognition code information, and receive picture information of the second terminal through the wireless connection, so as to reduce the data transmission pressure of the server.
In some embodiments, the target information may further include target sub-map data which may include: map data, map recognition code, or both. The map data refers to a map image that can be directly used, and the map recognition code refers to a recognition code of a map image. When no map data is stored in the first terminal, the target sub-map data may be the map data such that less storage resources in the first terminal can be occupied. When the map data is downloaded in advance in the first terminal, the target sub-map data may be a map recognition code and thus, the first terminal can obtain local map data when obtaining the map recognition code and further obtain a target sub-map, such that the data transmission amount between the first terminal and the server can be reduced, thereby improving the communication efficiency. The server can obtain the picture information of the second terminal closest to the first terminal, and then based on the picture information, obtain target sub-map data of a place where the second terminal is located, and take the target sub-map data as the target information of the target object. In some embodiments, the server may obtain a feature vector of the picture information. For example, the server may, based on a predetermined recognition algorithm, recognize the picture information of the second terminal to obtain a feature vector of the picture information or, determine multiple feature points (e.g. texture, corner and identifier information and the like) in the above picture information and based on the feature point data, generate a feature vector. Then, the server may match the feature vector of the picture information with the feature vectors in a predetermined visual sub-map database, for example, calculate a similarity between two feature vectors, so as to obtain a feature vector with maximum similarity. Then, the server may obtain the sub-map data corresponding to the feature vector with maximum similarity and thus obtain the target sub-map data. Therefore, by providing the target sub-map data, the accurate target information can be provided such that the user can conveniently determine the position and the search route of the target object, reducing the difficulty of searching for the target object and further improving the searching efficiency.
In practical applications, much sub-map data is stored in the visual sub-map database, which may increase a search time. For this reason, in some embodiments, a classification option may be added to the sub-maps in the visual sub-map database, where the classification option includes but not limited to: roof, ground, column, corner, parking space serial number, and identifier shape (e.g. triangle, rhombus, square, and trapezoid and the like) and the like. Based on the above classification option, the sub-maps in the visual sub-map database can be divided into multiple classes. As shown in
In some embodiments, the target information may be the coordinate data of the recognition region. Obtaining, by the server, the target information of the target object may include: determining, by the server, a recognition region corresponding to the first interactive instruction; then, obtaining, by the server, the coordinate data of the recognition region, and taking the coordinate data of the recognition region as the target information of the target object. It should be understood that the above recognition region may be a region formed when the user delimits a recognition box in a picture or a region of the target object recognized in the image uploaded by the second terminal, and hence, it is defaulted that the target object is within the recognition region. Based on the coordinate data of the recognition region, the recognition region or recognition box is generated to indicate the target object, helping the user to find the target object in time.
It should be noted that, considering that it is possible that the recognition box is displayed in the picture of the first terminal or the second terminal based on the coordinate data of the recognition region subsequently, the server may process the coordinate data of the above recognition region as initial position data. For example, the server may obtain the third coordinate data of each vertex of the recognition region, and then, based on the size of the picture in the first terminal, adjust the third coordinate data of each vertex to obtain the first coordinate data, where the first coordinate data is within a predetermined rang; afterwards, based on a predetermined sequence, record the first coordinate data of each vertex to obtain the position data of the recognition region. It can be understood that the specific contents of the initial position data can be referred to the contents shown in
At step 14, the target information is sent to the first terminal such that the first terminal displays the target information.
In some embodiments, the server may send the target information directly to the first terminal, such that the first terminal displays the above target information. Alternatively, the server may, based on the target information, process a multimedia file of the second terminal, for example, superimpose at least one of the recognition box, the historical trajectory, the navigation route, the current position, or the position image in the target information onto the video frame of the second terminal, so as to obtain a target multimedia file containing the target information. The server may push the target multimedia file to the first terminal. The display effect of the above target multimedia file played by the first terminal is as shown in
In some embodiments, the target information includes a recognition box. Namely, regardless of the fact that the target object is the picture of any second terminal during a movement process, the target object can be marked in the picture of the first terminal, such that the user can instantaneously find the target object in the picture without losing the tracked object, thus improving the use efficiency.
Since the coverage scope of the camera in the first terminal and the second terminal is limited, the target object may move out of the coverage scope of one terminal. For this reason, in some embodiments, the server may also detect whether the target object is out of the picture of the first terminal, that is, detect whether the target object leaves the coverage scope of one terminal. When detecting that the target object is out of the picture of the first terminal, the server may obtain a peripheral device serial number of a peripheral device around at least one second terminal around the first terminal and a feature vector of the target object, and then, based on the peripheral device serial number and the feature vector of the target object, re-obtain the information of the target object. Namely, with the peripheral device serial number of a peripheral device around at least one second terminal around the first terminal and the feature vector of the target object as the target object, the steps 12 to 14 are continued until a second terminal covering the target object is found; then, the picture information of the second terminal is fed back to the first terminal for displaying and at the same time, the target information of the target object is superimposed to generate the above target multimedia file. Thus, by switching the picture information of different second terminals to the first terminal, the effect of searching for the object across camera is achieved, improving the searching efficiency.
Since in some scenarios such as a person-searching scenario or a vehicle-searching scenario in an urban commanding system, one first terminal is not sufficient to satisfy the requirements of the actual scenarios, and thus, a plurality of first terminals may be disposed, where one of the plurality of first terminals serve as a primary terminal and others serve as secondary terminals. In this case, the server may send the target information to each of the first terminals, for example, firstly send the target information to the primary terminal of the first terminals and then based on a mapping relationship between primary terminal and secondary terminal, send the target information to each of the secondary terminals. In this way, the same target information can be displayed synchronously on the plurality of first terminals and the display requirements of a plurality of first terminals, especially the first terminals in different spaces are satisfied. In some embodiments, in a case of a plurality of first terminals, the terminals may be grouped and each group may display a part of the picture displayed by the primary terminal. For example, the first terminal may display a combination of 16 pictures, whereas four secondary terminals may display 4 of the 16 pictures respectively. Thus, it is guaranteed that a plurality of first terminals can collectively display some of the pictures, and the primary terminal can globally display all pictures. For example, in the urban commanding system, the first terminals in a provincial capital center and the first terminals in a prefecture-level city center can synchronously display at least some same contents, so as to improve the monitoring efficiency or use experiences. For another example, in a vehicle-following scenario, a vehicle ahead searches for a parking space in a parking lot while a vehicle behind can synchronize the picture in the camera of the vehicle ahead, increasing the observation view of field.
In some embodiments, the position information of the first terminal may be selected based on scenarios, for example, it is defaulted to use the position information of the primary terminal to represent the position of all first terminals. For another example, based on an operation on the primary terminal, the position information of the primary terminal or one secondary terminal may be selected to represent the position of all first terminals, such that the primary terminal operates one secondary terminal for task assignment, and the navigation route is generated based on the position of one secondary terminal and the like, helping expand the use scenarios and improve the use experiences.
It is considered that the user may cancel searching for a target object after finding the target object. Therefore, in some embodiments, the user may input a second interactive instruction on the interactive interface of the first terminal, for example, click a button for cancelling searching for person or cancelling searching for vehicle and send the second interactive instruction to the server. When detecting the second interactive instruction from the first terminal, the server stops obtaining the target information of the target object. For example, the server may obtain a peripheral device serial number of a peripheral device around a second terminal bound to the first terminal and restore the multimedia file of the second terminal corresponding to the peripheral device serial number to an original multimedia file and clear the corresponding stored data, for example, the feature vector or target information of the target object or the like. Thus, search for the target object can be stopped and the search experiences can be improved.
In the solutions provided by the present disclosure, a first interactive instruction is obtained from the first terminal; then, a target object corresponding to the first interactive instruction is determined; then, target information of the target object is determined, where the target information includes at least one of: a current position of the target object, picture information of the current position, a historical trajectory of the target object, a navigation route, or any combination thereof; finally, the target information is sent to the first terminal such that the first terminal displays the target information. By obtaining the target information of the target object and displaying the target information, the user can be helped to search for the target object, helping increase the searching efficiency and use experiences.
A system for interactively searching for a target object is deployed below in combination with a person-searching scenario and the above method of interactively searching for the target object. As shown in
In some embodiments, the real-time video display module A is used to acquire a stream media video stream, and the stream media service module is used to collect and upload a video real-time stream. The video service processing module C is deployed with multiple video stream services to acquire the video real-time stream uploaded by the stream media service module and send a processing result to a stream media server, i.e. the stream media service module. The video service processing module C is further deployed with a service used to monitor the message queue module E. The AI algorithm data receiving service module D is used to provide an http interface.
As shown in
When the first terminal displays a picture of one second terminal, if the target object is out of the picture of the first terminal, namely, leaves the coverage scope of the second terminal, the video processing service corresponding to the second terminal completes the following operations:
When there is no need to search for a target object, as shown in
In some embodiments, as shown in
A system for interactively searching for a target object is deployed below in combination with a vehicle-searching scenario and the above method of interactively searching for a target object. As shown in
In vehicle-searching scenarios, the vehicle is located in an underground parking lot. The scenarios may include: scenario 1 where no GPS signal is present in the parking lot; scenario 2 where there is no GPS signal but WiFi signal (or Bluetooth signal) in the parking lot; scenario 3, where if there is no WiFi signal, the first terminal uses a camera and an application program to perform offline navigation.
With the scenario 3 as an example, the first terminal may construct an offline map. For example, a visual positioning system, (if any) a Bluetooth positioning system and a parking lot system are constructed into a same world coordinate and thus, based on the world coordinate, the functions such as positioning/notification/switching and the like are achieved without error. For another example, the parking lot can be divided and numbered (for example, an area of 10 m times 10 m is called a sub-region) and then within each sub-region, an offline sparse three-dimensional feature point map is constructed based on VSLAM algorithm matching sparse feature; then, the key frames (the first images corresponding to the URL addresses), the three-dimensional map points and relevant parameters are stored as offline map data to obtain a visual sub-map database, where the visual sub-map database may be stored into the server.
As shown in
It should be noted that in step 5, the first terminal may be pre-installed with an application program (APP) to pre-download some sub-maps of the visual sub-map database to a local memory, and the pre-downloaded sub-maps are data obtained by obtaining sparse features based on VSLAM algorithm, which is lightweight data packet and can reduce occupation for the local memory. The first terminal may obtain a recognition code of a target sub-map returned by the server to select a target sub-image. Of course, the first terminal may also obtain a plurality of feature points of the current image and then based on the feature points, perform matching in the local memory to obtain the target sub-map and load it for displaying. Thus, the first terminal can achieve data processing without relying on the WiFi network in the parking lot. Furthermore, since the pre-stored sub-maps use only the feature points, the matching time can be reduced and the matching efficiency can be increased.
In some embodiments, model training may be performed for the querying and matching process in advance, such that for different frames (for example, due to different angles or different light rays), no better matching target sub-map can be obtained in the visual sub-map database, so as to improve the matching efficiency.
In some embodiments, the images of different regions in the parking lot can be classified, for example, classified into a first-class region based on triangular feature point, and into a second-class region based on rhombus and the like. When the camera of the first terminal detects the triangular feature points of the first-class region, it can perform matching in the sub-maps of the first classification in the visual sub-map database, so as to greatly increase the matching speed.
On the above method of interactively searching a target object, the present disclosure further provides a method of interactively searching for a target object, which is applied to a first terminal side. The method includes:
In some embodiments, the first interactive instruction includes: coordinate data of a recognition region, user data obtained by code scanning, or both.
In some embodiments, the target information includes at least one of: a current position of the target object, picture information of the current position, a historical trajectory of the target object, a navigation route for searching for the target object, or any combination thereof.
In some embodiments, the first terminal further includes an offline predetermined visual sub-map database, and when the target information includes a sub-map recognition code of a target sub-map, displaying the target information of the target object includes:
In some embodiments, the target information includes a navigation route, and displaying the target information of the target object includes:
In some embodiments, the target information includes a navigation identifier and displaying the target information of the target object includes:
In some embodiments, the method further includes:
It should be noted that the method shown herein matches the contents of the method shown in
On the basis of the above method of interactively searching for a target object, the present disclosure further provides a system for interactively searching for a target object, which includes a server, a first terminal and a second terminal; where,
In some embodiments, the second terminal includes at least one of: mobile terminal, vertical screen, spliced screen, camera, or any combination thereof.
In some embodiments, the server includes a feature vector obtaining module, a feature vector search engine and an object database;
It should be noted that the system shown herein matches the contents of the above method and can be referred to the contents of the above method and will not be repeated herein.
In some embodiments, there is further provided a security protection system, including a first terminal, a second terminal and a server. As shown in
In some embodiments, there is further provided a first terminal, including:
In some embodiments, there is further provided a non-volatile computer readable storage medium, for example, a memory containing executable computer programs. The executable computer programs may be executed by a processor to perform the method as shown in
Since the apparatus embodiments substantially correspond to the method embodiments, reference may be made to part of the descriptions of the method embodiments for the related part. The apparatus embodiments described above are merely illustrative, where the units described as separate members may be or not be physically separated, and the members displayed as units may be or not be physical units, i.e., may be located in one place, or may be distributed to a plurality of network units. Part or all of the modules may be selected according to actual requirements to implement the objectives of the solutions in the embodiments. Those of ordinary skill in the art may understand and carry out them without creative work.
It shall be noted that the relational terms such as “first” and “second” used herein are merely intended to distinguish one entity or operation from another entity or operation rather than to require or imply any such actual relation or order existing between these entities or operations. Also, the term “including”, “containing” or any variation thereof is intended to encompass non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements but also other elements not listed explicitly or those elements inherent to such a process, method, article or device. Without more limitations, an element defined by the statement “including a . . . ” shall not be precluded to include additional same elements present in a process, method, article or device including the elements.
The above are detailed descriptions of a method and an apparatus provided according to the embodiments of the present disclosure. Specific examples are used herein to set forth the principles and the implementing methods of the present disclosure, and the descriptions of the above embodiments are only meant to help understanding of the method and the core idea of the present disclosure. Meanwhile, those of ordinary skill in the art may make alterations to the specific embodiments and the scope of application in accordance with the idea of the present disclosure. In conclusion, the contents of the present specification shall not be interpreted as limiting to the present disclosure.
Number | Date | Country | Kind |
---|---|---|---|
202210186752.6 | Feb 2022 | CN | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2023/078791 | 2/28/2023 | WO |