The present application relates to the field of computer processing technology, in particular, to an information pushing method and apparatus.
Location-based service (geomarketing) is a new type of discipline that has emerged along with rapid development and wide application of geographic information systems (GIS), and is a powerful tool for industry-assisted decision-making and geographic market analysis. The location-based service is closely related to marketing, but the location-based service focuses on analyzing an impact of space and distance on marketing and economic activities.
The location-based service field is wide and colorful, and cooperation with traditional businesses provides unlimited space for its future development.
A premise of location-based service is that users in different geographic locations need to be aggregated, and then accurate information is pushed according to specific preferences and portraits (user's attribute information) of users in similar locations.
According to coordinate data reported by a buried point of a mobile app browsing log, a location point where a user often stays can be obtained. However, spatial data has certain requirements for clustering algorithms due to its specificity: it should be able to find clusters of any shape; the number of clusters is impossible to be determined a priori, so division-based clustering (K_means, etc.) is basically not feasible; it should be insensitive to noise data.
A density-based clustering algorithm such as a density-based spatial clustering of applications with noise (DBSCAN) algorithm is a relatively classic spatial clustering algorithm that can aggregate into an arbitrarily shaped cluster in spatial data containing noise.
The present application provides an information pushing method and apparatus.
In an embodiment, an information pushing method is provided, including:
acquiring and recording a coordinate point of a location where a user terminal is located reported by the user terminal, and time when the coordinate point is reported;
determining a coordinate point which satisfies a preset condition of each user within a first preset time;
dividing a minimum rectangular area containing all coordinate points which satisfy the preset condition into a grid with a preset step length as a unit, and establishing a mapping relationship between a coordinate point in each grid and a corresponding grid;
searching for a corresponding neighborhood through an eight-direction neighborhood mode with a grid having a corresponding relationship with a coordinate point as a target, in a clustering process using a density-based spatial clustering of applications with noise algorithm based on grid search;
mapping a grid in each cluster after clustering to a coordinate point in a corresponding grid according to the mapping relationship; and
pushing information at a corresponding location according to a portrait of a user corresponding to a coordinate point within a cluster for any cluster.
In another embodiment, an information pushing apparatus is provided, including: an acquiring unit, a determining unit, an establishing unit, a clustering unit, a mapping unit and a pushing unit;
the acquiring unit is configured to acquire and record a coordinate point of a location where a user terminal is located reported by the user terminal, and time when the coordinate point is reported;
the determining unit is configured to determine a coordinate point recorded by the acquiring unit which satisfies a preset condition of each user within a first preset time;
the establishing unit is configured to divide a minimum rectangular area containing all coordinate points which satisfy the preset condition determined by the determining unit into a grid with a preset step length as a unit, and establish a mapping relationship between a coordinate point in each grid and a corresponding grid;
the clustering unit is configured to search for a corresponding neighborhood through an eight-direction neighborhood mode with a grid divided by the establishing unit having a corresponding relationship with a coordinate point as a target in a clustering process using a density-based spatial clustering of applications with noise algorithm based on grid search;
the mapping unit is configured to map a grid in each cluster clustered by the clustering unit to a coordinate point in a corresponding grid according to the mapping relationship established by the establishing unit; and
the pushing unit is configured to push information at a corresponding location according to a portrait of a user corresponding to a coordinate point within a cluster mapped by the mapping unit for any cluster.
In another embodiment, an electronic device is provided, including a memory, a processor, and a computer program stored in the memory and run on the processor, where the processor, when executing the program, implements the steps of the information pushing method.
In another embodiment, a computer-readable storage medium is provided with a computer program stored thereon, where the program, when executed by a processor, implements the steps of the information pushing method.
In another embodiment, a chip for executing instructions is provided, where the chip includes a memory and a processor, the memory stores code and data, the memory is coupled with the processor, and the processor runs the code in the memory, to enable the chip to be used to execute the steps of the above-mentioned information pushing method.
In another embodiment, a program product containing instructions is provided, where the program product, when run on a computer, enables the computer to execute the steps of the above-mentioned information pushing method.
In another embodiment, a computer program is provided, where the computer program, when executed by a processor, is used to execute the steps of the above-mentioned information pushing method.
The following accompanying drawings are only a schematic description and explanation of the present application, but not to limit the scope of the present application:
In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions of the present application will be described in detail with reference to the accompanying drawings and embodiments.
The existing DBSCAN algorithm applied to mobile buried coordinate data has the following problems:
when a spatial area is large or data volume is large, the efficiency of the DBSCAN algorithm is very poor.
An embodiment of the present application provides an information pushing method, which is applied to a system including a user terminal, a server and a pushing terminal.
The user terminal is configured to report a coordinate point of a location where the terminal is located, and provide some network access information, shopping information, etc., to determine a user portrait.
The user portrait is a labeled user model abstracted from information such as a social attribute, a living habit and a consumption behavior of a user.
The server is configured to cluster according to location information (a coordinate point) reported by the user terminal, determine which coordinate points correspond to users belonging to a cluster, and then push information according to a user portrait of the user corresponding to the coordinate points in the cluster, where the pushed information can be an electronic advertisement, and relevant information can be pushed in place such as an elevator, a public restroom, a community gate, etc.
The pushing terminal is configured to display information pushed by the server.
Among the above-mentioned three devices, the present application mainly improves information acquisition on the user terminal and the clustering process on the server. The present application does not limit which kind of information pushing is performed based on the user portrait.
An information pushing process by the server in the present application embodiment is described in detail in combination with the accompanying drawings.
Referring to
step 101: a server acquires and records a coordinate point of a location where a user terminal is located reported by the user terminal, and time when the coordinate point is reported.
When a user browses an app, every time a page is updated, a coordinate point of a location where the user is located at that time will be reported.
The server receives the coordinate point of the location reported by the user terminal every time the page is updated, and records the reported coordinate point and the time when the coordinate point is acquired.
Referring to Table 1. Table 1 shows a corresponding relationship between each coordinate point of user 1 and time.
The coordinate points in Table 1 are identified by longitude and latitude information.
Step 102: determine a coordinate point which satisfies a preset condition of each user within a first preset time.
In the embodiment of the present application, for validity of subsequently processed data, it is also necessary to determine validity of data recorded for each user within the first preset time first, and the specific processing is as follows:
determine whether the number of recorded coordinate points is greater than a preset number threshold within the first preset time, if yes, consider the user as an active user, and perform the operation of step 102 for the user; otherwise, consider the user as an inactive user, and delete the coordinate point and time corresponding to the user.
In the present step, the determining a coordinate point which satisfies a preset condition of each user within a first preset time includes:
step 1: count a duration that each user stays at each coordinate point within the first preset time.
Taking the contents shown in Table 1 as a premise, the process of determining the stay duration at each coordinate point is implemented by a specific example as follows.
Continuous time points recorded in Table 1 for a coordinate point (lon1, lat1) are 00:00, 00:10, and 00:30, then it is determined that the time that user 1 stays at the coordinate point (lon1, lat1) is 30 minutes.
Based on the above-mentioned method of determining the stay time of the user 1 at each coordinate point, the duration that the user 1 stays at each coordinate point is determined. See Table 2 for details, Table 2 is content corresponding to the duration that the user 1 stays at each coordinate point obtained according to Table 1.
Step 2: select two coordinate points with longest stay time for each user.
As shown in Table 2, the selected coordinate points are a coordinate point (lon1, lat1) and a coordinate point (lon2, lat2).
Step 3: determine whether a difference ratio between stay times of the two coordinate points is less than a preset ratio value, if yes, determine both coordinate points with longest stay time as the coordinate point which satisfies the preset condition; otherwise, determine a coordinate point with longer stay time between the two coordinate points with longest stay time as the coordinate point which satisfies the preset condition.
In the embodiment of the present application, the difference ratio between the stay times of the two coordinate points is: a ratio of an absolute value of a difference between the stay times of the two coordinate points to a longest stay time.
As shown in Table 2, the stay time of the coordinate point (lon1, lat1) is 30 minutes, and the stay time of the coordinate point (lon2, lat2) is 5 minutes, then the difference ratio of the stay times is:
it is assumed that the preset ratio value is 30%, the difference ratio between the stay times of the two coordinate points is greater than the preset ratio threshold, which indicates that the difference of the stay times of the two coordinate points is relatively large, then only the coordinate point with the longest stay time (lon1, lat1) is retained.
If the difference ratio between the stay times of the two coordinate points is not greater than the preset ratio threshold, it indicates that the difference of the stay times of the two coordinate points is not large, and both the two coordinate points are relatively important, then the two coordinate points are retained.
Step 103: divide a minimum rectangular area containing all coordinate points which satisfy the preset condition into a grid with a preset step length as a unit, and establish a mapping relationship between a coordinate point in each grid and a corresponding grid.
After the determining a coordinate point which satisfies a preset condition of each user within a first preset time in step 102, before the dividing a minimum rectangular area containing all coordinate points which satisfy the preset condition into a grid with a preset step length as a unit in step 103, the method further includes: filter out a coordinate point in a low-density area, and specific implementation is as follows:
step 1: divide the minimum rectangular area containing all coordinate points which satisfy the preset condition into a grid with N times the preset step length as a unit; where N is an integer greater than 2;
where minimum rectangular areas of all coordinates to be processed can be determined by a range corresponding to minimum longitude and latitude and maximum latitude and longitude of the coordinate points; and
step 2: if it is determined that a number of coordinate points in any grid is less than a preset number threshold, then delete the coordinate points in the grid from all coordinate points which satisfy the preset condition.
The preset step length here is an actual step length when a grid is divided, and here the grid is firstly divided with more than twice the step length, so that each grid corresponds to a relatively large area, and if the number of coordinate points is still small in such a large area, the grid is considered to be a low-density area, then the coordinate points in the grid are filtered out.
The retained coordinate points will continue to be processed with step 103.
In the embodiment of the present application, an identification for the grid is defined in terms of longitude and latitude as a reference number respectively, such as 2-8 indicates that the grid is a grid in a second row in terms of longitude and an eighth row in terms of latitude.
In the embodiment of the present application, the establishing a mapping relationship between a coordinate point in each grid and a corresponding grid includes:
take an i-th coordinate point as an example;
establish a mapping relationship between a coordinate (loni, lati) of the i-th coordinate point and a grid identifier (lon_idi_lat_idi) as:
calculate a difference between the loni and minlon, and take a value obtained by rounding up a quotient of the difference with w as the lon_idi;
which is
and
calculate a difference between the lati and minlat, and take a value obtained by rounding up a quotient of the difference with w as the lat_idi;
which is
where the minlon and the minlat are minimum longitude and latitude coordinates of the minimum rectangular area; and the w is the preset step length.
Through the above-mentioned algorithm, a mapping relationship between each coordinate point and a grid identifier can be determined. As shown in Table 3, Table 3 shows a mapping relationship between a coordinate point and a grid identifier.
In Table 3, the user_id is a user identifier, and the grid_id is a grid identifier. There can be one or more coordinate points in a grid.
Step 104: search for a corresponding neighborhood through an eight-direction neighborhood mode with a grid having a corresponding relationship with a coordinate point as a target in a clustering process using a DBSCAN algorithm based on grid search.
Through step 103, the target of clustering is converted from a coordinate point to a grid, and next, in the clustering process using the DBSCAN algorithm, a neighborhood search is performed with a grid having a mapping relationship with the coordinate point as a target, that is, a target with a corresponding coordinate point, a grid without a corresponding coordinate point will not be processed, instead of performing a neighborhood search with the coordinate point as a target.
Referring to
step 201: select a grid from a set of current un-clustered grids.
Before performing clustering on grids, all the grids are un-clustered grids forming a set of un-clustered grids.
Here, the grids forming the set of un-clustered grids are grids having a mapping relationship with the coordinate points.
Selecting a grid here can be a random selection, or selecting with a rule for selecting a grid being given according to an actual application, to reduce the number of searches, which is not limited in the embodiment of the present application.
Step 202: acquire a set of neighborhood grids of the grid.
The set includes the grid.
When acquiring a neighborhood set of the selected grid, a corresponding neighborhood is searched for through an eight-direction neighborhood mode with the grid as a target. The specific acquisition process is as follows:
step 1: for any grid P, take the grid P as a center grid, and search for an eight-direction neighborhood grid of the grid P.
When searching for the eight-direction neighborhood grid of the grid, the searched grid is a grid in the set of un-clustered grids; for a grid that is an eight-direction neighborhood grid of a grid, but not a grid in the set of current un-clustered grids, it is not treated as an eight-direction neighborhood grid of the grid.
Referring to
Step 2: search for an eight-direction neighborhood grid of each neighborhood grid with all found neighborhood grids as a center grid again.
For a first-order neighborhood grid found in the step 1, the searched first-order neighborhood grid is taken as a center grid for performing an eight-direction neighboring grid search again.
Referring to
The first-order neighborhood grids are the grid 22, the grid 23, the grid 34, and the grid 42, then eight-direction neighborhood grids for the grid 22 are a grid 11, a grid 13, the grid 23 and the grid 33, since the grid 23 and the grid 33 have been searched before, therefore, for the grid 22, the grid 11 and the grid 13 are taken as second-order neighborhood grids of the point P; similarly, it can be obtained that for the grid 23, a grid 14 is taken as a second-order neighboring grid of the point P; for the grid 34 and the grid 42, there is no new second-order grid for the point P; and
based on the above-mentioned search, all current neighboring grids of the point P are: the grid 33, the grid 22, the grid 23, the grid 34, the grid 42, the grid 11, the grid 13, and the grid 14.
Step 3: end search of neighborhood grids until a total distance of all the searched neighborhood grids is greater than a preset clustering diameter or there is no new neighborhood grid.
The total distance of all the neighborhood grids is a diagonal length of a minimum rectangular area corresponding to all the neighborhood grids.
Assuming it is determined that the diagonal length of the minimum rectangular area corresponding to all the current neighborhood grids (after the second-order search) is greater than the clustering diameter, or there is no new neighborhood grid which is searched with the second-order neighborhood grid as a center, the search for the point P is ended; otherwise, the search is continued in a similar manner.
All the searched neighboring grids and the grid P are taken as a set of neighborhood grids of the grid P.
Step 203: determine whether a number of coordinate points in all grids in the set of neighborhood grids is less than a preset noise threshold, if yes, perform step 204; otherwise, perform step 205.
Step 204: mark all the grids in the set of neighborhood grids as noise; perform step 209.
Step 205: determine whether the number of the coordinate points in all the grids in the set of neighborhood grids is less than a preset small cluster threshold, if yes, perform step 206; otherwise, perform step 207.
The preset small cluster threshold is greater than the preset noise threshold.
Step 206: determine whether there is a cluster in a clustered cluster a distance between whose center and centers of all the grids of the set of neighborhood grids is less than a preset distance threshold, if yes, perform step 208; otherwise, perform step 207.
A coordinate of a center point of a cluster is a mean value of coordinate points of all grids in the cluster, where the mean value of the coordinate points is calculated by longitude and latitude. if there are N coordinate points in a cluster, then a specific determination method of a center coordinate (Core_lonN, Core_latN) of the cluster is as follows:
where the loni and lati are longitude and latitude coordinates of an i-th coordinate point among the N coordinate points.
A center of all the grids in the set of neighboring grids is a mean value of coordinate points corresponding to all the grids, which is similar to the calculation method of the center of the cluster, and will not be described in detail here.
Step 207: mark all the grids in the set of neighborhood grids as belonging to a new cluster; perform step 209.
Step 208: add all the grids in the set of neighborhood grids to a cluster whose distance is less than the preset distance threshold.
Step 209: delete all the grids in the set of neighborhood grids from a set of un-clustered grids.
Step 210: determine whether the set of un-clustered grids is empty, if yes, end the process; otherwise, perform step 201.
So far, the clustering of all the grids is completed.
A clustering using the DBSCAN algorithm with a grid as a target can greatly improve computing efficiency, thereby greatly improving information pushing efficiency of a device.
Step 105: map a grid in each cluster after clustering to a coordinate point in a corresponding grid according to the mapping relationship.
The mapping relationship between coordinate points and grid identifiers is given in Table 3, and clustering of the coordinate points can be realized by mapping a clustered grid to a corresponding coordinate point.
Step 106: push information at a corresponding location according to a portrait of a user corresponding to a coordinate point within a cluster for any cluster.
When specifically implementing the embodiment of the present application, a coordinate point and a user also have a corresponding relationship, and information can be pushed at a location corresponding to a corresponding cluster according to a portrait of the user, and specific implementation of step 106 is not limited in the embodiment of the present application.
Based on the same inventive concept, the present application further provides an information pushing apparatus. Referring to
the acquiring unit 501 is configured to acquire and record a coordinate point of a location where a user terminal is located reported by the user terminal, and time when the coordinate point is reported;
the determining unit 502 is configured to determine a coordinate point recorded by the acquiring unit 501 which satisfies a preset condition of each user within a first preset time;
the establishing unit 503 is configured to divide a minimum rectangular area containing all coordinate points which satisfy the preset condition determined by the determining unit 502 into a grid with a preset step length as a unit, and establish a mapping relationship between a coordinate point in each grid and a corresponding grid;
the clustering unit 504 is configured to search for a corresponding neighborhood through an eight-direction neighborhood mode with a grid divided by the establishing unit 503 having a corresponding relationship with a coordinate point as a target, in a clustering process using a density-based spatial clustering of applications with noise algorithm based on grid search;
the mapping unit 505 is configured to map a grid in each cluster clustered by the clustering unit 504 to a coordinate point in a corresponding grid according to the mapping relationship established by the establishing unit 503; and
the pushing unit 506 is configured to push information at a corresponding location according to a portrait of a user corresponding to a coordinate point within a cluster mapped by the mapping unit 505 for any cluster.
In an embodiment, the determining unit 502 is specifically configured to, when determining the coordinate point which satisfies the preset condition of each user within the first preset time, count a duration that each user stays at each coordinate point within the first preset time; select two coordinate points with longest stay time for each user; and determine whether a difference ratio between stay times of the two coordinate points is less than a preset ratio value, if yes, determine both coordinate points with longest stay time as the coordinate point which satisfies the preset condition; otherwise, determine a coordinate point with longer stay time between the two coordinate points with longest stay time as the coordinate point which satisfies the preset condition.
In an embodiment, the establishing unit 503 is further configured to, before dividing the minimum rectangular area containing all the coordinate points which satisfy the preset condition into the grid with the preset step length as a unit, divide the minimum rectangular area containing all coordinate points which satisfy the preset condition into a grid with N times the preset step length as a unit; wherein N is an integer greater than 2; and if it is determined that a number of coordinate points in any grid is less than a preset number threshold, then delete the coordinate points in the grid from all coordinate points which satisfy the preset condition.
In an embodiment, the establishing unit 503 is specifically configured to establish a mapping relationship between a coordinate (loni, lati) of an i-th coordinate point and a grid identifier (lon_idi_lat_idi) as: calculate a difference between the loni and minlon, and take a value obtained by rounding up a quotient of the difference with w as the lon_idi; and calculate a difference between the lati and minlat, and take a value obtained by rounding up a quotient of the difference with w as the lat_idi; wherein the minlon and the minlat are minimum longitude and latitude coordinates of the minimum rectangular area; and the w is the preset step length.
In an embodiment, the clustering unit 504 is specifically configured to, for any grid P, take the grid P as a center grid, and search for an eight-direction neighborhood grid of the grid P; search for an eight-direction neighborhood grid of each neighborhood grid with all searched neighborhood grids as a center grid again; end search of neighborhood grids until a total distance of all the searched neighborhood grids is greater than a preset clustering diameter or there is no new neighborhood grid; and take all the searched neighborhood grids and the grid P as a set of neighborhood grids of the grid P; wherein the total distance of all the neighborhood grids is a diagonal length of a minimum rectangular area corresponding to all the neighborhood grids.
In an embodiment, the clustering unit 504 is specifically configured to, in the clustering process using the density-based spatial clustering of applications with noise algorithm based on the grid search, select a grid from a set of current un-clustered grids; acquire a set of neighborhood grids of the grid; mark all grids in the set of neighborhood grids as noise when it is determined that a number of coordinate points in all the grids in the set of neighborhood grids is less than a preset noise threshold; add all the grids in the set of neighborhood grids to a cluster whose distance is less than a preset distance threshold when it is determined that the number of the coordinate points in all the grids in the set of neighborhood grids is less than a preset small cluster threshold and there is a cluster in a clustered cluster a distance between whose center and centers of all the grids of the set of neighborhood grids is less than the preset distance threshold; mark all the grids in the set of neighborhood grids as belonging to a new cluster when it is determined that the number of the coordinate points in all the grids in the set of neighborhood grids is less than the preset small cluster threshold and there is a cluster in the clustered cluster a distance between whose center and the centers of all the grids of the set of neighborhood grids is not less than the preset distance threshold, or the number of the coordinate points in all the grids in the set of neighborhood grids is not less than the preset noise threshold; delete all the grids in the set of neighborhood grids from a set of un-clustered grids; determine whether the set of un-clustered grids is empty, if yes, end the process; otherwise, perform a search on the set of current un-clustered grids; wherein, the preset noise threshold is less than the preset small cluster threshold.
The units of the above-mentioned embodiment can be integrated or deployed separately; they can be combined into one unit or be further split into multiple sub-units.
In another embodiment, an electronic device is provided, including a memory, a processor, and a computer program stored in the memory and run on the processor, where the processor, when executing the program, implements the steps of the information pushing method.
In another embodiment, an embodiment of the present application provides a computer-readable storage medium with a computer program stored thereon, where the program, when executed by a processor, implements the steps of the information pushing method.
In another embodiment, an embodiment of the present application provides a chip for executing instructions, where the chip includes a memory and a processor, the memory stores code and data, the memory is coupled with the processor, and the processor runs the code in the memory, to enable the chip to be used to execute the steps of the above-mentioned information pushing method.
In another embodiment, an embodiment of the present application provides a program product containing instructions, where the program product, when run on a computer, enables the computer to execute the steps of the above-mentioned information pushing method.
In another embodiment, an embodiment of the present application provides a computer program, where the computer program, when executed by a processor, is used to execute the steps of the above-mentioned information pushing method.
In summary, the present application, by converting a search neighborhood target from a coordinate point to a grid in the DBSCAN algorithm clustering, greatly accelerate clustering speed, thereby improving information pushing efficiency.
The above-mentioned is only a preferred embodiment of the present application and is not intended to limit the present application. Any modification, equivalent replacement, improvement, etc., made within the spirit and principle of the present application shall be included in the scope of protection of the present application.
Number | Date | Country | Kind |
---|---|---|---|
201910559744.X | Jun 2019 | CN | national |
This application is a continuation of International Application No. PCT/CN2020/076995, filed on Feb. 27, 2020, which claims priority to the Chinese Patent Application No. 201910559744.X, filed to the China National Intellectual Property Administration on Jun. 26, 2019 and entitled “INFORMATION PUSHING METHOD AND APPARATUS”. The contents of the above applications are hereby incorporated by reference in their entireties in this application.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2020/076995 | Feb 2020 | US |
Child | 17550499 | US |