The present disclosure relates to the field of computer application technologies, and particularly to big data and deep learning technologies in artificial intelligence technologies.
With rapid development of a mobile Internet technology, applications of mobile Internet affect our lives in various aspects. Different from the personal computer (PC) era, information of a brand new dimension, i.e., a geographic location, is added in almost all mobile applications.
In order to facilitate recording and application of the geographic location, different geographic location regions are required to be encoded reasonably.
In view of this, the present disclosure provides a method for encoding a geographic location region, a device and a computer storage medium, so as to reasonably encode the geographic location region.
According to a first aspect of the present disclosure, there is provided a method for establishing an encoding model, including:
According to a second aspect of the present disclosure, there is provided a method for encoding a geographic location region, including:
According to a third aspect of the present disclosure, there is provided an electronic device, including:
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium with computer instructions stored thereon, wherein the computer instructions are used for causing a computer to perform the method as mentioned above.
It should be understood that the statements in this section are not intended to identify key or critical features of the embodiments of the present disclosure, nor limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
The drawings are used for better understanding the present solution and do not constitute a limitation of the present disclosure. In the drawings,
The following part will illustrate exemplary embodiments of the present disclosure with reference to the drawings, including various details of the embodiments of the present disclosure for a better understanding. The embodiments should be regarded only as exemplary ones. Therefore, those skilled in the art should appreciate that various changes or modifications can be made with respect to the embodiments described herein without departing from the scope and spirit of the present disclosure. Similarly, for clarity and conciseness, the descriptions of the known functions and structures are omitted in the descriptions below.
Encoding of a geographic location region is representation of the geographic location region using one code, and is used to distinguish the geographic location region from other geographic location regions in a limited geographic location region set.
The geographic location region may be divided according to administrative divisions, such as a city, a district, a block, or the like. The geographic location region may also be divided according to a preset accuracy and shape, for example, into a 1 km ×1 km grid, and each grid is used as one geographic location region.
At present, all conventional geographic-region encoding methods comply with the principle that “geographic location regions close in physical space have more similar codes”: that is, conventionally, the geographic location region is encoded based on location information, such as commonly-used GeoHash encoding. However, in practical applications, these encoding methods are unreasonable. The present disclosure has a core idea that encoding is based on the principle that “geographic location regions with similar geographic functions and surface-feature distributions have more similar codes”. The method according to the present disclosure will be described below in detail in conjunction with an embodiment.
In the present disclosure, encoding of the geographic location region is mainly implemented based on an encoding model, and thus mainly includes two stages: a stage of establishing an encoding model and a stage of encoding the geographic location region using the encoding model. The two stages are described below.
101: acquiring training data, the training data including at least one triplet, and the triplet including an anchor sample, a positive sample and a negative sample of geographic location regions.
102: training the encoding model using the training data, the encoding model performing the following operations on each sample: performing embedding on at least one kind of geographic function information and at least one kind of surface-feature distribution information of the sample, and fusing vector representations obtained by the embedding to obtain an encoding result of the sample; wherein the encoding model has training targets of minimizing a distance between the encoding result of the anchor sample and the encoding result of the positive sample in the triplet, and maximizing a distance between the encoding result of the anchor sample and the encoding result of the negative sample in the triplet.
From the above technical solution in the embodiment, the established encoding model performs encoding based on the geographic function information and the surface-feature distribution information of the geographic location region, such that the geographic location regions with similar geographic functions and surface-feature distributions have more similar encoding results, and the encoding method is more reasonable compared with a traditional encoding method. The above steps will be described below in detail in conjunction with an embodiment.
First, the above step 101 of acquiring training data is described in detail in conjunction with the embodiment.
The training data used in the present disclosure is triplets, and each triplet includes an anchor sample, a positive sample and a negative sample. Each sample is a geographic location region. The positive sample is a geographic location region having a geographic function and a surface-feature distribution quite similar to those of the anchor sample. The negative sample is a geographic location region having a geographic function and a surface-feature distribution dissimilar to those of the anchor sample.
Each triplet in the training data may be selected manually. The method has high accuracy, but consumes a high human cost, and has a low efficiency. Therefore, several methods of automatically acquiring the training data are provided in the embodiment of the present disclosure, such as but not limited to:
In the method, the anchor sample may be selected from the geographic location regions divided in advance. During the selection of the positive sample of the anchor sample, since two adjacent geographic location regions are more likely to have similar geographic functions and surface-feature distributions, one geographic location region may be selected from the neighbor geographic location regions of the anchor sample as the positive sample. A selection method may be a random selection method or a method of selection according to a certain rule. During the selection of the negative sample of the anchor sample, one geographic location region may be selected from the non-neighbor geographic location regions as the negative sample. A selection method may also be a random selection method or a method of selection according to a certain rule.
A second method: from a navigation log, acquiring a geographic location region where a navigation starting point is located and a geographic location region where a navigation ending point is located as the anchor sample and the positive sample of the geographic location regions respectively, and selecting another geographic location region as the negative sample.
Based on customary preferences of users, a departure place and a destination are likely to have similar geographic functions and surface-feature distributions. Therefore, navigation information of a large number of users may be acquired from the navigation log, geographic location region pairs formed by the navigation starting points and the navigation ending points are counted, and the geographic location region pair with an occurrence frequency or an occurrence number meeting a certain condition is used as the anchor sample and the positive sample. The negative sample of the anchor sample may be randomly selected from other geographic location regions than the positive example and the anchor sample.
A third method: from a retrieval log, acquiring a geographic location region where an initiating location of retrieval is located and a geographic location region where a target location of the retrieval is located as the anchor sample and the positive sample of the geographic location regions respectively, and randomly selecting another geographic location region as the negative sample.
Similarly, based on customary preferences of users, a location where a user initiates retrieval and a target location of the retrieval are likely to have similar geographic functions and surface-feature distributions. Therefore, retrieval information of a large number of users may be acquired from the retrieval log. Geographic location region pairs formed by the initiating locations and the target locations of the retrieval are counted, and the geographic location region pair with an occurrence frequency or an occurrence number meeting a certain condition is used as the anchor sample and the positive sample. The negative sample of the anchor sample may be randomly selected from other geographic location regions than the positive example and the anchor sample.
The above step 102 of “training the encoding model using the training data” is described in detail below in conjunction with an embodiment.
At least one kind of geographic function information and at least one kind of surface-feature distribution information are extracted from the sample in each triplet in the training data.
The geographic function information may include at least one of: point of interest (POI) information, user information, or place query terms initiated at the geographic location region.
The POI information may include names, types, a number, addresses, or the like, of POIs contained in the geographic location region. The POI information may largely reflect the geographic function of the geographic location region. For example, geographic location regions where Disney and a fairground are located have similar geographic functions.
The user information may include an age distribution, a sex ratio, an occupation type distribution, an education level status, a salary status, or the like, of the users in the geographic location region. For example, users in a geographic location region, such as a science and technology park, show characteristics that the users are mostly male, 25-35 years old, programmers, and have college degrees or above and higher salaries.
The place query terms initiated in the geographic location region largely reflect user preferences of the geographic location region, and also reflect the geographic function of the geographic location region to a certain extent. The data may also be acquired from the retrieval log, and the place query terms initiated in the geographic location region in the retrieval log are counted to acquire the place query term with an occurrence frequency or an occurrence number meeting a certain condition.
The surface-feature distribution information may include at least one of: a map image or a real-scene image of the geographic location region. These images may be acquired from a server or database of a map-type application.
The map image of the geographic location region may be an image of the geographic location region displayed on a map. The map image may be a satellite image or a base map image. The map image includes map elements for various region types, such as land, river systems, green space, or the like, roads, such as high speed roads, city main roads, railways, or the like, and various types of POIs, such as scenic spots, hotels, schools, shopping malls, shops, office buildings, stadiums, or the like. The map image well reflects the surface-feature distribution of the geographic location region.
The real-scene image refers to an image drawn or shot based on a real scene, such as a street scene image. The real-scene image also well reflects the surface-feature distribution of the geographic location region.
In order to facilitate the description and understanding of the following embodiments, in the subsequent embodiments, for example, five features of the POI information, the user information, the place query term initiated in the geographic location region, the map image and the real-scene image are extracted from the geographic location region (each sample), and the five features are represented as: X1, X2, X3, X4 and X5 respectively.
The five features X1, X2, X3, X4 and X5 extracted from the geographic location region are input into the encoding model, the encoding model performs embedding on the five features to obtain vector representations of the features, i.e., the vector representation of the POI information, the vector representation of the user information, the vector representation of the place query term initiated in the geographic location region, the vector representation of the map image and the vector representation of the real-scene image. Then, the vector representations obtained by the embedding are fused to obtain an encoding result v of the geographic location region.
Since the POI information and the query term input into the embedding networks M1 and M3 are usually text-type data, these embedding networks may be neural networks, such as RNNs, or the like. The embedding operations performed by the embedding networks M1 and M3 may be represented as M1 (X1, θ1) and M3(X3, θ3) respectively, wherein θ1 and θ3 are model parameters of the embedding networks M1 and M3 respectively.
Since the user information input into the embedding network M2 is usually attribute-distribution-type data, this embedding network may be a neural network, such as a DNN, or the like. The embedding operation performed by the embedding network M2 may be represented as M2(X2, θ2), where θ2 is a model parameter of the embedding network M2.
Since image-type data is input into the embedding networks M4 and M5, these embedding networks may be neural networks, such as CNNs, or the like. The embedding operations performed by the embedding networks M4 and M5 may be represented as M4(X4, θ4) and M5(X5, θ5) respectively, wherein 0, and 05 are model parameters of the embedding networks M4 and M5 respectively.
The vector representations output by the embedding networks are sent into a fusion network for fusion, so as to obtain the encoding result v of the geographic location region. The fusion may be a process of splicing the vector representations, and then obtaining the encoding result through fully connected mapping. Or, the fusion may also be a process of taking an outer product of the vector representations to obtain the encoding result. Other processing methods are also possible, which are not listed here. A model parameter of the fusion network is represented as θ.
During training, for a triplet (va, v+, v−), va is the anchor sample, v+ is the positive sample, and v− is the negative sample. The encoding model has the training targets of minimizing the distance between the encoding result of the anchor sample and the encoding result of the positive sample in the triplet, and maximizing the distance between the encoding result of the anchor sample and the encoding result of the negative sample in the triplet.
Assuming that the processing of each sample by the encoding model is represented as f( ), a loss function may be defined as, for example:
wherein r is a preset minimum interval, so as to ensure that one minimum interval r exists between the distance between the encoding result of the anchor sample and the encoding result of the positive sample and the distance between the encoding result of the anchor sample and the encoding result of the negative sample ∥∥22 represents the Euclidean distance.
In the training process, the model parameters of the encoding model, i.e., the above-mentioned θ1, θ2, θ3, θ4, θ5 and θ are updated using the value of the loss function during each iteration until training ending conditions, such as convergence of the value of the loss function, reaching of a preset number of iterations, or the like, are met.
At this point, the encoding model is obtained based on training and may be configured to encode the geographic location region.
The above steps will be described below in detail in conjunction with an embodiment.
First, the step 301 of determining a to-be-encoded geographic location region will be described in detail.
Geographic location regions pre-divided according to preset precision may be used as the to-be-encoded geographic location regions, and the encoding results are determined one by one. Or, one of the geographic location regions may be used as the to-be-encoded geographic location region to determine the encoding result.
In an actual usage scenario, there may exist the following situation. A geographic location coordinate of a user is acquired, the geographic location coordinate is used as input, and a geographic location region where the geographic location coordinate is located is determined as the to-be-encoded geographic location region.
In such a usage scenario, the encoding model may be used in real time to determine the encoding result of the geographic location region where the input geographic location coordinate is located. Or, the encoding results of the geographic location regions may be acquired in advance to be stored, and after the input geographic location coordinate is acquired, the encoding result of the geographic location region where the geographic location coordinate is located is determined by querying the stored encoding results of the geographic location regions.
For the step 302 of acquiring at least one kind of geographic function information and at least one kind of surface-feature distribution information of the geographic location region, reference may be made to the description in step 102 in the embodiment shown in
After the encoding model shown in
For the specific processing operations of each embedding network and the fusion network, reference may be made to the related description of the embodiment shown in
With the encoding method, the geographic location regions with similar geographic functions and surface-feature distributions have more similar encoding results.
The geographic location region encoded using the method in the above embodiment may be applied to various application scenarios. Just a few application scenarios are listed below:
For example, in a map-type application, all overpasses are required to be found, so as to perform an information check. Then, after a plot of the overpass is found, all other plots meeting a requirement of similarity to the encoding result of the plot may be found using similarity of encoding results. The found plots should also be the plots where the overpass is located in theory, and these plots are screened and verified; for example, if a shop-type POI appears on an overpass, the plot is obviously wrong.
For another example, a site of a fast food chain store is required to be selected, plots where branch stores with good business conditions are located may be determined, and then, plots meeting a certain requirement of similarity to these plots are found using similarity of encoding results, and the site is selected from these plots to set up a new branch store of the fast food chain store. Due to the similarity of the geographic functions and surface-feature distributions, the new branch store of the fast food chain store set up on the newly selected plot may also have a good business condition.
For another example, when a user needs to determine a region with a specific geographic feature, the user may first select a region with the feature and take the region as a query region. An encoding result of the query region and encoding results of other geographic location regions are subjected to similarity calculation, so as to screen top N geographic location regions, wherein N is a preset positive integer. These screened geographic location regions also have the specific geographic feature. For example, a user wants to find a region with a residential region and a river, or a region where a residential region is close to a river. As shown in
A second application scenario: search recommendation is performed on a user based on the encoding result of the geographic location region where the user is located.
For example, when the user initiates a search, the geographic location region where the user is located during the search is acquired, and the encoding result of the geographic location region is used as one of input features to perform search recommendation. For example, when “ba” is input in an input box, a search term is recommended to the user in a form, such as a drop-down box, with the input of the user. If the user is located in a hotel in Beijing, scenic spots, such as “Badaling Great Wall”, or the like, may be recommended to the user preferably. If the user is located in the science and technology park, office buildings of technology companies, such as “Baidu Building”, or the like, may be recommended to the user preferably.
A third application scenario: a search result sort is performed on a user based on the encoding result of the geographic location region where the user is located.
For example, when the user initiates a search, the geographic location region where the user is located during the search is acquired, and the encoding result of the geographic location region is used as one of input features to perform the search result sort. With the search result sorting method, the recommendation may be based on the geographic functions and surface-feature distributions of the geographic location regions. For example, for a restaurant search initiated in a software park in Beijing and a software park in Chengdu, although the two parts are far apart in terms of a geographic location, encoding results thereof are highly similar due to the similarity of the geographic functions and surface-feature distributions. Restaurant search results based on this also have certain similarity: for example, fast food is preferred.
For another example, a software park in Beijing is close to a finance street, but encoding results thereof are quite different. Therefore, the sort of the restaurant search results is also quite different. For example, users of the software park prefer fast food restaurants, and users of the finance street prefer western food restaurants.
The method according to the present disclosure is described above in detail, and an apparatus according to the present disclosure will be described below in detail in conjunction with an embodiment.
The acquiring unit 501 is configured to acquire training data, the training data including at least one triplet, and the triplet including an anchor sample, a positive sample and a negative sample of geographic location regions.
The training unit 502 is configured to train the encoding model using the training data, the encoding model performing the following operations on each sample: performing embedding on at least one kind of geographic function information and at least one kind of surface-feature distribution information of the sample, and fusing vector representations obtained by the embedding to obtain an encoding result of the sample.
The encoding model has training targets of minimizing a distance between the encoding result of the anchor sample and the encoding result of the positive sample in the triplet, and maximizing a distance between the encoding result of the anchor sample and the encoding result of the negative sample in the triplet.
The geographic function information may include at least one of: point of interest information, user information, or place query terms initiated at the geographic location region.
The surface-feature distribution information may include at least one of: a map image or a real-scene image.
Specifically, the acquiring unit 501 may acquire the training data using but not limited to the following methods:
A second method: from a navigation log, acquiring a geographic location region where a navigation starting point is located and a geographic location region where a navigation ending point is located as the anchor sample and the positive sample of the geographic location regions respectively, and selecting another geographic location region as the negative sample.
A third method: from a retrieval log, acquiring a geographic location region where an initiating location of retrieval is located and a geographic location region where a target location of the retrieval is located as the anchor sample and the positive sample of the geographic location regions respectively, and selecting another geographic location region as the negative sample.
The dividing unit 503 is configured to pre-divide the geographic location region according to preset precision.
As an implementable way, the encoding model may include at least two embedding networks and a fusion network.
The training unit 502 may input the at least one kind of geographic function information and the at least one kind of surface-feature distribution information extracted from the sample into the embedding networks respectively.
The embedding network is configured to perform embedding on the input information to obtain the corresponding vector representations.
The fusion network is configured to fuse the vector representations output by the embedding networks, so as to obtain the encoding result of the sample.
When training the encoding model, the training unit 502 iteratively updates model parameters of the embedding network and the fusion network according to values of a loss function, and the loss function is pre-constructed according to the training target.
The determining unit 601 is configured to determine a to-be-encoded geographic location region.
The acquiring unit 602 is configured to acquire at least one kind of geographic function information and at least one kind of surface-feature distribution information of the geographic location region.
The encoding unit 603 is configured to input the acquired geographic function information and the acquired surface-feature distribution information into an encoding model, the encoding model performing embedding on the geographic function information and the surface-feature distribution information, and fusing vector representations obtained by the embedding to obtain an encoding result of the geographic location region.
The geographic function information includes at least one of: point of interest information, user information, or place query terms initiated at the geographic location region.
The surface-feature distribution information includes at least one of: a base map image or a street scene image.
The dividing unit 604 is configured to pre-divide the geographic location region according to preset precision.
As an implementable way, the determining unit 601 acquires an input geographic location coordinate: and determines the geographic location region where the geographic location coordinate is located as the to-be-encoded geographic location region.
As another implementable way, the determining unit 601 may use each divided geographic location region as the to-be-encoded geographic location region.
The applying unit 605 is configured to determine similar geographic location regions using a distance between the encoding results of the geographic location regions: or, based on the encoding result of the geographical location region where a user is located, perform search recommendation or a search result sort on the user.
The embodiments in the specification are described progressively, and mutual reference may be made to same and similar parts among the embodiments, and each embodiment focuses on differences from other embodiments. In particular, since the apparatus embodiment is substantially similar to the method embodiment, the description is relatively simple, and reference may be made to the corresponding description of the method embodiment for relevant points.
According to the embodiment of the present disclosure, there are also provided an electronic device, a readable storage medium and a computer program product.
As shown in
The plural components in the device 700 are connected to the I/O interface 705, and include: an input unit 706, such as a keyboard, a mouse, or the like; an output unit 707, such as various types of displays, speakers, or the like: the storage unit 708, such as a magnetic disk, an optical disk, or the like; and a communication unit 709, such as a network card, a modem, a wireless communication transceiver, or the like. The communication unit 709 allows the device 700 to exchange information/data with other devices through a computer network, such as the Internet, and/or various telecommunication networks.
The computing unit 701 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a central processing unit (CPU), a graphic processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, or the like. The computing unit 701 performs the methods and processing operations described above, such as the method for encoding a geographic location region and the method for establishing an encoding model. For example, in some embodiments, the method for encoding a geographic location region and the method for establishing an encoding model may be implemented as a computer software program tangibly contained in a machine readable medium, such as the storage unit 708.
In some embodiments, part or all of the computer program may be loaded and/or installed into the device 700 via the ROM 502 and/or the communication unit 709. When the computer program is loaded into the RAM 703 and executed by the computing unit 701, one or more steps of the method for encoding a geographic location region and the method for establishing an encoding model described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the method for encoding a geographic location region and the method for establishing an encoding model by any other suitable means (for example, by means of firmware).
Various implementations of the systems and technologies described herein may be implemented in digital electronic circuitry, integrated circuitry, field programmable gate arrays (FPGA), application specific integrated circuits (ASIC), application specific standard products (ASSP), systems on chips (SOC), complex programmable logic devices (CPLD), computer hardware, firmware, software, and/or combinations thereof. The systems and technologies may be implemented in one or more computer programs which are executable and/or interpretable on a programmable system including at least one programmable processor, and the programmable processor may be special or general, and may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input apparatus, and at least one output apparatus.
Program codes for implementing the method according to the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or a controller of a general purpose computer, a special purpose computer, or other programmable data processing apparatuses, such that the program code, when executed by the processor or the controller, causes functions/operations specified in the flowchart and/or the block diagram to be implemented. The program code may be executed entirely on a machine, partly on a machine, partly on a machine as a stand-alone software package and partly on a remote machine, or entirely on a remote machine or a server.
In the context of the present disclosure, the machine readable medium may be a tangible medium which may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine readable medium may be a machine readable signal medium or a machine readable storage medium. The machine readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of the machine readable storage medium may include an electrical connection based on one or more wires, a portable computer disk, a hard disk, a random access memory (RAM), a read only memory (ROM), an erasable programmable read only memory (EPROM or flash memory), an optical fiber, a portable compact disc read only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide interaction with a user, the systems and technologies described here may be implemented on a computer having: a display apparatus (for example, a cathode ray tube (CRT) or liquid crystal display (LCD) monitor) for displaying information to a user: and a keyboard and a pointing apparatus (for example, a mouse or a trackball) by which a user may provide input for the computer. Other kinds of apparatuses may also be used to provide interaction with a user: for example, feedback provided for a user may be any form of sensory feedback (for example, visual feedback, auditory feedback, or tactile feedback); and input from a user may be received in any form (including acoustic, speech or tactile input).
The systems and technologies described here may be implemented in a computing system (for example, as a data server) which includes a back-end component, or a computing system (for example, an application server) which includes a middleware component, or a computing system (for example, a user computer having a graphical user interface or a web browser through which a user may interact with an implementation of the systems and technologies described here) which includes a front-end component, or a computing system which includes any combination of such back-end, middleware, or front-end components. The components of the system may be interconnected through any form or medium of digital data communication (for example, a communication network). Examples of the communication network include: a local area network (LAN), a wide area network (WAN) and the Internet.
A computer system may include a client and a server. Generally, the client and the server are remote from each other and interact through the communication network. The relationship between the client and the server is generated by virtue of computer programs which run on respective computers and have a client-server relationship to each other. The server may be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so as to overcome the defects of high management difficulty and weak service expansibility in conventional physical host and virtual private server (VPS) service. The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be understood that various forms of the flows shown above may be used and reordered, and steps may be added or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, which is not limited herein as long as the desired results of the technical solution disclosed in the present disclosure may be achieved. 5
The above-mentioned implementations are not intended to limit the scope of the present disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made, depending on design requirements and other factors. Any modification, equivalent substitution and improvement made within the spirit and principle of the present disclosure all should be included in the extent of protection of the present disclosure.
Number | Date | Country | Kind |
---|---|---|---|
202110565434.6 | May 2021 | CN | national |
The present application is a U.S. national phase of International Application No. PCT/CN2021/131177, filed on Nov. 17, 2021, which claims priority to Chinese Patent Application No. 202110565434.6, filed on May 24, 2021, entitled “Method and Apparatus for Encoding Geographic Location Region As Well As Method and Apparatus for Establishing Encoding Model”, which are hereby incorporated by reference in their entireties.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2021/131177 | 11/17/2021 | WO |