Existing geospatial data solutions require extensive amount of time and computing power to clean and normalize data. Additionally, existing geospatial data solutions generally store data in silos across different organizations and applications. The siloed data used by existing systems cannot be efficiently merged and used across different applications.
It is with respect to these and other general considerations that embodiments have been described. Also, although relatively specific problems have been discussed, it should be understood that the embodiments should not be limited to solving the specific problems identified in the background.
Aspects of the present disclosure relate to a geospatial knowledge graph system that efficiently organizes, queries, and analyzes location-based information at various granularities, ranging from individual places to neighborhoods, cities, and beyond. The geospatial knowledge graph represents cells in a hierarchical grid system as nodes in a graph. These nodes are connected to other real-world entities including points of interest, people, and the movement of people across different locations. By including these additional entities as nodes, the geospatial knowledge graph provides a comprehensive and versatile framework for representing complex spatial interactions and dependencies.
This Summary is provided to introduce a selection of concepts in a simplified form, which is further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Additional aspects, features, and/or advantages of examples will be set forth in part in the following description and, in part, will be apparent from the description, or may be learned by practice of the disclosure.
Non-limiting and non-exhaustive examples are described with reference to the following figures.
Various aspects of the disclosure are described more fully below with reference to the accompanying drawings, which form a part hereof, and which show specific example aspects. However, different aspects of the disclosure may be implemented in many ways and should not be construed as limited to the aspects set forth herein; rather, these aspects are provided so that this disclosure will be thorough and complete and will fully convey the scope of the aspects to those skilled in the art. Aspects may be practiced as methods, systems, or devices. Accordingly, aspects may take the form of a hardware implementation, an entirely software implementation or an implementation combining software and hardware aspects. The following detailed description is, therefore, not to be taken in a limiting sense.
Aspects of the present disclosure relate to a geospatial knowledge graph system that efficiently organizes, queries, and analyzes location-based information at various granularities, ranging from individual places to neighborhoods, cities, and beyond. The geospatial knowledge graph represents cells in a hierarchical grid system as nodes in a graph. These nodes are connected to other real-world entities (also represented as nodes in the graph) including points of interest (POIs), people, and the movement of people across different locations. By including these additional entities as nodes, the geospatial knowledge graph provides a comprehensive and versatile framework for representing complex spatial interactions and dependencies.
For example, a distributed network of mobile devices may be used to collect data across areas of different granularities (e.g., venues, neighborhoods, cities, states, countries, etc.). The mobile devices may be any type of device comprising network capabilities, location services, movement sensors, etc. Exemplary devices include, but are not limited to, smartphones, GPS devices, smartwatches, automobiles, airplanes, trains, etc. In examples, users may opt to include their devices in said distributed network, for example, by downloading an application which connects the device to the network. Upon installing said application and receiving permission to collect data, an individual device in the network may utilize onboard sensors to collect information that can be used to by stop detection technology to determine whether the device stopped at a particular location, venue, POI, geographic region, etc. For example, sensor data in combination with location data, such as GPS, wireless network scan data, cellular data, etc., can be used to determine whether a device visited a particular location.
Aspects of the present disclosure provide an improved system for the acquisition, aggregation, storage, and analysis of geospatial data. In examples, geospatial data, or spatial data, represents features or objects on the Earth's surface (or other planetary body), including their locations, shapes, and attributes. In examples, the geospatial knowledge graph disclosed herein is operable to store any type of attributes associated with a feature or object, including, but not limited to physical feature data, temporal data, temporary attributes, etc. Furthermore, the disclosed geospatial knowledge graph can be leveraged across a number of different technical areas. For example, the geospatial knowledge graph can be used to precisely track weather attributes across regions to show accurate weather forecasts at different levels of granularity, provide venue recommendations for different geographic areas, improve navigation applications, improved tracking of deliveries, optimize transportation routes across supply chains, etc.
Existing geospatial data solutions require extensive amount of time and computing power to clean and normalize data. Additionally, existing geospatial data solutions generally store data in silos across different organizations and applications. The siloed data used by existing systems cannot be efficiently merged and used across different applications. As location data analysis requires highly specialized knowledge and skill sets, it is difficult to use the siloed data with different applications and across various different technologies. Because of this, it is often difficult to consume and visualize (e.g., generate user interfaces conveying geospatial data and associated attributes) across different applications, devices, and technologies. The geospatial knowledge graph disclosed herein addresses all these shortcoming, among other limitations of existing technology, to provide an efficient and effective system for aggregating, storing, and processing geospatial data and various disparate types of data associated with geospatial regions.
As shown in
As depicted in
Different entities and different information can be encompassed in a cell at a particular resolution, for example, at the lowest level resolution, e.g., resolution level 1 cell 102 shown in hierarchical grid system 100, various entities, such as, for example, venues, points of interest, people currently located in the cell, can be associated with a particular cell having resolution level 1 102. As resolution is increased, the data from all or a subset, depending on the type of data or type of query employed along with the hierarchical grid system 100, can be aggregated into the larger resolution cell. For example, all of the data or information (e.g., people, POIs, venues) of level 1 cells 102 can be aggregated into a level 2 cell 104. In examples, the subset may be determined based upon a type of application requesting data from the hierarchical grid system. For example, a navigation system may only select a subset of data comprising traffic data from when aggregating resolution cells, a venue system may select a subset of data comprising venue data, a weather application may only select a subset of data comprising weather information, etc.
While aspects of the present disclosure have thus far been described as creating and using a hierarchical grid system, such as hierarchical grid system 100, the data or information associated with the different cells in a hierarchical grid system can be stored and operated upon using a geospatial knowledge graph. A geospatial knowledge graph may be employed to not only represents locations (e.g., cells in a hierarchical graph) but also incorporate other real-world entities such as people, points of interest (POIs), events and transactions. By including these additional entities as nodes and explicitly modeling their relationships as edges in a graph, the geospatial knowledge graph provides a comprehensive and versatile framework for representing complex spatial interactions and dependencies. Use of a geospatial graph provides a richer expression of the spatial dependencies and interactions between different entities, allowing for advanced querying and analysis that would be difficult to achieve in more traditional geospatial systems.
The use of a geospatial knowledge graph 200 allows for the use to efficiently organize, aggregate, and perform operations on data associated with a geospatial app, as will be described further herein. Additionally, the use of the geospatial knowledge graph 200 allows for the efficient loading of data when users traverse a map, for example, when panning through adjacent cells in a similar level of resolution or zooming in or out on a map to load data from cells in different resolutions.
In further aspects, the different nodes in the geospatial knowledge graph 200 can include different types of information, based upon the type of entity being represented a particular node. As an example,
The geospatial knowledge graph approach is inherently more flexible when it comes to integrating data from multiple sources, as it can easily accommodate different types of spatial entities and relationships. This is made possible by two factors: 1) Any spatial dataset can be indexed based on the identities of the cells, and 2) information associated with POI or entity, such as transactions or events can be efficiently incorporated as nodes in the knowledge graph. This is an advantage when working with heterogeneous datasets or collaborating across different domains.
As previously discussed, among other benefits, the graph-based structure of the geospatial knowledge graph lends itself to advanced graph analytics techniques that can uncover hidden patterns, clusters, and relationships in the data. These techniques can be more efficient and powerful than traditional geospatial analysis methods. Because heterogeneous data and data types are stored in the graph-based structure, the data can be consumed and processed using various different applications, algorithms, and machine learning models to identify patters and hidden relationships across disparate data. As such, the disclosed geospatial system addresses the deficiencies of existing, siloed, systems and provides unified geospatial system that can be utilized across various different types of applications and technology areas.
While aspects of the disclosure have thus far related to using cell based hierarchical grid systems in association with geospatial knowledge graphs, aspects of the present disclosure can further accommodate temporal data, thereby tracking changes in geographic characteristics, people, actions, entities, etc., over time. For example, a temporal dimension can be introduced to the geospatial knowledge graph system to build upon the foundation of the geospatial knowledge graph by incorporating a time dimension. Among other benefits, the introduction of a temporal dimension allows for improved analysis and understanding of location-based data over time. By taking snapshots of specific entities at regular time intervals, this approach facilitates as-of analysis and longitudinal analysis, allowing users to comprehend changes over a period of time and analyze temporal patterns in spatial data. The time-aware geospatial knowledge graph may continue to employ a hierarchical grid-based framework, as depicted in
The graph-based representation along with a temporal dimension enable understanding of not only how a specific entity has changed over time, but also help gain insights around how other entities have influenced change in an analyzed entity. The graph-based representation coupled with the temporal dimension can show, for example, changes in demographics of the neighborhoods served by that store was the driving factor in the increase/decrease of foot traffic, changes in traffic patterns, changes in weather patterns over different periods of time, etc.
At operation 404, a snapshot is generated for the retrieved subset of hierarchical data and associated knowledge graph. For example, a copy of the retrieved data may be created and stored along with a temporal indicator indicating a time for the captured data. Flow continues to operation 406, wherein the copy of information is stored in the knowledge graph as a node in the graph. For example, the snapshot may be stored as a temporal node which includes the copied information. At operation 408, the stored snapshot may be related to other temporal snapshots for the same subset of data, for example, through the generation of relationship edges between a series of temporal snapshots generated for the same subset of data. In doing so, the temporal data can be efficiently tracked and operated on via the knowledge graph.
One of skill in the art will appreciate that any number of subsets for any level of granularity of data may be generated and stored, thereby allowing for the tracking of changes across time for various different entities of data stored in the graph knowledgebase. For example, the level of granularity tracked temporally may be as small as a single entity (e.g., a venue or person). Further, the type of changes tracked temporally can be specified, stored, and later recalled or analyzed using aspects of the present disclosure, such as the change in a road layout over time, the changes of demographics of a neighborhood, gentrification changes, foot traffic for a store, person movement through a region, etc. In doing so, aspects of the present disclosure provide a highly efficient (e.g., from storage of data and loading of data) solution to track various levels of granularity over a temporal dimension.
The use of a hierarchical grid system, such as the exemplary systems disclosed herein, as the foundation for the geospatial knowledge graph provides a consistent and scalable method for representing and managing large amounts of spatial data, which can be challenging in traditional geospatial systems. This enables efficient storage and querying of location-based information at different granularities. Further, aspects disclosed herein provide for dynamically computing information that can be attached to location nodes in the geospatial knowledge graph based on information attached to other nodes. Exemplary types of dynamic computations include summarization functions, rollup functions, and breakdown functions.
At operation 502, a focus of the summarization is determined. For example, the focus may be determined based upon the type of query or analysis received that initiated the summarization function. Continuing with the example noted above, the focus may be computing the number of POIs in the identified cells. The determined focus can be used to aggregate the information from the underlying geospatial knowledge graph at operation 506. For example, all of the POIs information of the associated cells can be aggregated, based upon the determined focus, and provided to a summarization function. The summarization function is executed on the aggregated data to provide a summarization. For example, continuing with the example above, the number of POIs may be determined by the aggregation function. Summarization functions, however, are not limited to mathematical functions. For example, the summarization function may relate to the generation of specific types of information about an entity or area, such as number of visitors associated with the entity or area, a description of various entities, etc. One of skill in the art will appreciate that any number of different summarization functions may be practiced with the aspects disclosed herein. At operation 508, the summarized information is provided, for example, for display on a user interface in response to a query, provided to another function for further analysis, or stored for later use.
Another example of a dynamic function may be a rollup function. Rollup functions accept information tied to child nodes of a cell to compute a value that can be tied to the parent node of a cell. For instance, the number of POIs tied to a cell node at resolution 2, can be computed from the count of POIs tied to all the child cells (e.g., cells at resolution 1).
At operation 604, data of interest for the selected cell is determined. In one example, the data of interest may be determined based upon a specific query, such as the number of venues located within the cell, the number of points of interest, demographic information for the cell, traffic information for the cell, weather information, etc. Alternatively, or additionally, the data of interest may be inferred, for example, based upon the type of application selecting the cell, the user selecting the cell, etc.
Flow continues to operation 606, where the parent node associated with the selected cell is identified in an associated knowledge graph. Upon identifying the parent node, child nodes associated with the parent node are identified at operation 608. In one example, all child nodes associated with the parent node may be identified. Alternatively, in another example, only child nodes having the data of interest may be identified at operation 608. Upon identifying the child nodes, flow continues to operation 610 where the data of interest from, or associated with, the identified child nodes are aggregated and associated with the parent node, thereby rolling up the data of interest into the parent node. The aggregated data of interest may be stored with the parent node for future recall.
Yet another exemplary dynamic function that can be employed using the geospatial knowledge graph may be a breakdown function. Breakdown functions use data attached to a specific cell and its child cells to distribute a value tied to the parent cell to each of its child cells. A breakdown could be as simple as disaggregating the population information captured at a selected resolution cell to lower resolution cells by using uniform distribution or using other information tied to cells at the selected resolution such as whether there is a road, building or railway track contained within that cell to do a heuristic-based distribution.
At operation 708, a distribution is determined for the one or more child nodes. The distribution may be a weighted factor which identifies how much up the data from the parent node should be attributed to a particular child node. In many instances, it will not make sense to evenly divide the data between child nodes. For example, when dividing up the total population, a child node that includes a number of apartment buildings should be distributed a higher number of people than a node that covers a field. As such, at operation 708, a weighing factor may be determined for a particular child node based upon the type of data being distributed and the characteristics of the particular child node.
At operation 710, an amount of the data being broken down is assigned to the various child nodes based upon the weights associated with the child nodes. The breakdown data may be associated with the child node for future recall.
Additional aspects disclosed herein enable the derivation of insights at different location granularities, including at the Place, Neighborhood, mall/megaplex, and airport levels. At the Place level, the system allows for the understanding of the demographics of the visitors of the Place. At the Neighborhood level, the system allows for the understanding of the popular places visited by the people living in that Neighborhood. At the mall/megaplex level, the system allows for the understanding of the busy hours, the composition of the stores within that mall, and the locations within the mall where people are spending most of their time. At the airport level, the system allows for the understanding of the ingress/egress points of the airport, the busy hours, and the congested points within the airport at different times of the day. This system provides a powerful tool for analyzing and understanding spatial data, enabling users to gain insights into the movements and behaviors of people in the real world.
Further aspects allow for the derivation of insights about locations on different time scales. This system represents a hexagonal grid system as a hierarchical spatial graph and represents the real-world movements of users across places as a data graph. By overlaying the data graph on the hexagonal spatial graph, this system allows for efficient querying and analysis of spatial data at different levels of granularity and on different time scales. For example, the system can provide insights into the busy hours of a Place within a day, the busy days within a week, and the seasonal patterns within a year. Additionally, the system can provide insights into the travel patterns of people living within a specific neighborhood on a weekly basis, as well as the migration patterns of people across cities or states.
In its most basic configuration, the operating environment 800 typically includes at least one processing unit 802 and memory 804. Depending on the exact configuration and type of computing device, memory 804 (instructions to perform the aspects disclosed herein) may be volatile (such as RAM), non-volatile (such as ROM, flash memory, etc.), or some combination of the two. This most basic configuration is illustrated in
Operating environment 800 typically includes at least some form of computer readable media. Computer readable media can be any available media that can be accessed by the at least one processing unit 802 or other devices comprising the operating environment. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable, and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other tangible, non-transitory medium which can be used to store the desired information. Computer storage media does not include communication media. Computer storage media does not include a carrier wave or other propagated or modulated data signal.
Communication media embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media.
The operating environment 800 may be a single computer operating in a networked environment using logical connections to one or more remote computers. The remote computer may be a personal computer, a server, a router, a network PC, a peer device, or other common network node, and typically includes many or all of the elements described above as well as others not so mentioned. The logical connections may include any method supported by available communications media. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and the Internet.
The description and illustration of one or more aspects provided in this application are not intended to limit or restrict the scope of the disclosure as claimed in any way. The aspects, examples, and details provided in this application are considered sufficient to convey possession and enable others to make and use the best mode of claimed disclosure. The claimed disclosure should not be construed as being limited to any aspect, for example, or detail provided in this application. Regardless of whether shown and described in combination or separately, the various features (both structural and methodological) are intended to be selectively included or omitted to produce an embodiment with a particular set of features. Having been provided with the description and illustration of the present application, one skilled in the art may envision variations, modifications, and alternate aspects falling within the spirit of the broader aspects of the general inventive concept embodied in this application that do not depart from the broader scope of the claimed disclosure.
Additional aspects of the present disclosure are provided in the accompanying Appendix.
Number | Date | Country | |
---|---|---|---|
63465622 | May 2023 | US |