The advent of massive networked computing resources has enabled virtually unlimited data collection, storage and analysis for projects such as low-cost genome sequencing, high-precision molecular dynamics simulations and high-definition imaging data for radiology, to name just a few examples. The resulting large, complex datasets known as “big data” make data processing difficult or impossible using database management software from one computer. Big data are becoming increasingly present in many aspects of society and technology including health care, science, industry and government. Many of these large, complex data sets are best understood when analyzed in a structured manner.
One such structured manner is to use an ontology for a data set, which is a structured representation of the data in that data set. Although not new per se, the use of ontologies is growing in the presence of modern computer technologies. For example, the semantic web is a very compelling, yet nascent and underdeveloped, example use of ontologies for data sets. The paradigms of big data and ontologies are likely to become more important. These paradigms have worked well together, such as in the field of visual analytics, which uses interactive visual techniques to interact with big data.
Ontologies also enable formal analysis, which helps with semantic correctness, interoperability, and can bring much needed insight. Ontologies can be applied to complex, multi-dimensional, and/or large data sets. But the development of data-specific, formal ontologies can be very difficult.
In one aspect, a method is provided. A computing device receives data from one or more data sources. The computing device generates a data frame based on the received data. The data frame includes a plurality of data items. The computing device determines a data ontology, where the data ontology includes a plurality of datanodes. The computing device determines a plurality of data pins. A first data pin of the plurality of data pins includes a first reference and a second reference. The first reference for the first data pin refers to a first data item in the data frame and the second reference for the first data pin refers to a first datanode of the plurality of datanodes. The first datanode is related to the first data item. The computing device obtains data for the first data item at the first datanode of the data ontology via the first data pin. The computing device provides a representation of the data ontology.
In another aspect, a computing device is provided. The computing device includes a processor and a tangible computer readable medium. The tangible computer readable medium is configured to store at least executable instructions. The executable instructions, when executed by the processor, cause the computing device to perform functions including: receiving data from one or more data sources; generating a data frame based on the received data, the data frame including a plurality of data items; determining a data ontology, where the data ontology includes a plurality of datanodes; determining a plurality of data pins, where a first data pin of the plurality of data pins includes a first reference and a second reference, where the first reference for the first data pin refers to a first data item in the data frame, where the second reference for the first data pin refers to a first datanode of the plurality of datanodes, and where the first datanode is related to the first data item; obtaining data for the first data item at the first datanode of the data ontology via the first data pin; and providing a representation of the data ontology.
In another aspect, a tangible computer readable medium is provided. The tangible computer readable medium is configured to store at least executable instructions. The executable instructions, when executed by a processor of a computing device, cause the computing device to perform functions including: receiving data from one or more data sources; generating a data frame based on the received data, the data frame including a plurality of data items; determining a data ontology, where the data ontology includes a plurality of datanodes; determining a plurality of data pins, where a first data pin of the plurality of data pins includes a first reference and a second reference, where the first reference for the first data pin refers to a first data item in the data frame, where the second reference for the first data pin refers to a first datanode of the plurality of datanodes, and where the first datanode is related to the first data item; obtaining data for the first data item at the first datanode of the data ontology via the first data pin; and providing a representation of the data ontology.
In another aspect, a device is provided. The device includes means for receiving data from one or more data sources; means for generating a data frame based on the received data, the data frame including a plurality of data items; means for determining a data ontology, where the data ontology includes a plurality of datanodes; means for determining a plurality of data pins, where a first data pin of the plurality of data pins includes a first reference and a second reference, where the first reference for the first data pin refers to a first data item in the data frame, where the second reference for the first data pin refers to a first datanode of the plurality of datanodes, and where the first datanode is related to the first data item; means for obtaining data for the first data item at the first datanode of the data ontology via the first data pin; and means for providing a representation of the data ontology.
Many modern large-scale projects, such as scientific investigations for bioinformatics research, are generating big data. The explosion of big data is changing traditional scientific methods; instead of relying on experiments to output relatively small targeted datasets, data mining techniques are being used to analyze data stores with the intent of learning from the data patterns themselves. Data analysis and integration in large data storage environments can challenge even experienced scientists.
Many of these large datasets are complex, heterogeneous, and/or incomplete. Most existing domain-specific tools designed for complex heterogeneous datasets are not equipped to visually analyze big data. For example, while powerful scientific toolsets are available, including software libraries such as SciPy, specialized visualization tools such as Chimera, and scientific workflow tools such as Taverna, Galaxy, and the Visualization Toolkit (VTK), some toolsets cannot handle large datasets. Other toolkits have not been updated to handle recent advances in data generation and acquisition.
DIVE (Data Intensive Visualization Engine) was designed and developed to help fill this technological gap. DIVE includes a software framework intended to facilitate analysis of big data and reduce the time to derive insights from the big data. DIVE employs an interactive, extensible, and adaptable data pipeline to apply visual analytics approaches to heterogeneous, high-dimensional datasets. Visual analytics is a big data exploration methodology emphasizing the iterative process between human intuition, computational analyses and visualization. DIVE's visual analytics approach integrates with traditional methods, creating an environment that supports data exploration and discovery.
DIVE provides a rich ontologically expressive data representation and a flexible modular streaming-data architecture or pipeline. The DIVE pipeline is accessible to users and software applications through an application programming interface, command line interface or graphical user interface. Applications built on the DIVE framework inherit features such as a serialization infrastructure, ubiquitous scripting, integrated multithreading and parallelization, object-oriented data manipulation and multiple modules for data analysis and visualization. DIVE can also interoperate with existing analysis tools to supplement its capabilities by either exporting data into known formats or by integrating with published software libraries. Furthermore, DIVE can import compiled software libraries and automatically build native ontological data representations, reducing the need to write DIVE-specific software. From a data perspective, DIVE supports the joining of multiple heterogeneous data sources, creating an object-oriented database capable of showing inter-domain relationships.
A core feature of DIVE's framework is the flexible graph-based data representation. DIVE data are stored as datanodes in a strongly typed ontological network defined by the data. These data can range from a set of unordered numbers to a complex object hierarchy with inheritance and well-defined relationships. Datanodes are software objects that can update both their values and structures at runtime. Furthermore, the datanodes' ontological context can change during runtime. So, DIVE can explore dynamic data sources and handle the impromptu user interactions commonly required for visual analysis.
Data flow through the system explicitly as a set of datanodes passed down the DIVE pipeline or implicitly as information transferred and transformed through the data relationships. Data from any domain may enter the DIVE pipeline, allowing DIVE to operate on a wide variety of datasets, such as, but not limited to, protein simulations, gene ontology, professional baseball statistics, and streaming sensor data.
Besides simply representing the conceptual structure of the user's dataset, DIVE's graph-based data representation can effectively organize data. For example, using DIVE's object model, ontologies from disparate sources can be merged. Each ontology can be represented as DIVE datanodes and dataedges. Then, the ontologies can be merged through property inheritance. This allows ontologies to inherit definitions from each other, resulting in a new, merged ontology compatible with multiple data sources and amenable to new analytical approaches.
DIVE includes a DIVE object parser with the ability to parse a .NET object or assembly distinct from the DIVE framework. Use of the DIVE object parser can circumvent addition of DIVE-specific code to existing programs. Further, the DIVE object parser can augment those programs with DIVE capabilities such as graphical interaction and manipulation. In one example (the Dynameomics API), the underlying data structures and the streaming functionality were integrated into a Protein Dashboard tool using the DIVE object parser without modifying the existing API code base, enabling reuse of the same code base in the DIVE framework and in Structured Query Language (SQL) Common Language Runtime implementations and other non-DIVE utilities.
DIVE supports two general techniques for data streaming: interactive SQL and pass-through SQL. Interactive SQL can effectively provide a flexible visualization frontend for an SQL database or data warehouse. However, for datasets not immediately described by the underlying database schema or other data source, the pass-through SQL approach can be used to stream complex data structures. Use of the pass-through SQL approach can enable use of very large scale datasets. For example, the pass-through SQL approach allowed DIVE to make hundreds of terabytes of structured data immediately accessible to users in a Dynameomics case study. These data can be streamed into datanodes and can be accessed either directly or indirectly through the associated ontology (for example, through property inheritance). Furthermore, these data are preemptively loaded via background threads into backing stores; these backing stores are populated using efficient bulk transfer techniques and predictively cache data for user consumption.
Finally, when the object parser is used with pass-through SQL, methods as well as data are parsed. So, the datanodes can access native .NET functionality in addition to the streaming data. Preexisting programs also can benefit from DIVE's streaming capabilities. For example, Chimera can open a network socket to DIVE's streaming module. This lets Chimera stream MD data directly from the Dynameomics data warehouse.
Overall, DIVE provides an interactive data-exploration framework that expands on conventional analysis paradigms and self-contained tools. DIVE can adapt to existing data representations, consume non-DIVE software libraries and import data from an array of sources. As research becomes more data-driven, fast, flexible big data visual analytics solutions, such as the herein-described DIVE, can provide a new perspective for projects using large, complex data sets.
Interaction can be provided by DIVE system 100 providing visual analytics and/or other tools for exploration of data from data sources 110. Interoperability can be provided by DIVE system 100 providing data obtained from data sources 110 in a variety of formats to DIVE plug-ins, associated applications, and DIVE tools.
These plug-ins, applications, and tools can be organized via the data pipeline. As one example, a DIVE tool can start a DIVE pipeline to convert data in a data frame into an ontological representation using a first DIVE plug-in, an application can generate renderable data from the ontological representation, and then a second DIVE plug-in can enable interaction with the renderable data.
The DIVE pipeline can be used to arrange components in a sequence of pipeline stages. An example three-stage DIVE pipeline using the above-mentioned components can include:
Stage 1—the first DIVE plug-in receives data from data sources 110, generates corresponding ontological representations, and outputs the ontological representations onto the pipeline.
Stage 2—the application receives the ontological representations as inputs via the pipeline, generates renderable data, and outputs the renderable data onto the pipeline.
Stage 3—the second DIVE plug-in can receive the renderable data via the pipeline and present the renderable data for interaction.
Additional DIVE pipeline examples are possible as well—some of these additional examples are discussed herein.
DIVE is domain independent and data agnostic. The DIVE pipeline accepts data from any domain, provided an appropriate input parser is implemented. Some example data formats supported by DIVE include, but are not limited to, SQL, XML, comma- and tab-delimited files, and several other standard file formats. In some embodiments, DIVE can utilize functionality from an underlying software infrastructure, such as a UNIX™-based system or the .NET environment.
Ontologies are gaining popularity as a powerful way to organize data. DIVE system 100's core data representation using datanodes and dataedges was developed with ontologies in mind. The fundamental data unit in DIVE is the datanode, where datanodes can be linked using dataedges.
Datanodes somewhat resemble traditional object instances from object-oriented (OO) languages such as C++, Java, or C#. For example, datanodes are typed, contain strongly typed properties and methods, and can exist in an inheritance hierarchy. Datanodes extend the traditional model of object instances, as datanodes can exist outside of an OO environment; e.g., in an ontological network or graph, and can have multiple relationships beyond simple type inheritance. DIVE system 100 implements these relationships between datanodes using dataedges to link related datanodes. Dataedges can be implemented by datanode objects and consequently might contain properties, methods, and inheritance hierarchies. Because of this basic flexibility, DIVE system 100 can represent arbitrary, typed relationships between objects, objects and relationships, and relationships and relationships.
Datanodes are also dynamic; every method and property can be altered at runtime, adding flexibility to DIVE system 100. The DIVE pipeline contains various data integrity mechanisms to prevent unwanted side effects. The inheritance model is also dynamic; as a result, objects can gain and lose type qualification and other inheritance aspects at runtime. This allows runtime classification schemes such as clustering to be integrated into the object model.
Datanodes of DIVE system 100 provide virtual properties. These properties are accessed identically to fixed properties but store and recover their values through arbitrary code instead of storing data on the datanode object. Virtual properties can extend the original software architecture's functionality, e.g., to allow data manipulation.
Dataedges can be used to implement multiple inheritance models. Besides the traditional is-a relationship in object-oriented (OO) languages, ontological relationships such as contains, part-of, and bounded-by can be expressed. Each of these relationships can support varying levels of inheritance:
Like OO language objects, property-inheritance subclasses can override superclass methods and properties with arbitrary transformations. Similarly, type-inheritance subclasses can be cast to superclass types. Because DIVE system 100 supports not only multiple inheritance but also multiple kinds of inheritance, casting can involve traversing the dataedge ontology. Owing to the coupling of the underlying data structure and ontological representation, every datanode and dataedge is implicitly part of a system-wide graph. Then, graph-theoretical methods can be applied to analyze both the data structures and ontologies represented in DIVE system 100. This graph-theoretical approach has already proved useful in some examples; e.g., application of DIVE system 100 to structural biology.
DIVE system 100 supports code and tool reuse. Because all data are represented by datanodes and dataedges, DIVE analysis modules are presented with a syntactically homogenous dataset. Owing to this data-type independence, modules can be connected so long as the analyzed datanodes have the expected properties, methods, or types.
Data-type handling is a challenge in modular architectures. For examples, Taverna uses typing in the style of MIME (Multipurpose Internet Mail Extensions), the VTK uses strongly typed classes, and Python-based tools, such as Biopython and SciPy, often use Python's dynamic typing.
For DIVE system 100, the datanode and dataedge ontological network is a useful blend of these approaches. The dynamic typing of individual datanodes and dataedges lets us build arbitrary type networks from raw data sources. The underlying strong typing of the actual data (doubles, strings, objects, and so on) facilitates parallel processing, optimized script compilation, and fast, non-interpreted handling for operations such as filtering and plotting. Datanodes and dataedges themselves can be strongly typed objects to facilitate programmatic manipulation of the dataflow itself. Although each typing approach has its strengths, typing by DIVE system 100 lends itself to fast, agile data exploration and fast, agile updating of DIVE tools. The datanode objects' homogeneity also simplifies the basic pipeline and module development. The tool updating is a particularly useful feature in an academic laboratory where multiple research foci, a varied spectrum of technical expertise, and high turnover are all common.
Data can be imported into DIVE system 100 to make the data accessible to the DIVE pipeline. In some cases, DIVE system 100 includes built-in functionality for importing data. For tabular data or SQL data tables, DIVE system 100 can construct one datanode per row, and each datanode has one property per column. DIVE also supports obtaining data from Web services such as the Protein Data Bank. Once DIVE obtains the data for data nodes, DIVE can establish relationships between datanodes using dataedges.
The DIVE pipeline can utilize plug-ins to create, consume, or transform data. A plug-in can include a compiled software library whose objects inherit from a published interface to the DIVE pipeline. Plug-ins can move data through “pins” much like an integrated circuit: data originate at an upstream source pin and are consumed by one or more downstream sink pins. Plug-ins can also move data by broadcasting and receiving events. Users can save pipeline topologies and states as saved pipelines and also share saved pipelines. DIVE system 100 can provide subsequent plug-in connectivity, pipeline instantiation, scripting, user interfaces, and other aspects of plug-in functionality.
When DIVE system 100 sends a datanode object through a branching, multilevel transform pipeline, correctness of the datanode's property value(s) should be maintained at each pipeline stage. For example, a simple plug-in that scaled its incoming values could scale all incoming data values everywhere in the pipeline. One option to ensure datanode correctness is to copy all datanodes at every pipeline stage. This option can be computational-resource intensive and can delay a user from interacting with the datanodes.
Another option to address the correctness problem is to create a version history of each transformed value of a datanode. For example, DIVE system 100 can use read and write contexts to maintain the version history; e.g., values of a datanode can be saved before and after writing by the pipeline. The version history can be keyed on each pipeline stage. Then, each plug-in can reads only the appropriate values for its pipeline stage and does read values from another pipeline stage or branch. The use of version histories can be fast and efficient because upstream graph traversal is linear and each value lookup in a read or write context is a constant time operation. Use of version histories maintains data integrity in a branching transform pipeline as well as being parallelizable. Further, the use of read and write contexts can accurately track a property value at every stage in the pipeline with a minimum of memory use.
In some embodiments, DIVE system 100 can utilize the Microsoft Windows platform including the .NET framework, as this platform includes dynamic-language runtime, expression trees, and Language-Integrated Query (LINQ) support. The .NET framework can provide coding features such as reflection, serialization, threading, and parallelism for DIVE. These capabilities can affect DIVE's functionality and user experience. Support for dynamic languages allows flexible scripting and customization. LINQ can be useful in a scripted data-exploration environment. Expression trees and reflection can provide object linkages for the DIVE object parser. DIVE streaming can use the .NET framework's threading libraries. DIVE system 100 can use 64-bit computations and parallelism supported by .NET to scale as processor capabilities scale. In other embodiments, DIVE can utilize one or more other platforms that provide similar functionality as described as being part of the Windows platform and .NET framework.
The platform can support several software languages; e.g., C#, Visual Basic, F#, Python, and C++. Such platform support enables authoring DIVE plug-ins in the supported languages. In addition, the supported languages can be used for writing command-line, GUI, and programmatic tools for DIVE system 100. DIVE can use external libraries that are compatible with the platform, including molecular visualizers, clustering and analysis packages, charting tools, and mapping software; e.g., the VTK library wrapped by the ActiViz .NET API. In some embodiments, DIVE can draw on data base support provided by the platform; e.g., storing data in a Microsoft SQL Server data warehouse.
Software clients of DIVE system 100 can include DIVE plug-ins and DIVE tools, as shown in
To speed big data operations, pre-loader 310 can predict user needs, perform on-demand and/or pre-emptive loading of corresponding data frames 320; e.g., subsets of data from one or more of data sources 110, and subsequent caching of loaded data frames 320. Each data frame of data frames 320 can include one or more data items, where each data item can include data in the subset(s) of data from one or more of data sources 110. For example, if an pre-loader 310 is loading data from data sources 110 related to purchases at a department store into data frame DF1 of data frames 320, each data frame, including DF1, can have data items (values) for data having data types such as “Purchased Item”, “Quantity”, “Item Price”, “Taxes”, “Total Price”, “Discounts”, and “Payment Type”.
Preemptive loading can reduce to on-demand loading of a specified frame, if necessary. Caching can be take place locally or remotely and can be single- or multi-tiered. For example, caching can include remote caching on a cloud database, which feeds local caching in local computer memory. In some embodiments, the local computer memory can include random access memory (RAM) chips, processor or other cache memory, flash memory, magnetic media, and/or other memory resident on a computing device executing software of DIVE system 100.
Loaded and cached data from data sources 110 can be stored by pre-loader 310 as data frames 320. Data frames 320 can be stored where they can be quickly accessed by the local computer memory.
Data frame selection logic 330 can include logic for switching relationships between data frames 320 and data pins 332. For example, data selection logic 330 can switch some or all of data pins 332 to reference data from a selected frame of frames 320. Data frame selection logic 330 can be provided by user input, programmatic logic, etc. In some embodiments, a pin-switching process for switching data pins 332 between frames of data frames 320 is O(1).
Once switched to a frame, data pins 332 can pull data, such as data items, from one or more selected data frames. In some embodiments, all pins reference one data frame, while in other embodiments, pins can reference two or more data frames; e.g., a first bank, or subset, of data pins 332 can reference the selected data frame F1 and a second bank of data pins 332 can reference a previously selected frame. Then, when a new data frame F2 is selected, the first bank of data pins 332 can reference the new frame F2 and the second bank of data pins 332 can reference the previously selected frame F1, or perhaps some other previously selected frame.
In some examples, one or more data pins of data pins 332 can be designated as a control pin. The control pin can indicate a control, or one or more data items of interest of the plurality of data items. For example, if data frames are each associated with a time, a control pin can indicate a time of interest a control, two control pins can respectively indicate a beginning time of interest and an ending time of interest for a time-range control, and multiple control pins can indicate multiple time/ranges of interest. As another example, if data frames are each associated with unique identifiers (IDs) such as serial numbers, VINs, credit card numbers, etc., a control pin can specify an ID of interest as a control. As another example, if data frames are each associated with a location, the location for the data frame can be used as a control. Many other examples of controls and control pins are possible as well.
In some examples, the control pin can be writeable so a user could set the control pin data; e.g., specify the control associated with the control pin (e.g., specify a time or ID). Then, once a control has been specified, DIVE system 100 can search or otherwise scan the data from data sources 100 for data related to the control. In other examples, the control pin can be read-only; that is, indicate a value of the control in a data frame, without allowing the control to be changed.
Data in data frames 320 can be organized according to data ontology 340, which can include arbitrary node types and arbitrary edges. Data ontology 340, in turn, can map node and edge properties; e.g., datanodes and dataedges, to data pins 332. When data pins 332 are switched between frames, data throughout ontology 340 that refers to data pins 332 can be simultaneously updated. For example, suppose data pin #1 referred to a data item having a data type of “name” in a data frame of data frames 320, and suppose that the data item accessible via data pin #1 is “Name11”. Then, if data pins 332 are all switched to refer to a new data frame with a name of“Name22”, the reference in data ontology 340 to data pin #1 would refer to the switched data item “Name22”. Many other examples are possible as well.
If data ontology 340 changes, references from data pins 332 into data ontology 340 can be changed as well. That is, each of data pins 332 can include at least two references: one or more references into data frames 320 for data item(s) and one or more references into data ontology 340 for node/edge data/logic. Then, changes in the structure, format, and/or layout of data frames 320 can be isolated by data pins 332 (and perhaps data frame selection logic 330) from data ontology 340 and vice versa.
In some embodiments, all pins switch together. Then, when data pins 332 indicate a data frame of data frames 320 has been switched, all references to data within data ontology 340 made using data pins 332 are updated simultaneously, or substantially simultaneously. If data ontology 340 changes, references from data pins 332 into data ontology 340 can be changed as well, thereby changing references to data made available by data pins 332. For example, if ontology 340 referred to data pin #1 to access a data type of “name” but changed to refer to a “first name” and a “last name”, the reference to data pin #1 may change; e.g., to refer to data pin #1 and #2 or some other data pin(s) of data pins 332.
In other embodiments, upon arrival of a new frame, some data pins 332 may not switch; e.g., a bank of data pins 332 referring to a first-received frame may not switch after the first data frame is received.
Ontological data from data ontology 340 can be arbitrarily transformed via transform 350 before providing data interactions 360. Because of the pin-linked ontology, fed by a fast-switched data set, in turn fed by preemptive data caching, pipeline 300 can use DIVE system 100 to provide quick interaction, analysis, and visualization of complex and multi-dimensional data.
Modern computational problems increasingly require formal ontological analysis. However, for some software hierarchies, formal ontologies do not exist. The generation of formal ontologies can be time consuming, difficult, and require expert attention. Ontologies are often implicitly defined in code by software engineers and so code, such as object hierarchies, can be parsed for conversion into a formal ontology.
For example, an object-parser can traverse object-oriented data structures within a provided assembly using code reflection. Using generalized rules to leverage the existing ontological structure, a formal ontology can be generated from the existing relationships of the data structures within the code. The ontology can be a static ontology defining an ontological structure or can be a dynamic ontology; that is, a dynamic ontology can include links between the ontological structure (of a static ontology) and object instances of the provided code assembly. The dynamic ontology can allow the underlying object instances to be modified through the context of the ontology without changes to the code assembly. In other examples, metadata tags can be added to the assembly to provide definitions for (generated) ontologies, and so provide a richer ontology definition.
DIVE system 100 can include a DIVE object parser, which can automatically generate datanodes and dataedges of a DIVE data structure from a software object hierarchy, such as a .NET object or assembly. Using reflection and expression trees, the DIVE object parser can consume object instances of the software object hierarchy and translates the object instances into propertied datanodes and dataedges of a DIVE data structure. For example, standard objects can be created by library-aware code. Then, those standard objects can be parsed by the DIVE object parser into a DIVE data structure, which can be injected into a DIVE pipeline as a data ontology.
The DIVE object parser can make software object hierarchies available for ontological data representation and subsequent use as DIVE plug-ins written without prior knowledge of DIVE. The DIVE object parser can include generic rules to translate between a software object hierarchy and a DIVE data structure. These generic rules can include:
Additional rules beyond the generic rules can handle other program constructs:
Throughout a parse, no data values are copied to datanodes or dataedges. Instead, dynamically created virtual properties link all datanode properties to their respective software object hierarchy members. So, any changes to runtime object instances are reflected in their corresponding representations in DIVE data structures. Similarly, any changes to datanode or dataedge properties in DIVE data structures propagate back to their software object instance counterparts.
Using this approach, the generic rules, and additional rules, the DIVE object parser can recursively produce an ontological representation of the entire software object hierarchy. With object parsing, users can import and use software object hierarchies within DIVE without special handling, so that software applications can be parsed and readily exploit DIVE capabilities. For example, assume L1 is a nonvisual code library that dynamically simulates moving bodies in space. A DIVE plug-in, acting as a thin wrapper, can automatically import library L1 and add runtime visualizations and interactive analyses. As the simulation progresses, the datanodes will automatically reflect the changing property values of the underlying software object instances. Through a DIVE interface, the user of the DIVE pipeline that imported L1 could change a body's mass. This change would propagate back to the runtime instance of L1 and appear in the visualization. Many other examples are possible as well.
Similarly, DIVE data structure 420 has datanodes for interfaces and classes IClassA, IClassB, Abstract Class, OClass, SuperClass, SubClassA and SubClassB, methods OClassM( ), SuperM( ),SubAM( ),SubBM1( ) and SubBM2( ) fields StaticSuperF, SubAF1, and SubAF2, and property SubBProp. Relationships between datanodes in DIVE data structure 420 are shown using both solid and dashed lines representing dataedges.
DIVE object parser 400 can parse software object hierarchy 410 for translation into a data ontology and/or DIVE data structure. In other examples, other software hierarchies than .NET assemblies can be input to DIVE object parser 400 for parsing. In the example shown in
In the example shown in
Instance-specific data of software object hierarchy 410 are maintained on the subclass data nodes in DIVE data structure 420; that is, data for super classes is not stored with superclass data nodes. The original fields, properties, and methods of software object hierarchy 410 are accessible through the data nodes of DIVE data structure 420 by virtual properties.
In DIVE data structure 420, each instance of a class can be represented. For example,
In scenario 500, parameters to DIVE object parser 400 can specify which semantic components are to be parsed into one or more ontologies. For example, the parameters can reflect user intent regarding whether or not private members, static objects, interfaces, and other software entities of code assembly 510 are parsed.
DIVE object parser 400 can recursively traverse object hierarchies of code assembly 510 using code reflection and expression trees. Using generalized, pre-defined rules, such as the generic and additional rules discussed above in the context of
In scenario 500, DIVE object parser 400 outputs the ontological components in two formats: static ontology 520 corresponding to semantic components and relationships of code assembly 510 and dynamic ontology 520. Both static ontology 520 and dynamic ontology 530 can include an ontological definition that uses standardized ontology language. Dynamic ontology 530 can further include links into the object instance(s) of code assembly 510. For example, links between ontological components and object instances using delegate methods and lambda functions.
DIVE supports the use of scripts to let users rapidly interact with the DIVE pipeline, plug-ins, data structures, and data. DIVE supports at least two basic types of scripting: plug-in scripting and μscripting (microscripting). DIVE can host components, including scripts, written in a number of computer languages. For example, in some embodiments, C# can be used as a scripting language.
Plug-in scripting is similar to existing analysis tools' scripting capabilities. Through the plug-in script interface, the user script can access the runtime environment, the DIVE system, and the specific plug-in. μscripting can provide direct programmatic control to experienced users and simple, intuitive controls to relatively-new users of DIVE.
μscripting is an extension of plug-in scripting in which DIVE writes most of the code. The user needs to write only the right-hand side of a lambda function. Here's a schematic of a lambda function F1( ):
The right-hand side RHS written by the user is inserted into the lambda function. The lambda function, including the user's right-hand-side code, is compiled at runtime. The client can provide any expression that evaluates to an appropriate return value. In general, plug-in scripting can be more powerful than μscripting, while μscripting can be simpler at first.
User scripts, such as plug-in scripts and μscripting-originated scripts, can be included into the DIVE system. For example, the user script can be incorporated into a larger, complete piece of code that can be compiled; e.g., during runtime using full optimization. Finally, through reflection, the compiled code is loaded back into memory as a part of the runtime environment. Although this approach requires time to compile each script, the small initial penalty is typically outweighed by the resulting optimized, compiled code. Both scripting types, particularly μscripting, can work on a per-datanode basis; optimized compilation helps create a fast, efficient user experience.
Table 1 below provides some μscripting examples.
DIVE system 100 can support data streaming using an interactive SQL approach and a pass-through SQL approach. In some embodiments, database languages other than SQL can be utilized by either approach. Interactive SQL can be used for the immediate analysis of large, nonlocal datasets via impromptu, user-defined dynamic database queries using SQL by taking user input to build an SQL query.
The SQL query can include one or more data queries, as well as one or more functions for analysis of data received via the data queries. DIVE system 100 can send the SQL query to the SQL database and parse the resulting dataset. Depending on the query's size and complexity, this approach can result in user-controlled SQL analysis through the GUI at interactive rates. DIVE system 100 can facilitate interactive SQL by use of events generated at runtime; for example, DIVE events can be generated in response to mouse clicks or slider bar movements. Upon receiving these DIVE events, a DIVE component can construct the appropriate SQL query.
An SQL query can use SQL template 610 to obtain and analyze data. In the example shown in
The pass-through SQL approach can be used for interactive analysis of datasets larger than the client's local memory; e.g., pass-through SQL can be used for streaming complex object models across a preset dimension. Pass-through SQL accelerates the translation of SQL data into OO structures by shifting the location of values from the objects themselves to an in-memory data structure called a backing store.
A backing store can include a collection of one or more tables of instance data, where each table can contain one or more instance values for a single object type. Internally, object fields and properties have pointers to locations in backing-store tables instead of local, fixed values. A backing-store collection then includes all the tables for the object instances occurring at the same point, or frame, in the streaming dimension.
Once a backing store has been created by DIVE system 100, copies of the backing-store structure can be generated with a unique identifier for each new frame. DIVE system 100 then inserts instance values for new frames into the corresponding backing-store copy. This reduces the loading of instance data to a table-to-table copy, bypassing the parsing normally required to insert data into an OO structure. The use of backing stores also removes the overhead of allocating and de-allocating expensive objects by reusing the same object structures for each frame in the streaming dimension.
Pass-through SQL enables streaming through a buffered backing-store collection of backing stores representing frames over the streaming dimension. A backing-store collection is initially populated client-side for frames on either side of the frame of interest, where buffer regions are defined for each end of the backing-store collection. Frames whose data are stored in the backing-store collection are immediately accessible to the client. When the buffer regions' thresholds are traversed during streaming, a background thread is spawned to load a new set of backing stores around the current frame; e.g., by the pre-loader. If the client requests a frame outside the loaded set, a new backing-store collection can be loaded around the requested frame. Loaded backing stores no longer in the streaming collection can be deleted from memory to conserve the client's memory.
On each subsequent data frame request 630b, DIVE system 100 can buffer data retrieved from database(s) 632 into backing stores 640 directly. In some embodiments, DIVE system 100 can use multiple threads to buffer data into backing stores 640. DIVE system 100 can use pass-through SQL streaming to propagate large amounts of data through a DIVE pipeline using database(s) 632, object hierarchy 634, and backing stores 640 at interactive speeds; i.e., by bypassing object-oriented parsing.
In a case study, DIVE is used by the Dynameomics project to provide molecular dynamics simulations for studying protein structure and dynamics. The Dynameomics project involves characterization of the dynamic behaviors and folding pathways of topological classes of all known protein structures.
An interesting facet of protein biology is that structure equals function; that is, what a protein does and how it does it is intrinsically tied to its 3D structure. During a molecular dynamics simulation, scientists simulate interatomic forces to predict motion among atoms of a molecule, such as a protein, and its environment to better understand the 3D structure of the molecule.
The physical simulation is calculated using Newtonian physics; at specified time intervals, the simulation state is saved. This produces a trajectory or a series of structural snapshots reflecting the protein's natural behavior in an aqueous environment. Image 730 shows three structures selected from a trajectory containing more than 51,000 frames.
Molecular dynamics is useful for three primary reasons. First, like many in silico techniques, it allows virtual experimentation; scientists can simulate protein structures and interactions without the cost or risk of laboratory experiments. Second, modern computing techniques allow molecular dynamics simulations to run in parallel, enabling virtual high-throughput experimentation. Third, molecular dynamics simulation is the only protein analysis method that produces sequential time-series structures at both high spatial and high temporal resolution. These high-resolution trajectories can reveal how proteins move, a critical aspect of their functionality.
However, molecular dynamics simulations can produce datasets considerably larger than what most structural-biology tools can handle. So far, the Dynameomics project has generated hundreds of terabytes of data consisting of thousands of simulations and millions of structures, as well as their associated analyses, stored in a distributed SQL data warehouse. The data warehouse can hold at least four orders of magnitude more protein structures than the Protein Data Bank, which is the World's primary repository for experimentally characterized protein structures.
In particular, the Dynameomics project contains much more simulation data than many domain-specific tools are engineered to handle. One of the first Dynameomics tools built on the DIVE platform was the Protein Dashboard. The Protein Dashboard which provides interactive 2D and 3D visualizations of the Dynameomics dataset. These visualizations include interactive explorations of bulk data, molecular visualization tools, and integration with external tools such as Chimera.
The top of
The generated datanodes and dataedges, along with DIVE plug-ins, μscripts, plug-in scripts, DIVE tools, and/or other software entities, can be used together as a DIVE pipeline, as indicated a lower portion of
At lower left of
Chart region 930 shows one of many possible linked interactive charts for a “SASA1 Plot” related to “Residue SASA”. The interactive charts can be generated using data streamed from the data sources mentioned in the context of
A tool implemented independently of DIVE and the Protein Dashboard is the Dynameomics API. The API can be used to establish an object hierarchy, provide high-throughput streaming of simulations from the Dynameomics data warehouse. The Dynameomics API includes domain-specific semantics and data structures and provides multiple domain-specific analyses. In some embodiments, the Dynameomics API can be user interface agnostic; then, the Dynameomics API can provide data handling and streaming support independently of how the user views and otherwise interacts with the data; e.g., using the Protein Dashboard. In some embodiments, the API can be written in a particular computer language; e.g., C#.
With the Dynameomics data and semantics available to the DIVE pipeline, a visual analytics approach can be applied to the Dynameomics data. Protein Dashboard 800 can be used to interact with and visualize the data. However, because the data flows through the Dynameomics API, wrapped by DIVE datanodes and dataedges, multiple protein structures from different sources can be loaded, including structures from the Protein Data Bank. Once loaded, the protein structures can be aligned and analyzed in different ways.
Furthermore, because Protein Dashboard 800 has access to additional data from the Dynameomics API via DIVE system 100, the utility of Protein Dashboard 800 increases. For instance, scientists can find utility in coloring protein structures on the basis of biophysical properties; e.g., solvent-accessible surface area, deviation from a baseline structure. By streaming the data through the pipeline, these biophysical properties can be observed as they change over time. In some instances, some or all of the biophysical properties can be accessed through the data's inheritance hierarchy.
Applications built on DIVE system 100 have been used to accelerate biophysical analysis of Dynameomics and other data related to two specific proteins. The first protein is the transcription factor p53, mutations in which are implicated in cancer. The second protein is human Cu—Zn superoxide dismutase 1 (SOD1), mutations in which are associated with amyotrophic lateral sclerosis.
The Y220C mutation of the p53 DNA binding domain is responsible for destabilizing the core, leading to about 75,000 new cancer cases annually according to Boeckler et al. The DIVE framework can analyze the structural and functional effects of the Y220C mutation using a DIVE module called ContactWalker. The ContactWalker module can identify amino acids' interatomic contacts disrupted significantly as a result of mutation. The contact pathways between disrupted residues can be identified identified using DIVE's underlying graph-based data representation.
In particular,
In another example, DIVE has been used in about 400 simulations of 106 disease-associated mutants of SOD1. Through extensive studies of A4V mutant SOD1, Schmidlin et al. previously noted the instability of two β-strands in the SOD1 Greek key β-barrel structure. That analysis took several years to complete and such manual interrogation of simulations does not scale to allow us to search for general features linked to disease across hundreds of simulations.
DIVE system 100 was used to further explore the formation and persistence of the contacts and packing interactions in this region across multiple simulations of mutant proteins. DIVE system 100 facilitates isolation of specific contacts, rapid plotting of selected data, and easy visualization of the relevant structures and geographic locations of specific mutations, while providing intuitive navigation from one view to another.
The top panel of
In particular,
Example DIVE application pipelines are shown in
The lower portion of
In another example, the user could request a continuous data stream based on location-related sensor data; e.g., request data from “all deep-ocean current sensors within 100 miles of the up-to-the-minute GPS position of any Navy ship over 1000 tons and under the eventual command of Admiral Jones.” In this case, the ontology graph would have to cover naval vessels, command hierarchies, and ocean sensor data. In this case, the subset of the ontology can change in real time as the ships moves (and perhaps as command changes). Then, queries can be made against the larger ontological graph of naval vessels and undersea sensors using live data streams as part of the query to provide the requested continuous data stream. Many other example DIVE pipelines and uses of DIVE system 100 are possible as well.
The network 1206 can correspond to a local area network, a wide area network, a corporate intranet, the public Internet, combinations thereof, or any other type of network(s) configured to provide communication between networked computing devices. In some embodiments, part or all of the communication between networked computing devices can be secured.
Servers 1208 and 1210 can share content and/or provide content to client devices 1204a-1204c. As shown in
In particular, computing device 1300 shown in
Computing device 1300 can be a desktop computer, laptop or notebook computer, personal data assistant (PDA), mobile phone, embedded processor, touch-enabled device, or any similar device that is equipped with at least one processing unit capable of executing machine-language instructions that implement at least part of the herein-described techniques and methods, including but not limited to method 1400 described with respect to
User interface 1301 can receive input and/or provide output, perhaps to a user. User interface 1301 can be configured to send and/or receive data to and/or from user input from input device(s), such as a keyboard, a keypad, a touch screen, a computer mouse, a track ball, a joystick, and/or other similar devices configured to receive input from a user of the computing device 1300.
User interface 1301 can be configured to provide output to output display devices, such as one or more cathode ray tubes (CRTs), liquid crystal displays (LCDs), light emitting diodes (LEDs), displays using digital light processing (DLP) technology, printers, light bulbs, and/or other similar devices capable of displaying graphical, textual, and/or numerical information to a user of computing device 1300. User interface module 1301 can also be configured to generate audible output(s), such as a speaker, speaker jack, audio output port, audio output device, earphones, and/or other similar devices configured to convey sound and/or audible information to a user of computing device 1300.
Network communication interface module 1302 can be configured to send and receive data over wireless interface 1307 and/or wired interface 1308 via a network, such as network 1206. Wireless interface 1307 if present, can utilize an air interface, such as a Bluetooth®, Wi-Fi®, ZigBee®, and/or WiMAX™ interface to a data network, such as a wide area network (WAN), a local area network (LAN), one or more public data networks (e.g., the Internet), one or more private data networks, or any combination of public and private data networks. Wired interface(s) 1308, if present, can comprise a wire, cable, fiber-optic link and/or similar physical connection(s) to a data network, such as a WAN, LAN, one or more public data networks, one or more private data networks, or any combination of such networks.
In some embodiments, network communication interface module 1302 can be configured to provide reliable, secured, and/or authenticated communications. For each communication described herein, information for ensuring reliable communications (i.e., guaranteed message delivery) can be provided, perhaps as part of a message header and/or footer (e.g., packet/message sequencing information, encapsulation header(s) and/or footer(s), size/time information, and transmission verification information such as CRC and/or parity check values). Communications can be made secure (e.g., be encoded or encrypted) and/or decrypted/decoded using one or more cryptographic protocols and/or algorithms, such as, but not limited to, DES, AES, RSA, Diffie-Hellman, and/or DSA. Other cryptographic protocols and/or algorithms can be used as well as or in addition to those listed herein to secure (and then decrypt/decode) communications.
Processor(s) 1303 can include one or more central processing units, computer processors, mobile processors, digital signal processors (DSPs), graphics processing units (GPUs), microprocessors, computer chips, and/or other processing units configured to execute machine-language instructions and process data. Processor(s) 1303 can be configured to execute computer-readable program instructions 1306 that are contained in data storage 1304 and/or other instructions as described herein.
Data storage 1304 can include one or more physical and/or non-transitory storage devices, such as read-only memory (ROM), random access memory (RAM), removable-disk-drive memory, hard-disk memory, magnetic-tape memory, flash memory, and/or other storage devices. Data storage 1304 can include one or more physical and/or non-transitory storage devices with at least enough combined storage capacity to contain computer-readable program instructions 1306 and any associated/related data and data structures, including but not limited to, data frames, data pins, ontologies, DIVE data structures, software objects, software object hierarchies, code assemblies, data interactions, scripts (including μscripts).
Computer-readable program instructions 1306 and any data structures contained in data storage 1306 include computer-readable program instructions executable by processor(s) 1303 and any storage required, respectively, to perform at least part of herein-described methods, including, but not limited to method 1400 described with respect to
In some embodiments, data and/or software for DIVE system 100 can be encoded as computer readable information stored in tangible computer readable media (or computer readable storage media) and accessible by client devices 1204a, 1204b, and 1204c, and/or other computing devices. In some embodiments, data and/or software for DIVE system 100 can be stored on a single disk drive or other tangible storage media, or can be implemented on multiple disk drives or other tangible storage media located at one or more diverse geographic locations.
In some embodiments, each of the computing clusters 1309a, 1309b, and 1309c can have an equal number of computing devices, an equal number of cluster storage arrays, and an equal number of cluster routers. In other embodiments, however, each computing cluster can have different numbers of computing devices, different numbers of cluster storage arrays, and different numbers of cluster routers. The number of computing devices, cluster storage arrays, and cluster routers in each computing cluster can depend on the computing task or tasks assigned to each computing cluster.
In computing cluster 1309a, for example, computing devices 1300a can be configured to perform various computing tasks of DIVE system 100. In one embodiment, the various functionalities of DIVE system 100 can be distributed among one or more of computing devices 1300a, 1300b, and 1300c. Computing devices 1300b and 1300c in computing clusters 1309b and 1309c can be configured similarly to computing devices 1300a in computing cluster 1309a. On the other hand, in some embodiments, computing devices 1300a, 1300b, and 1300c can be configured to perform different functions.
In some embodiments, computing tasks and stored data associated with DIVE system 100 can be distributed across computing devices 1300a, 1300b, and 1300c based at least in part on the processing requirements of DIVE system 100, the processing capabilities of computing devices 1300a, 1300b, and 1300c, the latency of the network links between the computing devices in each computing cluster and between the computing clusters themselves, and/or other factors that can contribute to the cost, speed, fault-tolerance, resiliency, efficiency, and/or other design goals of the overall system architecture.
The cluster storage arrays 1310a, 1310b, and 1310c of the computing clusters 1309a, 1309b, and 1309c can be data storage arrays that include disk array controllers configured to manage read and write access to groups of hard disk drives. The disk array controllers, alone or in conjunction with their respective computing devices, can also be configured to manage backup or redundant copies of the data stored in the cluster storage arrays to protect against disk drive or other cluster storage array failures and/or network failures that prevent one or more computing devices from accessing one or more cluster storage arrays.
Similar to the manner in which the functions of DIVE system 100 can be distributed across computing devices 1300a, 1300b, and 1300c of computing clusters 1309a, 1309b, and 1309c, various active portions and/or backup portions of these components can be distributed across cluster storage arrays 1310a, 1310b, and 1310c. For example, some cluster storage arrays can be configured to store one portion of the data and/or software of DIVE system 100, while other cluster storage arrays can store a separate portion of the data and/or software of DIVE system 100. Additionally, some cluster storage arrays can be configured to store backup versions of data stored in other cluster storage arrays.
The cluster routers 1311a, 1311b, and 1311c in computing clusters 1309a, 1309b, and 1309c can include networking equipment configured to provide internal and external communications for the computing clusters. For example, the cluster routers 1311a in computing cluster 1309a can include one or more internet switching and routing devices configured to provide (i) local area network communications between the computing devices 1300a and the cluster storage arrays 1301a via the local cluster network 1312a, and (ii) wide area network communications between the computing cluster 1309a and the computing clusters 1309b and 1309c via the wide area network connection 1313a to network 1206. Cluster routers 1311b and 1311c can include network equipment similar to the cluster routers 1311a, and cluster routers 1311b and 1311c can perform similar networking functions for computing clusters 1309b and 1309b that cluster routers 1311a perform for computing cluster 1309a.
In some embodiments, the configuration of the cluster routers 1311a, 1311b, and 1311c can be based at least in part on the data communication requirements of the computing devices and cluster storage arrays, the data communications capabilities of the network equipment in the cluster routers 1311a, 1311b, and 1311c, the latency and throughput of local networks 1312a, 1312b, 1312c, the latency, throughput, and cost of wide area network links 1313a, 1313b, and 1313c, and/or other factors that can contribute to the cost, speed, fault-tolerance, resiliency, efficiency and/or other design goals of the moderation system architecture.
Method 1400 can begin at block 1410, where a computing device can receive data from one or more data sources, as discussed above in the context of at least
At block 1420, the computing device can generate a data frame based on the received data. The data frame can include a plurality of data items, as discussed above in the context of at least
At block 1430, the computing device can determine a data ontology. The data ontology can include a plurality of datanodes, as discussed above in the context of at least
At block 1440, the computing device can determine a plurality of data pins, as discussed above in the context of at least
At block 1450, the computing device can obtain data for the first data item at the first datanode of the data ontology via the first data pin, as discussed above in the context of at least
At block 1460, the computing device can provide a representation of the data ontology, such as discussed above in the context of at least
In some embodiments, method 1400 can also include: receiving additional data from the one or more data sources; storing a subset of the additional data in a second data frame, where the second data frame includes the plurality of data items, and where the data in the second data frame differs from data in the first data frame, and changing the first reference of the first data pin to refer to the first data item in the second data frame, as discussed above in the context of at least
In other embodiments, method 1400 can also include: specifying a designated control for the control data item of the control pin, and after specifying the designated control, generating a data frame associated with the designated control, such as discussed above in the context of at least
Unless the context clearly requires otherwise, throughout the description and the claims, the words ‘comprise’, ‘comprising’, and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to”. Words using the singular or plural number also include the plural or singular number, respectively. Additionally, the words “herein,” “above” and “below” and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of this application.
The above description provides specific details for a thorough understanding of, and enabling description for, embodiments of the disclosure. However, one skilled in the art will understand that the disclosure may be practiced without these details. In other instances, well-known structures and functions have not been shown or described in detail to avoid unnecessarily obscuring the description of the embodiments of the disclosure. The description of embodiments of the disclosure is not intended to be exhaustive or to limit the disclosure to the precise form disclosed. While specific embodiments of, and examples for, the disclosure are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the disclosure, as those skilled in the relevant art will recognize.
All of the references cited herein are incorporated by reference. Aspects of the disclosure can be modified, if necessary, to employ the systems, functions and concepts of the above references and application to provide yet further embodiments of the disclosure. These and other changes can be made to the disclosure in light of the detailed description.
Specific elements of any of the foregoing embodiments can be combined or substituted for elements in other embodiments. Furthermore, while advantages associated with certain embodiments of the disclosure have been described in the context of these embodiments, other embodiments may also exhibit such advantages, and not all embodiments need necessarily exhibit such advantages to fall within the scope of the disclosure.
The above detailed description describes various features and functions of the disclosed systems, devices, and methods with reference to the accompanying figures. In the figures, similar symbols typically identify similar components, unless context dictates otherwise. The illustrative embodiments described in the detailed description, figures, and claims are not meant to be limiting. Other embodiments can be utilized, and other changes can be made, without departing from the spirit or scope of the subject matter presented herein. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the figures, can be arranged, substituted, combined, separated, and designed in a wide variety of different configurations, all of which are explicitly contemplated herein.
With respect to any or all of the ladder diagrams, scenarios, and flow charts in the figures and as discussed herein, each block and/or communication may represent a processing of information and/or a transmission of information in accordance with example embodiments. Alternative embodiments are included within the scope of these example embodiments. In these alternative embodiments, for example, functions described as blocks, transmissions, communications, requests, responses, and/or messages may be executed out of order from that shown or discussed, including substantially concurrent or in reverse order, depending on the functionality involved. Further, more or fewer blocks and/or functions may be used with any of the ladder diagrams, scenarios, and flow charts discussed herein, and these ladder diagrams, scenarios, and flow charts may be combined with one another, in part or in whole.
A block that represents a processing of information may correspond to circuitry that can be configured to perform the specific logical functions of a herein-described method or technique. Alternatively or additionally, a block that represents a processing of information may correspond to a module, a segment, or a portion of program code (including related data). The program code may include one or more instructions executable by a processor for implementing specific logical functions or actions in the method or technique. The program code and/or related data may be stored on any type of computer readable medium such as a storage device including a disk or hard drive or other storage medium.
The computer readable medium may also include non-transitory computer readable media such as computer-readable media that stores data for short periods of time like register memory, processor cache, and random access memory (RAM). The computer readable media may also include non-transitory computer readable media that stores program code and/or data for longer periods of time, such as secondary or persistent long term storage, like read only memory (ROM), optical or magnetic disks, compact-disc read only memory (CD-ROM), for example. The computer readable media may also be any other volatile or non-volatile storage systems. A computer readable medium may be considered a computer readable storage medium, for example, or a tangible storage device.
Moreover, a block that represents one or more information transmissions may correspond to information transmissions between software and/or hardware modules in the same physical device. However, other information transmissions may be between software modules and/or hardware modules in different physical devices.
Numerous modifications and variations of the present disclosure are possible in light of the above teachings.
The present application claims priority to U.S. Provisional Patent Application No. 61/840,617, entitled “Methods for Efficient Streaming of Structured Information”, filed Jun. 28, 2013, which is entirely incorporated by reference herein for all purposes.
This invention was made with government support under Grant Nos. 5T15LM007442 and GM50789, awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2014/044683 | 6/27/2014 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
61840617 | Jun 2013 | US |