The present technology pertains to analyzing drilling reports, and more specifically to automatic classification of drilling reports with deep natural language processing.
Drilling activities in oil and gas are a shared concern among energy companies, government agencies, and the general public, as they can impact both the profits of the various parties and the natural environment. Accordingly, it is important to obtain accurate and thorough data related to drilling activities, which can be used to study the drilling activities in order to learn from previous drilling activities and optimize future drilling activities. To this end, oil and gas companies often generate drilling reports for respective drilling activities.
Drilling reports contain rich information such as well state information, including symptoms and events reported in situ by the drillers in free-form text. This information can provide new insights into the drilling process and support future drilling strategies. However, the size and volume of drilling reports generated by oil and gas companies renders any meaningful analysis of these reports unfeasible. Furthermore, the complexity and free-form nature of drilling reports makes the task of analyzing these types of reports even more challenging.
In order to describe the manner in which the above-recited and other advantages and features of the disclosure can be obtained, a more particular description of the principles briefly described above will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Understanding that these drawings depict only exemplary embodiments of the disclosure and are not therefore to be considered to be limiting of its scope, the principles herein are described and explained with additional specificity and detail through the use of the accompanying drawings in which:
Various embodiments of the disclosure are discussed in detail below. While specific implementations are discussed, it should be understood that this is done for illustration purposes only. A person skilled in the relevant art will recognize that other components and configurations may be used without parting from the spirit and scope of the disclosure.
Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be obvious from the description, or can be learned by practice of the herein disclosed principles. The features and advantages of the disclosure can be realized and obtained by means of the instruments and combinations particularly pointed out in the appended claims. These and other features of the disclosure will become more fully apparent from the following description and appended claims, or can be learned by the practice of the principles set forth herein.
It will be appreciated that for simplicity and clarity of illustration, where appropriate, reference numerals have been repeated among the different figures to indicate corresponding or analogous elements. In addition, numerous specific details are set forth in order to provide a thorough understanding of the embodiments described herein. However, it will be understood by those of ordinary skill in the art that the embodiments described herein can be practiced without these specific details. In other instances, methods, procedures and components have not been described in detail so as not to obscure the related relevant feature being described. The drawings are not necessarily to scale and the proportions of certain parts may be exaggerated to better illustrate details and features. The description is not to be considered as limiting the scope of the embodiments described herein.
Disclosed are systems, methods, and computer-readable storage media for automatic classification and presentation of drilling reports using deep natural language processing. In some examples, a system can obtain drilling reports associated with respective well drilling or operation activities. Further, the system can generate, based on the drilling reports, word vectors, where each word vector represents a respective word in the drilling reports. The system can also partition sentences in the drilling reports into respective words and, for each sentence, identify respective word vectors corresponding to the respective words associated with the sentence.
Based on the respective word vectors, the system can classify the sentences into respective events, respective symptoms, respective actions, respective results, and so forth. The system can classify the sentences using a neural network into concepts, categories, etc. For example, the system can classify sentences into symptom, action, result, event, etc. The system can generate a tool that allows users to search sentences or drilling reports based on classifications and/or sequences of classifications.
To illustrate, the system can generate a tool that allows a user to search for symptoms matching search string X, and limit the search to the sequence symptom→action→result, where the symptom is based on the search string as previously explained. The system can also generate visually interactive tools which can depict concepts or classifications in a screen based on a predetermined pattern or configuration, such as a word cloud or a graph, for example. This can allow the user to view select one or more drilling reports or clusters of reports based on a specific concept or classification. Moreover, the system can present the classification of drilling reports in a Geographic Information System (GIS) that would help sales teams identify what regions in a map need their products based on the types of problems and symptoms that occur, for example. These example tools can be used by users in real-time to troubleshoot problems during drilling activities or manage the process and progress of the drilling activities.
As previously explained, drilling reports contain valuable intelligence on well state and operations. Unfortunately, previously, the drilling reports and intelligence contained in the reports are significantly underutilized and unexploited. This is largely due to the size and volume of drilling reports generated by oil and gas companies renders any meaningful analysis of these reports unfeasible, and the complexity and free-form nature of drilling reports has restricts or limits intelligent, automated, or computerized analysis of these types of reports.
The disclosed technology addresses the need in the art for tools capable of performing intelligent, automated, and effective analysis, classification, and presentation of drilling reports and intelligence. The approaches herein can provide accurate and computerized tools for automatic classification of drilling reports. Such tools can be robust and capable of correcting or understanding typing errors, abbreviations, language shortcuts or terms of art, symbols, acronyms, etc. The technologies and approaches herein can provide interactive visualizations of the hidden semantic relationships of concepts between drilling reports, and graphical tools for retrieving or filtering data based on report sequencing and classifications. The graphical tools and visualizations can allow users and engineers to quickly identify and sift through relevant sequences within large volumes of drilling reports, and efficiently interact and understand concepts and intelligence provided in the drilling reports.
Disclosed are systems, methods, and computer-readable storage media for automatic classification and presentation of drilling reports using deep natural language processing. A brief introductory description of exemplary systems and environments, as illustrated in
Logging tools 126 can be integrated into the bottom-hole assembly 125 near the drill bit 114. As the drill bit 114 extends the wellbore 116 through the formations 118, logging tools 126 collect measurements relating to various formation properties as well as the orientation of the tool and various other drilling conditions. The bottom-hole assembly 125 may also include a telemetry sub 128 to transfer measurement data to a surface receiver 130 and to receive commands from the surface. In at least some cases, the telemetry sub 128 communicates with a surface receiver 130 using mud pulse telemetry. In some instances, the telemetry sub 128 does not communicate with the surface, but rather stores logging data for later retrieval at the surface when the logging assembly is recovered.
Each of the logging tools 126 may include a plurality of tool components, spaced apart from each other, and communicatively coupled with one or more wires. The logging tools 126 may also include one or more computing devices 150 communicatively coupled with one or more of the plurality of tool components by one or more wires. The computing device 150 may be configured to control or monitor the performance of the tool, process logging data, and/or carry out the methods of the present disclosure.
In at least some instances, one or more of the logging tools 126 may communicate with a surface receiver 130 by a wire, such as wired drillpipe. In other cases, the one or more of the logging tools 126 may communicate with a surface receiver 130 by wireless signal transmission. In at least some cases, one or more of the logging tools 126 may receive electrical power from a wire that extends to the surface, including wires extending through a wired drillpipe.
Referring to
The illustrated wireline conveyance 134 provides support for the tool, as well as enabling communication between the tool processors on the surface and providing a power supply. The wireline conveyance 134 can include fiber optic cabling for carrying out communications. The wireline conveyance 134 is sufficiently strong and flexible to tether the tool body 132 through the wellbore 48, while also permitting communication through the wireline conveyance 134 to local processor 138 and/or remote processors 136, 140. Additionally, power can be supplied via the wireline conveyance 134 to meet power requirements of the tool. For slickline or coiled tubing configurations, power can be supplied downhole with a battery or via a downhole generator.
Having disclosed example drilling environments and tools, the disclosure now turns to a discussion of classification and presentation of drilling reports and related concepts.
Operators and/or drillers can generate drilling reports for specific well operations, such as drilling operations and activities. As previously indicated, drilling reports can contain rich information and statistics about well state and well operations such as drilling activities. Indeed, drilling reports can contain a large amount of intelligence, data, statistics, etc., which can provide valuable insight into well state and operations. Non-limiting examples of data which can be contained in drilling reports include events, actions, symptoms, results, logging details, etc. Some or all of the information in drilling reports can be reported in situ by drillers and/or operators. Drilling reports can also include various types and/or formats of data, such as free-form text, symbols, formulas, acronyms, expressions, terms of art, etc.
For example, drilling tools can sometimes get stuck due to miscalculations or limited knowledge about the ground or surface. In this example, the NPT can include the time spent fishing or rescuing the tool and/or performing any adjustments before drilling resumes.
The reports 200, 202 can include a log of events during the PT and NPT, respectively, as well as other related information such as observations, analysis, notes, etc. The reports 200 and/or 202 can then be analyzed, classified, processed, etc., as further described below.
The text extraction and cleaning process 352 can include extracting text from a database 308 containing drilling reports to obtain a corpus 310 of text from the drilling reports. When generating the corpus, the text can be concatenated into one or more files. The corpus 310 can then be cleaned to yield cleaned text 312. The cleaning of text can involve removing certain symbols (e.g., &, #, −, etc.), replacing acronyms with their corresponding short descriptions or full names (e.g., POOH replaced with pull out of hole, etc.). In some cases, symbols and/or short-text can be replaced with regular expressions. Below is a table of regular expression substitutions.
Further lemmatization can be performed in the cleaning step. For example, plurals can be removed or converted to singular form (e.g., wells can be converted to well).
In the vectorization and plotting process 354, the cleaned text 312 can be split/divided/partitioned into words 358, and the resulting corpus can be passed to a word-to-vector transformation function 360. The word-to-vector transformation function 360 can perform one or more functions, algorithms, and/or operations to transform the words 358 into a set of vectors. In some cases, the set of vectors can be of high dimension, such as 300, for example.
The output of the word-to-vector transformation function (i.e., the set of vectors) can be projected in a plane to yield a projected plane. The plane can be, for example, a 2D Cartesian plane, a graph or scatter plot, etc. In some examples, the set of vectors can be projected into the plane by means of dimensionality reductions techniques, such as t-distributed stochastic neighbor embedding (t-SNE).
In the visualization process, the projected plane can be used to generate one or more visualizations. To illustrate, the projected plane can be used to generate a word cloud in
In the text extraction and cleaning process 352, drilling reports and/or operational notes can be extracted from database 308. In some cases, the text can be concatenated in chronological order, for example. The reports 310 extracted from the database 308 can then be cleaned to yield cleaned reports 312. The reports 310 can be cleaned as previously explained, by removing or replacing specific types of items, such as symbols, acronyms, etc. The cleaning operations can serve as a denoising layer.
In the word encoding and plotting process 354, a corpus 358 can be generated from the cleaned reports 312. A noise-constrastive estimation 360 or similar technique can be performed on the corpus 358, and words can be plotted and encoded.
Labeled sentences 362 can then be used to train a neural network 364 for performing classification of unseen sentences 366. The neural network 364 can vary. For example, the neural network 364 can be a simple network with arithmetic averaging, a convolutional neural network (CNN), a long short-term memory network (LSTM), etc. Moreover, the classification 366 can classify the sentences into categories or concepts, such as events, actions, symptoms, results, etc.
In one example, the neural network 364 can be a simple network with arithmetic averaging. In this example, fixed-length features can be assigned to sentences by averaging (i.e., reduction operation) their constituent word vectors. This feature can then be passed to a fully connected hidden layer with 20 tanh neurons followed by a softmax classification.
An example implementation of workflow 350 can be as follows. After the cleaning process, the total number of tokens and/or vocabulary size in the corpus can be reduced to T=810375 and V=17623, respectively, for example. To illustrate, the Mikolov et al. methodology, known in the art, can be implemented. The corpus can be scanned with a fixed window of size m=3, and each word wi; i=1; 2; . . . ; V in the vocabulary can be assigned two random vectors ui; viϵ [−1; 1]d with d=300 the embedding dimension. The word wi can be in the center of a window, in which case vi is the associated vector representation, or an outer (or target) word for which ui is looked up likewise. Within a context window centered at a word wc, a correct outer word wo can be sampled. Furthermore k=64 words w1; w2; . . . ; wk are sampled from the vocabulary at random from the unigram distribution P (w). The probability of the pair (wc; wo) can be maximized and the probability of the pairs (wc; wi); i=1; 2; . . . ; k can be minimized with the objective:
J
t=log σ(uOTvc)+Σi=1kwiP(w)[ log σ(−uiTvc)] Eq. 1
and
is the sigmoid function. This process can be repeated for all context windows throughout the corpus leading to the total objective J=Σt=1TJt.
A batch of b=128 word pairs can be processed at a time during minimization. Word vectors are updated iteratively with stochastic gradient descent and a learning rate of lr=1.0. At the end of the minimization the average loss may be approximately 2.02 and the resulting embedding can be illustrated in
To illustrate, clusters 422 representing words such as “incident”, “accident”, “issues”, and “environmental” may appear together in a cluster of issues 418. The issues 418 in this example can be a concept shared by such words (e.g., semantic relationship between the word vectors). Clusters 422 can also be formed according to specific patterns. For example, a cluster of integers 402 can include integers grouped in ascending order.
The word embedding can be robust to noise. For example, concepts or words remarks and remark in the remarks 410 cluster can be arranged or depicted in close proximity based on their similarity or relevance to each other. Abbreviations can also be captured, such as in “circulate” and “circ”. These properties can be used for overcoming nuances of technical languages which may be common in drilling reports (e.g., 200 and 202).
In order to classify sentences in drilling reports, three different neural network architectures may be tested: simple network with arithmetic averaging, convolutional neural network (CNN) and long short-term memory network (LSTM).
For example, a cluster of words associated with the concept integers 454 can be selected 452 by a user. The selection 452 can trigger a search and/or presentation of reports 456 associated with the concept integers 454 selected. This may allow a user to identify and select a specific cluster or concept of interest and obtain drilling reports related to the selected cluster or concept. Users can thus quickly navigate and filter through reports in a visual and interacted manner.
The selection 452 can also result in the word cloud 450 maximizing or focusing on the selected cluster or concept. The maximized or focused cluster or concept can depict sub-clusters based on sub-concepts, which can also be selectable. This can allow a user to navigate and drill down into more granular concepts or clusters.
The dendrogram 500 can be interactive. Thus, a user can select a specific level to obtain reports associated with the selected level. The user can navigate the various levels 504-558 for a specific concept depending on the desired granularity.
Next, a convolution layer, which can include filters 628 and features 630, and a max pooling layer 632 can be interleaved twice in a total of four additional layers. The two convolutional layers 626 can include 128 filters of length 3 and the two max pooling layers 632 can halve their inputs. The architecture can further include a fully connected layer 634 composed of 128 ReLU neurons and a fully connected softmax layer 636.
Referring to the neural networks illustrated in
However, drilling reports may contain high repetition of n-grams. For example,
The classification of each sentence in the reports can enable an analysis of sequencing behavior. For example, a drilling engineer may be interested in analyzing cases where the symptoms were followed by failure events without any action, or looking up all the actions taken for a specific symptom.
Referring to
The tool 1120 can include a search portion 1122 where a user can type a string, value, or query to be searched within the drilling reports. In the search portion 1122, the user search for a particular classification, such as symptom 1124. For example, the user can type “Ream down to 3856 m and observe erratic torque” in the search portion 1122 to search drilling reports containing sentences classified as symptom that include the specified search parameter, namely, “Ream down to 3856 m and observe erratic torque”.
The tool 1120 can identify a particular sequence associated with the search, and identify any instances of the particular sequence found in the drilling reports. For example, the tool 1120 can identify a particular sequence of Symptom→Action→Result, and identify instances of that particular sequence within the drilling reports where the symptom 1124 in the sequence is similar to “Ream down to 3856 m and observe erratic torque”, as defined in the search portion 1122.
Based on an example search for the sequence of symptom→Action→Result, where the symptom is “Ream down to 3856 m and observe erratic torque”, the tool 1120 can present the actions 1128 and result 1130, 1132 for that particular search. This way, the user can view or access the actions and result in the drilling reports resulting from the symptom 1124 from the search.
The tool 1120 can depict the actions 1128 and results 1130, 1132 in different ways. For example, actions 1128 can be displayed as text, an image, a code, a summary, etc. Similarly, the results 1130, 1132 can be displayed as text, charts, graphics, percentages or other values, etc. Moreover, the tool 1120 can depict a description 1126 of the search results, as well as other information, such as links, summaries, documents, etc.
The tool 1120 can also allow the user to interact with the actions 1128 and/or results 1130, 1132. For example, the user can select a specific action to retrieve additional information about that action and/or corresponding report(s). As another example, the user can select a result to modify the searched sequence to include results involving the selected result. Moreover, the user can select one or more specific reports from the tool 1120, which can be identified based on the search results. The user can also modify the search string or value in the search portion 1122 and/or the sequence for the search. For example, the user can modify the sequence to include Symptom→Event or Event←Symptom in order to find the events following a particular symptom or the symptom preceding a particular event.
The tool 1120 can be used by a user to support decisions in real-time or during operations. For example, whenever a symptom is observed during a drilling operation, the tool 1120 can display actions that were taken in the past and the results from those actions.
Having disclosed some basic system components and concepts, the disclosure now turns to the example method embodiment shown in
At step 1300, the method can involve obtaining drilling reports associated with respective well drilling or operation activities. Based on the drilling reports, at step 1302, the method can involve generating word vectors. Each word vector can represent a respective word in the drilling reports, as previously explained.
In some cases, the drilling reports can be processed through a denoising layer to yield a corpus. The denoising layer can be configured to replace acronyms with corresponding descriptions, remove symbols, replace symbols with regular expressions, change plurals to singular form, and cleanup the text in other ways.
At step 1304, the method can involve partitioning sentences in the drilling reports into respective words. At step 1306, the method can involve, for each sentence, identifying respective word vectors from the word vectors. The respective word vectors can correspond to the respective words associated with the sentence. For example, the word vectors can correspond to the words obtained when the sentence was partitioned into words.
In some cases, step 1304 can involve splitting a corpus processed through a denoising layer into words, and step 1306 can include processing the words through a word-to-vector transformation operation to generate word vectors for the words from the corpus cleaned via the denoising layer.
The word vectors can be used to generate an interactive visualization, such as a word cloud or a dendrogram. For example, the word vectors can be projected into a Cartesian plane and visually presented in an interface, such as a word cloud. Each word or word vector can be labeled. For example word vectors can be labeled based on a respective word corresponding to the word vector in the Cartesian plane. The word vectors can be clustered in the Cartesian plane based on semantic relationships between the respective word represented by the word vectors.
Based on the respective word vectors, at step 1308, the method can involve feeding the vectors for each word in a sentence into a neural network that takes care of combining the word vectors in a non-trivial way. This combination can represent the associated sentences, which are classified into respective events, respective symptoms, respective actions, and/or respective results.
The method can also involve identifying respective sequences based on the classifications. For example, the method can involve identifying sequences of events, symptoms, actions, and/or results. The sequences can be presented via a graphical tool to allow a user to view or access sequences of information based on a filtering criteria, such as a defined sequence and/or one or more search strings for the search.
For example, a user can provide a filtering criteria which defines a particular string or value for an event, symptom, action, and/or result, and a graphical tool can filter respective content and sequences from drilling reports to identify one or more sequences matching the filtering criteria. The filtering criteria can also specify the particular sequence to be searched or identified. Thus, the filtering criteria can not only define a string or value to search for a particular concept, such as an event or symptom, but also a sequence of concepts, such as a symptom followed by an action taken in response to the system, where one or more of the concepts in the sequence of concepts matches the search string or value.
The graphical tool can present the identified sequence(s), including any associated information. For example, based on a search of the symptom A for a sequence of Symptom→Action, the graphical tool can present all of the instances of symptom A found in the drilling reports as well as their corresponding action. The graphical tool can also enable to user to interact with search results, refine search results, access data or reports from the search results, or even reconfigure the display of search results (e.g., graph, list, table, etc.).
Having disclosed example systems and concepts for classification and visualization of drilling reports, the disclosure now turns to
Example system and/or computing device 1400 includes a processing unit (CPU or processor) 1410 and a system bus 1405 that couples various system components including the system memory 1415 such as read only memory (ROM) 1420 and random access memory (RAM) 1425 to the processor 1410. The processors of
Other system memory 1415 may be available for use as well. The memory 1415 can include multiple different types of memory with different performance characteristics. It can be appreciated that the disclosure may operate on a computing device 1400 with more than one processor 1410 or on a group or cluster of computing devices networked together to provide greater processing capability. The processor 1410 can include any general purpose processor and a hardware module or software module, such as module 1 1432, module 2 1434, and module 3 1436 stored in storage device 1430, configured to control the processor 1410 as well as a special-purpose processor where software instructions are incorporated into the processor. The processor 1410 may be a self-contained computing system, containing multiple cores or processors, a bus, memory controller, cache, etc. A multi-core processor may be symmetric or asymmetric. The processor 1410 can include multiple processors, such as a system having multiple, physically separate processors in different sockets, or a system having multiple processor cores on a single physical chip. Similarly, the processor 1410 can include multiple distributed processors located in multiple separate computing devices, but working together such as via a communications network. Multiple processors or processor cores can share resources such as memory 1415 or the cache 1412, or can operate using independent resources. The processor 1410 can include one or more of a state machine, an application specific integrated circuit (ASIC), or a programmable gate array (PGA) including a field PGA.
The system bus 1405 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. A basic input/output (BIOS) stored in ROM 1420 or the like, may provide the basic routine that helps to transfer information between elements within the computing device 1400, such as during start-up. The computing device 1400 further includes storage devices 1430 or computer-readable storage media such as a hard disk drive, a magnetic disk drive, an optical disk drive, tape drive, solid-state drive, RAM drive, removable storage devices, a redundant array of inexpensive disks (RAID), hybrid storage device, or the like. The storage device 1430 can include software modules 1432, 1434, 1436 for controlling the processor 1410. The system 1400 can include other hardware or software modules. The storage device 1430 is connected to the system bus 1405 by a drive interface. The drives and the associated computer-readable storage devices provide nonvolatile storage of computer-readable instructions, data structures, program modules and other data for the computing device 1400.
In one aspect, a hardware module that performs a particular function includes the software component stored in a tangible computer-readable storage device in connection with the necessary hardware components, such as the processor 1410, bus 1405, display or output device 1435, and so forth, to carry out a particular function. In another aspect, the system can use a processor and computer-readable storage device to store instructions which, when executed by the processor, cause the processor to perform operations, a method or other specific actions. The basic components and appropriate variations can be modified depending on the type of device, such as whether the device 1400 is a small, handheld computing device, a desktop computer, or a computer server. When the processor 1410 executes instructions to perform “operations”, the processor 1410 can perform the operations directly and/or facilitate, direct, or cooperate with another device or component to perform the operations.
Although the exemplary embodiment(s) described herein employs the hard disk 1430, other types of computer-readable storage devices which can store data that are accessible by a computer, such as magnetic cassettes, flash memory cards, digital versatile disks (DVDs), cartridges, random access memories (RAMs) 1425, read only memory (ROM) 1420, a cable containing a bit stream and the like, may also be used in the exemplary operating environment. Tangible computer-readable storage media, computer-readable storage devices, or computer-readable memory devices, expressly exclude media such as transitory waves, energy, carrier signals, electromagnetic waves, and signals per se.
To enable user interaction with the computing device 1400, an input device 1445 represents any number of input mechanisms, such as a microphone for speech, a touch-sensitive screen for gesture or graphical input, keyboard, mouse, motion input, speech and so forth. An output device 1435 can also be one or more of a number of output mechanisms known to those of skill in the art. In some instances, multimodal systems enable a user to provide multiple types of input to communicate with the computing device 1400. The communications interface 1440 generally governs and manages the user input and system output. There is no restriction on operating on any particular hardware arrangement and therefore the basic hardware depicted may easily be substituted for improved hardware or firmware arrangements as they are developed.
For clarity of explanation, the illustrative system embodiment is presented as including individual functional blocks including functional blocks labeled as a “processor” or processor 1410. The functions these blocks represent may be provided through the use of either shared or dedicated hardware, including, but not limited to, hardware capable of executing software and hardware, such as a processor 1410, that is purpose-built to operate as an equivalent to software executing on a general purpose processor. For example the functions of one or more processors presented in
The logical operations of the various embodiments are implemented as: (1) a sequence of computer implemented steps, operations, or procedures running on a programmable circuit within a general use computer, (2) a sequence of computer implemented steps, operations, or procedures running on a specific-use programmable circuit; and/or (3) interconnected machine modules or program engines within the programmable circuits. The system 1400 shown in
One or more parts of the example computing device 1400, up to and including the entire computing device 1400, can be virtualized. For example, a virtual processor can be a software object that executes according to a particular instruction set, even when a physical processor of the same type as the virtual processor is unavailable. A virtualization layer or a virtual “host” can enable virtualized components of one or more different computing devices or device types by translating virtualized operations to actual operations. Ultimately however, virtualized hardware of every type is implemented or executed by some underlying physical hardware. Thus, a virtualization compute layer can operate on top of a physical compute layer. The virtualization compute layer can include one or more of a virtual machine, an overlay network, a hypervisor, virtual switching, and any other virtualization application.
The processor 1410 can include all types of processors disclosed herein, including a virtual processor. However, when referring to a virtual processor, the processor 1410 includes the software components associated with executing the virtual processor in a virtualization layer and underlying hardware necessary to execute the virtualization layer. The system 1400 can include a physical or virtual processor 1410 that receive instructions stored in a computer-readable storage device, which cause the processor 1410 to perform certain operations. When referring to a virtual processor 1410, the system also includes the underlying physical hardware executing the virtual processor 1410.
Embodiments within the scope of the present disclosure may also include tangible and/or non-transitory computer-readable storage devices for carrying or having computer-executable instructions or data structures stored thereon. Such tangible computer-readable storage devices can be any available device that can be accessed by a general purpose or special purpose computer, including the functional design of any special purpose processor as described above. By way of example, and not limitation, such tangible computer-readable devices can include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other device which can be used to carry or store desired program code in the form of computer-executable instructions, data structures, or processor chip design. When information or instructions are provided via a network or another communications connection (either hardwired, wireless, or combination thereof) to a computer, the computer properly views the connection as a computer-readable medium. Thus, any such connection is properly termed a computer-readable medium. Combinations of the above should also be included within the scope of the computer-readable storage devices.
Computer-executable instructions include, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. Computer-executable instructions also include program modules that are executed by computers in stand-alone or network environments. Generally, program modules include routines, programs, components, data structures, objects, and the functions inherent in the design of special-purpose processors, etc. that perform particular tasks or implement particular abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of the program code means for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps.
Other embodiments of the disclosure may be practiced in network computing environments with many types of computer system configurations, including personal computers, hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like. Embodiments may also be practiced in distributed computing environments where tasks are performed by local and remote processing devices that are linked (either by hardwired links, wireless links, or by a combination thereof) through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
It will be appreciated that for simplicity and clarity of illustration, where appropriate, reference numerals have been repeated among the different figures to indicate corresponding or analogous elements. In addition, numerous specific details are set forth in order to provide a thorough understanding of the embodiments described herein. However, it will be understood by those of ordinary skill in the art that the embodiments described herein can be practiced without these specific details. In other instances, methods, procedures and components have not been described in detail so as not to obscure the related relevant feature being described. Also, the description is not to be considered as limiting the scope of the embodiments described herein. The drawings are not necessarily to scale and the proportions of certain parts have been exaggerated to better illustrate details and features of the present disclosure.
In the above description, terms such as “upper,” “upward,” “lower,” “downward,” “above,” “below,” “downhole,” “uphole,” “longitudinal,” “lateral,” and the like, as used herein, shall mean in relation to the bottom or furthest extent of, the surrounding wellbore even though the wellbore or portions of it may be deviated or horizontal.
Correspondingly, the transverse, axial, lateral, longitudinal, radial, etc., orientations shall mean orientations relative to the orientation of the wellbore or tool. Additionally, the illustrate embodiments are illustrated such that the orientation is such that the right-hand side is downhole compared to the left-hand side.
The term “coupled” is defined as connected, whether directly or indirectly through intervening components, and is not necessarily limited to physical connections. The connection can be such that the objects are permanently connected or releasably connected. The term “outside” refers to a region that is beyond the outermost confines of a physical object. The term “inside” indicate that at least a portion of a region is partially contained within a boundary formed by the object. The term “substantially” is defined to be essentially conforming to the particular dimension, shape or other word that substantially modifies, such that the component need not be exact. For example, substantially cylindrical means that the object resembles a cylinder, but can have one or more deviations from a true cylinder.
The term “radially” means substantially in a direction along a radius of the object, or having a directional component in a direction along a radius of the object, even if the object is not exactly circular or cylindrical. The term “axially” means substantially along a direction of the axis of the object. If not specified, the term axially is such that it refers to the longer axis of the object.
Although a variety of information was used to explain aspects within the scope of the appended claims, no limitation of the claims should be implied based on particular features or arrangements, as one of ordinary skill would be able to derive a wide variety of implementations. Further and although some subject matter may have been described in language specific to structural features and/or method steps, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to these described features or acts. Such functionality can be distributed differently or performed in components other than those identified herein. Rather, the described features and steps are disclosed as possible components of systems and methods within the scope of the appended claims.
Moreover, claim language reciting “at least one of” a set indicates that one member of the set or multiple members of the set satisfy the claim. For example, claim language reciting “at least one of A and B” means A, B, or A and B.
Statements of the disclosure include:
Statement 1: A method comprising obtaining drilling reports associated with respective well drilling or operation activities; based on the drilling reports, generating a plurality of word vectors, wherein each word vector from the plurality of word vectors represents a respective word in the drilling reports; partitioning sentences in the drilling reports into respective words; for each sentence, identifying respective word vectors from the plurality of word vectors, the respective word vectors corresponding to the respective words associated with the sentence; based on the respective word vectors, classifying, via a neural network, the sentences into at least one of respective events, respective symptoms, respective actions, respective results, and a different category of labels.
Statement 2: A method according to Statement 1, wherein the drilling reports comprise reports generated during at least one of non-productive time periods and productive time periods in the respective well drilling or operation activities, the non-productive time periods being associated with a troubleshooting event corresponding to at least one of a failure, an error, a problem, and a disruption, and the productive time periods comprising periods of time when drilling operations are being performed.
Statement 3: A method according to any of Statements 1 and 2, further comprising projecting the plurality of word vectors into a cartesian plane; and labeling each word vector in the cartesian plane based on a respective word corresponding to the word vector in the cartesian plane.
Statement 4: A method according to any of Statements 1 through 3, further comprising: clustering the plurality of word vectors in the cartesian plane based on semantic relationships between the respective word represented by each word vector, to yield semantic clusters.
Statement 5: A method according to any of Statements 1 through 4, further comprising: generating, based on the semantic clusters, a graphical word cloud with semantic relationships that is navigable with different levels of granularity selected in a corresponding dendrogram.
Statement 6: A method according to any of Statements 1 through 5, further comprising: presenting the graphical word cloud on a display, wherein at least one of semantic clusters of words in the graphical word cloud and the words associated with the semantic clusters of words are user-selectable via a computing device associated with the display, wherein user selection of the at least one of semantic clusters of words in the graphical word cloud and the words associated with the semantic clusters of words triggers a presentation of one or more respective reports associated with the at least one of semantic clusters of words in the graphical word cloud and the words associated with the semantic clusters of words.
Statement 7: A method according to any of Statements 1 through 6, further comprising: based on the classifying of the sentences and filtering criteria defining a particular event, a particular symptom, a particular action, or a particular result, filtering the respective sequences to identify one or more sequences in drilling reports matching the filtering criteria; and presenting the identified one or more sequences matching the filtering criteria, the one or more sequences comprising at least one of the particular event, the particular symptom, the particular action, and the particular result.
Statement 8: A method according to any of Statements 1 through 7, further comprising: spatial location information extracted, plotting on a Geographic Information System (GIS) the classification of sentences into respective events, respective symptoms, respective actions, and respective results.
Statement 9: A system comprising: one or more processors; and at least one computer-readable storage medium having stored therein instructions which, when executed by the one or more processors, cause the one or more processors to: obtain a plurality of drilling reports associated with respective well drilling or operation activities; based on a word-to-vector transformation operation on the plurality of drilling reports, generate a plurality of word vectors, wherein each word vector from the plurality of word vectors represents a respective word in the plurality of drilling reports; partition sentences in the plurality of drilling reports into respective words; for each sentence, identifying respective word vectors from the plurality of word vectors, the respective word vectors corresponding to the respective words associated with the sentence; based on the respective word vectors, classify via a neural network, the sentences into at least one of respective events, respective symptoms, respective actions, respective results, and a different category of labels.
Statement 10: A system according to Statements 9, the at least one computer-readable storage medium storing additional instructions which, when executed by the one or more processors, cause the one or more processors to: project the plurality of word vectors into a cartesian plane; and label each word vector in the cartesian plane based on a respective word corresponding to the word vector in the cartesian plane.
Statement 11: A system according to any of Statements 9 and 10, the at least one computer-readable storage medium storing additional instructions which, when executed by the one or more processors, cause the one or more processors to: cluster the plurality of word vectors in the cartesian plane based on semantic relationships between the respective word represented by each word vector, to yield semantic clusters.
Statement 12: A system according to any of Statements 9 through 11, the at least one computer-readable storage medium storing additional instructions which, when executed by the one or more processors, cause the one or more processors to: generate, based on the semantic clusters, a graphical word cloud with semantic relationships.
Statement 13: A system according to any of Statements 9 through 12, the at least one computer-readable storage medium storing additional instructions which, when executed by the one or more processors, cause the one or more processors to: present the graphical word cloud on a display, wherein at least one of semantic clusters of words in the graphical word cloud and the words associated with the semantic clusters of words are user-selectable via a computing device associated with the display, wherein user selection of the at least one of semantic clusters of words in the graphical word cloud and the words associated with the semantic clusters of words triggers a presentation of one or more respective reports associated with the at least one of semantic clusters of words in the graphical word cloud and the words associated with the semantic clusters of words.
Statement 14: A system according to any of Statements 9 through 13, the at least one computer-readable storage medium storing additional instructions which, when executed by the one or more processors, cause the one or more processors to: based on the classifying of the sentences and on filtering criteria defining a particular event, a particular symptom, a particular action, or a particular result, filter the respective sequences to identify one or more sequences matching the filtering criteria; and present the identified one or more sequences matching the filtering criteria, the one or more sequences comprising at least one of the particular event, the particular symptom, the particular action, and the particular result.
Statement 15: A system according to any of Statements 9 through 14, the at least one computer-readable storage medium storing additional instructions which, when executed by the one or more processors, cause the one or more processors to: based on the classifying of the sentences into at least one of respective events, respective symptoms, respective actions, and respective results, as well as extraction of spatial location information, plotting on a Geographic Information System (GIS) the classification of sentences into respective events, respective symptoms, respective actions, and respective results.
Statement 16: A non-transitory computer-readable storage medium comprising: instructions stored on the non-transitory computer-readable storage medium, the instructions, when executed by at least one processor, cause the at least one processor to: obtain a plurality of drilling reports associated with respective well drilling or operation activities; based on a word-to-vector transformation operation on the plurality of drilling reports, generate a plurality of word vectors, wherein each word vector from the plurality of word vectors represents a respective word in the plurality of drilling reports; partition sentences in the plurality of drilling reports into respective words; for each sentence, identifying respective word vectors from the plurality of word vectors, the respective word vectors corresponding to the respective words associated with the sentence; based on the respective word vectors, classify via a neural network, the sentences into at least one of respective events, respective symptoms, respective actions, respective results, and a different category of labels.
Statement 17: A non-transitory computer-readable storage medium according to Statement 16, storing additional instructions which, when executed by the at least one processor, cause the at least one processor to: based on the classifying of the sentences into at least one of respective events, respective symptoms, respective actions, and respective results, determine respective sequences of at least two of specific events, specific symptoms, specific actions, and specific results; based on filtering criteria defining a particular event, a particular symptom, a particular action, or a particular result, filter the respective sequences to identify one or more sequences matching the filtering criteria; and present the identified one or more sequences matching the filtering criteria, the one or more sequences comprising at least one of the particular event, the particular symptom, the particular action, and the particular result.
Statement 18: A non-transitory computer-readable storage medium according to any of Statements 16 and 17, storing additional instructions which, when executed by the at least one processor, cause the at least one processor to: based on the classifying of the sentences into at least one of respective events, respective symptoms, respective actions, and respective results, as well as extraction of spatial location information, plotting on a Geographic Information System (GIS) the classification of sentences into respective events, respective symptoms, respective actions, and respective results.
Statement 19: A non-transitory computer-readable storage medium according to any of Statements 16 through 18, storing additional instructions which, when executed by the one or more processors, cause the one or more processors to: process the drilling reports through a denoising layer to yield a corpus, the denoising layer being configured to perform at least one of replace acronyms with corresponding descriptions, remove symbols, replace symbols with regular expressions, and change plurals to singular form.
Statement 20: A non-transitory computer-readable storage medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform a method according to any of Statements 1 through 18.
Statement 21: A method comprising: generating interactive plots for exploration of concepts in drilling reports; receiving one or more queries based on classified sentences from the drilling reports, the one or more queries including queries of wells that present a particular sequence of symptoms, actions, and results; and generating a Geographic Information System (GIS) plot which presents classifications with spatial coordinates.
Statement 22: A system comprising means for performing a method according to any of Statements 1 through 8.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2016/066621 | 12/14/2016 | WO | 00 |