Using machine learning to detect drilling, reservoir and production problems in unstructured text

The GeoClassifier® algorithm can detect operational problems in reports, documents, logs and other forms of unstructured text. Machine learning (neural networks) is used for prediction, complementing the existing machine learning model in GeoClassifier® which detects well / borehole names without using a prior list of names. These can be used to support oil & gas, mining, renewables, carbon capture & storage (CCS), geothermal and hydrogeology sectors.

Discover Subsurface and Geoscience Knowledge not Documents.

Find and discover geoscience knowledge not documents. An example of how organisations are exploiting the output from OpportunityFinder(R), generated by applying the algorithm to their unstructured text such as PDF, PPT, Word, Excel, XML/JSON, image files etc. on file shares and document management systems  

This company has used Microsoft PowerBI over the top of the CSV created by OpportunityFinder. In just a couple of weeks configuring an application to allow geoscientists to discover petroleum, chemical elements, mineral system and geo-resource associations in time and space as well as casting a wide net for analogue discovery.

This allows the geoscientists to uncover knowledge that typically is buried so far down traditional document search results lists – they never normally read it. The algorithm takes this a step further, using a lexicon of 75,000 terms, machine learning and Patented methods to ‘join the dots’ to suggest patterns for potential missed leads, ideas and exploration. OpportunityFinder(R) is used internationally by organisations in the oil and gas, metals and mining, geothermal and hydrogen sectors to support the energy transition.
#geoscience #digitaltransformation #naturallanguageprocessing #Python #datascience #oilandgas #mining

Text mining for Geo-resources

Discover new insights in geoscience documents, using patterns in unstructured text to detect petroleum, mineral, hydro, geothermal and hydrogen exploration opportunities.

First-of-a-kind OpportunityFinderⓇ and GeoClassifierⓇ algorithms are now integrated. Teaching machines about geoscience. 

Apply to deep archives, documents on your shared drive, or in Microsoft Sharepoint or Document Management Systems. Apply to external geoscience subscription reports and literature. 
Visualise outputs quickly (knowledge graph like no other) in existing applications or build new ones. Combine with structured data. A digital assistant for geoscientists.
Used to identify hidden geo-resources by some of the world’s largest companies.
Supercharge your existing digital transformation initiatives with Patented geoscience technology

Text Mining: OpportunityFinder® algorithm extends into Porphyry Copper

The OpportunityFinder Python based Natural Language Processing (NLP) algorithm has been extended to detect clues for porphyry copper in text.

Launched in early 2020 and used by organisations for petroleum and native hydrogen exploration, the algorithm uses hundreds of thousands of lexicons, taxonomies and labelled data for machine learning models. The novel Patented method combines these, placing a geological lens over unstructured information – turning it into structured information which can be visualised.

This can assist the Geoscientist ‘read’ hundreds of thousands/millions of notes, papers, reports, presentations, logs, maps and sketches for clues to potentially new hidden opportunities. Some ideas and opportunities may only be apparent by combining clues from many documents.

Why now? New approaches may be needed to make a contribution to the change that is likely needed. Limiting global warming to 2 degrees means more electronics, renewable energy such as wind turbines, solar panels and electric vehicles. Wood Mackenzie estimate this means an 85% increase of copper is needed by 2030. Source


GeoClassifier® machine learning prediction: now trained with quarter of a million labelled geoscience sentences.

GeoClassifier® can automatically classify sentences, paragraphs and documents to geoscience categories and detect well/borehole names in text. GeoClassifier® uses over 250,000 labelled public geoscience sentences to train deep learning models to achieve this. When an organisation licenses the algorithm they also receive the actual training data, so can build and train their own ML models for prediction if they wish.

OpportunityFinder ® for geoscience text processing: 1 Million Documents in 26 hours

Due to the performant way the patented algorithm has been designed, it can check through millions of permutations in every sentence in a document extremely quickly. In a large corpus of text this equates to trillions of permutations.

Run on nothing more sophisticated than a standard i7 high street laptop, the algorithm processed 1 Million Documents for extractions in 26 hours. If an organisation had more resources than 1 laptop – this could be substantially quicker again..

These permutations may include clues for petroleum plays, conditions for ore bodies/minerals, evidence for hydrogen as well as empirical evidence, generic Geobody, Lithostratigraphic, Chronostratigraphic, Lithological, Environmental and Structural evidence.

For more information:

Subsurface Insights and Natural Language Processing in the Geosciences

Infoscience Technologies was delighted to guest author an article on Natural Language Processing (NLP) in the Geosciences for Halliburton’s September issue of Subsurface Insights magazine. This month’s issue is a Minerals special.

Sign up free to Halliburton’s magazine for a copy:

More info:

GeoClassifier® – Machine Learning detection of Well Names in Unstructured Text

Detecting entities such as well names in unstructured text can be useful for many aspects of information discovery.

Lookup lists from corporate databases and regular expression pattern rules can be useful. They do have limitations though, it can be difficult to predict sometimes what may lie within thousands of old reports and documents.

Having a machine learning model tuned and trained on thousands of public domain examples may help and support existing digitalisation activities. This new capability was added to the GeoClassifier® algorithm recently.

The screenshot shows some examples in oil & gas, geothermal, hydrogeology, mining and carbon capture sectors.

Tested on several hundred UK License Reports gave 96% accuracy detecting 718 well names.

For more information: