OpportunityFinder® Customer survey

A recent survey was undertaken of a large organisation that has been using the Natural Language Processing / Machine Learning Python algorithm OpportunityFinder for the past 12 months.

They have been applying the algorithm to millions of documents to extract knowledge and ideas.

Compared to their existing traditional search engines they estimated:

1. The algorithm has reduced the time they spend searching for technical information by over 50%.

2. The algorithm has increased the chance they will discover new knowledge they would not have found otherwise (using traditional search engines) by 75%.


File system migration to a document management system, support for acquisitions and mergers: GeoClassifier® – A new way of automatically organising geoscience, subsurface and wells documentation

The GeoClassifer(R) Python algorithm launched back in December 2020 (for petroleum, mining and renewables) can automatically read the ‘body text’ of geoscience, subsurface & wells documentation (PDF, PPT, Word, Excel etc) and:

  • Classify by document type
  • Classify by document category
  • Classify by chronostratigraphy
  • Classify by Lithostratigraphy
  • Classify by well / borehole name
  • Classify by prospect / lead name
  • Classify by survey name
  • Classify by deposit / orebody name
  • Classify by reservoir / aquifer name
  • Classify by field name
  • Classify by block / license name
  • Classify by play name
  • Classify by basin / geobody name
  • Classify by area /region name
  • Classify by country and region
  • Categorise by discipline/topics
  • Extract dates, people’s names & company names (ootb models)
  • Classify to machine learnt topics (custom geoscience model)
  • Also extract all of the names above that occur in the document if required
  • Extract drilling & operation problems
  • Many more features..

The resultant tags can be used to help organise records & document management and improve search & discovery of geoscience, subsurface and wells documents.

The GeoClassifier algorithm achieves this in a unique and novel way using several techniques.

– Knowledge Engineering (a taxonomy with thousands of clues for document types and categories)

– Machine Learning (250,000 labelled topic examples in an ensemble model), custom SpaCy NER models.

– Natural Language Processing (NLP) state-of-the-art geoscience name extraction

There are many limitations and problems when taking a taxonomy or thesaurus built for manual tagging of documents – then trying to apply that automatically to text. Unlike traditional methods (and taxonomies), GeoClassifier(R) was built from the ‘get-go’ for automated not manual document tagging – supporting digital transformation.

The Python algorithm can be applied immediately to diverse documentation, from any geographical location without using prior lists of names. Lists of names (e.g. well names) can be added to improve detection.

The algorithm can run stand alone against files on a filesystem and/or a company can take parts of it and embed in their existing tooling that may be more integrated with SharePoint / EDMS and Search systems.

The algorithm also uses an automatic document scoring system based on a number of criteria to identify those documents that will have tendencies to be ‘most important’ from a search and document & records retention point of view. This can aid file system migration projects, as well as acquisition, divestment and mergers.

More: contact@infosciencetechnologies.com

Patented next generation algorithms: The GeoClassifier(R) algorithm disrupts traditional document classification and extraction whilst OpportunityFinder(R) disrupts traditional business ideation processes, targeting associative extraction of petroleum, mining and renewables concepts and opportunities.

OpportunityFinder® v4.2: State-of-the-art geotagging for subsurface, geoscience and Earth science documents

NEW: OpportunityFinder® v4.2 has options to detect 30% more geographical/geobody entities within the body text of documents. These can support spatial and map based search & discovery. Coverage includes from well/boreholes, leads, prospects & plays to fields, deposits, localities, tracts, blocks & licenses to mountains, foldbelts, seamounts and basins. Using state-of-the-art natural language processing and machine learning, documents (and domain evidence in documents) can be precisely geo-located on a map automatically. The algorithm works anywhere in the world without using prior lists of names. Recent findings suggest it can detect significantly more geotags than traditional approaches, inductively uncovering new data points.


#oilandgas #mining #renewables #geothermal #hydrogeology

Automatically detecting geo-resource evidence in reports

Looking to extract evidence for petroleum systems, metals & minerals, heat flow, fluid flow or aquifers & seals in reports or semi-structured databases? Or chronostratigraphy, lithostratigraphy, tectonics, depositional environment and lithology? The patented algorithms from Infoscience Technologies may give your organisation a fast start..


Need help searching for petroleum system elements for exploration?

Our algorithms combine over 75,000 different ways potential hydrocarbon occurrence, source rock, maturation, migration, reservoir, trap and seal clues may be mentioned in documents, reports and logs.

Using traditional keyword search, explorers may miss up to 40% – 60% [1] of the relevant geoscience evidence buried in report collections.

Based on years of research, our algorithms combine the best from Knowledge Engineering, Natural Language Processing (NLP) and Machine Learning.

Our algorithms can plug into any existing search engine, fast tracking digital transformation initiatives. Bringing state of the art intelligence to assist geoscientists oil & gas data mining and search.

#digital #geoscience #subsurface #search #documents


[1] https://asistdl.onlinelibrary.wiley.com/doi/abs/10.1002/asi.23595

Using machine learning to detect mentions of drilling and operational problems in text.

Using machine learning to detect mentions of drilling and operational problems in text. Over 5,000 public domain sentences have been labelled to train a predictive machine learning model to detect wellbore drilling and operational ‘problems’ (including reservoir and production) in documents, reports and logs.

This can support alerts & monitoring, health & safety, search & discovery as well as analogues & learning by quickly extracting where problems have or are being encountered, in volumes of unstructured text which are too vast for a person to ever realistically read through.

The model generalises, capable of surfacing types of problems that were not even present in the original training set. For example in the image shown, ‘swelling formation’ is detected in the first sentence as a potential problem. Whilst ‘swelling clays’ was labelled in the training set, the phrase ‘swelling formation’ was not. The model has inferred this based on statistical word context.

The techniques are also useful for scanned content with OCR errors, note that ‘stuck pipe’ and ‘swelling clays’ are detected even though they have spelling errors introduced by the OCR process – not uncommon!

This model adds to the existing CNN derived ML models in GeoClassifier(R) for detecting well names and subsurface topics. These models are fully integrated with the OpportunityFinder(R) algorithm for the energy transition, mining and geohealth sectors.

More at: http://www.infosciencetechnologies.com

#digitaltransformation #machinelearning #naturallanguageprocessing #georesources

Using machine learning to detect drilling, reservoir and production problems in unstructured text

The GeoClassifier® algorithm can detect operational problems in reports, documents, logs and other forms of unstructured text. Machine learning (neural networks) is used for prediction, complementing the existing machine learning model in GeoClassifier® which detects well / borehole names without using a prior list of names. These can be used to support oil & gas, mining, renewables, carbon capture & storage (CCS), geothermal and hydrogeology sectors.

Discover Subsurface and Geoscience Knowledge not Documents.

Find and discover geoscience knowledge not documents. An example of how organisations are exploiting the output from OpportunityFinder(R), generated by applying the algorithm to their unstructured text such as PDF, PPT, Word, Excel, XML/JSON, image files etc. on file shares and document management systems  

This company has used Microsoft PowerBI over the top of the CSV created by OpportunityFinder. In just a couple of weeks configuring an application to allow geoscientists to discover petroleum, chemical elements, mineral system and geo-resource associations in time and space as well as casting a wide net for analogue discovery.

This allows the geoscientists to uncover knowledge that typically is buried so far down traditional document search results lists – they never normally read it. The algorithm takes this a step further, using a lexicon of 75,000 terms, machine learning and Patented methods to ‘join the dots’ to suggest patterns for potential missed leads, ideas and exploration. OpportunityFinder(R) is used internationally by organisations in the oil and gas, metals and mining, geothermal and hydrogen sectors to support the energy transition.

#geoscience #digitaltransformation #naturallanguageprocessing #Python #datascience #oilandgas #mining