Anglo American sign global license for Infoscience Technologies’ Natural Language Processing (NLP) Algorithms

We are pleased to announce that Anglo American, one of the worlds largest mining companies, has signed a global software license agreement to deploy Infoscience Technologies’ Natural Language Processing (NLP) algorithms.

Supporting its FutureSmart Mining™ Digitalisation initiative in Discovery (Exploration), Anglo American will apply these unique disruptive algorithms to its vast internal collection of documents. These will assist in the detection of new geological mineral deposits.

Dr Paul Cleverley, Infoscience Technologies Founder and Director commented, “Finding new deposits of critical minerals is essential for the transition to a lower carbon world. This agreement confirms Infoscience is a world leader in exploiting unstructured data in the subsurface and geo-resource sector”.

About Anglo American

Anglo American is a global mining company with a commodity portfolio that includes platinum group metals, diamonds, copper, iron ore, polyhalite, nickel, manganese and metallurgical coal for steel making. Anglo American employs over 100,000 people world-wide in 15 countries with annual revenues in excess of $40 Billion.

About Infoscience Technologies Ltd
Infoscience is a UK tech start-up founded in 2018, pioneering Artificial Intelligence (AI) algorithms for the geo-resource sector (oil & gas, mining and renewables). These patented algorithms detect hidden opportunities from geoscience and commercial clues in unstructured text. Customers include some of the worlds largest companies.

#miningindustry #geology #artificialintelligence

Disruptive algorithms for geo-resources – oil and gas, metals and mining unstructured text

Unlock the value in your oil & gas, mining, subsurface and geoscience documents. Disrupt existing business workflows. Automatically classify, extract data and names, find problems and opportunities. Assisting the subsurface professional and Geoscientist.

Save time searching for information, reduce the risk of missing key information and increase the chances of ideation & discovering new knowledge.

Patented state-of-the-art Python algorithms using the industry’s largest subsurface clue taxonomy/lexicons:



From the pioneer in subsurface & geoscience unstructured text analytics.

OpportunityFinder® Customer survey

A recent survey was undertaken of a large organisation that has been using the Natural Language Processing / Machine Learning Python algorithm OpportunityFinder for the past 12 months.

They have been applying the algorithm to millions of documents to extract knowledge and ideas.

Compared to their existing traditional search engines they estimated:

1. The algorithm has reduced the time they spend searching for technical information by over 50%.

2. The algorithm has increased the chance they will discover new knowledge they would not have found otherwise (using traditional search engines) by 75%.

File system migration to a document management system, support for acquisitions and mergers: GeoClassifier® – A new way of automatically organising geoscience, subsurface and wells documentation

The GeoClassifer(R) Python algorithm launched back in December 2020 (for petroleum, mining and renewables) can automatically read the ‘body text’ of geoscience, subsurface & wells documentation (PDF, PPT, Word, Excel etc) and:

  • Classify by document type
  • Classify by document category
  • Classify by chronostratigraphy
  • Classify by Lithostratigraphy
  • Classify by well / borehole name
  • Classify by prospect / lead name
  • Classify by survey name
  • Classify by deposit / orebody name
  • Classify by reservoir / aquifer name
  • Classify by field name
  • Classify by block / license name
  • Classify by play name
  • Classify by basin / geobody name
  • Classify by area /region name
  • Classify by country and region
  • Categorise by discipline/topics
  • Extract dates, people’s names & company names (ootb models)
  • Classify to machine learnt topics (custom geoscience model)
  • Also extract all of the names above that occur in the document if required
  • Extract drilling & operation problems
  • Many more features..

The resultant tags can be used to help organise records & document management and improve search & discovery of geoscience, subsurface and wells documents.

The GeoClassifier algorithm achieves this in a unique and novel way using several techniques.

– Knowledge Engineering (a taxonomy with thousands of clues for document types and categories)

– Machine Learning (250,000 labelled topic examples in an ensemble model), custom SpaCy NER models.

– Natural Language Processing (NLP) state-of-the-art geoscience name extraction

There are many limitations and problems when taking a taxonomy or thesaurus built for manual tagging of documents – then trying to apply that automatically to text. Unlike traditional methods (and taxonomies), GeoClassifier(R) was built from the ‘get-go’ for automated not manual document tagging – supporting digital transformation.

The Python algorithm can be applied immediately to diverse documentation, from any geographical location without using prior lists of names. Lists of names (e.g. well names) can be added to improve detection.

The algorithm can run stand alone against files on a filesystem and/or a company can take parts of it and embed in their existing tooling that may be more integrated with SharePoint / EDMS and Search systems.

The algorithm also uses an automatic document scoring system based on a number of criteria to identify those documents that will have tendencies to be ‘most important’ from a search and document & records retention point of view. This can aid file system migration projects, as well as acquisition, divestment and mergers.


Patented next generation algorithms: The GeoClassifier(R) algorithm disrupts traditional document classification and extraction whilst OpportunityFinder(R) disrupts traditional business ideation processes, targeting associative extraction of petroleum, mining and renewables concepts and opportunities.

OpportunityFinder® v4.2: State-of-the-art geotagging for subsurface, geoscience and Earth science documents

NEW: OpportunityFinder® v4.2 has options to detect 30% more geographical/geobody entities within the body text of documents. These can support spatial and map based search & discovery. Coverage includes from well/boreholes, leads, prospects & plays to fields, deposits, localities, tracts, blocks & licenses to mountains, foldbelts, seamounts and basins. Using state-of-the-art natural language processing and machine learning, documents (and domain evidence in documents) can be precisely geo-located on a map automatically. The algorithm works anywhere in the world without using prior lists of names. Recent findings suggest it can detect significantly more geotags than traditional approaches, inductively uncovering new data points.

#oilandgas #mining #renewables #geothermal #hydrogeology

Automatically detecting geo-resource evidence in reports

Looking to extract evidence for petroleum systems, metals & minerals, heat flow, fluid flow or aquifers & seals in reports or semi-structured databases? Or chronostratigraphy, lithostratigraphy, tectonics, depositional environment and lithology? The patented algorithms from Infoscience Technologies may give your organisation a fast start..

Need help searching for petroleum system elements for exploration?

Our algorithms combine over 75,000 different ways potential hydrocarbon occurrence, source rock, maturation, migration, reservoir, trap and seal clues may be mentioned in documents, reports and logs.

Using traditional keyword search, explorers may miss up to 40% – 60% [1] of the relevant geoscience evidence buried in report collections.

Based on years of research, our algorithms combine the best from Knowledge Engineering, Natural Language Processing (NLP) and Machine Learning.

Our algorithms can plug into any existing search engine, fast tracking digital transformation initiatives. Bringing state of the art intelligence to assist geoscientists oil & gas data mining and search.

#digital #geoscience #subsurface #search #documents


Using machine learning to detect mentions of drilling and operational problems in text.

Using machine learning to detect mentions of drilling and operational problems in text. Over 5,000 public domain sentences have been labelled to train a predictive machine learning model to detect wellbore drilling and operational ‘problems’ (including reservoir and production) in documents, reports and logs.

This can support alerts & monitoring, health & safety, search & discovery as well as analogues & learning by quickly extracting where problems have or are being encountered, in volumes of unstructured text which are too vast for a person to ever realistically read through.

The model generalises, capable of surfacing types of problems that were not even present in the original training set. For example in the image shown, ‘swelling formation’ is detected in the first sentence as a potential problem. Whilst ‘swelling clays’ was labelled in the training set, the phrase ‘swelling formation’ was not. The model has inferred this based on statistical word context.

The techniques are also useful for scanned content with OCR errors, note that ‘stuck pipe’ and ‘swelling clays’ are detected even though they have spelling errors introduced by the OCR process – not uncommon!

This model adds to the existing CNN derived ML models in GeoClassifier(R) for detecting well names and subsurface topics. These models are fully integrated with the OpportunityFinder(R) algorithm for the energy transition, mining and geohealth sectors.

More at:

#digitaltransformation #machinelearning #naturallanguageprocessing #georesources