4 Reasons Not to Fear Machine Learning

By Eleanor Oldham, Senior Geophysicist

This time last year, if I’d read the words “machine learning” in an article I would have skipped on past, assuming this was a topic for somebody else to think about. However, a year is a long time (particularly if that year is 2020) and I now eagerly hoover up as many machine learning, artificial intelligence and neural network articles as I can find on a geoscience theme. Why? Because a couple of online courses taught me that machine learning isn’t something which needs to be locked away, only to be accessed by a chosen few. Machine learning is something we can all get involved in to the benefit of our geoscience projects. Here are 4 reasons why you should take the plunge:

1. It’s actually very easy

A mistaken belief that machine learning was fantastically complicated is what scared me away from the subject for so long. However, I now believe that almost any scientist already has the skills to take the leap into the world of machine learning – even you!

As the name suggests, machine learning uses machines to learn patterns from data. In the “good old days”, if you wanted to automate a process you would have to manually write a set of rules to map every combination of inputs to the right output. These rules would basically look like a long list of “if this then that” statements. With machine learning, you simply feed the computer a large set of training data (inputs with known outputs) and allow the algorithm to create its own set of rules for how to predict the outputs from the inputs.

It will save you time and money

What’s 2765 ÷ 35?
Did it take you longer than a second to work that out?
Was it fun?

I don’t think you’ll find it hard to believe that machine learning algorithms are faster than asking a human to manually write out a set of rules. This means it’s quick and easy to test out lots of different machine learning models to find what works best for your situation. Bonus – we can spend our valuable time doing much more interesting things, like QC-ing the models.

In case it was annoying you, the answer was 79!

3. It’s really accurate

The more training data you use, the greater the ability of the algorithms to learn the required patterns and the more accurate the results become. Machines are now more accurate than people at labelling animal photos so you can bet they’re pretty good at solving geoscience problems too.

4. You might be doing it already without even realising

There are three basic tasks which you can achieve through machine learning and I wouldn’t be surprised if you do all of them already (although perhaps in a more manual fashion or by getting some software to do it for you).

Representation Learning (Clustering)

  • You have an unlabelled set of data and want to look for structure.
  • Any structures identified could be used as labels for a future classification task.
  • Example: Grouping different types of reservoir facies ahead of geological modelling. Note that the clusters generated will not look like traditional, petrophysical facies but you might uncover some unexpected patterns to consider.
Clustering:  Input data are assigned to three classes (blue, green and orange)
Clustering: Input data are assigned to three classes (blue, green and orange)

Classification

  • You have a labelled set of training data.
  • When new data arrives you want to be able to predict the correct label.
  • Example: Creating a facies classification volume from seismic inversion cubes (like lambda-rho & mu-rho). Be aware that machine learning processes involving seismic are usually undersampled with respect to training data (i.e. wells) and this will impact the accuracy of the results.
Classification:  Two features of the input data (x1 and x2) are used to assign each point to one of the pre-defined classes
Classification: Two features of the input data (x1 and x2) are used to assign each point to one of the pre-defined classes

Regression

  • Your training data is a set of input data paired with output values.
  • When new data comes in you want to be able to predict the correct value from the feature(s) of the input data.
  • Example: Predicting density log values using GR, neutron and sonic logs with depth as an additional constraint.
Regression:  One feature of the input data (x1) is used to predict an output value (y)
Regression: One feature of the input data (x1) is used to predict an output value (y)

I hope I’ve be able to allay some of the fear surrounding machine learning. I passionately believe it’s something all geoscientists can reap the benefits of. At Merlin, our Keywise service uses machine learning to extract keywords from files in any data structure to turn unstructured data archives into searchable databases. Please get in touch to find out how you can use Keywise to solve your data frustration or for more information on geoscience machine learning.