Blue Sky Science: What is machine learning?

Austin Sandoval

20

What is machine learning?

Machine learning is an area of computer science that focuses on building computer programs that help machines learn by example. This is similar to the way young children learn about the world around them.

Children can learn to distinguish between apples and oranges by showing them many examples. Soon they’ll recognize certain features like the color and texture of apples versus oranges.

A machine can learn to tell apples and oranges apart in much the same way. A machine can be given a set of images of different apples and oranges. Along with each image, it’s told whether the image is an apple or an orange.

The machine automatically extracts features from these images, like the color, shape and texture. After looking at how these features are associated with being an apple or an orange, it can learn to predict whether a new fruit image is either an apple or an orange automatically, with no assistance from a person.

At the NIH Center for Predictive Computational Phenotyping, we’re using machine learning to understand diseases like cancer. Many diseases are related to abnormal levels of gene activity. The gene activity levels are features that the machine can use to distinguish between healthy and diseased cases.

Because there are thousands of genes, it’s difficult for a person, even an expert scientist, to figure out which genes are really most important for a disease.

Machines, on the other hand, can look at thousands of genes at a time with no trouble. We can use machine-learning techniques to help scientists define the most important genes in disease processes.