Why Machines Learn

The Elegant Math Behind Modern AI

Author Anil Ananthaswamy

Read by Rene Ruiz

Listen to a clip from the audiobook

0:00

Audiobook Download

Books on Tape

On sale Jul 16, 2024 | 13 Hours and 31 Minutes | 9780593786956

Add to cart Add to list Exam Copies

See Additional Formats

Nonfiction > Science & Technology

View on Edelweiss

A rich, narrative explanation of the mathematics that has brought us machine learning and the ongoing explosion of artificial intelligence

Machine learning systems are making life-altering decisions for us: approving mortgage loans, determining whether a tumour is cancerous, or deciding whether someone gets bail. They now influence developments and discoveries in chemistry, biology, and physics—the study of genomes, extra-solar planets, even the intricacies of quantum systems. And all this before large language models such as ChatGPT came on the scene.

We are living through a revolution in machine learning-powered AI that shows no signs of slowing down. This technology is based on relatively simple mathematical ideas, some of which go back centuries, including linear algebra and calculus, the stuff of seventeenth- and eighteenth-century mathematics. It took the birth and advancement of computer science and the kindling of 1990s computer chips designed for video games to ignite the explosion of AI that we see today. In this enlightening book, Anil Ananthaswamy explains the fundamental math behind machine learning, while suggesting intriguing links between artifical and natural intelligence. Might the same math underpin them both?

As Ananthaswamy resonantly concludes, to make safe and effective use of artificial intelligence, we need to understand its profound capabilities and limitations, the clues to which lie in the math that makes machine learning possible.

*This audiobook contains a PDF of equations, graphs, and illustrations.

Chapter 1

Desperately Seeking Patterns

When he was a child, the Austrian scientist Konrad Lorenz, enamored by tales from a book called The Wonderful Adventures of Nils-the story of a boy's adventures with wild geese written by the Swedish novelist and winner of the Nobel Prize for Literature, Selma Lagerlöf-"yearned to become a wild goose." Unable to indulge his fantasy, the young Lorenz settled for taking care of a day-old duckling his neighbor gave him. To the boy's delight, the duckling began following him around: It had imprinted on him. "Imprinting" refers to the ability of many animals, including baby ducks and geese (goslings), to form bonds with the first moving thing they see upon hatching. Lorenz would go on to become an ethologist and would pioneer studies in the field of animal behavior, particularly imprinting. (He got ducklings to imprint on him; they followed him around as he walked, ran, swam, and even paddled away in a canoe.) He won the Nobel Prize for Physiology or Medicine in 1973, jointly with fellow ethologists Karl von Frisch and Nikolaas Tinbergen. The three were celebrated "for their discoveries concerning organization and elicitation of individual and social behavior patterns."

Patterns. While the ethologists were discerning them in the behavior of animals, the animals were detecting patterns of their own. Newly hatched ducklings must have the ability to make out or tell apart the properties of things they see moving around them. It turns out that ducklings can imprint not just on the first living creature they see moving, but on inanimate things as well. Mallard ducklings, for example, can imprint on a pair of moving objects that are similar in shape or color. Specifically, they imprint on the relational concept embodied by the objects. So, if upon birth the ducklings see two moving red objects, they will later follow two objects of the same color (even if those latter objects are blue, not red), but not two objects of different colors. In this case, the ducklings imprint on the idea of similarity. They also show the ability to discern dissimilarity. If the first moving objects the ducklings see are, for example, a cube and a rectangular prism, they will recognize that the objects have different shapes and will later follow two objects that are different in shape (a pyramid and a cone, for example), but they will ignore two objects that have the same shape.

Ponder this for a moment. Newborn ducklings, with the briefest of exposure to sensory stimuli, detect patterns in what they see, form abstract notions of similarity/dissimilarity, and then will recognize those abstractions in stimuli they see later and act upon them. Artificial intelligence researchers would offer an arm and a leg to know just how the ducklings pull this off.

While today's AI is far from being able to perform such tasks with the ease and efficiency of ducklings, it does have something in common with the ducklings, and that's the ability to pick out and learn about patterns in data. When Frank Rosenblatt invented the perceptron in the late 1950s, one reason it made such a splash was because it was the first formidable "brain-inspired" algorithm that could learn about patterns in data simply by examining the data. Most important, given certain assumptions about the data, researchers proved that Rosenblatt's perceptron will always find the pattern hidden in the data in a finite amount of time; or, put differently, the perceptron will converge upon a solution without fail. Such certainties in computing are like gold dust. No wonder the perceptron learning algorithm created such a fuss.

But what do these terms mean? What are “patterns” in data? What does “learning about these patterns” imply?

A Next Big Idea Club Must-Read Title for July
One of The Information's 5 Best AI Books of 2024
A Winner of the Artificiality Book Awards 2024

"A deep look at the mathematical innovations that made the AI revolution possible. One of the most useful books on AI that I've ever read!"
—Cal Newport, New York Times bestselling author of Slow Productivity and Deep Work, and Professor of Computer Science at Georgetown University

“Why Machines Learn, by the award-winning science writer Anil Ananthaswamy, takes the reader on an entertaining journey into the mind of a machine… [The book] demystifies the underlying mechanisms behind machine learning, which may possibly lead to a better understanding of the learning process itself and the development of improved AI.”
—Physics World

“A skillful primer makes sense of the mathematics beneath AI's hood.”
—New Scientist

“Whether Ananthaswamy is talking of ML algorithms or manipulation of matrices, he maintains a lightness of language and invokes historical accounts to advance a compelling narrative… A must-read for anyone who is curious to understand 'the elegant math behind modern AI' [and] an inspirational guide for teachers of math and mathematical sciences who can adopt these techniques and methods to make classrooms lively.”
—Shaastra, IIT-Madras

“Some books about the development of neural networks describe the underlying mathematics while others describe the social history. This book presents the mathematics in the context of the social history. It is a masterpiece. The author is very good at explaining the mathematics in a way that makes it available to people with only a rudimentary knowledge of the field, but he is also a very good writer who brings the social history to life.”
—Geoffrey Hinton, Nobel Laureate, deep learning pioneer, Turing Award winner, former VP at Google, and Professor Emeritus at University of Toronto

“After just a few minutes of reading Why Machines Learn, you’ll feel your own synaptic weights getting updated. By the end you will have achieved your own version of deep learning—with deep pleasure and insight along the way.”
—Steven Strogatz, New York Times bestselling author of Infinite Powers and professor of mathematics at Cornell University

“If you were looking for a way to make sense of the AI revolution that is well underway, look no further. With this comprehensive yet engaging book, Anil Ananthaswamy puts it all into context, from the origin of the idea and its governing equations to its potential to transform medicine, quantum physics—and virtually every aspect of our life. An essential read for understanding both the possibilities and limitations of artificial intelligence.”
—Sabine Hossenfelder, physicist and New York Times bestselling author of Existential Physics: A Scientist's Guide to Life's Biggest Questions

“Why Machines Learn is a masterful work that explains—in clear, accessible, and entertaining fashion—the mathematics underlying modern machine learning, along with the colorful history of the field and its pioneering researchers. As AI has increasingly profound impacts in our world, this book will be an invaluable companion for anyone who wants a deep understanding of what’s under the hood of these often inscrutable machines.”
—Melanie Mitchell, author of Artificial Intelligence and Professor at the Santa Fe Institute

“Generative AI, with its foundations in machine learning, is as fundamental an advance as the creation of the microprocessor, the Internet, and the mobile phone. But almost no one, outside of a handful of specialists, understands how it works. Anil Ananthaswamy has removed the mystery by giving us a gentle, intuitive, and human-oriented introduction to the math that underpins this revolutionary development.”
—Peter E. Hart, AI pioneer, entrepreneur, and co-author of Pattern Classification

“Anil Ananthaswamy’s Why Machines Learn embarks on an exhilarating journey through the origins of contemporary machine learning. With a captivating narrative, the book delves into the lives of influential figures driving the AI revolution while simultaneously exploring the intricate mathematical formalism that underpins it. As Anil traces the roots and unravels the mysteries of modern AI, he gently introduces the underlying mathematics, rendering the complex subject matter accessible and exciting for readers of all backgrounds.”
—Björn Ommer, Professor at the Ludwig Maximilian University of Munich and leader of the original team behind Stable Diffusion

“An inspiring introduction to the mathematics of AI.”
—Arthur I. Miller, author of The Artist in the Machine: The World of AI-Powered Creativity

"Will there be math? Oh, yes, there will be math. But Ananthaswamy is
the best guide you could ask for on such a perilous journey."
—The Information

"This book is the ultimate explainer... What I love most is how [Ananthaswamy] threads history into the equations. You get why these methods matter, how they were discovered, and why they’ve stuck around. I felt like I was part of the journey, not just staring at some abstract formula. If you’re curious about how machines learn but feel like math is a wall you can’t climb, this book is your ladder. Highly recommended."
—Helen Edwards, The Artificiality Institute

“[An] illuminating overview of how machine learning works.”
—Kirkus Reviews

Anil Ananthaswamy is an award-winning science writer and a former staff writer and deputy news editor for New Scientist. He is the author of several popular science books, including The Man Who Wasn’t There, which was longlisted for the PEN/E. O. Wilson Literary Science Writing Award. He was a 2019-20 MIT Knight Science Journalism Fellow and the recipient of the Distinguished Alum Award, the highest award given by IIT Madras to its graduates, for his contributions to science writing. View titles by Anil Ananthaswamy

Prologue 1

CHAPTER 1
Desperately Seeking Patterns 7

CHAPTER 2
We Are All Just Numbers Here 26

CHAPTER 3
The Bottom of the Bowl 64

CHAPTER 4
In All Probability 95

CHAPTER 5
Birds of a Feather 144

CHAPTER 6
There’s Magic in Them Matrices 176

CHAPTER 7
The Great Kernel Rope Trick 206

CHAPTER 8
With a Little Help from Physics 242

CHAPTER 9
The Man Who Set Back Deep Learning (Not Really) 277

CHAPTER 10
The Algorithm that Put Paid to a Persistent Myth 302

CHAPTER 11
The Eyes of a Machine 346

CHAPTER 12
Terra Incognita 382

Epilogue 415

Acknowledgments 431
Notes 435
Index 455