top of page

I Thought Logarithms Were Useless… Then I Discovered They're Behind Machine Learning!

I still remember sitting in class, staring at a board full of logarithms, completely zoned out. It all felt pointless. Like, why am I learning this? When is this ever going to matter in real life?


The teacher kept insisting it was important—that someday it would make sense. But honestly, it sounded like one of those things teachers say just to make a topic feel less useless. So I did what most of us do: learned it for the exam, wrote it down, and then forgot about it.


Then a few years later, something unexpected happens—you start seeing patterns everywhere. Apps recommend exactly what you want to watch. Banks detect fraud instantly. Models predict outcomes before they even happen. And when you look a little deeper, you realize this isn’t magic. It’s math. The same math.


Logarithms—the thing you thought you’d never use—are quietly sitting at the core, helping machines make sense of messy probabilities. And that’s when it hits you—not dramatically, but slowly: that “useless” chapter was never useless at all.

So let’s go back to that same question—but this time for real: what exactly is a logarithm, and why does it matter so much?



Before jumping into machine learning, let’s understand logarithms in the simplest way possible.


Imagine I ask you: “10 raised to what power gives 1,000,000?” The answer is 6.


That’s all a logarithm really does—it answers the question: how many times are we multiplying?


Instead of writing something huge like: y = 10⁶...


We can write: log₁₀(y) = 6!


Both mean the same thing. But the logarithmic form is shorter, cleaner, and easier to work with.


And this simple idea becomes incredibly powerful when we deal with probabilities—especially very small ones.


Part 1: The Probability Problem That Logarithms Solve


I was building a simple spam detector that looked at signals like unusual words (0.7), an unknown sender (0.6), and suspicious intent (0.8). To combine them, I did what math suggests—multiply the probabilities.


0.7 × 0.6 × 0.8 = 0.336

For a few features, this works fine. But real machine learning models deal with hundreds or even thousands of features. Now imagine multiplying a number like 0.9 again and again 100 times:


0.9^100 ≈ 0.000026

The result becomes extremely small—almost zero. At this point, computers start losing precision, rounding errors creep in, and the model becomes unstable.


This is where logarithms step in. Instead of multiplying probabilities directly, we take logarithms first:

log(0.7 × 0.6 × 0.8)

Using log properties, this becomes:

log(0.7) + log(0.6) + log(0.8)

Now multiplication has turned into addition, and the values remain stable instead of collapsing toward zero. When we need the original probability back, we simply exponentiate:


"We don’t lose information—we just store it in a safer form."


This simple transformation—turning multiplication into addition—is one of the key reasons machine learning works at scale. And this idea directly leads us to log loss in logistic regression. This is the first reason logarithms are fundamental to machine learning: they allow us to multiply many small probabilities without our computers melting down.


Part 2: The Logistic Regression Problem—Predicting Yes or No


When I first tried to predict a simple yes/no outcome (like pass or fail), a straight line felt natural. But it quickly broke—giving probabilities greater than 1 or less than 0. A linear model just doesn’t respect the rules of probability. We need something that always stays between 0 and 1, no matter the input.


So instead of predicting probability directly, we take a smarter route: we predict odds, and then take their logarithm. This transforms the problem into something a straight line can handle:


log(p / (1 − p)) = β₀ + β₁x₁ + β₂x₂


Now the output can be any real number (−∞ to +∞), making it perfect for linear modelling. And when we convert it back, something beautiful happens—the straight line turns into an S-shaped curve (sigmoid) that always stays between 0 and 1.


This is the real magic: logarithms act as a bridge. They take us from messy probabilities to clean linear equations—and back to valid probabilities again.


Without logs, logistic regression simply wouldn’t exist


Part 3: How the Model Learns – Log Loss & Confident Mistakes


Once our model starts predicting probabilities, the next question is: how does it improve? The answer is simple—it learns from its mistakes using something called log loss. Think of it like a penalty score that punishes bad predictions.


Before jumping into formulas, let’s understand this in a very natural way.


Imagine a teacher trying to predict whether a student will pass an exam. Instead of giving a strict “pass” or “fail,” the teacher gives a confidence score—like “I’m 90% sure this student will pass” or “I’m only 50% sure.”


This is exactly how machine learning models work. They don’t just give answers—they give probabilities, which tell us how confident the model is about its prediction.


Now, once the model starts making these predictions, we need a way to measure how good or bad those predictions are. And not just whether they are right or wrong—but how confident the model was while being right or wrong.



Case 1: Model is confident and correct

Prediction = 0.9 (90% chance of passing)


Log loss = −log(0.9) ≈ 0.10 → Very small penalty 


The model was right and confident — perfect


Case 2: Model is unsure


Prediction = 0.5 (50% chance)


Log loss = −log(0.5) ≈ 0.69 → Moderate penalty 


Not wrong, but not very helpful


Case 3: Model is confident but wrong


Prediction = 0.1 (10% chance of passing)


Log loss = −log(0.1) ≈ 2.30 → Huge penalty 


The model was very confident… but completely wrong



Now look at the pattern carefully. When the model is correct and confident, the penalty is very small. When it is unsure, the penalty is moderate. But when it is confident and wrong, the penalty becomes very large. This is the key idea behind log loss.


It doesn’t just check whether the prediction is right or wrong—it also checks how confident the model was. In real life, being slightly unsure and wrong is acceptable, but being highly confident and wrong can lead to serious mistakes.


That’s why log loss is designed this way—it pushes the model to not only be accurate but also honest about its confidence.


Part 4: Understanding Coefficients—Turning Log-Odds Back into Probability


Here's where the college-level math comes full circle. Remember when you learned about exponentiating to "undo" logarithms? In logistic regression, this matters deeply for interpreting your model.


When we exponentiate our logit equation, we get back the probability:


p = e^(β₀ + β₁x₁ + β₂x₂) / (1 + e^(β₀ + β₁x₁ + β₂x₂))

But here's the practical magic for business stakeholders: when we exponentiate a single coefficient—say e^β₁—we get the odds ratio.



Example: In a bank's loan approval model, credit_score has a coefficient β₁ = 0.03.


Interpretation: Each additional point in credit score multiplies the odds of loan approval by 1.0305—or increases the odds by 3.05%.


This is how data scientists communicate with business teams. The exponential function transforms abstract log-odds back into intuitive language: "For every point higher, your odds improve by this percentage."



Without understanding how to exponentiate coefficients, you'd be stuck with formulas. With it, you have actionable insights.


Part 5: Why Logarithms Were Never "Useless"


Back in school, logarithms often felt abstract. But in reality, they quietly solve some of the biggest problems in machine learning:


Handle extremely small probabilities by turning multiplication into addition


 Convert nonlinear problems into simple linear ones through the logit transformation


Create smart penalty systems (log loss) that punish confident mistakes more than uncertain ones


Interpret model coefficients by exponentiating to find odds ratios


In short, logs make models stable, learnable, and reliable.


They're already everywhere around you—when an AI helps detect diseases, when your bank flags fraud, when spam emails are filtered, or when apps recommend what you should watch next. In all these cases, logarithms are working behind the scenes to turn messy real-world data into meaningful predictions.


And this is just the beginning. As you go deeper into AI—neural networks, language models, or data science—you'll keep seeing logarithms again and again. They're not just a topic from math class—they're one of the core ideas powering modern technology.


The truth is simpler than we thought: logarithms aren't useless. They're essential. And now you know why.


Written by Ashish Sharma- Guest Writer for MathsFlex Tutoring and Machine Learning expert (India).




Comments


bottom of page