The Same Distance Formula You Learned in School… Powers AI Today!

Ashish Sharma
May 2
5 min read

You open Netflix… and it suggests a movie you actually like. Your phone unlocks just by seeing your face. Your bank detects a fraud transaction instantly. But have you ever wondered — how do machines decide what is similar and what is different? Surprisingly, they use something you already learned in school: distance between two points.

Distance between 2 points- A level Maths Curriculum

In AI, everything is converted into numbers — and distance tells the machine what is similar and what is different.

Whether it’s comparing users, detecting unusual behaviour, or grouping similar data, these distance measures help machines make sense of the world. What we once learned as “distance between two points” becomes a powerful tool for pattern recognition and decision-making in real-world systems.

Let’s understand this with a simple idea:

Imagine going from your home to a shop.

• The shortest path is a straight line → this is how L2 (Euclidean distance) works.

• But in real life, you follow roads (left, right, up, down) → this is L1 (Manhattan distance).

So, L2 = straight-line distance and L1 = step-by-step distance!

In school, we mostly use these formulas to find the distance between points. But in AI, they are used far beyond that — to compare data, detect patterns, and drive intelligent decision.

How we measure ‘how big’ a vector is — with the L² norm giving the familiar Euclidean distance.

What you studied as a simple formula is actually used by machines to compare things and make decisions.

Now, let’s explore how L1 (Manhattan) and L2 (Euclidean) distances are used in AI and industry.

“You might be surprised how many real-world problems use this simple idea.”

Controlling Model Weights Using L1 and L2 Regularization

Imagine you are preparing for an exam by memorizing answers to just a few questions. If the same questions appear, you score well, but when new questions come, you struggle. This happens because you didn’t truly understand the concepts—you only memorized them. The same thing can happen in Machine Learning, where a model memorizes the training data instead of learning patterns. As a result, it performs well on known data but fails on new, unseen data. To prevent this, we use regularization, which acts like a rule that stops the model from depending too much on specific information and helps it learn more generally.

Loss functions with L1 and L2 regularisation: two ways to control model complexity and improve generalisation.

There are two common types of regularization. L1 regularization (Lasso) removes less important features by making some values exactly zero, similar to ignoring unimportant chapters while studying. L2 regularization (Ridge), on the other hand, reduces the importance of all features slightly without removing them, like studying everything but not focusing too much on any one topic. Both methods add a small penalty to keep the model simple and balanced, helping it avoid memorization and perform better on new data. In short, L1 removes unnecessary parts, while L2 keeps everything smoothly controlled.

Using L1 and L2 Distances for Measuring Similarity and Clustering

Imagine a teacher grouping students based on similar marks — students with close marks sit together, while others form different groups. AI does the same using distance. It checks how “close” or “far” data points are and groups similar ones together. If two points are very close, they belong to the same cluster; if they are far apart, they go into different clusters. Now here’s something interesting — when AI uses L2 (Euclidean distance), the clusters look more round like circles, while using L1 (Manhattan distance) creates diamond-shaped groups. In simple words, distance helps AI find patterns and organize data automatically.

In clustering, AI groups similar things together using algorithms like K-means.

Imagine arranging students based on their marks — students with similar marks sit together, while others form different groups. K-means does the same by checking how close data points are. If two points are close, they go into the same group; if they are far, they go into different groups. L2 distance measures the direct straight-line distance, so it is commonly used for natural grouping. L1 distance measures step-by-step distance (like moving along roads), so it works better when the data has irregular or extreme values. In simple words, K-means uses distance to decide which data points belong together.

Anomaly Detection in AI Systems

In AI, anomaly detection means finding something unusual or different from normal behavior. For example, if your bank suddenly detects a very large transaction from your account, it may mark it as suspicious. The system first learns what your normal behavior looks like, and then compares new data with it. If something is very different, it is called an anomaly. L2 distance reacts strongly to one big change, like a sudden huge transaction, while L1 distance captures many small changes happening over time. In simple words, AI uses distance to check how “normal” or “abnormal” something is

“L2 reacts strongly to one big deviation, while L1 captures the total accumulated deviation.”

AI measures how different something is by calculating distance. L2 distance gives more importance to big changes, so even one large difference can make the distance very high — which is useful for detecting sudden unusual events. On the other hand, L1 distance adds all small differences together, so it captures gradual changes over time. In simple words, L2 highlights big shocks, while L1 looks at overall change.

Fairness in AI using L1 and L2 Norms

In AI, fairness means treating different groups equally. For example, if two groups are applying for a job, the system should not favor one group unfairly. AI checks this by comparing the results given to each group. If the difference is too large, the system is considered biased. To reduce this difference, AI uses L1 and L2. L1 tries to make the results almost exactly equal, while L2 allows small differences but avoids large unfair gaps. In simple words, L1 enforces strict fairness, while L2 keeps things balanced.

To reduce unfairness, AI tries to minimize the difference between the results of different groups. L1 does this by making the difference as small as possible, aiming to make both groups almost equal. L2 also reduces the difference, but it allows small gaps and mainly focuses on avoiding large unfair differences. In simple words, L1 tries to make everything equal, while L2 tries to keep things balanced without big gaps.

So, L1 is used when we want strict fairness and almost equal outcomes, while L2 is used when some flexibility is okay but large unfair differences should be avoided. Together, they help AI systems make more balanced and fair decisions.

What you learned in school as simple distance formulas is actually used by machines to understand the world. From grouping data to detecting fraud and ensuring fairness, everything comes down to one idea — measuring how similar or different things are.

Written by Ashish Sharma- Guest Writer for MathsFlex Tutoring and Machine Learning expert (India).

The Same Distance Formula You Learned in School… Powers AI Today!

Controlling Model Weights Using L1 and L2 Regularization

Using L1 and L2 Distances for Measuring Similarity and Clustering

Anomaly Detection in AI Systems

Fairness in AI using L1 and L2 Norms

Recent Posts

Comments