Machine Learning Produce Probabilities

That is the fundamental of machine learning

Feb 18, 2025

Most of the subscribers by now will understand that when we talk about Artificial Intelligence these days, the back end is just machine learning models predominantly neural networks. However, I would like to take this opportunity, to reiterate that it does not mean that less complex models such as linear regression, and decision trees do not have their place.

What is machine learning? If you study most of the algorithms associated with machine learning, namely supervised learning, you will start to understand that machine learning is just a probabilities machine. This means that at the end of the day, machine learning models generate probabilities. In an estimation problem, what is the probability that the final figure is within a specific range? In a classification problem, what are the associated probabilities of the different outcomes? In unsupervised learning, while it discovers structures in the data, it can only say those structures are highly probable because there are instances where the structures are not followed. For instance cluster analysis, by deciding which cluster is an entity ‘closer to’, we can estimate the related probabilities.

What does that mean, you might ask? Ahem…that means some situations can go wrong or worse, very wrong. What this knowledge transfers to is that as a business, you need to be prepared for situations where it can go WRONG!

In the Law of Large Numbers, while it states that with a significant number of occurrences, we will reach the “true” probabilities, it also means that the occurrence of machine learning gone wrong is confirmed unless there is a 100% probability it does not.

Businesses can do the following to prepare for it.

Prepare for the foreseeable. Scenario planning helps—to state how something can go wrong and plan what to do when it does.
Design a recovery process for the unforeseeable a good recovery process will go a long way in rectifying errors and building up loyalty.

To conclude, machine learning produces probabilities, and that means it can go wrong. Determine if the company can bear the consequences of a wrong prediction. Best be prepared for it! Part of the price of using AI these days. :)

What are your thoughts on this? Do share them in the comment below. :)

Starting the New Year, I am looking for folks to discuss more on Cybersecurity X Artificial Intelligence and Agentic Workflow. If you are interested in bouncing ideas with me, please PM me on my LinkedIn. Thank you!

Consider supporting my work by subscribing, sharing, and liking the post. Additional support can be given at my buymeacoffee. :)

NOTE: Firstly, thank you very much for following each of my posts despite your busy schedule. Greatly appreciated, especially to the folks who constantly hit the “Like” and “Share” buttons.

Circumstances have changed in recent weeks, and I foresee I will need more time to work on them, from a career and also family front for a while. On the career front, I will be setting up a boutique AI consulting and venture studio, called Project Asia Data. AI Events in the South East Asia region, I will be working with AIMX. Do give us a follow.

If you are looking to improve your career prospects, by improving your visual analysis skills I am launching a new Visual Analytics program, together with Eric Sandosham! Do spread the word, please! :)

As such, I have decided to reduce the newsletter frequency to once every fortnight and I hope to still get your support, through “Likes”, commenting, and subscribing. Thank you!

Past Issues You Might Be Interested

Keeping Data Projects Library

Koo Ping Shung

November 12, 2024

So your company decided to start building up your data and AI capabilities, how should you prepare the ground so that your company can continuously gain value from data and AI?

Read full story

Think Vs Compute (Part 2)

Koo Ping Shung

March 5, 2024

In a previous issue (here), I wrote about the differences between think vs compute. After writing the posts and gathered some feedback from subscribers. I felt that I still have not captured the essence of it and thus continuing to ponder about it during my long walks.

Read full story

Eric Sandosham

Feb 18

Just a small clarification: ML generates estimated probabilities. Folks unfamiliar with ML assume those probabilities are good to use – e.g. 10% = 1 out of 10 times, it will definitely work. The goodness of estimates comes down to the quality of input data, the appropriateness of the algorithm, and finally, the background / operating context (which are no directly captured but implied through the input data).

So ultimately, it comes down to the probability that a probability works! :)

Expand full comment

1 reply by Koo Ping Shung

1 more comment...

"Let's Build Intelligence Together"

Keeping Data Projects Library

Think Vs Compute (Part 2)

Discussion about this post