Binary Prediction Archives - United InfoLytics

Machine Learning & Finding What You Seek

Peter VanWylen — Thu, 18 Aug 2022 21:17:08 +0000

This is one of a series of posts that covers machine learning and its applications. The goal is to discuss the similarities between human and machine learning processes—and to use this understanding to think productively about wins and losses in any endeavor or sphere of life. We will start with talking about how exactly machine “learning” actually learns. I have mostly removed all jargon, and I’ve replaced it with everyday language as these concepts are truly accessible to anyone like this. If you find this post interesting or helpful, do check out the rest of the series.

One note: I use the words algorithm and model a fair bit below. You can think of the algorithm as the procedure used to approach a learning task. The model is the trained state of the algorithm after it works through some training data. A model potentially gets updated in some way each time the algorithm gets new data to learn from. The model is the sum total of the algorithm’s current learning towards the task at hand.

Similar to a human mind without experience, a machine-learning algorithm without training data to learn from is not very useful.

You Be The Machine

To start, I’m going to give you a very quick learning task that is something we might ask a computer to work on. I will have you work on classification of images. Are you ready? You’ve got this! First your training dataset:

A simple training dataset from which an ML algorithm can learn

Pretend you are the computer. You are given a set of 8 training images each with a label. Look at each of the images as if you’ve never seen these things before; try to learn what a cat is and what a dog is. Note that one good thing about this dataset is that we are teaching the computer that cats and dogs come in a variety of colors/patterns and that these aren’t the defining aspects of the thing. In training a real model, we might give the computer 50 to 50,000 examples of each instead of 4 each, but you get the idea. Now that you have trained yourself on these 8 images, I have a question for you: what are the things in each of the following two images? Before you answer based on your years of lived experience, I want you to attempt to answer as if the only training you have on dogs and cats is the above 8 images.

If you answered “dog” for the first one, great! You have earned your keep as a machine learning algorithm. If you answered “dog” for the second one, I think there’s a chance you cheated and used your life experience instead of just the training data! Look again, and you’ll clearly see that in some ways, the second image looks more like the 4 cats in the training images above. In the training data, all the cats have ears that stand up, they are all sitting upright, and each of the pictures shows their whole body including legs. But each of our training images for the dogs shows just a dog face—no legs, no body.

If you answered “cat” for the second image, I would say that you did your job just fine and there’s nothing wrong with you as a machine learning algorithm. What’s wrong is the fact that our training dataset was very small and lopsided—and thus it biased you, the machine, to think that any furry creature sitting down on its hind legs with pointy ears is a cat. Given that the face in the second image is definitely more doglike than catlike, the model could also be inconclusive saying something like “30% chance of dog, 60% chance of cat, 10% neither.” The facial structure has it thinking it’s somewhat doglike, but the ears and the pose has it thinking more catlike.

Before reading on, I have a question for you: what can you do without writing a line of code to get the algorithm working better to identify cats and dogs in a variety of positions, angles, etc.?

More Data is Good, Diverse Data is Better

You have now learned one of the first pitfalls of machine learning: an algorithm’s ability to label things or predict things is only as good as the training data it is fed. This is honestly true of human learning as well. If you are only exposed to a small slice of the world’s people or the world’s geography, your ability to understand and transfer what you’ve learned to people and places outside of your slice is limited.

Whenever I’m working with a machine learning algorithm, I tend to be exceptionally interested in the nature of the mistakes that it makes. I don’t spend a lot of time celebrating the 95% of things that it handled correctly; I spend my time looking closely at the 5% of things it got wrong, and I think through the main ways I might improve the machine’s performance so that it makes fewer mistakes:

Change the algorithm(s) used.
Work on ways to code some hints to give the algorithm a leg up on understanding the data—essentially formulas that give it some of the human understanding we have. These hints are called features, and this creative coding work is called “feature engineering.”
Work on gathering a larger and more diverse training dataset.

The first two of these are beyond the scope of this article and get a bit technical, but the third one is easy to understand. If we were to give the machine 100 images labeled as cats and 100 images labeled as dogs, it would almost certainly be more skilled at classification than it is with 4 of each. That said, if I still only provide training examples of cats sitting on hind legs and only have dogs’ faces instead of their whole bodies, my algorithm remains limited in its experience and is still limited in its ability to generalize what it has “learned” to a wider variety of situations. It’ll only be truly good at differentiating between sitting cats and the faces of dogs. It would be a more accurate classifier if trained on a reasonably wide set of animal positions, breeds, and backgrounds. Even if I only have time to give it a handful more dog images and cat images, it’ll learn the most and improve the most if those new images added to the training data are diverse. To learn more about AI that struggles in situations outside of its training, see this article on tricks that can beat the best Go-playing AI algorithms.

Admittedly most readers are more likely to be interested in using AI to rank potential customers or classify data in databases rather than classifying images of pets. The principles are the same, however, and building a diverse training set will be important for any business use of machine learning.

Getting philosophical on finding what you seek

In machine learning, it turns out that the most valuable training examples are those that are confusing or are outright surprises to the algorithm when it first sees them. To use human language, most algorithms seek to minimize the “cognitive dissonance” the model “feels” with regards to new confusing cases. The technical term for this is “minimizing the loss” but essentially you can think of it like this: the model changes most when it encounters a training example that is most surprising or most contrary to its current model of the task at hand. It changes significantly in order to ensure it would have a higher chance of success next time if presented with a similar image and asked to classify it.

If you train the algorithm on an additional 100 cat pictures and for all 100 it would have already known they were cats, it might improve just a bit in its abilities but not remarkably. If instead you give it just 20 additional cat pictures and 5 of them were surprises where the algorithm would have initially said, “not a cat,” this additional training data or experience is very impactful and will improve the accuracy of the model significantly—or at least it’ll decrease the model’s overconfidence.

The same is true in life. Your understanding of the world is most improved by new information that you didn’t expect, and it only improves slowly when given additional information that already squares with your understanding of the world. This is why scientists often say that they long for surprises in their experimental data. Data that goes exactly opposite your working model forces you to think and learn a lot more than data that confirms what you already think.

Binary Classification—and when “no” leads to “yes”

Whether it comes to finding the right person to hire, the right next job, the winning needles in a haystack of data, it turns out that closed doors, failures, and mistakes are exceptionally valuable. It turns out that the “nos” in life are the key to the next “yes.” It’s not just a cliche: closed doors are key to finding the open ones and failure truly is only failure if you fail to learn from it.

I was recently working on a machine learning model that does binary classification, which is a fancy way of saying that it looks at a spreadsheet and for each row it tries to figure out if it’s a “yes” or a “no” based on what it has learned from other rows that are known to be “yeses” or “nos”. The model was working better than I had initially hoped, and it was identifying needles in a very large haystack of data that experts were missing with traditional non-ML search and labeling algorithms. I was thrilled that it was working so well, and I then decided to see how it would do labeling on a slightly wider set of real-world data. Since I was also working on creating a training dataset from scratch, I would manually review its predictions on new examples; upon manual review, I would add it to the training dataset with the confirmed label.

Right after I started throwing a wider set of data at it, it was performing absolutely horribly and was mislabeling things constantly. I had asked it to extrapolate only slightly outside of the training dataset, so I thought it would be no big deal, but it was. I manually reviewed perhaps 40 examples in a row that it thought were in the “yes” category but all of them were actually “no.” I started looking at my code to see it I had somehow broken it—after all, it had been working quite well the day before.

It turns out that nothing was broken. I stuck with it and kept saying, “actually that’s a no” to heaps of things that it was labeling as a “yes.” The model started to learn. The next day it was back to finding needles in a haystack with great success. It was only then that I realized that the key to getting back to winning was a whole bunch of mistakes that I fed back into the system so it would learn from them.

Getting started

Whether you are trying to see if machine learning can transform your work, or just reading for the purpose of growing your data skills, be encouraged that the each “no” in life and in machine learning will help you clarify where to be looking for the next “yes.” If you want some consulting as you get started with machine learning, you want a complete outsourced solution for it, or even if you just want to ask questions to see if there is potential return on investment from machine learning in your line of work, then answer is certainly “yes”—I’d love to talk. Set up a time today.

Admittedly, image classification or distinguishing between cat pictures and dog pictures is probably not a critical goal for your work. There are so many other things that machine learning can do well, and in many business applications the examples being learned and later predicted are actually rows from a spreadsheet or a database. See this earlier post for a wider explanation of the types of tasks ML can do well.

The post Machine Learning & Finding What You Seek appeared first on United InfoLytics.

A review of Salesforce Einstein Prediction Builder

Peter VanWylen — Tue, 14 Dec 2021 00:55:03 +0000

Salesforce isn’t exactly known for AI or Machine Learning (see Gardner magic quadrant), but they are working at it. I was pleased when in 2020 I discovered their Einstein Prediction Builder, which is (of course) an additional charge, but is available for basic usage for free. Any current Salesforce customer can use their “Try Einstein” program to see what is possible and actually do some real work with it too. Even if you don’t want to pay, you can start using Prediction Builder to model up to 10 different prediction tasks. As of date of writing (Early 2023), you can only have one turned on and active at any given time, meaning the free level of this service only allows one prediction to be populating predictions onto objects in real time as they are created or modified in the system. If you currently use Salesforce, read on to learn more. If not, United InfoLytics can still help you with your Machine Learning and Predictive Analytics needs across other platforms as well.

TL/DR Summary: there’s a bit of a learning curve to ML but you owe it to yourself to try it using a simple and affordable AutoML tool like this either by setting it up yourself (read carefully below) or getting our team to help you set up a prediction for you ensuring best practices in independent variable selection, training dataset filtering etc.

Types of Predictions

What sorts of predictions is Einstein Prediction Builder capable of? Basically there are two types of predictions: a binary (yes/no or true/false) prediction or predicting a number through regression (this option is currently in beta). For the binary predictions, it involves making a prediction of the relative probability of an event occurring. For those using Salesforce in sales, this would often be the chance of a given lead making a purchase, or the chance of a quote becoming a deal. Basically it involves looking at a Salesforce object (Lead, Contact, Account, Opportunity, etc.) and making a prediction of the probability of an event occurring, usually a desired outcome like a student enrolling or a customer making another purchase. For the second type or prediction, it involves making a prediction of a number like total sales over the next 12 months across all current customers. This is a bit fancier because each prediction is a number (can be negative or positive, large or small) instead of the binary predictions being a relative probability between 0 and 100. Regression can be used to predict sales of $100 or $100,000 all using the same prediction builder.

Review of Einstein Prediction Builder

There are some things that Salesforce clearly got right with this product. Even if you don’t have any specific AI or machine learning training, you can get started with this, and it mostly works even if there are a few bugs you may encounter. In a sense, they might have one of the more user-friendly user interfaces for getting your feet wet in predictive analytics. And if you’re already on Salesforce, there’s no reason not to give it a try. Your data is already in the system, and there are no flat files to export and re-import into another machine learning system, no APIs or data exchange to set up. Give yourself 2 hours to play with it and you might create something really useful! Or if you don’t have the two hours, let United InfoLytics do it for you! With minimal investment, you may end up with a really useful system that helps you focus your efforts on the accounts and contacts that are most likely to purchase, enroll, participate, give, etc.

Basically the process for setting up a binary prediction is:

Pick an object you want to make predictions on. All the fields you want to reference must be on this object, and currently you cannot reference related objects. If you need to reference a related object, create a formula field that pulls in this data first.
Define the training set. This is the data that we now know the answer to: which ones were a “yes” and which ones were ultimately a “no” or a “not yet.” Tell it which accounts, contacts, leads, or opportunities are “water under the bridge” such that we have a “yes” or a “no” or some reasonable estimate on this. You would want to, for example, exclude a lead from the training dataset if it came in over the last 60 days. We shouldn’t call a new lead that came in yesterday a “no” just because they haven’t purchased yet!
Define a “yes” and a “no”. For many, this will be whether there was a sale or not. Help Salesforce know what criteria define a “yes” and they’ll assume all the others in the training set are “no”.
Tell Salesforce which fields to use to make predictions. This is possibly the most tricky step, and it’s also the most important. You should include anything that you think might be reasonably likely to predict the outcome in some small way. You should be careful not to include things like race, ethnicity, gender, etc. if the prediction needs to be equitable and not learn biases from the training data.
Think carefully about excluding fields that allow in some hindsight bias. Most important in the prior step is that you must avoid hindsight bias by excluding any fields that are in some way linked to a salesforce object taking the desired outcome. Let’s say that phone number is not generally gathered on most of your inquiry forms whereby new leads and contacts come into the system, but the phone number is always gathered as part of the sale process. If you allow prediction builder to use the phone number field to make predictions, it’ll do a very funny thing: it’ll say that the most powerful predictor of someone making a purchase is whether they provided their phone number vs. leaving it blank! But think carefully here: it’s not actually true. The fact that they made a purchase is what caused them to enter their phone number and not the other way around. My rule of thumb here is simple: include fields that are usually gathered prior to the desired outcome occurring and exclude any fields that are rarely or only sometimes collected but later fully populated after the desired outcome.
Hit go and grab some coffee. Once you are done defining your prediction, it’s time to let the machine do some “learning.” This is basically where the machine learning algorithms start working looking at all the training data where you supplied a “yes” or a “no” and developing a model for which other fields predict the desired outcome being more or less likely. Right now, the algorithms it currently uses are (I believe) just Logistic Regression and Random Forest. It tries them both and figures out which one does a better job on the data! It may take 30 minutes to several hours for it to finish the learning process. Just go and start another task and occasionally come back to refresh and check whether it has completed.
Check the prediction scorecard. Salesforce has made a pretty easy to understand prediction scorecard that’ll help you avoid the two main problems you may encounter: a prediction that’s too good and a prediction that’s no good at all. If your prediction comes back too good, changes are high that you’ve allowed Salesforce to use a field that is tainted with the hindsight bias discussed above. If your prediction comes back pretty bad, you’ve either defined the machine learning problem poorly, or you’ve not given it the features that are most likely to be helpful and predictive, or your prediction problem just isn’t amenable to the data you currently have in your system and living on that object. Sometimes feature engineering here is helpful, like pulling in summary or calculated fields off of other objects. By adding a few additional “features” to your prediction, you may get better performance that starts to make your prediction worth using in real life to prioritize leads, contacts, and accounts for focused follow-up, calls, emails and other outreach.
When satisfied, turn on the prediction! If your scorecard comes back good, check to make sure that the features it says are good predictors make sense to you as meaningful predictors in your experience in the field you are working in. Anything that seems bogus might just need to be removed from the prediction. Once you’re happy, turn on the prediction and go get some more coffee. Slowly, it’ll start filling in predictions on all records that you’ve asked it to predict upon. Once this is complete, you can begin to act based on these predictions. One great way to start is with a list view or report that sorts records from highest to lowest probability of taking the desired outcome. Then start working the list (make those calls, send some text messages and emails) to see if the prediction plays out in real life. One interesting experiment here is to call the top 20 people on the list and the bottom 20 on the list. Hopefully you’ll see a significant difference in interest and likelihood of taking the desired actions between the top 20 and the bottom 20. If so, you’ve proven it works and you should start using this prediction across all your business processes where you are unable to evenly reach out to all contacts and you need to spend more time, effort and money on some over others. While mass email is cheap, phone calls are not. Send mass email to all your contacts—but save your phone calls for the people most likely to convert!

Caveats and Bugs

In a short time of working with the system, I encountered two bugs / shortcomings:

There is an undocumented issue where Prediction Builder is unable to “see” any records that haven’t been modified in the last two years. There is literally no reference to this in their documentation, and I had to go through a very long series of back and forth messages over a month for the problem to get escalated to someone who finally figured out that this was “intended behavior.” They said they would update the documentation on this, but honestly this is indicative of a very immature product if they think that all the training data should necessarily be from the last two years, and they don’t even document this “feature.”
Sometimes you can have a working prediction in the system and go in to tweak some small thing (add or remove a feature column or restrict the training dataset in some way) and at the end it refuses to save your changes giving you an error message. The error message doesn’t make sense and there’s also no way to resolve it. All you can do is leave the page. Every time this has happened to me (2-3 times) I’ve lost all my work on a given prediction. They are unable to recover your work, and basically you have to start over on defining the nature of the prediction problem, the positive case, the negative case, the features to use in the prediction, etc. This to me is again a marker of this product being immature. I have high hopes that someday everyone using Salesforce intelligently will be using this tool, but maybe right now it’s only a chance to get your feet wet and start imagining what is possible with AI/ML across your Salesforce database. I certainly wouldn’t pay for it yet.

Technical Details for ML Professionals

If you’re a machine learning professional, you will likely be a bit disappointed by some simplifications of the product. For example, you cannot pick a scoring function for the AutoML process. I honestly don’t even know what it is as currently implemented. You cannot get any sort of area under curve for model evaluation or comparison. You cannot even see what the second best performing model was or what the hyperparameters were. That having been said, you can use your ML knowledge to quickly and efficiently get something built and into production in an hour of active work plus the necessary coffee breaks while the model is trained and scored. This is better than can be said for many other machine learning pipelines! If your company or your client is using Salesforce, it’s worth a try as the investment is very low and the return on investment possibly very high in terms of increased sales with decreased effort.

The post A review of Salesforce Einstein Prediction Builder appeared first on United InfoLytics.