Azure Machine Learning – The Responsible Road

Artificial Intelligence can change people’s lives. It can help people see again, and it can help us to get faster diagnosed for deceases. It is helping me right now to write a blog post that reads easier and with fewer spelling mistakes. Companies use AI to predict their sales, to optimize their production. AI assists us wholly or partially in driving a car. These are just some examples of what it can do.

But … Yes, there is always a but to every story. Data can be dangerous if used or interpreted in the wrong way. A very famous paradox is the ‘Simpson Paradox.’ In this paradox, it shows that data can tell the truth, but it is not the full truth.

Now imagine you are working for the Department of Corrections, and you need to build a model that predicts what the risk is for recidivism. Based on that, they could decide whether to grant people parole or not.

Or you work for a bank, and you build an algorithm that calculates if you will be able to pay back a loan or not. Based on the result, the bank will give you credit or not.

These projects can work great, but what if you cannot explain why Ms. Emma Smith cannot get a loan. Or why is Mr. Kalifa Abara is a higher risk for recidivism? And should you, as a data scientist, have access to all these very confidential data. (*)

Last year’s Build (2019) was all about AI for all skill levels. With many new features within the cognitive services and automated ML.
Then we had Ignite (2019), where the focus was on bringing AI into production with MLOps. By converting the techniques of software development cycles into the machine learning industry.

Today Build 2020 has started, and Microsoft put their focus now on “Responsible ML.” Within this topic, there are three different pillars.

Understand

“Why did Mr. Carl Jones not get a loan?”
It has happened many times customers leaves a bank office devastated, because they did not get a loan, and the banker could not give a reason.

For this, Microsoft is working on three projects, namely:

Interpret-Text: a library that incorporates explainers for text-based machine learning models and visualizes the results with a built-in dashboard

FairLearn: a python package that implements a variety of algorithms that mitigate unfairness in supervised machine learning

DiCE (Diverse Counterfactual Explanations): a library that generates “what-if” explanations for a model output

Protect

How do we make the difference in our datasets between general information and private information, and how can you protect that part of your data.

Or what if you don’t want your data scientist even to see the data. Or how can we protect data while it is getting processed?

To fix these problems, Microsoft revealed some other great tools:

Differential Privacy Whitenoise (Core & System): Differential Privacy validator and runtime + tools and services for DP processing of tabular and relational data

Seal SDK: An easy-to-use and powerful homomorphic encryption library, or in other words how can you do mathematical calculations on encrypted data

Open Enclave: New Azure VMs, where data can stay encrypted while getting processed, by making use of specialized hardware.

Control

How do we explain that a new iteration of training your Machine Learning model gives suddenly different results?

Azure Machine Learning Audit Trail enables you to automatically track your experiments and datasets that correspond to your registered ML Model.

Stay tuned for more details.

The above items are a lot of excellent new features that will get built-in within AML; some soon, some might still take some time.

Sammy Deprez and three other Microsoft AI MVP’s were asked by Microsoft to have a look at it. On Thursday 21th of May 2020, they kicked-off with a live stream on Twitch (hosted by Henk Boelman and Tess Ferrandez)

Each four of them went more in-depth into one of the features.

Eve Pardi (@EvePardi) will talk about Interpret-Text. You can read her intro blog [here]
Willem Meints (@Willem_Meints) is going to discuss FairLearn. You can find his intro blog [here]
Alicia Moniz (@AliciaMoniz) looked into Differential Privacy. Her intro blog can be found [here]
Sammy will take you into the magical world of Confidential ML with Microsoft Seal and OpenEnclave.

If you want to hear more details about these topics, the recording of the session will be available soon.

Stay tuned!

*: The names used in this article are random, not based on any real person.

This blog was originally posted on https://www.datafish.eu/article/azure-machine-learning-the-responsible-road/.

Understand

Protect

Control

Stay tuned for more details.

Stay tuned.