How To Create Jupyter Notebook In Ibm Watson Studio

Deploy and Explain Neural Networks using IBM Watson and OpenScale

Before jumping into the step-by-step tutorial on how to deploy a machine learning model using Watson Studio and explain the result using OpenScale, I would like to tell a short story, I would like to explain why do we actually need AI, and why is AI a next step in the evolution of mankind. If you are interested only in the technical part, then feel free to jump to the second section called IBM Watson.

Why do we need tools like AI

Most of the people, if th e y hear terms like Artificial Intelligence or Machine Learning they think about robots. The reason for this is because big-budget movies presents AI as a human-like machine, that has superpowers with the potential of destroying humanity. Actually AI doesn't really need arms and legs to destroy humanity, but this is another story.

Artificial Intelligence is based on the idea that the human intelligence, behavior, the human mind can be modelled in such a way that the machine (or the algorithm) can mimic humans, can execute a task that until now required human intervention, human-like reasoning. In my opinion, we should not mimic human-like behavior, humans are very error-prone, we should create better machines/algorithms, which could carry out repetitive tasks for us, so we can focus on more important things, maybe on tasks that requires more creativity, more imagination, emotion and other human-like qualities.

Progression, innovation, evolution requires better and better tools, without tools our physical and psychical capabilities are very limited, shallow. If you think about the history of "modern" mankind, every epoch is defined by tools. If we start in the prehistoric ages, the so-called prehistory was already defined by tools, it is delimited by tools. As the definition says, prehistory is the period between the use of the first stone tools and the invention of writing systems. As you can see, already in the definition of this epoch we have tools, it is delimited by tools. There were multiple "ages" between the two events, like bronze or iron age, which are all defined by the tools.

If you think about it, our modern history is also defined by tools, the most important events of the last few decades are defined by the invention of new tools/machines/devices. Yes, I'm talking about the 4 industrial revolutions. If we think about these revolutions, what were the reasons for those revolutions? The answer is tools!

During the first industrial revolution, the biggest changes came in the industries in the form of mechanization. Mechanization was the reason why agriculture started to be replaced by the industry as the backbone of the societal economy.

Following the first industrial revolution, almost a century later we see the world go through the second. It started at the end of the 19th century, with massive technological advancements in the field of industries that helped the emergence of a new source of energy, electricity, gas, and oil. The important points of the second industrial revolution were the development for steel demand, chemical synthesis and methods of communication such as the telegraph and the telephone.

The third revolution brought forth the rise of electronics, telecommunications and of course computers. Through the new technologies, the third industrial revolution oped the doors to space expeditions, research, and biotechnology.

The fourth industrial revolution or the so-called Industry 4.0 is happening right now and it is defined by one thing, the internet. We can see the transition from the first industrial revolution that rooted for technological phenomenon all the way to Industry 4.0 that develops virtual reality worlds, allowing us to bend the laws of physics.

As you can see, all these major events in our history were defined by tools and Artificial Intelligence is another powerful tool which can and will reshape our world. In my opinion, AI is an absolutely necessary step in the evolution of mankind (and maybe this will be the last revolution in the history of mankind, but this is another story, for another article). If we want to conduct more advanced research (for example to travel at the speed of the light or to invent a way for teleportation) we need much more powerful tools and one of them is AI. Another big event will be the invention of quantum computers and then the combination of quantum computing and artificial intelligence. From that point there is no limit for inventions, from that point, we can imagine a future where the researchers will be AI algorithms and not humans, but this is a very far and a very fragile future. Let's just stick to the present and talk about what kind of tools do we have now for AI/ML.

Nowadays we have lots of devices connected to the internet (4th industrial revolution) and each of these devices generates data. Because we want intelligent and personalized devices, which can guess our needs, we need to process these data. Processing this tremendous amount of data is a very hard task and it cannot be done manually. So automatic reasoning about the data is absolutely necessary. This is where AI comes to the rescue. But after automatic reasoning, we also want to understand the reasons of the actions of the AI algorithm. This is how the hype of explainability arise. To read more about current trends in Explainable AI, you can read in my article called Explaining Explainable AI.

IBM Watson

The platform developed by IBM, called Watson Studio combined with Machine Learning Services and with Watson OpenScale is a comprehensive combination which was designed to make it easy to develop, train, manage models and deploy AI-powered applications. It is a SaaS solution delivered on the IBM Cloud. It gives your data scientists, engineers, developers and domain experts — the tools they need to collaborate — to drive innovation in their business.

Watson Studio accelerates the machine and deep learning workflows required to infuse AI into your business to drive innovation. It provides a suite of tools for data scientists, application developers and subject matter experts, allowing them to collaboratively connect to data, wrangle that data and use it to build, train and deploy models at scale. Successful AI projects require a combination of algorithms + data + team, and a very powerful compute infrastructure.

The only problem with this platform is that the documentation can be hard to understand and the support from the community is not that advanced, there are very few articles about how to use Watson, there are very few (almost none) articles about creating applications using Watson. So I've decided that we should close this gap, Watson is a great tool, backed-up by a very strong infrastructure and besides creating and deploying the models, it also offers out-of-the-box explainability for AI models. In order to explain the results of the model, they use the LIME algorithm which was presented in Explaining Explainable AI.

In the next section, you will see an example how to build a Neural Network to predict diabetes and then how to use the tools mentioned above to deploy the model, to setup continuous learning, drift detection and to explain the results of the Neural Network.

Diabetes prediction in Watson

In this example, we will use a Neural Network to predict diabetes using the Pima Diabetes Dataset. We will use the Keras package to quickly build and train a neural network. We will use Watson Studio to build, train and deploy the network and OpenScale to setup drift detection, continuous learning and to explain the results of the model.

The dataset used in this example can be downloaded from Diabetes.csv. The datasets consist of several medical predictor variables and one target variable, Outcome. Predictor variables include the number of pregnancies the patient has had, their BMI, insulin level, age, and so on. The columns of the dataset can be described as:

Number of times pregnant
Plasma glucose concentration a 2 hours in an oral glucose tolerance test
Diastolic blood pressure (mm Hg)
Triceps skin fold thickness (mm)
2-Hour serum insulin (mu U/ml)
Body mass index (weight in kg/(height in m)²)
Diabetes pedigree function
Age (years)
Class variable (0 or 1)

In order to solve this problem, firstly lets setup our IBM account and prepare the services needed for diabetes prediction.

IBM Cloud — Watson step-by-step deployment

The first thing to do is to create an account on IBM Cloud and login to your account.

IMPORTANT NOTE: for best performance and compatibility, each resource should be in the same region. In this tutorial, I will use Frankfurt as selected region for each Resources. You can choose other regions, based on your account type and based on your current location.

From here, please follow the steps below:

In the top-right corner of the screen click on your account picture and then click on My IBM.

This will take you to your IBM Dashboard

2. On your IBM Dashboard you will see your products. From the Products section launch the IBM Cloud

This will take you to you IBM Cloud Dashboard where we can start creating the services needed.

Alternatively, you can use https://cloud.ibm.com/ to access the IBM Cloud directly

3. The next step is to add a new resource to your cloud account. On your IBM Cloud account, click on the Create Resource button.

4. Scroll down and select the Watson Studio resource.

5. Select the region (I will select Frankfurt), make sure you've selected the Lite plan (if you don't want to pay) and click Create.

6. After the Watson Studio resource is ready to use, click Get Started.

7. In Watson Studio, we will create our first project, called Diabetes. For this click on the Create Project button.

8. And then click on the Create Empty Project button

10. Choose a name for the project (here Diabetes). The next step is to link this project to an Object Storage (where we will store our project-related files, like the .csv file for the training data, the notebook and the resources created by the notebook, etc). To create a new Object Storage, click on the Add button.

11. Select the Lite plan (if you don't want to pay) and click Create

12. In the Pop-up window give a name to the Object Storage Account and click Create

13. Refresh the project creation page and you should see the Object Storage name that you've created linked to the project. Click on Create to create the new Watson Studio project.

14. The project will be opened automatically (if not then click on the name of the project in the Watson Studio home page). Now we will add a jupyter notebook file to the project, which will be used to create our Neural Network, train it, save it and deploy it to the IBM Cloud. For this, click on the Add to project button.

15. Select the Notebook button.

16. Give it a name (like Diabetes prediction) and for the Runtime, select the Default Python 3.6 Free runtime. Click create notebook and wait for the notebook to be initialized.

17. The next step is to upload the training which will be used by the notebook. For this click on the little matrix icon in the action bar, click on Browse and select the training data .csv file.

18. The next step is to read the data from the uploaded .csv file. For this click on the Insert to code link, which will automatically add a block of code to the notebook. If you run this cell (select the cell and hit Shift+Enter) it will connect to the Object Storage, read the csv file and show the fir 5 rows of the file.

19. Now we will create the Neural Network, train it and save it to the Object Storage

a. Define the global variables

b. Import the necessary packages

c. Functions to split the data into feature variables and target or label variable, to create and train the model and to load the model in-memory

d. Because we will use zip deploy, we will have to compress the NN model and create a .tgz file

20. Now we have the trained model in a compressed file, the next step is to deploy it to the IBM Cloud. For this deployment, we will need to associate a Watson Machine Learning Service to the Diabetes project. For this, go back to the Watson Studio, click on the Hamburger menu, select Services and then Watson Services

21. Click on Add Service

22. Select Machine Learning Service

23. In the Pop-up menu select the region (I will select Frankfurt), give it a name and click Confirm

24. A new Machine Learning service will be created. Click on the name of the service.

25. In order to use this service to deploy our model, we need Credentials to access this service. On the Service Credentials tab click on New credentials button.

26. In the Pop-up window give ti a name, select Auto Generate for the Service ID and click Add

27. Copy the service credentials because we will use it in our notebook for deployment

28. Now go back to the notebook because we will write the code to deploy the model using the newly created Machine Learning Service

a. Import and create the Watson Machine Learning service client

https://gist.github.com/cdc56c5f4ee03c9201b9e6d6ff842f8c

b. Setup the properties of the model and publish it to the repository

c. We will read the Guide of the new model just to be sure that it was generated successfully

d. Create the deployment

e. List all the deployment, just to see that everything went fine

f. An API for accessing our model was generated automatically. We will read the url of this, to test our model.

g. Test the model from code.

29. If we want to test our model using Postman, we will need an access token. This can be read using our notebook:

a. Install the ibm-watson python package

b. Setup the token manager — you will have to create a new API key in the IBM Cloud IAM (or on the IBM Cloud home page, click on the hamburger menu, select Security, select Manage, Identity and Access which will open a new tab for IAM)– click API Keys and click the Create an IBM Cloud API Key button, copy the key and insert into the token manager.

d. Add the token in the Postman header section

e. Add test data to the body of the request and click Send

30. Now in your Watson Project called Diabetes you should have the assests shown in the picture below

Now that we have a working model and an API endpoint to access it securely (token-based authentication) we can continue to set up the continuous learning, drift detection and explainability pipeline using OpenScale.

In the next steps, we will explain the results using Watson OpenScale, which uses the LIME algorithm to explain the models (for LIME see Explaining Explainable AI).

IBM Cloud — OpenScale step-by-step configuration

In the IBM Cloud Resource List click on Create resource and select Watson OpenScale, give it a name, select the region, select the Lite plan and Click Create

2. After launching the OpenScale platform, select Use the free lite plan Database

3. The next step is to add a Machine Learning Provider

4. You can select from multiple providers, like Watson Machine Learning, Azure or Amazon services, but we have a Custom model because our model was built by hand, was built by us and deployed manually. Then select Enter deployment manually.

5. Go to the Insights Dashboard and click Add

6. In the pop-up window select your deployed model name.

7. The next step is to specify the input and output data schema. For this we will have to create a new Jpyter Notebook and run the code shown below:

8. The next step is to configure the model details:

a. Select the bucket name and the name of the dataset

b. Configure the Object Storage Connection. For this you should go on the IBM Cloud home page, hamburger button, All Resources, Storage, select your storage, select the storage object and in the menu, you can find the Configuration tab. Here you will find everything to set up the connection.

c. Select the column name of the target class

d. Select the feature column names

e. Select the columns containing texts of categorical data (in this example we don't have any)

f. Select the output variable name (this was created automatically)

9. Configuring the continuous learning and the drift detection is very simple, you just have to set a threshold for the evaluation metric (like accuracy).

10. After all the configuration steps (Payload logging, Model details, Quality, Fairness, Explainability (this will be automatically configured), Drift) you can select your model.

11. In the actions drop-down you can upload the testing data and evaluate the model. As you can see in the picture above, 10 explanation reports were generated. If you click on the number a pop-up window will show up, where you can find the id of each explanation report.

12. You can use the explanation report id to see the explanations.

13. This will give you a report like in the picture below. In this report, for example, you can see the feature importance, the confidence, etc.

Conclusions

In this article we learnt:

How to use Watson Studio and Watson OpenScale
How to work with Jupyter Notebooks in Watson Studio
How to create, train and test a Neural Network in Watson Studio
How to deploy a model to Watson Machine Learning Service and set up a testing endpoint with a secure connection (token-based security)
How to call our model using Postman
How to work with Cloud Object Storage and how to connect it to a Watson Studio project
How to set up a Watson OpenScale deployment model and use it to explain the results of the deployed Neural Network

In the next article, we will continue our journey in the world of Explainable AI and we will continue with part 2 of Explaining Explainable AI. In the future, we will talk about other cloud providers and we will show you how can you deploy a model and set up an API in Google Cloud, Amazon and Azure. If you are interested, then please follow me on Medium.

Thanks for reading this long article!

I really like coffee, because it gives me the energy to write more articles. If you liked this article, then you can show your appreciation and support by buying me a coffee!