Video: What are Generative AI models?

In this video, the IBM expert Kate Soule explains how a popular form of generative AI, large language models, works and what it can do for enterprise.

It came to my attention how powerful foundation models are, not only to generate text or images, but their ability to be repurposed to distinct types of machine learning tasks. Due to this ability, foundation models represent a paradigm shift on how we create AI solutions.

Good video!

Building a Machine Learning Model: Collecting Data

Data is the main ingredient of Machine Learning models. An actual Machine Learning project starts with the team assessing if the necessary data is available.

If not (which would be no surprise), the team needs to develop a strategy to collect and store the necessary data before modeling Machine Learning Algorithms.

Common sources of data include:

  • Databases
  • Computer files
  • Websites
  • REST APIs
  • Sensor data
  • Physical files
  • Satellite images
  • User interaction in apps
  • Videos

Most Machine Learning algorithms are agnostic to the ingested type of data: most of them receive matrixes and numeric arrays as parameters. From the learning algorithms perspective, it does not matter if the feature came from ebooks or images.

Thus, in Machine Learning Project, a good part of the effort comes from setting the correct strategy to collect, store, clean, and preprocess the data, making them beneficial to train Machine Learning models. 

Types of Machine Learning: Deep Learning

Deep Learning is a branch of Machine Learning specialized in Artificial Neural Networks with multiple intermediary layers. 

Neural Networks with multiple layers can approximate very complex non-linear functions. That’s why Deep Learning has been very successful in CV (Computer Visions), NLP (Natural Language Processing), and Reinforcement Learning. 

Deep Learning architectures include:

  • Convolutional Neural Networks
  • Recurrent Neural Networks 

Due to its ability to learn high-order non-linear functions, Deep Learning algorithms can easily overfit small datasets. Also, Deep Learning algorithms may be computationally expensive, even for a small amount of data.

Thus, Deep Learning is not a silver bullet. Deep Learning can overfit easily with no gain to generalize the solution for simple problems over a small dataset, making it a poor choice for this domain.

However, Deep Learning has achieved stunning results for complex problems that benefit from large amounts of data, longer training time, and intense computational power. 

Types of Machine Learning: Time Series Analysis

The typical regression task predicts the value of a target variable based on the values of one or more feature variables. For example, to predict the price of a house based on its characteristics like size, number of rooms, etc.

But in some cases, we want to predict the value of a variable based on its past values. In our example, we would predict the price of a house based on its previous prices instead of its characteristics.

We call this modeling Time Series Analysis. A Time Series is a collection of observations – values of a variable – ordered along the time. The time elapsed between observations varies years, months, days, seconds, or even milliseconds.

A simple model for Time Series Analysis is moving averages (MA). This model assumes that the following observation is the average of all past observations.

Another model for Time Series Analysis is the autoregressive model (AR). This model assumes that the following observation is linear dependent on the last p values of the time series. In other words, it’s a linear regression model applied to the p previous observations.

Finally, from combining these models, we form the more general models autoregressive-moving-averages (ARMA) and the autoregressive-integrated-moving-averages model (ARIMA).

Types of Machine Learning: Recommender Systems

Recommender systems are a category of Machine Learning algorithms that predict a user’s rating or preferences over a collection of items. 

Recommender systems algorithms include:

  • Collaborative filtering
  • Content-based filtering
  • Session-based
  • Session-based recommender systems 

Reinforcement learning for recommender systems include:

  • Multi-criteria recommender systems
  • Risk-aware recommender systems
  • Mobile recommender systems 

Types of Machine Learning: Reinforcement Learning

In the reinforcement learning paradigm, the learning process is a loop in which the agent reads the state of the environment and then executes an action.

Then the environment returns its new state and a reward signal, indicating if the action was correct or not. The process continues until the environment reaches a terminal condition or it reaches a maximum number of iterations.

These are some of the main concepts in Reinforcement Learning:

The Environment

The environment is a representation of the context that our agent will interact with. It can represent an aspect of the natural world, like the stock market, or a street, or a completely virtual environment, like a game.

State

States are observations that the agent receives from the environment. It’s the way the agent gets all available information about the environment.

Actions

Actions are performed by the agent and may change the state of the environment. All the rules of how an action changes the state of the environment are internal to the environment. For a given state, the agent can choose its following action, but it does not control how this action will affect the environment.

Rewards

Rewards signal to the agent if an action was correct or not.

So, if you want to learn more about Reinforcement Learning by a practical example, take a look at the following blog post:

reinforcement-learning4.fun/2019/06/09/int…

Machine Learning Applications: Topic Discovery with Clustering

Given a set of documents, a common task is to group them accordingly to topics or subjects. A human agent can create a hierarchy of subjects and assign each document to its related issue.

However, a clustering algorithm can create this structure automatically and more precisely. We can apply hierarchical clustering algorithms to group documents as we build the hierarchy between them.

We don’t need to set the subjects nor the topic hierarchy previously. Instead, the clustering process discovers them.

However, human agents need to provide some parameters to the algorithms:

  • a criterion to measure the similarity between documents (a distance function)
  • a level to cut the hierarchy or minimum size of a cluster
  • labels for the discovered clusters

Exploratory Data Analysis for Machine Learning

Exploratory data analysis is the most challenging task when building a machine learning model, especially for beginners.

A result of the No-Free-Lunch-Theorem is that there’s no single model that will perform well for every dataset. In other words, there’s no silver bullet Machine Learning Algorithm.

The practical consequence is that we need to make a LOT of human decisions when building our model: which algorithm to use, which features to use, which features to discard, apply normalization, regularization, hyperparameters to tune.

And because the space of decisions is so vast, going on simple trial-and-error is a shot in the dark. We need to drive our decisions on actions that could potentially benefit our model.

So, the only way to make better decisions when building a model is to understand our dataset. And that’s why an excellent Exploratory Data Analysis is an essential step in Machine Learning model building.

Machine Learning Applications: Stroke Prediction

The Stroke Prediction Dataset at Kaggle is an example of how to use Machine Learning for disease prediction.

The dataset comprises more than 5,000 observations of 12 attributes representing patients’ clinical conditions like heart disease, hypertension, glucose, smoking, etc. For each instance, there’s also a binary target variable indicating if a patient had a stroke.

We can build a model to predict the occurrence of a stroke by training typical classification algorithms, for example, Logistic Regression, K-Nearest Neighbors, Support Vector Machines classifiers, Decision Trees, or others.

Of course, the actual applicability of such a model depends on how representative the patients dataset is. However, this is a good exercise for those who wish to understand how to apply machine learning in healthcare. 

Reference:

https://www.kaggle.com/fedesoriano/stroke-prediction-dataset

Machine Learning Applications: Medical Diagnosis

An increasing application of machine learning classification algorithms is medical diagnose. Diagnosing if a patient has a specific disease is a simple binary classification problem.

For example, we may build a model to identify if a patient has Hepatitis C. In this case, there are two possible outputs: yes and no, which is a typical binary classification problem.

Thus, the big challenge for medical diagnosis is collecting the correct data to train a model for a specific disease: blood tests, x-ray images, ultrasound, etc.

Each of these data types can help diagnose and need particular treatment before used to feed ML models.

Finally, we’re far from a general artificial intelligence system capable of replacing doctors. However, the excellent results obtained by applying ML models to clinical data show that ML will be a powerful tool for healthcare.