Explore

AML 3304 Lecture and Lab Notebook: F23 October 17

Learning Outcomes:

Learn how make Embeddings ← → Learning how the architecture of the AI Application works (which is what we need to do the Project)

How it all starts with the concept of the PYTHON NEURON building up to ANNs and GANs.

⁠

Lecture on Generative Adversarial Networks and Artificial Neural Networks

⁠

Outline:

Epoch 1: Foundation and Basics

⁠

Introduction to Neural Networks

⁠

Definition and Inspiration

Neural networks, in the context of machine learning, are algorithms inspired by the structure of the human brain. They are designed to recognize patterns.

Based on Bayesian Math:

https://coda.io/@peter-sigurdson/baysian-models⁠

⁠

The brain consists of billions of cells called neurons, which process and transmit information. Similarly, in an ANN artificial neural network, basic units called artificial neurons process information using mathematical functions. Python AI programs start with the Python class “neuron”. Then we build our AI Application up in layers: which yields the Layered Architecture of the AI Application.

#todo: add references to UMEP book and LI Blog Articles

Why Neural Networks?

Capability to learn and make decisions from data, based on mathematical statistical algorithms called Bayesian algorithms applied to your input Training Corpus.

⁠

Flexibility: Can adjust themselves to the data and can generalize to unseen data. We need to study the operation of JSON “Big Data” data structures to provide conversional memory.

Versatility: Used in a wide range of applications like image and voice recognition, medical diagnosis, financial forecasting, scientific and engineering (Wolfram Mathematica for Science and Engineering). We can present business processes in an XML formatted language called BPEL and BPMN. Wolfram’s Mathematica can do do AI processing on scientific and engineering formulas.

⁠

The Starting Point: Basic Neuron: The Building Block of ANNs

⁠

Anatomy of a Neuron

Inputs: Analogous to dendrites in a biological neuron. They receive the data.

Weights: These are the strength or amplitude of a connection. If a weight is large, it means that the neuron's input has a strong influence on the output.

Summation Function: All the weighted inputs are summed up in this function.

Activation Function: It determines the neuron's output based on the sum of the weighted inputs. It's used to introduce non-linearity into the output of a neuron. This is where we get probabilistic programming.

Output: Analogous to axon in a biological neuron. This is the data that the neuron sends out to the next layer after processing.

The Perceptron Model

The perceptron is the simplest type of artificial neuron. It's a binary classification algorithm that makes its decisions based on a linear predictor function.

A perceptron takes several binary inputs, multiplies them by their weights, sums them up, and then produces a single binary output using a step function. (In future classes, we will experiment with simple R programming to see visualizations and to develop an intuition for this).

⁠

Structure of Artificial Neural Networks (ANNs)

⁠

Layers in an ANN

Input Layer: The layer that receives input from the dataset. Typically, nodes in this layer represent features or attributes of the dataset.

⁠

Hidden Layer(s): Layers that are neither input nor output. They perform computations and transfer information from the input nodes to the output nodes. Deep Neural Networks have multiple hidden layers: These to the intermediate processing steps.

Output Layer: This layer produces the result for given inputs: This produces the “next token generation” stream which is sent back to user who is chatting with our AI Language Model.

Nodes or Neurons

Image NODES in a Weighted Graph.

#todo insert photo of whiteboard

Each node in a layer represents a specific output. For instance, in a neural network designed for multiclass classification, each node in the output layer represents a specific class.

Connections and Weights

Every node in a layer is connected to every node in the previous and next layers. These connections represent weights, which get adjusted during learning.

⁠

Training an ANN: Feedforward and Backpropagation

⁠

Feedforward Process: Layered Architecture

The process where the input is passed through the neural network, layer by layer, until it reaches the output layer.

Each neuron processes the input, and this processed data is passed as input to the next layer.

Loss Function

Once we have the predicted output, we compare it to the actual output. This comparison is quantified using a loss function (e.g., Mean Squared Error for regression tasks or Cross-Entropy for classification tasks). [We will use R programming to get a visualization on what these mean.]

The result from the loss function, termed as 'loss', indicates how well the neural network's prediction matched the actual output. If you did good Bayesian Training, your loss will be very small.

Backpropagation Algorithm

The heart of the neural network training process.

It's an optimization algorithm used for minimizing the error in the predictions.

The main principle behind backpropagation is the chain rule from calculus. It calculates the gradient of the loss function with respect to each weight by propagating the gradient backward in the network.

Once the gradient is calculated, weights are adjusted using optimization techniques like Gradient Descent.

Gradient Descent and Learning Rate

Gradient Descent is an optimization algorithm used to minimize the loss function by adjusting weights in the direction of the steepest decrease in the loss.

The size of the steps taken to reach the minimum is determined by the learning rate. If the learning rate is too high, the algorithm might overshoot the optimal solution.

If it's too low, the algorithm might get stuck and take too long.

⁠

At the end of this epoch, the students should have a clear understanding of the fundamental concepts and workings of Artificial Neural Networks, at least to the point of getting an intuition on our code labs.

They should be able to visualize the structure of ANNs, understand the significance of each component, and appreciate the intricacies of the training process.

Epoch 2: Diving Deeper with GANs

Introduction to Generative Adversarial Networks (GANs)

Components of GANs: Generator and Discriminator: The Generator develops solutions. The Discriminator looks for problems with those solutions.

Training Process of GANs

Types of GANs:

Vanilla GAN,

DCGAN

CGAN

StyleGAN

Introduction to Generative Adversarial Networks (GANs)

Generative Adversarial Networks, abbreviated as GANs, have made a significant splash in the world of deep learning.

As their name suggests, GANs are generative models that can produce or "generate" new, previously unseen data.

How AIs generate new information / Very analogous to how doing SQL query on a Database can generate new information:

⁠

https://www.linkedin.com/pulse/computational-philosophy-building-implicate-order-new-sigurdson/⁠

⁠

They're particularly known for creating realistic images, but their capabilities stretch far beyond that.

Relatable Example: Think of GANs like an artist (generator) trying to create counterfeit money and a detective (discriminator) trying to distinguish between real and counterfeit money. Over multiple epochs of interaction, the artist becomes so skilled that the detective can't tell the difference between the counterfeit and the real thing. (Loss function is nearing Zero).

⁠

Components of GANs: Generator and Discriminator

The Generator: It takes random noise as an input and produces data (like images). Its primary aim is to make its generated data indistinguishable from real data.

Relatable Example: Consider the generator as a student trying to cheat on an exam by writing an answer in their words. Initially, the answers might not make sense, but with practice, they get better.

The Discriminator: It's a binary classifier that tries to distinguish between real data and fake data produced by the generator. If it gets fooled by the generator, it provides feedback, helping the generator improve.

Relatable Example: Now think of the discriminator as the teacher checking the student's answer. The teacher (discriminator) knows the correct answer and can easily spot when a student is trying to cheat. However, as the student becomes better at paraphrasing, it gets tougher for the teacher to catch the cheating.

⁠

Training Process of GANs

The training process of GANs is like a two-player game:

Generator's Turn: The generator creates a batch of fake data.

Discriminator's Turn: It evaluates both real data and the fake data produced by the generator.

Feedback Loop: The discriminator then provides feedback to the generator about how convincing its fake data was.

Adjustments: Based on this feedback, the generator tries to produce even more convincing data.

This process is repeated through multiple EPOCHS until the generator produces data that the discriminator can't distinguish from real data or until a set number of iterations is reached. [The loss function is so close to Zero that it is effectively Zero).

Relatable Example: It's like training a dog. When the dog (generator) does something right, it gets a treat. If not, it gets corrected and tries again. Over time, with repetition, the dog learns to perform tricks correctly. AIs get trained (Human Feedback Training) by being told that their performance is good enough or not, and modify their perform to optimize their reward. We let the AI algorthm self-discover the path to minimize gradient descent loss.

⁠

Types of GANs

Vanilla GAN: The simplest form of GAN, and often the starting point for those new to the topic. It consists of a basic generator and discriminator architecture.

Relatable Example: Think of Vanilla GAN as the basic model car in a series. It has essential features and gets the job done, but there might be more advanced models with additional features.

DCGAN (Deep Convolutional GAN): Uses deep convolutional networks, making it especially effective for image generation tasks. DCGANs can capture complex image patterns and structures.

Relatable Example: Imagine trying to replicate a famous painting. While a basic approach (Vanilla GAN) might capture the colors and basic shapes, DCGAN would capture the intricate brush strokes and details.

CGAN (Conditional GAN): Adds a twist to GANs by introducing conditional parameters, allowing users to guide the data generation process based on certain conditions or labels.

Relatable Example: Suppose you're cooking a meal (data generation). With CGAN, it's like following a specific recipe or adding ingredients based on dietary restrictions.

StyleGAN: Particularly renowned for producing high-quality, photorealistic images. It can adjust the style of the generated image at different levels of granularity.

Relatable Example: Think of StyleGAN as a fashion designer creating clothes. While the basic structure of a shirt remains the same, the designer can change its style - from the type of collar, the print, the fabric texture, and so on.

⁠

Conclusion

GANs, with their adversarial approach, have revolutionized the way we think about data generation.

By pitting two AI Agents against each other, they harness a competitive spirit to produce astonishingly realistic results. As we've seen through relatable examples, the dynamics between the generator and discriminator are both competitive and collaborative, leading to continuous improvement and refinement.

⁠

Lab Activity: Implementing a Basic Vanilla GAN

Note: We will be using Juypter Notebook within Anaconda Python for our Activities, Assignment and Project hand-in. It is great for combining Code Execution and Documentation with Markdown.

Objective: To give students a hands-on experience in setting up and training a basic GAN to understand the interaction between the generator and the discriminator.

Steps:

Setup: Install required libraries and tools.

Data Preparation: Use a simple dataset (e.g., MNIST) for ease of understanding.

Building the Generator and Discriminator: Using TensorFlow/Keras.

Training the GAN: Observe how the generator improves over time.

Evaluating Results: Visualizing generated data after various epochs.

Discussion: Analyzing the challenges faced during training and ways to overcome them.

⁠

By the end of the lecture and lab activity, students will have a comprehensive understanding of GANs' intricacies, from their foundational principles to their diverse variations.

⁠

image.jpeg failed to upload

⁠

Generate a python program to illustrate this operation

Let's implement a basic Vanilla GAN using the TensorFlow/Keras library. This will provide a hands-on illustration of how GANs operate. For simplicity, we'll use the MNIST dataset of handwritten digits.

import numpy as np

import tensorflow as tf

from tensorflow.keras.datasets import mnist

from tensorflow.keras.layers import Dense, Flatten, Reshape

from tensorflow.keras.models import Sequential

# Load and preprocess the MNIST dataset

(x_train, _), (_, _) = mnist.load_data()

x_train = x_train / 255.0 # normalize to [0, 1]

x_train = np.expand_dims(x_train, axis=-1)

x_train = x_train.astype(np.float32)

# Define the generator model

def build_generator(latent_dim):

model = Sequential([

Dense(128, activation='relu', input_dim=latent_dim),

Dense(784, activation='sigmoid'),

Reshape((28, 28))

])

return model

# Define the discriminator model

def build_discriminator():

model = Sequential([

Flatten(input_shape=(28, 28)),

Dense(128, activation='relu'),

Dense(1, activation='sigmoid')

])

return model

# Compile the discriminator model

discriminator = build_discriminator()

discriminator.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Build and compile the GAN model with discriminator set as non-trainable

discriminator.trainable = False

generator = build_generator(latent_dim=100)

gan_input = tf.keras.Input(shape=(100,))

gan_output = discriminator(generator(gan_input))

gan = tf.keras.Model(gan_input, gan_output)

gan.compile(optimizer='adam', loss='binary_crossentropy')

# Training the GAN

epochs = 10000

batch_size = 128

for epoch in range(epochs):

# Train discriminator

idx = np.random.randint(0, x_train.shape[0], batch_size)

real_imgs = x_train[idx]

noise = np.random.normal(0, 1, (batch_size, 100))

generated_imgs = generator.predict(noise)

real_labels = np.ones((batch_size, 1))

fake_labels = np.zeros((batch_size, 1))

d_loss_real = discriminator.train_on_batch(real_imgs, real_labels)

d_loss_fake = discriminator.train_on_batch(generated_imgs, fake_labels)

d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

# Train generator

noise = np.random.normal(0, 1, (batch_size, 100))

valid_labels = np.ones((batch_size, 1))

g_loss = gan.train_on_batch(noise, valid_labels)

# Print progress

print(f"{epoch}/{epochs} [D loss: {d_loss[0]} | D accuracy: {d_loss[1]}] [G loss: {g_loss}]")

# Save generated images at specific intervals

if epoch % 1000 == 0:

generated_img = generated_imgs[0] * 255

tf.keras.preprocessing.image.save_img(f'generated_img_{epoch}.png', generated_img)

print("Training Complete!")

This script defines a basic GAN using the MNIST dataset. The generator takes random noise as input and produces a 28x28 image. The discriminator then tries to differentiate between real MNIST images and the ones produced by the generator. The training loop repeatedly updates both models in an adversarial manner.

For simplicity, this code only runs for 10,000 epochs and saves generated images at intervals of 1,000 epochs.

In a real-world scenario, you might train for more epochs and implement various refinements like different layers, learning rate schedules, etc. to improve the model's performance.

Lab: Building an Embedding:

⁠

How to start jupyter notebook

Starting a Jupyter Notebook is simple. Here's how you can do it:

Installation (if not installed):

You can install Jupyter Notebook via pip if you haven't already:

pip install jupyter

If you're using the Conda distribution (like Anaconda or Miniconda), you can install using:

Copy code

conda install jupyter

Starting Jupyter Notebook:

Open your terminal or command prompt. Navigate to the directory where you want your notebooks to be saved or where they are located. Then type:

jupyter notebook

This command will start the Jupyter Notebook server and open a browser window/tab showing the Jupyter Notebook interface.

Using Jupyter Notebook:

Once Jupyter Notebook is open in your browser, you'll see a dashboard showing your files and directories.

To create a new notebook, click on the "New" button (usually on the top right side) and select the Python version you want to use.

This will open a new tab with an empty notebook where you can start typing in cells. You can write and execute Python code, make notes with Markdown, etc.

Stopping Jupyter Notebook:

Close the browser tabs/windows where Jupyter Notebook is running.

Go back to your terminal or command prompt. Press CTRL+C to stop the Jupyter Notebook server.

(Optional) Jupyter Lab:

Jupyter Lab is the next-generation interface for Jupyter. If you prefer a more modern and flexible interface, you might want to give it a try:

Copy code

pip install jupyterlab

Or, using Conda:

rCopy code

conda install -c conda-forge jupyterlab

To start Jupyter Lab, use:

Copy code

jupyter lab

Remember, Jupyter Notebook (and Jupyter Lab) is a browser-based application, so make sure you have a working browser installed on your machine. If you're running Jupyter on a remote server, you might need to set up additional configurations or use a specific command to access it from your local machine's browser.

Epoch 3: Practical Applications and Hands-on Session

Applications of ANNs and GANs

Hands-on Session: Building a Simple GAN

Potential Challenges and Limitations

Future Outlook and Conclusion

⁠

Epoch 1: Foundation and Basics

Introduction to Neural Networks

Inspiration from Biological Neurons

History and Evolution of ANNs

Role of Neural Networks in Machine Learning

Basic Neuron: The Building Block of ANNs

Anatomy of a Neuron: Inputs, Weights, Bias, and Activation Function

The Perceptron: The First Neural Model

Activation Functions: Sigmoid, ReLU, Tanh, etc.

Structure of Artificial Neural Networks (ANNs)

Layers in ANNs: Input, Hidden, and Output

Deep Neural Networks: Advantages of Depth

Loss Functions and Objective

Training an ANN: Feedforward and Backpropagation

Feedforward Process: Computation Graph and Outputs

The Backpropagation Algorithm: Chain Rule and Weight Adjustment

Gradient Descent and Learning Rate

⁠

Epoch 2: Diving Deeper with GANs

Introduction to Generative Adversarial Networks (GANs)

The Concept of Generative Models

GANs vs. Traditional Generative Models

Unsupervised Learning with GANs

Components of GANs: Generator and Discriminator

The Generator: Creating Synthetic Data

The Discriminator: Distinguishing Real from Fake

Balance and the Adversarial Game

Training Process of GANs

Loss Functions in GANs: Binary Cross Entropy

Challenges: Mode Collapse, Training Instability

Strategies for Stable GAN Training

Types of GANs: Vanilla GAN, DCGAN, CGAN, StyleGAN

Vanilla GAN: The Original GAN Model

DCGAN: Use of Convolutional Layers

CGAN: Conditional Data Generation

StyleGAN: High-Quality Image Generation with Style Control

⁠

Epoch 3: Practical Applications and Hands-on Session

Applications of ANNs and GANs

Image Classification, Regression, and Clustering with ANNs

Image Generation, Image-to-Image Translation, and Style Transfer with GANs

Real-world Use Cases: Art Creation, Game Design, Medical Imaging

Hands-on Session: Building a Simple GAN

Setting Up the Environment

Implementing a Vanilla GAN for Image Generation

Training and Evaluating the Model

Potential Challenges and Limitations

Overfitting in ANNs and Solutions

GAN Challenges: Mode Collapse, Gradient Vanishing/Exploding

Ethical Considerations: Deepfakes and Misinformation

Future Outlook and Conclusion

Advancements in Neural Network Architectures

Expanding Applications in Various Domains

Emphasizing Ethical Use and Development

⁠

This lecture offers a comprehensive understanding of ANNs and GANs, transitioning from foundational knowledge to hands-on application. The idea is to give learners not only the theoretical knowledge but also practical skills to implement and understand the potential and challenges of these powerful neural network architectures.

Let's delve deep into the concept of a neuron, both in biological and artificial contexts.

⁠

Biological Neuron

⁠

Neurons are the fundamental units of the brain and nervous system, responsible for receiving sensory input from the external world, processing and transmitting this information, and sending out instructions to the rest of the body.

Components of a Biological Neuron:

Dendrites: These are tree-like branches that receive input from other neurons. They channel this information to the cell body or soma. Each neuron can have multiple dendrites.

Soma (Cell Body): It's the main part of the neuron. It contains the cell's nucleus (which houses its DNA) and other organelles. It processes the information received from the dendrites.

Axon: A long, slender projection of the neuron that conducts electrical impulses away from the neuron's cell body. Each neuron has only one axon, but this axon may branch out to numerous other cells.

Synapses: These are junctions or gaps where the axon tip of one neuron can send signals to another neuron's dendrites. This is where the "communication" happens, often via chemical neurotransmitters.

How a Biological Neuron Works:

When a neuron receives signals at its dendrites, these signals are aggregated in the soma. If the received signal strength surpasses a certain threshold, the neuron fires, sending an electrical signal down its axon. This signal can then be transmitted to other neurons via synapses.

⁠

Artificial Neuron (Perceptron)

⁠

Inspired by the biological neuron, an artificial neuron or perceptron is a fundamental building block of artificial neural networks in the field of machine learning.

Components of an Artificial Neuron:

Inputs: Analogous to dendrites in a biological neuron, they represent the data fed into the neuron. Each input is associated with a weight.

Weights: These are the adjustable parameters within a neuron and determine the importance or influence of a given input on the neuron's output.

Bias: It's an additional parameter that allows the activation function to be shifted left or right. It's akin to a y-intercept in linear equations.

Summation Function: Aggregates the weighted inputs and adds the bias. Mathematically, if we have inputs �1,�2,...��x

,...x

and weights �1,�2,...��w

,...w

, the summation is ∑(��∗��)+��∑(x

∗w

)+bias.

Activation Function: After summing the weighted inputs and bias, this function determines the neuron's output. It introduces non-linearity, allowing neural networks to learn from error and make adjustments, which is essential for learning complex patterns.

How an Artificial Neuron Works:

Weighted Sum: Multiply each input by its corresponding weight and sum them up.

Add Bias: To the weighted sum, add a bias.

Apply Activation Function: Pass the result from step 2 through an activation function to produce the neuron's output. Common activation functions include the sigmoid, hyperbolic tangent (tanh), and ReLU (Rectified Linear Unit).

⁠

Comparison and Takeaway:

While both biological and artificial neurons involve summing up inputs and producing outputs, they operate on fundamentally different principles. Biological neurons deal with electrochemical signals and intricate biological processes. In contrast, artificial neurons operate based on mathematical functions and algorithms.

The concept of the artificial neuron was inspired by our understanding of the brain, but it's a vast simplification. Yet, even with this simplification, artificial neural networks, composed of interconnected artificial neurons, have achieved impressive feats in various domains, including image and speech recognition, natural language processing, and more.

Python code to illustrate these concepts

1. Basic Neuron

Let's start by defining a simple perceptron.

pythonCopy code

import numpy as np

def activation_function(x):

# Using a simple step function as activation for illustrative purposes

return 1 if x >= 0 else 0

def perceptron(inputs, weights):

# Summing up the weighted inputs

sum_ = np.dot(inputs, weights)

# Applying the activation function

return activation_function(sum_)

2. Feedforward in ANN

Let's extend the basic perceptron concept to a simple feedforward neural network with one hidden layer.

pythonCopy code

def sigmoid(x):

return 1 / (1 + np.exp(-x))

def feedforward(inputs, weights_input_hidden, weights_hidden_output):

# Input to Hidden Layer

hidden_input = np.dot(inputs, weights_input_hidden)

hidden_output = sigmoid(hidden_input)

# Hidden Layer to Output

final_input = np.dot(hidden_output, weights_hidden_output)

final_output = sigmoid(final_input)

return final_output

3. Training using Backpropagation (simplified)

The below code demonstrates the training process using backpropagation for a simple network.

pythonCopy code

def sigmoid_derivative(x):

Want to print your doc?
This is not the way.

Try clicking the ··· in the right corner or using a keyboard shortcut (

CtrlP

) instead.