Introduction to Generative Adversarial Networks (GANs)
Generative Adversarial Networks, abbreviated as GANs, have made a significant splash in the world of deep learning.
As their name suggests, GANs are generative models that can produce or "generate" new, previously unseen data.
How AIs generate new information / Very analogous to how doing SQL query on a Database can generate new information:
They're particularly known for creating realistic images, but their capabilities stretch far beyond that.
Relatable Example: Think of GANs like an artist (generator) trying to create counterfeit money and a detective (discriminator) trying to distinguish between real and counterfeit money. Over multiple epochs of interaction, the artist becomes so skilled that the detective can't tell the difference between the counterfeit and the real thing. (Loss function is nearing Zero).
Components of GANs: Generator and Discriminator
The Generator: It takes random noise as an input and produces data (like images). Its primary aim is to make its generated data indistinguishable from real data. Relatable Example: Consider the generator as a student trying to cheat on an exam by writing an answer in their words. Initially, the answers might not make sense, but with practice, they get better. The Discriminator: It's a binary classifier that tries to distinguish between real data and fake data produced by the generator. If it gets fooled by the generator, it provides feedback, helping the generator improve. Relatable Example: Now think of the discriminator as the teacher checking the student's answer. The teacher (discriminator) knows the correct answer and can easily spot when a student is trying to cheat. However, as the student becomes better at paraphrasing, it gets tougher for the teacher to catch the cheating. Training Process of GANs
The training process of GANs is like a two-player game:
Generator's Turn: The generator creates a batch of fake data. Discriminator's Turn: It evaluates both real data and the fake data produced by the generator. Feedback Loop: The discriminator then provides feedback to the generator about how convincing its fake data was. Adjustments: Based on this feedback, the generator tries to produce even more convincing data. This process is repeated through multiple EPOCHS until the generator produces data that the discriminator can't distinguish from real data or until a set number of iterations is reached. [The loss function is so close to Zero that it is effectively Zero). Relatable Example: It's like training a dog. When the dog (generator) does something right, it gets a treat. If not, it gets corrected and tries again. Over time, with repetition, the dog learns to perform tricks correctly. AIs get trained (Human Feedback Training) by being told that their performance is good enough or not, and modify their perform to optimize their reward. We let the AI algorthm self-discover the path to minimize gradient descent loss.
Types of GANs
Vanilla GAN: The simplest form of GAN, and often the starting point for those new to the topic. It consists of a basic generator and discriminator architecture. Relatable Example: Think of Vanilla GAN as the basic model car in a series. It has essential features and gets the job done, but there might be more advanced models with additional features. DCGAN (Deep Convolutional GAN): Uses deep convolutional networks, making it especially effective for image generation tasks. DCGANs can capture complex image patterns and structures. Relatable Example: Imagine trying to replicate a famous painting. While a basic approach (Vanilla GAN) might capture the colors and basic shapes, DCGAN would capture the intricate brush strokes and details. CGAN (Conditional GAN): Adds a twist to GANs by introducing conditional parameters, allowing users to guide the data generation process based on certain conditions or labels. Relatable Example: Suppose you're cooking a meal (data generation). With CGAN, it's like following a specific recipe or adding ingredients based on dietary restrictions. StyleGAN: Particularly renowned for producing high-quality, photorealistic images. It can adjust the style of the generated image at different levels of granularity. Relatable Example: Think of StyleGAN as a fashion designer creating clothes. While the basic structure of a shirt remains the same, the designer can change its style - from the type of collar, the print, the fabric texture, and so on. Conclusion
GANs, with their adversarial approach, have revolutionized the way we think about data generation.
By pitting two AI Agents against each other, they harness a competitive spirit to produce astonishingly realistic results. As we've seen through relatable examples, the dynamics between the generator and discriminator are both competitive and collaborative, leading to continuous improvement and refinement.
Lab Activity: Implementing a Basic Vanilla GAN
Note: We will be using Juypter Notebook within Anaconda Python for our Activities, Assignment and Project hand-in. It is great for combining Code Execution and Documentation with Markdown.
Objective: To give students a hands-on experience in setting up and training a basic GAN to understand the interaction between the generator and the discriminator.
Steps:
Setup: Install required libraries and tools. Data Preparation: Use a simple dataset (e.g., MNIST) for ease of understanding. Building the Generator and Discriminator: Using TensorFlow/Keras. Training the GAN: Observe how the generator improves over time. Evaluating Results: Visualizing generated data after various epochs. Discussion: Analyzing the challenges faced during training and ways to overcome them. By the end of the lecture and lab activity, students will have a comprehensive understanding of GANs' intricacies, from their foundational principles to their diverse variations.
image.jpeg failed to upload
Generate a python program to illustrate this operation
Let's implement a basic Vanilla GAN using the TensorFlow/Keras library. This will provide a hands-on illustration of how GANs operate. For simplicity, we'll use the MNIST dataset of handwritten digits.
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Dense, Flatten, Reshape
from tensorflow.keras.models import Sequential
# Load and preprocess the MNIST dataset
(x_train, _), (_, _) = mnist.load_data()
x_train = x_train / 255.0 # normalize to [0, 1]
x_train = np.expand_dims(x_train, axis=-1)
x_train = x_train.astype(np.float32)
# Define the generator model
def build_generator(latent_dim):
model = Sequential([
Dense(128, activation='relu', input_dim=latent_dim),
Dense(784, activation='sigmoid'),
Reshape((28, 28))
])
return model
# Define the discriminator model
def build_discriminator():
model = Sequential([
Flatten(input_shape=(28, 28)),
Dense(128, activation='relu'),
Dense(1, activation='sigmoid')
])
return model
# Compile the discriminator model
discriminator = build_discriminator()
discriminator.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Build and compile the GAN model with discriminator set as non-trainable
discriminator.trainable = False
generator = build_generator(latent_dim=100)
gan_input = tf.keras.Input(shape=(100,))
gan_output = discriminator(generator(gan_input))
gan = tf.keras.Model(gan_input, gan_output)
gan.compile(optimizer='adam', loss='binary_crossentropy')
# Training the GAN
epochs = 10000
batch_size = 128
for epoch in range(epochs):
# Train discriminator
idx = np.random.randint(0, x_train.shape[0], batch_size)
real_imgs = x_train[idx]
noise = np.random.normal(0, 1, (batch_size, 100))
generated_imgs = generator.predict(noise)
real_labels = np.ones((batch_size, 1))
fake_labels = np.zeros((batch_size, 1))
d_loss_real = discriminator.train_on_batch(real_imgs, real_labels)
d_loss_fake = discriminator.train_on_batch(generated_imgs, fake_labels)
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)
# Train generator
noise = np.random.normal(0, 1, (batch_size, 100))
valid_labels = np.ones((batch_size, 1))
g_loss = gan.train_on_batch(noise, valid_labels)
# Print progress
print(f"{epoch}/{epochs} [D loss: {d_loss[0]} | D accuracy: {d_loss[1]}] [G loss: {g_loss}]")
# Save generated images at specific intervals
if epoch % 1000 == 0:
generated_img = generated_imgs[0] * 255
tf.keras.preprocessing.image.save_img(f'generated_img_{epoch}.png', generated_img)
print("Training Complete!")
This script defines a basic GAN using the MNIST dataset. The generator takes random noise as input and produces a 28x28 image. The discriminator then tries to differentiate between real MNIST images and the ones produced by the generator. The training loop repeatedly updates both models in an adversarial manner.
For simplicity, this code only runs for 10,000 epochs and saves generated images at intervals of 1,000 epochs.
In a real-world scenario, you might train for more epochs and implement various refinements like different layers, learning rate schedules, etc. to improve the model's performance.