PyTorch Level 1 Lab Building a Simple AI Model from Scratch
Objective:
By the end of this lab, students will learn how to:
Set up a Google Colab environment for PyTorch.
Import necessary libraries.
Prepare and preprocess data.
Define a simple neural network model using PyTorch.
Train and evaluate the model.
Step 1: Set Up Your Environment
First, ensure you have PyTorch installed in your Google Colab environment. PyTorch is already included in Colab, but we will use the following command to ensure we have the latest version.
create your Notebook:
https://colab.research.google.com/
!pip install torch torchvision
Step 2: Import Required Libraries
Next, import the necessary libraries.
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
The MNIST dataset is a widely recognized and commonly used dataset in the field of machine learning, particularly for image classification tasks [citation:1][citation:2][citation:3][citation:5]. It consists of 70,000 grayscale images of handwritten digits, with dimensions of 28x28 pixels [citation:4]. The dataset is divided into 60,000 images for training and 10,000 images for testing, with each digit (0-9) having 7,000 images (6,000 for training and 1,000 for testing) [citation:3]. The MNIST dataset offers a standard benchmark for various machine learning models, allowing researchers and practitioners to compare and evaluate their algorithms' performance on this specific image classification task [citation:6][citation:7]. Furthermore, the MNIST dataset's simplicity and well-structured format make it an ideal resource for learning and practicing the development, evaluation, and implementation of convolutional deep learning neural networks for image classification tasks [citation:1][citation:5].
References:
[citation:1] How to Develop a CNN for MNIST Handwritten Digit Classification (<https://towardsdatascience.com/how-to-develop-a-cnn-for-mnist-handwritten-digit-classification-3a7371d84208>)
[citation:2] Image Classification in 10 Minutes with MNIST Dataset (<https://towardsdatascience.com/image-classification-in-10-minutes-with-mnist-dataset-a811da868139>)
[citation:3] mnist · Datasets at Hugging Face (<https://huggingface.co/datasets/mnist>)
[citation:4] MNIST - Ultralytics YOLOv8 Docs (<https://docs.ultralytics.com/datasets/classify/mnist/>)
[citation:5] GitHub - pengfeinie/handwritten-digit-classification: The MNIST ... (<https://github.com/pengfeinie/handwritten-digit-classification>)
[citation:6] MNIST - Ultralytics YOLOv8 Docs (<https://docs.ultralytics.com/datasets/classify/mnist/>)
[citation:7] Build Your First Image Classification Model with The MNIST Dataset ... (<https://towardsdatascience.com/build-your-first-image-classification-model-with-the-mnist-dataset-b4e43b03a718>)
Step 3: Prepare and Preprocess Data (waiting for download and extract to complete)
We will use the MNIST dataset, a standard dataset for image classification tasks.
The `optim` module in PyTorch provides several families of optimizers, which are algorithms designed to update the parameters of your model based on the gradients computed during backpropagation. Here are some of the most commonly used optimizer families and their references:
1. Stochastic Gradient Descent (SGD) and its variants
* `torch.optim.SGD`: Implements the classic SGD algorithm with support for Nesterov momentum. [Reference](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html)
* `torch.optim.SparseAdam`: A variant of Adam optimizer that supports sparse gradients. [Reference](https://pytorch.org/docs/stable/generated/torch.optim.SparseAdam.html)
2. **Adam and its variants**
* `torch.optim.Adam`: Implements the Adam algorithm, which is a popular choice for deep learning models. [Reference](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html)
* `torch.optim.AdamW`: A variant of Adam that includes weight decay in the optimization process. [Reference](https://pytorch.org/docs/stable/generated/torch.optim.AdamW.html)
3. **Learning Rate Schedulers**
* `torch.optim.lr_scheduler`: A collection of learning rate schedulers that can be used to adjust the learning rate during training. [Reference](https://pytorch.org/docs/stable/optim.html#learning-rate-schedulers)
4. **Other optimizers**
* `torch.optim.RMSprop`: Implements the RMSprop algorithm, which is an optimization algorithm that uses the root mean square of recent gradients to normalize the gradients. [Reference](https://pytorch.org/docs/stable/generated/torch.optim.RMSprop.html)
* `torch.optim.Rprop`: Implements the Rprop algorithm, which is an optimization algorithm that adapts the learning rate individually for each parameter. [Reference](https://pytorch.org/docs/stable/generated/torch.optim.Rprop.html)
In the provided context, the `optim.Adam` optimizer is used, which is a member of the Adam family of optimizers. The learning rate is set to 0.001. The `nn.CrossEntropyLoss()` criterion is used to calculate the loss between the predicted and actual values.
Setting Up Environment: We ensure PyTorch is installed in the Google Colab environment.
Importing Libraries: We import necessary libraries such as PyTorch and torchvision.
Preparing Data: We download and load the MNIST dataset, applying necessary transformations.
Defining the Model: We create a simple neural network model using nn.Module.
Defining Loss and Optimizer: We set up the loss function and optimizer for training.
Training the Model: We train the model by running a training loop over the dataset.
Evaluating the Model: We evaluate the model's performance on the test dataset.
Saving and Loading the Model: We save the trained model to a file and load it back.
Conclusion
In this lab, students learned the basics of building and training a simple neural network model using PyTorch. This foundational knowledge prepares them for more advanced topics, such as fine-tuning pre-trained models using transformers, which will be covered in subsequent labs.
What your Output from this lab is: A pytorch tensor file which you can post on a Model Sharing Server.
The PyTorch tensor file (simple_nn.pth) within a Google Colab environment.
This file contains the saved model parameters of a neural network, which is typically in binary format and therefore not human-readable.
To deploy this model to a server, follow these steps:
Save the Model in Colab:
Ensure your model is saved properly using torch.save(model.state_dict(), 'simple_nn.pth').
Download the Model File:
Download the saved model file from Collab to your local machine using:
python
Copy code
from google.colab import files files.download('simple_nn.pth')
Upload to Model Server:
Upload the simple_nn.pth file to your model server. The exact method will depend on your server's setup (e.g., FTP, SCP, direct upload via web interface).
Load the Model on the Server:
On your model server, you will need to load the model using PyTorch:
import torch from your_model_definition import SimpleNN # replace with your actual model class model = SimpleNN() model.load_state_dict(torch.load('simple_nn.pth')) model.eval() # set the model to evaluation mode
Deploy the Model:
Integrate the model into your application, such as a Flask or FastAPI server, to serve predictions.
If you need help with any specific step, feel free to ask!
Pytorch Lab Level 2: Using a teacher model
Let's proceed with PyTorch Level 2 Lab where we will fine-tune a pre-trained transformer model using the Hugging Face `transformers` library.
This lab will guide you through the process of leveraging a pre-trained model to create an AI language model.
---
PyTorch Level 2 Lab: Fine-Tuning a Pre-trained Transformer Model
### **Objective:**
By the end of this lab, students will learn how to:
1. Set up a Google Colab environment for PyTorch and the Hugging Face `transformers` library.
2. Load a pre-trained transformer model and tokenizer.
3. Prepare and preprocess data.
4. Fine-tune the pre-trained model on a specific dataset.
5. Evaluate the model and generate text.
### **Step 1: Set Up Your Environment**
First, ensure you have the `transformers` library installed in your Google Colab environment.
```python
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel, AdamW, get_linear_schedule_with_warmup
from torch.utils.data import DataLoader, Dataset, random_split
import numpy as np
import pandas as pd
```
### **Step 3: Prepare and Preprocess Data**
We'll use a simple dataset of conversational text for this lab. For simplicity, let's define a small dataset directly in the code.
```python
# Define a simple dataset
data = [
"Hello, how can I help you?",
"Hi, I need some assistance.",
"Sure, what do you need help with?",
"I am looking for information about your services.",
"We offer a variety of services, including AI development and consulting.",
"Can you tell me more about your AI development services?",
"Of course, we specialize in creating custom AI solutions for businesses.",
"That's great! How can I get started?",
"You can start by scheduling a consultation with one of our experts.",
"Thank you, I will do that.",
"You're welcome! Have a great day!"
]
# Convert to a pandas DataFrame for easy manipulation
df = pd.DataFrame(data, columns=["text"])
# Preprocess the data
input_data = [preprocess_data(text, tokenizer) for text in df['text']]
# Convert list of dictionaries to a single dictionary of tensors
input_ids = torch.cat([item['input_ids'] for item in input_data])
attention_mask = torch.cat([item['attention_mask'] for item in input_data])
labels = torch.cat([item['labels'] for item in input_data])
# Create a custom dataset class
class TextDataset(Dataset):
def __init__(self, input_ids, attention_mask, labels):
self.input_ids = input_ids
self.attention_mask = attention_mask
self.labels = labels
1. **Setting Up Environment**: Ensure PyTorch and the Hugging Face `transformers` library are installed in the Google Colab environment.
2. **Importing Libraries**: Import necessary libraries such as PyTorch and transformers.
3. **Preparing Data**: Define and preprocess a simple dataset. Convert text data into tensors using the tokenizer.
4. **Loading Pre-trained Model**: Load a pre-trained GPT-2 model and set up the optimizer and learning rate scheduler.
5. **Fine-tuning the Model**: Define and execute the training loop to fine-tune the model on the dataset.
6. **Generating Text**: Define a function to generate text using the fine-tuned model and test it with a seed text.
Conclusion
In this lab, you learned how to fine-tune a pre-trained transformer model using PyTorch.
You prepared data, loaded a pre-trained model, fine-tuned it on a specific dataset, and generated text. This foundational knowledge prepares them for more advanced NLP tasks and real-world applications, providing valuable skills for entry-level AI jobs.
---
Want to print your doc? This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (