Share
Explore

Lab: Build an AI Model with Google Collab Notebook using Tensor Flow

Lab: Build an AI Model with Google Collab Notebook using Tensor Flow


Let's build a simple AI text chatbot model using TensorFlow in a Google Colab Notebook. We'll go through all the steps and train it on a sample training text. Here's a detailed guide:

Step 1: Set Up Your Environment

First, ensure you have TensorFlow installed in your Google Colab environment. If not, you can install it using the following command:
!pip install tensorflow

Step 2: Import Required Libraries

Next, import the necessary libraries.
import tensorflow as tf
import numpy as np
import re
import os
import zipfile
import urllib.request
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Bidirectional

Step 3: Prepare Training Data

For the training text, we'll define a simple example of conversational text.
training_text = """
Hello, how can I help you?
Hi, I need some assistance.
Sure, what do you need help with?
I am looking for information about your services.
We offer a variety of services, including AI development and consulting.
Can you tell me more about your AI development services?
Of course, we specialize in creating custom AI solutions for businesses.
That's great! How can I get started?
You can start by scheduling a consultation with one of our experts.
Thank you, I will do that.
You're welcome! Have a great day!
"""

Step 4: Tokenize and Prepare Sequences

Tokenize the text and prepare input sequences for the model.
tokenizer = Tokenizer()
tokenizer.fit_on_texts([training_text])
total_words = len(tokenizer.word_index) + 1

input_sequences = []
for line in training_text.split("\n"):
token_list = tokenizer.texts_to_sequences([line])[0]
for i in range(1, len(token_list)):
n_gram_sequence = token_list[:i+1]
input_sequences.append(n_gram_sequence)

# Pad sequences and create predictors and labels
max_sequence_len = max([len(x) for x in input_sequences])
input_sequences = np.array(pad_sequences(input_sequences, maxlen=max_sequence_len, padding='pre'))

predictors, label = input_sequences[:,:-1], input_sequences[:,-1]
label = tf.keras.utils.to_categorical(label, num_classes=total_words)

Step 5: Build the Model

Define and compile the model.
model = Sequential([
Embedding(total_words, 64, input_length=max_sequence_len-1),
Bidirectional(LSTM(20)),
Dense(total_words, activation='softmax')
])

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

Step 6: Train the Model

Train the model with the prepared data.
history = model.fit(predictors, label, epochs=100, verbose=1)

Step 7: Test the Chatbot

Create a function to generate responses based on a given seed text.
def generate_text(seed_text, next_words, max_sequence_len):
for _ in range(next_words):
token_list = tokenizer.texts_to_sequences([seed_text])[0]
token_list = pad_sequences([token_list], maxlen=max_sequence_len-1, padding='pre')
predicted = np.argmax(model.predict(token_list), axis=-1)
output_word = ""
for word, index in tokenizer.word_index.items():
if index == predicted:
output_word = word
break
seed_text += " " + output_word
return seed_text

Test the chatbot with a seed text.
seed_text = "Hello"
next_words = 5
print(generate_text(seed_text, next_words, max_sequence_len))

Explanation of Steps

Setting Up Environment: We ensure TensorFlow is installed in the Google Colab environment.
Importing Libraries: We import TensorFlow and other necessary libraries for text processing and model building.
Preparing Training Data: We define a sample training text and tokenize it.
Tokenizing and Preparing Sequences: We convert the text into sequences of tokens and prepare input-output pairs for training.
Building the Model: We create a simple neural network model using Embedding, LSTM, and Dense layers.
Training the Model: We train the model on the prepared data.
Testing the Chatbot: We define a function to generate text based on a seed input and test the chatbot.
This is a basic implementation of a text-based chatbot using TensorFlow. You can extend and refine this model with more training data and advanced techniques for better performance.

megaphone

Lab Part 2: Using a Teacher Model


In this context, the term "teacher model" refers to a pre-trained model or a dataset that we can use as a reference or starting point for training our own model.
In the previous example, we didn't use an explicit pre-trained teacher model. Instead, we trained a simple chatbot model from scratch using a small sample of conversational text.
Next, let’s use a pre-trained teacher model.
You could leverage existing large language models like GPT-3 or GPT-4 from OpenAI, which are pre-trained on vast amounts of text data. These models can be fine-tuned on your specific dataset to create a more sophisticated chatbot.

info

Let's highlight the differences between the two approaches and explain the usage of the `transformers` library in the context of this lab.

Part 1: Building a Simple AI Text Chatbot Model from Scratch

#### Approach 1. **Libraries Used**: - TensorFlow - Keras (for model building) 2. **Model Architecture**: - Embedding Layer - Bidirectional LSTM Layer - Dense Layer with Softmax Activation 3. **Data Preparation**: - Tokenization using Keras `Tokenizer` - Padding sequences 4. **Training**: - Model trained from scratch using a small sample of conversational text. 5. **Testing**: - Generating responses based on seed text using the trained model.

Part 2: Fine-tuning a Pre-trained Model using the Transformers Library

#### Approach 1. **Libraries Used**: - TensorFlow - Transformers (from Hugging Face) 2. **Model Architecture**: - Pre-trained GPT-2 model 3. **Data Preparation**: - Tokenization using the `GPT2Tokenizer` from the `transformers` library 4. **Training**: - Fine-tuning the pre-trained GPT-2 model on the same sample conversational text. 5. **Testing**: - Generating responses using the fine-tuned GPT-2 model.

Differences and Why We Use the Transformers Library

1. **Pre-trained Models vs. Building from Scratch**:
- **From Scratch**: In Part 1, we built a model from scratch.
This required defining a custom architecture and training it entirely on our small dataset. This approach is simpler but less powerful due to limited training data.
- **Pre-trained Model**: In Part 2, we leveraged a pre-trained GPT-2 model using the `transformers` library. This model has been trained on a vast amount of text data, making it much more capable of understanding and generating natural language.

2. **Use of the Transformers Library**:
- The `transformers` library from Hugging Face provides easy access to a variety of pre-trained models for natural language processing tasks.
By using this library, we can quickly implement and fine-tune powerful models like GPT-2.
- **Advantages**: - **Higher Performance**: Pre-trained models have already learned a lot about language structure and context. - **Efficiency**: We can fine-tune a model on our specific dataset in a shorter amount of time compared to training from scratch.

3. **Model Architecture**:
- **Part 1**: We manually defined an architecture with an embedding layer, an LSTM layer, and a dense layer. - **Part 2**: We used the pre-trained architecture of GPT-2, which includes multiple transformer layers that have been pre-trained on extensive datasets.
4. **Tokenization and Data Preparation**: - **Part 1**: Tokenization and padding were done using Keras utilities. - **Part 2**: Tokenization is handled by the `GPT2Tokenizer` from the `transformers` library, which is specifically designed to work with GPT-2 and other transformer models.

Summary

In summary, Part 1 of this lab demonstrated how to build and train a simple chatbot model from scratch using basic neural network components.
Part 2 introduced the `transformers` library, which provides access to powerful pre-trained models like GPT-2.
By fine-tuning a pre-trained model, we can leverage the extensive training these models have undergone, resulting in a more capable and efficient chatbot.

Why Use the Transformers Library?
- **Performance**: Pre-trained models are highly effective at generating coherent and contextually relevant responses. - **Ease of Use**: The `transformers` library simplifies the process of implementing and fine-tuning these models. - **Flexibility**: You can fine-tune pre-trained models on your specific dataset to tailor them to your needs.
This comparison highlights the benefits of using pre-trained models and the `transformers` library for building more advanced and capable AI applications.

Here’s a modified approach using the TensorFlow transformers library to fine-tune a pre-trained language model (e.g., GPT-2) on your dataset.
Let’s continue with Collab Notebook:

Step 1: Set Up Your Environment

First, ensure you have the necessary libraries installed in your Google Colab environment.
!pip install transformers
!pip install tensorflow

Step 2: Import Required Libraries

Next, import the necessary libraries.
python
Copy code
import tensorflow as tf
from transformers import GPT2Tokenizer, TFGPT2LMHeadModel

Step 3: Load Pre-trained GPT-2 Model and Tokenizer

Load a pre-trained GPT-2 model and tokenizer.
python
Copy code
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = TFGPT2LMHeadModel.from_pretrained("gpt2")

Step 4: Prepare Training Data

For the training text, we'll define a simple example of conversational text.
python
Copy code
training_text = """
Hello, how can I help you?
Hi, I need some assistance.
Sure, what do you need help with?
I am looking for information about your services.
We offer a variety of services, including AI development and consulting.
Can you tell me more about your AI development services?
Of course, we specialize in creating custom AI solutions for businesses.
That's great! How can I get started?
You can start by scheduling a consultation with one of our experts.
Thank you, I will do that.
You're welcome! Have a great day!
"""

Step 5: Tokenize the Text

Tokenize the text and prepare it for the model.
python
Copy code
inputs = tokenizer(training_text, return_tensors='tf', max_length=512, truncation=True, padding='max_length')
inputs = inputs['input_ids']

Step 6: Fine-tune the Model

Fine-tune the pre-trained GPT-2 model on your dataset.
python
Copy code
# Define the optimizer
optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5)

# Compile the model
model.compile(optimizer=optimizer, loss=model.compute_loss)

# Fine-tune the model
model.fit(inputs, inputs, epochs=3, batch_size=1)

Step 7: Test the Chatbot

Create a function to generate responses based on a given seed text.
python
Copy code
def generate_text(seed_text, next_words=50):
input_ids = tokenizer.encode(seed_text, return_tensors='tf')
output = model.generate(input_ids, max_length=next_words, num_return_sequences=1, no_repeat_ngram_size=2)
return tokenizer.decode(output[0], skip_special_tokens=True)

Test the chatbot with a seed text.
python
Copy code
seed_text = "Hello"
print(generate_text(seed_text))

Explanation of Steps

Setting Up Environment: Install necessary libraries in Google Colab.
Importing Libraries: Import TensorFlow and the transformers library for pre-trained models.
Loading Pre-trained Model and Tokenizer: Load a pre-trained GPT-2 model and its tokenizer.
Preparing Training Data: Define a sample training text.
Tokenizing the Text: Tokenize the text and prepare it for the model.
Fine-tuning the Model: Fine-tune the pre-trained GPT-2 model on the given conversational dataset.
Testing the Chatbot: Define a function to generate responses based on a seed input and test the chatbot.
Using a pre-trained model like GPT-2 significantly improves the chatbot's performance by leveraging the extensive training the model has already undergone. This approach is more effective than training a model from scratch with a small dataset.
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.