Building AI Model Embeddings with Artificial Neural Networks (Nov 7)

Learning Outcomes:
How to build an AI model using an embedding.
How artificial neural networks work and are used in this.
Lab Exercises: Doing the coding in Google Collab Notebook.

Your Assignment is make your own embedding.
We are here in this course to learn how to be AI Application Developers.
Your project requires you to build your own Generative AI Language Model → Your own embedding which you are making by yourself will be the Center of your AI Language Model.

What is an Embedding?
A representation of words in the format of numbers.
Gematria: Assigning numbers to letters: Each word has a numerical value.
In building our embedding for our AI Model, we are creating a Numeric Algebraic Matrix. As we will see in the Lab, this numeric matrix - which is what the embedding is - is where the tokens and weightings of the Training Data live.

Lecture 1: Introduction to Building AI Model Embeddings with Artificial Neural Networks

Good morning class, and welcome to the exciting world of Artificial Intelligence!
Today, we're going to take our first steps into building AI models, focusing on embeddings and how they work within the structure of artificial neural network architects.
We'll be using Google Colab, an accessible platform that allows us to write and execute Python code through the browser.

What is an AI Model Embedding?

In the context of AI, an embedding is a representation of data in a lower-dimensional space.
Imagine you have a lot of data points with many features; embeddings allow us to convert these high-dimensional data points into fewer dimensions so that similar data points are placed closer together in this new space. {Bucketing}
Embeddings are crucial when dealing with types of data like text and images, where traditional numerical representations can be very sparse and inefficient.

Why Use Artificial Neural Networks?

Artificial Neural Networks (ANNs) are inspired by the biological neural networks that constitute animal brains.
They are a series of algorithms that endeavor to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates.
ANNs are capable of learning and modeling complex patterns and decision boundaries.
They are fundamental in many AI tasks like classification, regression, and even generative models.

Building a Simple Neural Network in Google Colab

Google Colab is a free cloud service that supports Python.
Google Colab also is great for making computational notebook to combine Code with Documentation / Assignment Questions.
Start making Google Collab Notebook part of your Daily Practice!
You can do personal learning projects and post your Google Collab Notebook to showcase your great work to Employers to win the Interview!
It's an ideal platform for machine learning and AI education because it provides free access to hardware acceleration (GPUs and TPUs), which are essential for training neural network models efficiently.
You can also purchase additional cloud processing time and storage to Go Commando on your AI Model!
Let's start with a simple example:
We will create a neural network that learns to represent words as embeddings and uses those embeddings to predict the context in which a word appears. Next Token Generation!
Step 1: Setting Up the Environment
First, we need to set up our environment in Google Colab. Here's how you start a new notebook and import the necessary libraries.

# Import the required libraries
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

Step 2: Preparing the Data
We'll need some text data to work with. For simplicity, we will create a small corpus of sentences.

# Define a corpus of sentences
corpus = [
'the quick brown fox jumps over the lazy dog',
'I am learning AI in college',
'building AI models is exciting',
'deep learning is a branch of machine learning'

# Tokenize the corpus
tokenizer = tf.keras.preprocessing.text.Tokenizer()
vocab_size = len(tokenizer.word_index) + 1 # +1 for padding
sequences = tokenizer.texts_to_sequences(corpus)

# Let's look at our vocabulary
print("Vocabulary:", tokenizer.word_index)

Step 3: Creating Training Data
We'll use the sequences to create our training data. Each word will be used to predict its context.
pythonCopy code
# Generate training pairs (context windows)
window_size = 2

def generate_context_word_pairs(corpus_seq, window_size, vocab_size):
context_length = window_size*2
for words in corpus_seq:
sentence_length = len(words)
for index, word in enumerate(words):
context_words = []
label_word = []
start = index - window_size
end = index + window_size + 1
for i in range(start, end)
if 0 <= i < sentence_length
and i != index])

x = tf.keras.utils.sequence.pad_sequences(context_words, maxlen=context_length)
y = tf.keras.utils.to_categorical(label_word, vocab_size)
yield (x, y)

# Prepare the data for training
pairs_gen = generate_context_word_pairs(sequences, window_size, vocab_size)
pairs = list(pairs_gen)

Step 4: Building the Neural Network
Now, we create a simple neural network with an embedding layer.
pythonCopy code
# Define the embedding dimension
embed_size = 128

# Build the model
model = Sequential([
layers.Embedding(input_dim=vocab_size, output_dim=embed_size, input_length=window_size*2),
layers.Lambda(lambda x: tf.reduce_mean(x, axis=1)),
layers.Dense(vocab_size, activation='softmax')

# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Summary of the model

Step 5: Training the Model
pythonCopy code
# Train the model
for epoch in range(1000):
loss, acc = 0, 0
for x, y in pairs:
loss, acc = model.train_on_batch(x, y)
if (epoch + 1) %

What kind of a python data structure is this:

print("Vocabulary:", tokenizer.word_index)

The tokenizer.word_index in Python is a dictionary data structure. When using the Tokenizer class from the Keras preprocessing text module, word_index provides a mapping of words (as strings) to their integer indices. Each word is a key in the dictionary, and the corresponding value is the unique integer that has been assigned to that word.
Here's an example of what the word_index might look like if it were printed out:
pythonCopy code
'the': 1,
'learning': 2,
'ai': 3,
'in': 4,
'quick': 5,
'brown': 6,
'fox': 7,
'jumps': 8,
'over': 9,
'lazy': 10,
'dog': 11,
'i': 12,
'am': 13,
'college': 14,
'building': 15,
'models': 16,
'is': 17,
'exciting': 18,
'deep': 19,
'a': 20,
'branch': 21,
'of': 22,
'machine': 23

In this dictionary, each word from the corpus has been given a unique integer index. These indices are used internally by the neural network when processing text data, as the network itself cannot process raw text but instead processes numerical representations of the text.

How this embedding will be a the center of the project to build our own generative AI language model.



Building the Generative AI Model from Neuron to Embedding

A Journey Through the Layers of the AI Architecture

Lecture Overview

Greetings, scholars. Today, we embark on a fascinating journey through the layers of AI architecture. Our destination? To construct a generative AI language model. But our focus will not merely be on the destination; rather, we shall relish the intricate pathway there, beginning with the humble neuron and culminating in the complex structure of model embeddings.
The heart of our endeavor lies in the embedding layer, a dense representation of features where words, or tokens, transform into vectors of real numbers. This transformation is pivotal, as it stands at the core of our generative model, enabling the nuanced understanding of language necessary to generate coherent and contextually relevant text.

Lecture 1: The Foundation - Understanding Neurons and Layers

The Biological Inspiration

Just as the human brain processes information through a network of neurons, artificial neural networks (ANNs) use artificial neurons, or nodes, to process data. Let's take a moment to appreciate the biological neuron: a cell that transmits information through electrical and chemical signals. This concept inspired the creation of the perceptron, the simplest form of an artificial neuron, where inputs, weighted sum, and an activation function mimic the biological process.

From Biological to Artificial

In our artificial constructs, we take inputs �x, apply weights �w, add a bias �b, and run them through an activation function �f to produce an output �y:
Each neuron's output then serves as input to the next layer, creating a complex web of computation that forms the backbone of our AI models.

Coding in Google Colab

Let's get hands-on. Open Google Colab and create a new Python 3 notebook:
pythonCopy code
import numpy as np
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Embedding

# Let's create a single neuron with a sigmoid activation function
model = Sequential([
Dense(1, activation='sigmoid', input_shape=(10,))

model.compile(optimizer='adam', loss='binary_crossentropy')

This simple model represents a single neuron. It's overly simplistic for our needs, but it's a starting block.

Lecture 2: Scaling Up - Neural Networks and Deep Learning

Layering Complexity

A single neuron is limited, but by layering neurons together, we create a neural network capable of learning complex patterns. In deep learning, we stack many such layers to create a 'deep' network. Each layer learns to transform its input data into a slightly more abstract and composite representation.

The Embedding Layer - The First Step Towards an AI Language Model

In the context of NLP, embeddings are a powerful way to represent language. Each word is mapped to a high-dimensional space, where the semantic relationships between words are captured in the geometry of that space.

Google Colab Exercise - Creating an Embedding Layer

Now, let's create an embedding layer:
pythonCopy code
# Set the size of the embedding
vocab_size = 10000 # Let's assume we have a vocabulary of 10,000 words
embed_size = 300 # Each word will be represented by a 300-dimensional vector

embedding_layer = Embedding(input_dim=vocab_size, output_dim=embed_size)

This is a standalone embedding layer. In our final model, it will be the first layer, accepting the integer-encoded vocabulary and outputting the dense word vectors.

Lecture 3: The AI Model Embedding - At the Center of Generative AI

The Role of Embeddings in Generative Models

Generative AI models synthesize new content by learning from vast amounts of data. The embedding layer is central to this. It turns discrete, categorical data into continuous vectors, allowing the model to interpret words not just as unique IDs but as a spectrum of features that represent their meaning.
Let's enhance our Google Colab notebook:
pythonCopy code
# Define a simple sequential model
model = Sequential([
# Other layers would follow here

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy')

Though our model is still incomplete, you've taken the first step in configuring its heart.

Lecture 4: Preparing for Generative Modeling

Architectural Considerations

Generative models often use architectures like RNNs, LSTMs, or Transformers, all of which can manage sequences of data. They require significant computational resources, particularly during training.

Google Colab and Beyond - Deployment

Colab notebooks can be easily shared. Once your model is ready, you can:
Save the model in Google Colab.
Export the notebook to a GitHub repository.
Share the link to the Colab notebook or the GitHub repo with me for evaluation.
Here's how you can save your model:
pythonCopy code'my_generative_model.h5')

And you can share your Colab notebook by clicking on 'Share' in the upper right, ensuring that the notebook is set to 'Anyone with the link can view'.

Epochs in Training

Finally, let's train our embedding within the larger network context for just 4 epochs, as a start. Adjust the epochs parameter below:
pythonCopy code
history =, y_train, epochs=4)

This is just the beginning. The real challenge lies in fine-tuning, experimenting, and discovering the power of embeddings in generative AI models.
Class, this concludes the first segment of our lecture series on AI model embeddings. When you're ready, say 'continue', and we will proceed with the next epoch of your learning journey.

Where the training data is being provided

In the lecture I provided earlier, there was an implicit assumption that you would be dealing with preprocessed data; however, the actual step of providing the training data was not explicitly detailed. This is an oversight that should be addressed, as training data is a crucial aspect of any machine learning task. Let's correct that.
In a comprehensive lecture, the part where we would discuss and provide the training data would typically come after preparing the embedding layer and before defining the complete model architecture. It would involve preprocessing the data to be suitable for training, which is a crucial step for any machine learning or AI project.

Let’s add this now:

Preparing the Training Data for Our Generative AI Model

Understanding the Data

Before we can start training our generative model, we need to prepare our dataset. In the context of NLP and embeddings, our training data consists of text—lots of text. For our generative model, this text needs to be tokenized; that is, converted into numerical values that our model can understand.

Tokenization and Sequence Generation

Using Google Colab, we'll write code to tokenize our corpus of text and then generate sequences that will be used by the model to learn the language structure.
Let's add to our Google Colab notebook:
pythonCopy code
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# Sample corpus
corpus = [
'We are learning about AI',
'Embeddings are central to NLP models',
'This will help us in building generative models',
# Imagine many more sentences here

# Initialize the tokenizer
tokenizer = Tokenizer(num_words=vocab_size)

# Convert sentences to sequences of integers
sequences = tokenizer.texts_to_sequences(corpus)

# Pad the sequences to have uniform length
padded_sequences = pad_sequences(sequences, padding='post')

In this snippet, we tokenize our corpus and pad the resulting sequences to ensure they have the same length, which is required for most neural network architectures.

Generating Labels for Training

For a generative model, we typically want to predict the next word in a sequence, so our labels would be the word that comes after a given sequence of words.
pythonCopy code
# Prepare the data for the generative task
X_train, y_train = padded_sequences[:, :-1], padded_sequences[:, -1]
y_train = tf.keras.utils.to_categorical(y_train, num_classes=vocab_size)

Here, X_train contains sequences of words, and y_train contains the next word that the model should predict. We also one-hot encode our labels to match the output layer of our network, which will use a softmax activation function for multi-class classification.

Next Steps

With our training data prepared, we can now feed it into our model to begin training. This will be the next step in our lab, where we start to see our generative model come to life.
This section of the lecture would typically be accompanied by a more in-depth exploration of the data, a discussion on the importance of data quality and quantity, and potentially a deep dive into data augmentation techniques for improving the robustness of the generative model. It would also include live coding within Google Colab to demonstrate these concepts in action.

A more in-depth exploration of the data, a discussion on the importance of data quality and quantity, and potentially a deep dive into data augmentation techniques for improving the robustness of the generative model.
Live coding within Google Colab to demonstrate these concepts in action.
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
) instead.