Share
Explore

Summer 23 Project Document AML 3304 Course Project Writing Python to construct a Simple Generative AI Language Model

DUE DATE is Week 13:
Last edited 9 seconds ago by System Writer
Project DUE Date is August 18 (Friday). On Saturday I will look at the uploaded project control document text files, examine your Latex, Trello, and GitHub (it you are doing the Coding Version of the Project).
Your Latex Document should contain instructions to run your project and screen shots of its operation.

Change of plan: We will not be doing individual presentations.
Video is optional: Strongly encouraged but not required.
For your Video: You will make a POWERPOINT which will be the story board to be the background when you are doing the screen cast.

Steps to make a Video:

Downlod the free 30 day trial of Camtasia. (See instructions below).
Make a PowerPOINT which will be the storyboard for your Video.
Your video will be a Screencast. Turn on video recording software: Record your screen and your talking.
Save as MP4 file format and upload to YouTube
Put the YouTube URL into your Trello and Latex documents.

“What should I put in my video?”
It’s your video - put whatever you want!
You are telling the story of how you made your Project:
talk about how do did it: tools, procedures
How you made those decisions
Problems
Cool new discoveries
Post it on LINKEDIN for potential employers to see: Show that you are very excited about doing this kind of work!

image.png
image.png
Notes on how to make the Video:
Install Camtasia and Snaggit 30 day trial
Make a Prezi.com / PowerPoint to be the Story Board for your Video.


References and Resources
By July 30:
How to do this: Submit TEXT FILE
Tell me your
Team Name,
Team members’ names and Student IDs
TRELLO Board LINK: (add as a member)
Account LINK (add as a member)
Put this into a TEXT file, upload to : XXX
** OR if doing the Research Paper Stream: Upload a text file,
named as your StudentName_StudentID.txt if individual
named as TeamName.txt if you are a coding project team

Upload to:
s23 Project Document AML 3304 Course Project s1 AM Session Project Control File Upload
This dropbox will be for all Work Product Submissions.

s23 Project Document AML 3304 Course Project s2 PM Session Project Control File Upload
This dropbox will be for all Work Product Submissions.


For Research Paper: Your project grade will be based:
100% on your Work Submission (Reseach Paper or Project Code)
For coding Project:
50% based on the project code and presentation/latex pdf/video
50% based on individual interviews with the Instructor

How to make your TRELLO BOARD (ONE Trello Board per Team).
Go to
Add team members and add Peter@PeterSigurdson.net
PowerPoint:

Make Latex Document
Get signed up at
One Overleaf Account Per Team
You can work in teams of up to 5 members for the Coding Stream Project.
Presentation: Done in the last 2 weeks of class in a separate Adobe Meeting.
According to Course Outline: Project is worth 35% of Final, assigned as follows:
1/2 of your Grade for the project will be shared among the team.
1/2 of your Grade will be based on your answers to my questions and your ability to explain what is going on with the details and operations of the Class Project.
2 streams for the Project
Stream A: You can write a RESEARCH Paper, on topics outlined below: Individual NOT team.
Stream B: You can work in a team of up to 5 members to make your own Python Generative AI Language Mode.
Upload your Research Paper: Upload link will be posted.

Deliverable Requirements and Submission Format for the Coding Project.

Both research stream / coding stream of project will make a Video.
**- We will download and learn to use Camtasia and Snagit
Techsmith.com
image.png
You will make a PowerPoint as the background Storyboard: To screen cast as you are narrating your Video

Make a PowerPoint storyboard to be the backdrop for your Video: Name it as TeamName.pptx [Upload your PowerPoint Story Board also.]

Upload it to:
s1
s2
What we want to see, and what you want to show on your Video: Is the Generative AI Language Model that you built, or present in your Research Paper who you WOULD do it.
Making a Video and presenting your work and Video in a LinkedIn BLOG Post is a REQUIRED WORK PRODUCT OF BOTH THE RESEARCH PAPER STREAM AND THE CODING STREAM.
It is a required output of the project that all team members have face time and talk time on the video.

You are doing this to let Employers see the great work you are doing!


LLama Model (Watch these to get started):
LLaMA & Alpaca: “ChatGPT” On Your Local Computer 🤯 | Tutorial
In this video I will show you how you can run state-of-the-art large language models on your local computer. Yes, you’ve heard right. For this we will use the Dalai library which allows us to run the foundational language model LLaMA as well as the instruction-following Alpaca model. While the LLaMA model is a foundational (or broad) language model that is able to predict the next token (word) based on a given input sequence (sentence), the Alpaca model is a fine-tuned version of the LLaMA model capable of following instructions (which you can think of as ChatGPT behaviour). What’s even more impressive, both these models achieve comparable results or even outperform their GPT counterparts while still being small enough to run on your local computer. In this video I will show you that it only takes a few steps (thanks to the Dalai library) to run “ChatGPT” on your local computer. 00:00:00 Intro 00:00:30 LLaMA Model 00:03:54 Dalai Library Prerequisites 00:06:04 Setup Dalai & LLaMA 00:08:39 Alpaca Model 00:10:57 Setup Dalai & Alpaca 00:12:50 Outro My Medium Article for This Video: https://medium.com/p/17adda704c23 Dalai GitHub Repository: https://github.com/cocktailpeanut/dalai LLaMA Weird Character Generation Issue: https://github.com/cocktailpeanut/dalai/issues/65 Helpful LLaMA prompts: https://github.com/facebookresearch/llama/blob/main/FAQ.md#2 Resources: AI Explained YT Video: https://www.youtube.com/watch?v=xslW5sQOkC8&ab_channel=AIExplained Self-Instruct: https://arxiv.org/pdf/2212.10560.pdf Ark Forecast: https://research.ark-invest.com/hubfs/1_Download_Files_ARK-Invest/Big_Ideas/ARK%20Invest_013123_Presentation_Big%20Ideas%202023_Final.pdf Alpaca: https://crfm.stanford.edu/2023/03/13/alpaca.html Dalai: https://github.com/cocktailpeanut/dalai Introducing LLaMA: https://ai.facebook.com/blog/large-language-model-llama-meta-ai/ LLaMA Paper: https://arxiv.org/pdf/2302.13971.pdf https://twitter.com/miolini/status/1634982361757790209 https://twitter.com/summerlinARK/status/1599196885675544576?lang=en Stay in Touch Medium https://medium.com/@martin-thissen LinkedIn https://linkedin.com/in/mthissen135/ YouTube Of course, feel free to subscribe to my channel! :-) https://patreon.com/MartinThissen (of course, financial support is completely voluntary, but I was asked for this)
youtu.be


When you are presenting in your video:
[Also great advice for the Job Interview:]
You should demontrate your Command and Control - Thought Leadership - your Insight into your Business Domain = of this Material.
Demonstrate your Thought Leadership of this material:
What new questions and thinking and insights can you bring into this?
For your school project, and at work: You should always be “In the Zone” of thinking about what new insights and questions can I get from my Creative Imagination, exposing myself to these concepts everyday.
How to do the video:
Download the free 30 day Trial of Camtasia. You can also get the free trial of Snaggit Screen Capture to drop screen shots into your Video:
image.png
image.png
You will use the Free Trial Version of Camtasia: Or you can each make your own segment of the Video: Dropbox your Segment to one team member who will assemble the various Video Clips for everyone.
Course MINDMAP
Why are we learning this?
Peter’s MINDMAP to give you resources for the PROJECT: ​
You will not lose your JOB to AI. You will lose your Job to people who are using AI to do that job better than you can.
You will move to the Top when you are the one who can build the AI solutions:
Generative AI Large Language Models to implement and deliver the Business processes.
Now you will see the emergence of an emerging Job Category:
AI Language Model Analyst
AI Application Developer
AI Entrepeneurship
Also: AI will be embedded into many other job fields such as Data Analytics, Business Process Engineering, AI-enabled Dev Ops (AI OPS )
Your project is going to be to:
What to deliver in your coding project:
Keep your GitHub up to date with your Code.
Make a Generative AI Language Model which the Instructor can test my asking Questions of it.
You can start by pre-training from another open source model.
You can layer on top of that some of your own Training Corpus content.

Coding Stream: Build a Python Program to present a Small Scale, proof of concept, MVP (Minimal Viable product) - also called a Toy Version, Prototype - of a Generative AI Language Model.
This link can't be embedded.
OR

Research Paper Stream:

If you do the Research Paper stream, you do not need to do any coding.
However you wish to include some code segments in your Paper.
For example you may study my LinkedIn Blog Posting on Open CHAT GPT:
Write a Research paper which discusses the Points noted below, and any other lines of inquiry you developed while doing this research to demonstrate Thought Leadership:

Upload your Paper to:

Your Research Paper will be Authored in Latex.
You will make a TRELLO Board to provide the outlining, research organization, and planning for your Research Paper.
You will make a Video presenting your idea: You will make a PowerPoint to present the StoryBoard background for the Video.
Length: Use your own judgement. Provide a completely detailed explanation of what you want to convey. Include pictures, charts, diagrams, and visual references.
Look for ways to illustrate your ideas with charts, pictures, diagrams, tables.
You can walnut.io, Figma.com, Canva, or other tools to make some infographics and illustrations to include with your paper.
Have a title page: Name, Student ID, Course Code and Section CODE.
You will be using ChatGTP, You.com, so Include the PROMPTS you are using with the AI to do your research.
We learned that a PROMPT should be of this Format:
Tell the AI What you want.
Provide the context or the REASON or purpose that you want this information for.
Constraint or prescribe the format of what the Answer should look like.
For example:
Write a Movie Script about a Bichon Frise named Peanut who is a master Ace Spy fighting for the safety of his country from the Bad People. Cast Peanut as being like Jason Bourne - the superhuman (super canine) Jason Bourne.
The reason I want this is to start a spy adventure series based on Peanut the Spy.
This script should be for a 90 minute, Hollywood-style Action adventure movie.

Here are elements to address for the Research Paper:

1. The technologies and theory involved and talk about HOW these applications are built and about HOW the software engineering and build processes and software project management works in an AI Cloud Devops world:
HOW DOES Unified Process Project Management Methodology (shine into) into this new world of Cloud AI-optimized Devops and building AI solutions.
2. How does CI/CD work here? How do we apply CI / CD to building and enhancing the ML OPS MODEL.
3. How do Bayesian Models work to power Generative AI Language Models.
4. All of the topics covered here:
Research Paper elements to address:
The kinds of tools that are used to build AI applications that integrate with and enhance the company’s business processes.
The kinds of skills you must develop if you want to work as an AI programmer/application developer.




For the Coding Project Stream: Here are your Instructions

Welcome to the lab workbook for AML 3304, where you will learn how to build a simple generative AI language model similar to CHAT GPT.

This lab workbook will guide you through the process of building your first language model, step-by-step.

What software/programming languages will you use?: PYTHON
Where will you source training data?
What kind of a PLATFORM will you deploy your Model to? [Cloud Devops and building the ML Ops Model]
How will you push updates? How will you push updates and do CI/CD - What CI/CD pipeline will you use?

Prerequisites

Before starting this lab, you should have a basic understanding of programming concepts like loops and conditional statements.
You should also have some experience with Python programming language.
Please address this topic in your Latex Document Project Write Up:
Does ML Ops solve all the same software engineering structural requirements that Unified Process does?

Step 1: Installing Required Libraries

Detailed Instructions:
Note: As Cloud Devops Build Engineers: We must solve problems such as provisioning shared team access to our Code Base.
Prompt Engineering:
What kind of a shared code repository should I use with my team to make an ML Ops Model
The first step in building a language model is to install the PYTHON required libraries.
In this lab, we will be using the following libraries:
Target to get this done by next class: Set up your PYTHON Environment with:
Hugging Face
TensorFlow
Keras
NumPy
Pandas
Matplotlib
You can install them using pip or conda command in your terminal.

Step 2: Data Collection and Preprocessing

The next step is to collect and preprocess the data.

You can choose any dataset you want, but for this lab, we will use the Gutenberg corpus.
You can download it from the following link: https://www.gutenberg.org/ebooks/search/%3Fsort_order%3Ddownloads

Once you have downloaded the data, you need to preprocess it by removing unwanted characters, converting text to lowercase, removing stop words, and tokenizing the text.
You can use the HuggingFace platform to do this.
You can use the NLTK library for this task.

Step 3: Building the Language Model

The next step is to build the language model itself.
We will be using a simple neural network with a single LSTM layer.
You can start by defining the architecture of your model. Contrast this with the old-school software engineering approach in which we used Unified Process and UML to design the CONTROL SOFTWARE. Now we start by defining the MACHINE LEARNING MODEL. This is the ML Ops which is replacing Dev Ops and becoming Cloud Devops.

Lecture Notebook: The Role and Use of Neural Network with a Single LSTM Layer

Introduction

Neural networks are a powerful tool for solving complex problems in a variety of fields, including natural language processing, computer vision, and robotics.
One popular type of neural network used for processing and predicting sequential data is the Long Short-Term Memory (LSTM) network.

In this lecture notebook, we will explore the role and use of neural networks with a single LSTM layer. We will cover the basics of LSTMs, their architecture, and how they can be used for natural language processing tasks, such as language modeling and text generation.

Long Short-Term Memory (LSTM)

LSTMs are a type of recurrent neural network (RNN) that are capable of processing and predicting sequential data.
They were introduced by Hochreiter and Schmidhuber in 1997 as a solution to the vanishing gradient problem that occurs in traditional RNNs.
The vanishing gradient problem occurs when the gradients of the loss function with respect to the weights of the network become very small, making it difficult for the network to learn long-term dependencies. LSTMs solve this problem by introducing a memory cell that can remember information for a long time and selectively forget information that is no longer needed.

LSTM Architecture

An LSTM network consists of a series of LSTM cells that are connected in a chain. Each LSTM cell has three gates: the input gate, the output gate, and the forget gate. These gates control the flow of information into and out of the cell.
The input gate decides which information to add to the cell state, the forget gate decides which information to discard from the cell state, and the output gate decides which information to output from the cell state.

Using LSTMs for Natural Language Processing

LSTMs can be used for a variety of natural language processing tasks, such as language modeling, text classification, and text generation. One popular use case for LSTMs is building language models that can generate text that closely resembles human language.

To use LSTMs for language modeling, we first preprocess the text by tokenizing it, converting it to lowercase, and removing stop words. We then feed the preprocessed text into the LSTM network, which learns the underlying patterns and structure of the text. Once the network has been trained, we can use it to generate new text that is similar to the original text.

Conclusion

In conclusion, neural networks with a single LSTM layer are a powerful tool for processing and predicting sequential data, such as natural language. LSTMs are capable of learning long-term dependencies and selectively remembering and forgetting information, making them ideal for natural language processing tasks that require understanding the underlying structure and patterns of text.
With their ability to generate new text that closely resembles human language, LSTMs have the potential to revolutionize the field of natural language processing and open up new possibilities for creative expression and communication.

Is LONG SHORT TERM MEMORY IS used in chatgpt


Yes, LSTM (Long Short-Term Memory) is one of the key components used in the architecture of GPT (Generative Pre-training Transformer), which is the underlying model architecture used in ChatGPT.

GPT is a type of neural network that is pre-trained on large amounts of text data and then fine-tuned for specific natural language processing tasks such as language translation, question-answering, and text generation. The GPT architecture uses a series of transformer blocks, which in turn use self-attention mechanisms to encode and decode textual data.

Within each transformer block, the GPT architecture also includes a feedforward neural network and a multi-head self-attention mechanism. Additionally, the GPT architecture includes a positional encoding mechanism to ensure that the order of words within a sentence is preserved during encoding and decoding.

LSTM is used in the pre-training phase of GPT to help the model learn the underlying patterns and structure of natural language text. Specifically, LSTM is used to pre-train the language model on a large corpus of text data before fine-tuning it for specific natural language processing tasks.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout

model = Sequential()
model.add(LSTM(128, input_shape=(max_len, len(chars))))
model.add(Dropout(0.2))
model.add(Dense(len(chars), activation='softmax'))

Step 4: Training the Language Model

Once you have defined the architecture of your model, you can start training it using the preprocessed data. You can use the following code to train your model:
model.compile(loss='categorical_crossentropy', optimizer='adam')

model.fit(x, y, batch_size=128, epochs=20)

Step 5: Generating Text

Once your model has been trained, you can use it to generate text. You can use the following code to generate text:
def generate_text(model, length):
# Start with a random seed
start_index = random.randint(0, len(text) - max_len - 1)
generated_text = text[start_index: start_index + max_len]

for i in range(length):
# Vectorize the input
x_pred = np.zeros((1, max_len, len(chars)))
for t, char in enumerate(generated_text):
x_pred[0, t, char_to_int[char]] = 1.

# Predict the next character
preds = model.predict(x_pred, verbose=0)[0]
next_index = sample(preds, 0.5)
next_char = int_to_char[next_index]

# Append the predicted character to the generated text
generated_text += next_char
generated_text = generated_text[1:]

return generated_text


Conclusion

In this lab workbook, you learned how to build a simple generative AI language model similar to CHAT GPT. You learned how to collect and preprocess data, build a neural network, train the model, and generate text using the model. This is just the beginning of your journey in the field of natural language processing, and we hope that this lab has sparked your interest in this exciting and rapidly growing field.

Prompt Engineering: expand to give explicit details on how all the tools and python libraries work

Step 1: Installing Required Libraries



The first step in building a language model is to install the required libraries. In this lab, we will be using the following libraries:


TensorFlow

TensorFlow is an open-source machine learning library developed by Google. It is used for building and training neural networks, including deep learning models. TensorFlow is one of the most popular libraries for machine learning and has a vast community of developers and contributors.

Keras

Keras is an open-source library that provides a high-level API for building and training neural networks. It is built on top of TensorFlow and allows developers to quickly prototype and build deep learning models.

NumPy

NumPy is a Python library used for numerical computing. It provides support for large, multi-dimensional arrays and matrices, as well as a large collection of mathematical functions to operate on these arrays.

Pandas

Pandas is a Python library used for data manipulation and analysis. It provides support for data structures like data frames, series, and panel, and functions to manipulate and analyze data.

Matplotlib

Matplotlib is a Python library used for data visualization. It provides support for creating various types of plots, including line plots, scatter plots, bar plots, and histograms.
You can install these libraries using pip or conda command in your terminal.

Step 2: Data Collection and Preprocessing

The next step is to collect and preprocess the data. You can choose any dataset you want, but for this lab, we will use the Gutenberg corpus. You can download it from the following link: https://www.gutenberg.org/ebooks/search/%3Fsort_order%3Ddownloads
Once you have downloaded the data, you need to preprocess it by removing unwanted characters, converting text to lowercase, removing stop words, and tokenizing the text. You can use the NLTK library for this task.

NLTK

NLTK (Natural Language Toolkit) is a Python library used for natural language processing tasks, such as tokenization, stemming, lemmatization, part-of-speech tagging, and more. It provides support for various corpora and resources for training and testing natural language processing models.

Step 3: Building the Language Model

The next step is to build the language model itself. We will be using a simple neural network with a single LSTM layer. You can start by defining the architecture of your model.

LSTM

LSTM (Long Short-Term Memory) is a type of recurrent neural network (RNN) that is used for processing and predicting sequential data. It is a neural network layer that can remember information for a long time and avoid the vanishing gradient problem, which is common in traditional RNNs.

Step 4: Training the Language Model

Once you have defined the architecture of your model, you can start training it using the preprocessed data. You can use the following code to train your model:

Compile and Fit

The compile() method configures the model for training, including the loss function, optimizer, and metrics. The fit() method trains the model on the given data, with the specified batch size and number of epochs.

Step 5: Generating Text


Once your model has been trained, you can use it to generate text. You can use the following code to generate text:

Sampling

The sample() function is used to randomly select an index from the probability distribution of the predicted characters. By sampling from this distribution, we can generate new text that is similar to the original data but not an exact copy.

Conclusion


In conclusion, building a simple generative AI language model requires a combination of several Python libraries and tools. We need to preprocess the data using NLTK, build the model using TensorFlow and Keras, train the model using the fit() method, and generate text using the sample() function. With these tools and libraries, we can build powerful language models that can generate text that closely resembles human language.


Tabulate the benefits of each language model.


In addition to the model "gpt-3.5-turbo," OpenAI offers several other models, including "code-davinci-002," "text-davinci-002," "text-davinci-003," "ChatGPT," "GPT-4," and "GPT-4-32k" [1][2][3][4]. Each model has its own strengths and use cases, and some are updated more frequently than others.
Sure, here are the benefits of each language model:
gpt-3.5-turbo: This is the largest and most powerful language model offered by OpenAI, with 175 billion parameters. It is capable of generating human-like text across a wide range of topics and can be used for applications like chatbots, language translation, and content generation.
code-davinci-002: This model is specifically designed for programming and software development tasks. It can generate code snippets, debug code, and complete code prompts with high accuracy. It is particularly useful for developers who want to automate repetitive programming tasks.
text-davinci-002: This model is optimized for generating natural language text and can be used for applications like content creation, chatbots, and virtual assistants. It is trained on a wide range of text sources and can generate high-quality text across a variety of topics.
text-davinci-003: This is an updated version of the text-davinci-002 model, with improved performance and accuracy. It can generate more fluent and coherent text than its predecessor and is particularly useful for applications that require long-form text generation.
ChatGPT: This is a smaller and more lightweight language model than some of the others offered by OpenAI, with only 774 million parameters. It is specifically designed for chatbot applications and can generate natural-sounding responses to user queries.
GPT-4: This is a future language model that has not yet been released by OpenAI. It is expected to be even larger and more powerful than gpt-3.5-turbo and will likely be capable of generating even more human-like text.
GPT-4-32k: This is a variant of the GPT-4 model that is designed to be more memory-efficient, with a smaller model size of 32,000 parameters. It is intended for use in devices with limited memory and processing power, such as mobile phones and IoT devices.
Each of these language models has its own unique strengths and applications, and the choice of model will depend on the specific needs of the user.

Hand in work / How to submit the Coding Project.

Make a PowerPoint. Name it as TeamName.pptx

Into this PowerPoint: Put Team members’ names and Student IDs. GITHUB URL. A technical narrative of what you did, and how you did it.

Write your PYTHON CODE, get it working on your work Station.
Your hand in will be your GITHUB Repository URL. Make it PUBLIC
Your team will create and upload a POWERPOINT:
Present a Title Page with Team Members’ Names and Student IDs.
Present your GITHUB URL
Provide a technical narrative of what you will and how. Discuss any details of interest such as your Training Corpus.
Present some screen some shots of your code in operation.
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.