Share
Explore

W23 AML 3304 Assignment 1 How to do ML Ops to build an AI language model

For your Assignment:
Instructor's Overleaf Latex Document to give you some ideas of how to structure your team's Latex Document:
Build your own LLM using the principles in the Work Book
Make your own LLM based on this workbook:

The big goals of this assignment are to learn how to make a customized LLM that you can train on the documents and processes of a company. This is going to become a huge in-demand skill over the next 2 to 3 years. We want you to be there first!

Appointment one team member to be the Team Librarian, create the Team Trello Board, and share access to everyone else.
As you start to put together your ideas for the assignment, everyone can put in their ideas and brainstorm, build up the ideas, rework, reformat → Finally when everyone agrees →Use this to build your Latex Document.

Sample of a TRELLO Board such as you will make:

Class Exercise: Let’s Get signed up on Trello
Go to : signup with personal email address: Get the free version
Video on using TRELLO:

image.png



image.png

LINK to Instructor’s Demontration Board for TrELLO:

What you are to do:

1. Send me a text file (see assigment instructions) with links to TRELLO and Latex ULR LINKS Public so I can view them.
2. Do research on the following questions below: Questions are the thinking tools for us as professional people: ​ Use Chat CPT and Search Engine: perplexity.ai:
The Whole Purpose and Learning Outcome of this Assignment is to Cultivate Inside Yourself the ability to Come up with good meaningful questions Formulate strong questions
(just like you did for the LI Article Assignment).
Use the AI as your teacher - ask it questions - gather the information it provides, dig more into specific topics. Have conversations with your AI professor. Ask it to teach you. This Assignment is about learning to ask deep questions. (This is what we call Prompt Engineering).


I want one per team: send me a text file named as teamname.txt:
Team members’ names and student ID
LINK to your TRELLO Board
Add to your TELLO Board
EDIT LINK to your Latex Document

Instructor’s TRELLO Board:


How to add members to your Trello board:

https://app.screencast.com/2qX3GDhw80XvA
Remember to add peter@petersigurdson.net

Upload link for the text file with LATEX Overleaf Share link and Trello Share LINK:


Instructor’s Latex Document:


How to Share Editor Access to LATEX:

The 2 Deliverables your Team will present for this Assignment:
TRELLO board
Latex Presentation document:

Here are some learning resources for LATEX: ​

PowerPoint notebook:

How to create your Latex Document:

Go to - make an Account.
Appoint one team member to be the in-charge person for the Team’s Latex Document
They will give all member members plus EDITOR ACCESS to the TEAM’s Latex Document:

Each team appoint a Documentarian to be in charge of setting up your TRELLO and Latex Documents and assigning everyone in the team to have EDITOR ACCESS: Make a TEXT document: include the SHARE LINK to your TRELLO Board and put into that text document the LINK to your OVERLEAF LATEX Document

See this Screencast for instructions on getting your TRELLO Board Share LINK
Once you have your TRELLO Share LINK and LATEX Share LINK in that text - Save that document as TEAMName.txt - and upload that document to:


Learn to use LATEX
Make a TRELLO BOARD to organize your teams’ work.
Add as a member of your Trello Board: Along with all the other team members.

ONE TRELLO Board for the entire team. Appoint a team document manager to be in charge of creating this board. Add all the other team members’ emails to it. (Add the email address they signed into TRELLO with).
image.png

Also - Add to your board
Make a SWIM Lane for each of the Questions you have been asked to answer.
In each Swim Lane: Put cards (all team members brain storm to do this) to address the Questions you have been asked to answer:


And use the Trello Board to start to organize your team’s work.
You will use the TRELLO board to build your Latex Document:
And this will be the Script or Storyboard for your Video.

Resources:

ChatGPT and Playground
Peter’s List of AI Research Resources:


What to deliver / How to do this work:This Assignment is a preparation for doing the Project.
In the Project, you will create an MVP Minimal Viable Product for your own AI language model, to learn what all the core pieces are and how you fit them together.

This Assignment requires you to research and investigate HOW these technologies work.

What exactly am I supposed to submit for this assignment:
Your TRELLO Board LINK
Your LATEX Board LINK
Put both these LINKS INTO A TEXT FILE [1 per team] named as your TEAMName.txt
Upload to: ​Tuesday Group Section 2 Where to upload your TEXT file:


With your team, you will research and investigate the questions noted below, and any other topics you think of by yourself.
Develop the skills of Prompt Engineering to develop your ability to provide to your clients and employers the quality of Thought Leadership.
Remember that to work in this field, you need to run the Red Queen’s Race, which means: have a mindset to invent your own work and come up with the right questions to ask.
The core element of an AI language model is the Machine Learning Model.
How do we as Cloud Dev Ops specialists build, run, administer, maintain, support the Software Build Process for the ML MODEL (which is the basis for the AI generative language model) in the Cloud?
ML Ops is a critical component in building and deploying machine learning models at scale.
ML Ops involves automating the entire machine learning lifecycle from model development to deployment and monitoring.

Learning Outcomes for this Assignment: / Key Questions and Concepts to answer in your TRELLO Board and Latex Document:

Trello Boards and Latex will be your presentation format for your Assignment (And project)

What is the generative AI Language MODEL and how do we BUILD it?


How does CI / CD work in an AI ML Software build work, to deliver the ML Ops Model?

What metrics do we design, and how do we apply these metrics, to the software build process for AI/ML Cloud DevOps? What numbers do we measure by to see if we are doing a good enough job in terms of our AI/ML build process?

DEV OPS = The skill set to run a Software build process.
ML OPS is the skill to build the ML OPS MODEL.
BUT: What new dynamics do we need to factor into play to do CLOUD dev ops because our AI and ML build processes live in the CLOUD.
What are the deployment strategies for ML Models?

Easily accessible tool for machine learning ops called mlflow where one can
save artifacts like dataset, metrics , training scripts, with versioning
model registering and deployment.

Here are some steps and resources on how to do ML Cloud Dev Ops to build a Generative AI language model:


How to develop the Project Plan and curate the appropriate data for training the AI ML MODEL:
Talk about what THE Machine Learning MODEL is - where it comes from, how and why we use it. ​ You have seen how Project Management and Software Engineering work for old-school traditional software projects: 3 Tier Web Applications / MODEL VIEW CONTROLLER
Desktop Applications
Distributed Applications/Edge Computing / Internet of Things ​Project these insights into how:
Software Engineering
Software Project management
will be done in the context of AI/ML Cloud Dev Ops. (Which means: Software Building Considerations for AI and ML).
What is Cloud Dev Ops/ how is it done / compare and contrast with traditional dev ops. ​The first step in building an AI language model is to identify what you are going to use it for and why?
Then think about where you are going to source the Training data from.
What value is it going to deliver to the business?
See the PowerPoint slides for a discussion of how does a generative AI Language model produce new knowledge.
image.png

This involves determining the problem statement, defining the target audience, and gathering relevant data to train the model. → Compare/Contrast/Discuss how the relates to the Unified Process way of doing things.
Modeling strategies: Once the data is gathered, the next step is to develop modeling strategies that align with the project goals (aligns with User Stories in UP). This includes selecting appropriate algorithms, feature engineering, and determining evaluation metrics.
Data pipelines: Data pipelines are an essential part of building software processes to create ML Models (this is ML OPS), that involve gathering, cleaning, and validating datasets. These data pipelines ensure data consistency, quality, and reliability. What about Veracity, Trustworthiness of the training data? Which human agency has the moral suasion to make these decisions.
Automated data and model validation: To automate the process of using new data to retrain models in production, it is essential to introduce automated data and model validation steps to the pipeline. This ensures the model's accuracy, reliability, and consistency. ​
Model baseline and concept drift: Talk about what this means: Baseline. Concept Drift. Establishing a model baseline and addressing concept drift is critical in ensuring model accuracy and consistency. This involves monitoring the model's performance and adjusting it accordingly when there is a deviation from the baseline.
Deploying and continuously improving the model: How does CI / CD work in an AI ML Software build world? Once the model is developed, the next step is to deploy it and continuously improve it. This involves selecting the appropriate deployment strategy, integrating it into the production system, and monitoring its performance.
Monitoring and alerts: Monitoring machine learning applications for operational and machine learning-related issues is crucial. Devops - Cloud Devops SRE : Service and Reliability Engineering // How does SRE transform into AI/ML? Monitoring and Alerts involves comparing model inputs between training and inference, exploring model-specific metrics, and providing monitoring and alerts on the machine learning infrastructure. ​
In your project presentation (which is your Video): Address each of these items and compare/contrast with the Unified Process way of doing things: From your research, come up with additional things to think about.
building an AI language model with ML Ops involves scoping the project
identifying data needs
developing modeling strategies
building data pipelines
introducing automated data and model validation: what IS model validation, how is it done?
establishing a model baseline and addressing concept drift
deploying and continuously improving the model
monitoring and alerting for operational and machine learning-related issues ​



I expect your Team to think of additional questions based on your own research:
Here is an example:

In the 1990s, and 2000s, famous Computer Scientists like Peter Sigurdson, used Unified Process to do combine software engineering and software project management. Predict what the methodology will be in the future for software project management and software engineering for building generative AI language models.
It's important to note that there is no record of a famous computer scientist named Peter Sigurdson. However, I understand your question is about the future of software project management and software engineering methodologies for building generative AI language models.
In the future, we can expect methodologies for software project management and software engineering to be more adaptable, data-driven, and focused on continuous learning and improvement. As AI and ML technologies become more sophisticated and pervasive, several trends and techniques may emerge:
Hybrid methodologies: Combining the best aspects of traditional methodologies, like the Unified Process or Waterfall, with Agile or Lean principles. These hybrid methodologies will allow teams to balance the need for structure and planning with the flexibility and adaptability required for rapidly changing AI/ML environments.
Data-driven development: AI/ML models rely on data, and future methodologies will put an even greater emphasis on data quality, consistency, and management. These methodologies will be built around data pipelines, data versioning, and data validation to ensure reliable and trustworthy AI/ML models.
Continuous integration, delivery, and deployment (CI/CD): CI/CD practices will become more advanced and better integrated with AI/ML projects. AI-specific CI/CD tools, like TensorFlow Extended (TFX) or MLflow, will evolve, allowing for seamless end-to-end pipelines for model training, validation, and deployment.
AutoML and Neural Architecture Search (NAS): With the growing complexity of generative AI language models, there will be an increased reliance on AutoML and NAS techniques. These will enable the automatic discovery and optimization of model architectures, hyperparameters, and training strategies, significantly reducing the manual effort required to create and maintain state-of-the-art models.
Explainable AI and model interpretability: As AI language models become more complex, there will be a greater need for explainability and interpretability. Future methodologies will incorporate techniques to make AI models more transparent and understandable to developers, stakeholders, and end-users, ensuring trust and accountability.
Collaboration between humans and AI: Future methodologies will emphasize human-AI collaboration, where AI assists in decision-making, automating tasks, and providing insights throughout the software development and project management processes.
AI ethics and responsible AI: Future methodologies will place more emphasis on ethical considerations and responsible AI development. This includes addressing issues like fairness, accountability, transparency, privacy, and avoiding biases in AI/ML models.
Cross-functional and interdisciplinary teams: Future methodologies will focus on fostering collaboration between different disciplines, like data scientists, ML engineers, software developers, and domain experts. This will ensure that AI language models are built with a deep understanding of the problem domain and optimized for real-world applications.
Overall, the future of software project management and software engineering methodologies for building generative AI language models will be centered around adaptability, data-driven approaches, automation, and the integration of AI throughout the entire development process. These methodologies will also consider ethical and responsible AI development, ensuring that AI language models are both effective and trustworthy.
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.