Explore

Lesson Plan: Introduction to ML Ops and Automation Tools

Peter Sigurdson

What is CI / CD?

A step of steps that run a build process.

Therefore, most of the tools we will be using will Automation Servers.

Objective:

Introduce students to the concept of ML Ops and its importance in managing and deploying machine learning models

Familiarize students with popular automation tools such as Ansible, Jenkins, Travis CI, GitLab CI/CD, GitHub Actions, and CircleCI.com

Materials needed:

Computers with internet access

Slides (provided in the lecture section)

Duration: 1 hour

Outline:

I. Introduction to ML Ops (15 minutes)

Explanation of what ML Ops is and why it is important

Overview of the different stages of ML Ops (development, testing, deployment, and monitoring)

ML Ops, or Machine Learning Operations, is the process of managing, deploying, and monitoring machine learning models in production. It involves a set of practices, tools, and processes that enable teams to build, test, deploy, and monitor machine learning models in a efficient and effective way.

The main goal of ML Ops is to ensure that machine learning models are deployed and running smoothly in production, and that they continue to perform as expected. It also helps to improve collaboration, communication, and transparency between data scientists, engineers, and other stakeholders.

Here are some key steps for doing ML Ops:

Model Development: The first step is to develop a machine learning model that solves a specific business problem. This typically involves data preprocessing, feature engineering, model selection and training.

Model Testing: After developing a model, it is important to test it to ensure that it is working as expected. This includes both unit and integration testing, as well as evaluating the model's performance using metrics such as accuracy, precision, and recall.

Model Deployment: Once a model has been developed and tested, it can be deployed to a production environment. This step typically involves creating a containerized or serverless version of the model and deploying it to a cloud platform or on-premises infrastructure.

Model Monitoring: After deploying a model, it's important to monitor its performance and behavior in production. This includes collecting metrics, logging, and alerts to detect any issues or anomalies.

Model Management: Finally, it's important to have a process for managing machine learning models in production. This includes versioning, tracking of changes, and maintaining a history of models.

To effectively do ML Ops, it is important to use appropriate tools and frameworks such as Ansible, Jenkins, GitLab, GitHub Actions, CircleCI and others to automate the various stages of ML Ops. This can help to improve the SRE factors of: speed, reliability and scalability of the ML process.

II. Ansible (a component of your Assignment)

Explanation of what Ansible is and how it can be used for automation

Demonstration of how to create and run an Ansible playbook

III. Jenkins (a component of your assignment)

Explanation of what Jenkins is and how it can be used for continuous integration and continuous deployment

Demonstration of how to create and run a Jenkins job

IV. Travis CI (10 minutes)

Explanation of what Travis CI is and how it can be used for continuous integration and testing

Demonstration of how to configure a Travis CI build

V. GitLab CI/CD (10 minutes)

Explanation of what GitLab CI/CD is and how it can be used for continuous integration, testing, and deployment

Demonstration of how to configure a GitLab CI/CD pipeline (This will be a key element of your Project)

VI. GitHub Actions (10 minutes)

Explanation of what GitHub Actions are and how they can be used for continuous integration, testing, and deployment

Demonstration of how to configure a GitHub Actions workflow

VII. CircleCI (10 minutes)

Explanation of what CircleCI is and how it can be used for continuous integration, testing, and deployment

Demonstration of how to configure a CircleCI pipeline

VIII. Conclusion and Next Steps (5 minutes)

Setup and configure all the elements of a PYTHON CI / CD Process

Ansible is an open-source automation tool that can be used to automate various IT tasks, such as configuration management, application deployment, and task automation. It uses a simple and human-readable language called YAML to define automation tasks, making it easy to understand and use.

Ansible works by connecting to remote hosts through SSH or WinRM, and then executing tasks on those hosts using a set of pre-defined modules. These modules are written in Python and can perform a variety of tasks, such as installing software, configuring services, and managing files.

Ansible uses the concept of playbooks to define a set of tasks that need to be executed on a specific set of hosts. A playbook is a YAML file that contains a list of tasks, and the hosts on which those tasks should be executed. Playbooks can also include variables, which can be used to customize the tasks based on the environment or specific needs.

Ansible also uses the concept of roles, which allow you to organize your playbooks and tasks into reusable and modular components. Roles are a way to group related tasks, files, and variables together, making it easier to share and reuse playbooks across different projects.

Here are the main components of Ansible:

Inventory: A list of hosts or groups of hosts that Ansible will connect to and execute tasks on.

Playbooks: YAML files that contain a list of tasks and the hosts on which those tasks should be executed.

Modules: Pre-defined Python scripts that perform specific tasks, such as installing software or configuring services.

Roles: Reusable and modular components that group related tasks, files, and variables together.

Variables: Used to customize tasks based on the environment or specific needs.

To use Ansible, you will need to install it on your local machine and define your inventory and playbooks. Once you have defined your inventory and playbooks, you can use the ansible-playbook command to execute your playbooks on the defined hosts.

Ansible also provides a web-based interface called Ansible Tower that can be used to manage and monitor your Ansible automation.

Ansible is a powerful automation tool that can be used to automate a wide range of IT tasks. It is easy to use, human-readable, and can be used to improve the speed, reliability, and scalability of your IT operations.

An Introduction to Ansible In our first chapter, we are going to be looking at the technology world before tools such as Ansible came into existence in order to get an understanding of why Ansible was needed. Before we start to talk about Ansible, let's quickly discuss the old world. I have been working with servers, mostly ones that serve web pages, since the late 90s, and the landscape is unrecognizable. To give you an idea of how I used to operate my early servers, here is a quick overview of my first few years running servers. Like most people at the time, I started with a shared hosting account where I had very little control over anything on the server side when the site I was running at the time outgrew shared hosting. I moved to a dedicated server—this is where I thought I would be able to flex my future system administrator muscles, but I was wrong. The server I got was a Cobalt RaQ 3, a 1U server appliance, which, in my opinion, was ahead of its time. However, I did not have root level access to the machine and for everything I needed to do, I had to use the web-based control panel. Eventually, I got a level of access where I could access the server using SSH or Telnet (I know, it was the early days), and I started to teach myself how to be a system administrator by making changes in the web control panel and looking at the changes to the configuration files on the server. After a while, I changed servers and this time opted to forego any web-based control panel and just use what I had learned with the Cobalt RaQ to configure my first proper Linux, Apache, MySQL, PHP (LAMP) server by using the pages of notes I had made. I had created my own runbooks of one-liners to install and configure the software I needed, as well as numerous scribbles to help me look into problems and keep the lights on. After I got my second server for another project, I realized that was probably a good time to type out my notes so that I could copy and paste them when I needed to deploy a server, which I am glad I did, as it was shortly after my first server failed—my host apologized and replaced it with a higher-specification but completely fresh machine with an updated operating system. So I grabbed my Microsoft Word file containing the notes I made and proceeded to then copy and paste each instruction, making tweaks based on what I needed to install and also on the upgraded operating system. Several hours later, I had my server up and running and my data restored. One of the important lessons I learned, other than that there is no such thing as too many backups, was to not use Microsoft Word to store these types of notes; the command doesn't care if your notes are all nicely formatted with headings and courier font for the bits you need to paste. What it does care about is using proper syntax, which Word had managed to autocorrect and format for print. So, I made a copy of the history file on the server and transcribed my notes in plaintext. These notes provided the base for the next few years as I started to script parts of them, mostly the bits that didn't require any user input. These scraps of commands, one-liners, and scripts were all adapted through Red Hat Linux 6—note the lack of the word Enterprise—all the way through to CentOS 3 and 4. Things got complicated when I changed roles, stopped consuming services from web hosts, and started working for one. All of a sudden, I was building servers for customers who may have different requirements than my own projects—no one server was the same. From here, I started working with Kickstart scripts, PXE boot servers, gold masters on imaging servers, virtual machines, and bash scripts that started prompting for information on the system that was being built. I had also moved from only needing to worry about maintaining my own servers to having to log in to hundreds of different physical and virtual servers, from ones that belonged to the company I was working for to customer machines. Over the next few years, my single text file quickly morphed into a complex collection of notes, scripts, precompiled binaries, and spreadsheets of information that, if I am being honest, really only made sense to me. While I had moved to automate quite a few parts of my day-to-day work using bash scripts and stringing commands together, I found that my days were still very much filled with running all of these tasks manually, as well as working a service desk dealing with customer-reported problems and queries. My story is probably typical of many people, while the operating systems used will probably be considered quite ancient. Now, the entry point of using a GUI and moving to the command line, while also keeping a scratch pad of common commands, is quite a common one I have heard. We will be covering the following topics: Who is behind Ansible The differences between Ansible and other tools The problem Ansible solves

McKendrick, Russ. Learn Ansible: Automate cloud, security, and network infrastructure using Ansible 2.x . Packt Publishing. Kindle Edition.

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.