Increasing Software Availability to 99.9%

Company

Appitsimple
Product CallHippo

Problem

CallHippo is a SaaS virtual telephony solution that helps businesses connect with their clients from anywhere. The software was used by sales and marketing teams to connect with their clients. This meant that it was important to it be up and running
With the continuous development approach, we see software concerns from time to time, this led to sometime CallHippo servers going down and making the service unusable.
This led to customers complaining. With customers based out of US and development team in India we were also slow to take the action if servers go down at odd time for development team and CallHippo wanted a solution to the same
There were about 1 failure in every 15 days at Jan 2019

Breakthrough Solution

Vishal was given the task to find the solution to increase the service availability time.
The solution came through implementing the FMEA Framework also called Failure Mode and Effect Analysis.
The team asked the most basic question - what stops a person from calling?
We started with reasons that were leading to failure to achieve that result, their probability of happening in a month and their impact. We found out 35+ reasons which were stopping a customer lead to failure and not being able to achieve the task of making or receiving a call
With that we identified the measures that will be required to make to avoid it from happening in the first place
This helped us prioritise the product development based on impact and the we were able to pull it off. In 2-3 Months we were able to get rid of all major reasons that were causing our servers to go down
We changed the process before the deployment to test and make things more versatile and detailed

Impact

The server downtime went from once in 15 days to once in 6 months.
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.