Autonomous Operations – IT operations on Autopilot


By: Gaurav Mahakud


Imagine a bank outage that lasts for days, leaving almost a million deposit accounts inaccessible and causing significant financial hardship for customers. Such incidents are not uncommon in the ever-evolving world of information technology. Despite efforts to address IT complexity, outages and performance issues are still prevalent. However, a technological solution to a social problem is reshaping the approach to IT complexity with a different perspective.


blog24-image
Figure 1: Autonomous vehicles

Take a look at autonomous vehicles – a revolutionary mobility technology that addresses social problems by preventing 1.35 million road accidents each year, reducing an individual's average time in traffic congestion by 150 hours, and minimising violations of road traffic rules. Tesla, Waymo, and Cruise by GM are a few examples of companies that have commercialized this technology. The ability of autonomous driving to anticipate and plan for upcoming driving situations is based on a philosophical concept called "Prudence." Technically, an autonomous driving vehicle is prudent or "looks ahead" with a combination of sensors and cameras that feed real-time data to artificial intelligence (AI). The AI makes autonomous decisions and operates to steer a car around seamlessly.


In the IT world, operations are a fundamental part of organisations. They steer the collective vehicle of IT infrastructure components on a journey to provide the best user experience. Managing the criticality of infrastructure and millions of end-users every day is a daunting task. Most IT organizations experience a bumpy ride while dealing with the complexity of infrastructure, the cost to manage its performance, scalability to support its user base, and security to keep data safe. Manual operations and the continuous addition of people to manage the ever-growing complexity of infrastructure are the top reasons for this. IT organizations try to maintain business momentum by spending huge resources on managing IT clutter instead of reducing it.


Artificial Intelligence for IT Operations (AIOps) emerged ten years ago to make this bumpy ride smoother. AIOps helps the operations team detect issues faster and take actions to resolve them quickly using traditional AI. However, the new need is for ‘autonomous operations’ with proactive approaches—which can move past AIOps and achieve prevention of issues even before they occur. It can radically reduce IT management costs and improve efficiency with the least manual efforts possible. This will become a reality only by leveraging advanced deep AI, continuously training it to understand current business behavior and relating it to the fundamental components that drive operations.


blog24-image
Figure 2: Autonomous IT operations

While being proactive is the new necessity, reactive approaches cannot be abandoned altogether. AIOps reactive root cause analysis (RCA) reveals the blast radius and the impacted components due to an IT infrastructure issue. It also identifies the cause as either an internal variable of the infrastructure that can be controlled or an external variable like network speed or third-party integrations, which are beyond the control of an IT organization. In addition to these reactive approaches, AIOps tools need to provide accurate information on possible impacts to prevent them. This transformation to proactive AIOps is seen as an emerging market of Pre-emptibility with pre-emptive tech like Autonomous Operations (Auto-Ops). Auto-Ops addresses familiar and novel situations differently and uses knowledge from both to identify early symptoms of an IT issue. These recognized symptoms of issues can proactively keep any IT infrastructure immune from similar categories of failures.


Auto-Ops can help businesses manage IT clutter with a new benchmark of efficiency and optimized cost. Auto-Ops is capable of transforming the IT world by identifying a new issue, recalling how it emerged, and preventing its future occurrence. Going back to the IT problems faced by our bank, envision the solution. Imagine identifying the early symptoms of an IT catastrophe and taking immediate action to mitigate a calculated impact with Auto-Ops.