Modern IT Operations: Ninja, Samurai or Ronin?

Author: Abdelelah Alzaghloul, CISA, CISM, CGEIT, CRISC, ITIL 4 MP, ITIL 4 SL
Date Published: 30 June 2021

For decades, the IT discipline has suffered from a division between its two main entities: the development team and the operations team. This segregation of duties (SoD) has resulted in different mindsets, models and tools, and competing and conflicting objectives. IT literature refers to this as the “wall of confusion” because developers seem to throw their finished codes over the wall to the operations team.

The development team focuses on speed, agility and the continuous delivery of changes to meet strict and challenging time-to-market objectives. In contrast, the operations team focuses on reducing the cadence of change to maintain stability, compliance and continuity of business operations to meet challenging availability targets.

These different mindsets based on skills and specialization can be compared to two famous Japanese warrior types: ninja and samurai. The development team has many characteristics in common with the ninja warrior:

  • Driven by agility and speed
  • Expert in the use of lightweight weapons such as knives, ninja stars and crossbows, or integrated development environment tools
  • Proficient in stealth and camouflage, for example, the ability to work from anywhere at any time

At the other extreme, the operations team resembles the traditional samurai warrior:

  • Heavily armored, slow moving, and situated close to fortresses and castles (i.e., data centers, server rooms and operations bridges)
  • Disciplined by an ancient code of ethics called Bushido (IT service management principles)
  • Expert in the use of swords and heavy weapons (i.e., IT equipment and devices)
  • Organized in hierarchical army legions (i.e., traditional tiered IT operations model)

Just as IT communities around the globe were bridging the gap between development and operations to meet the demands of high-velocity IT departments, an unprecedented disruption impacted both teams: the COVID-19 pandemic.

The samurai warriors (operations teams) were unable to stand close to their castles and strongholds (data centers and server rooms) due to lockdowns, or they had to risk their lives to do so. Ninja warriors (development teams) had to think twice (or even three times) before introducing any change, knowing that failure could put their colleagues (operations) in risky situations.

IT enterprises around the globe recognized the need for a new type of warrior, leading to the rise of the ronin. A ronin is a wanderer; someone who finds the way without belonging to any one place. A ronin is a hybrid warrior—as fast as a ninja and as disciplined as a samurai—with additional characteristics:

  • Mobile
  • Masterless
  • Skilled in the use of both heavy and light weapons

The need for such a warrior was evident before the pandemic and was the focus of many emerging disciplines (e.g., DevOps, AgileOps), but COVID-19 accelerated the adoption of this trend globally, along with many other changes in the landscape of IT operations, by introducing new battlefields, weapons, tactics and even ethics.

New Battlefields

Being wanderers, ronins are not bound to their castles or fortresses, just as operations teams are no longer required to be down the hall from IT data centers or server rooms. They can be anywhere and work at any time, as they proved during pandemic-related lockdowns and curfews. Even though some operational tasks may still require “touch labor” or a physical presence, such as replacing faulty devices or investigating hardware or power failures, the idea is to minimize these physical interventions by leveraging technology.

A DYNAMIC, HIGHLY EXTENSIBLE WORKFORCE THAT IS DIGITALLY ENABLED AND SECURE BALANCES EMPLOYEES’ SAFETY, PRODUCTIVITY AND CONTINUITY OF BUSINESS OPERATIONS.

A dynamic, highly extensible workforce that is digitally enabled and secure balances employees’ safety, productivity and continuity of business operations. For example, giants like Facebook and Twitter introduced permanent work-from-home (WFH) policies, allowing employees to work remotely forever.1 Some of the technology enablers for this model include:

  • Connectivity tools—Virtual private networks (VPN) and virtual desktop infrastructure (VDI) are among many tools that enable operations teams to perform their day-to-day duties remotely, effectively and efficiently. When security principles are applied, such tools have proved that an offsite operator is as efficient as an onsite engineer, especially when physical presence poses a risk. The main challenge observed with large-scale remote connectivity is capacity and license management, which can be overcome by defining clear usage-based profiles for employees that balance profile objectives with available capacity:
    • VDI for administrators who require application-level access for configuration changes, user interface (UI) interactions and dashboard monitoring
    • VPN for operators whose daily activities require access to databases, back-end processes or infrastructure components
  • Everything as code (EaC)—The idea that software defines everything has changed the focus of IT operations from manual and repetitive tasks to automated workflows that require less physical intervention. Examples of this methodology include:
    • Infrastructure as code (IaC)—Managing and provisioning computer data centers through machine-readable definition files, rather than physical hardware configuration.2 This paradigm shift has taken infrastructure automation to new levels, requiring that operations teams’ traditional competencies now include “codifying” skills.
    • Software-defined network (SDN)—Enables dynamic, programmatically efficient network configuration to improve network performance and monitoring, making it more like cloud computing than traditional network management.3 SDN is another example of how “codifying” infrastructure components enables operators to manage, scale and retire infrastructure effectively and efficiently while minimizing the need for physical hands-on labor. In the telecommunications sector, SDNs will spare network field engineers from having to make trips across the country to manage and maintain network components.

New Weapons

A ronin is a two-sword warrior, like the samurai. However, like the ninja, the ronin also uses a variety of other weapons such as knives, bows and even stars. As wanderers, ronins make use of any weapon that serves the purpose: winning the battle. IT operations’ traditional sword is the integrated information technology service management (ITSM) tool set. As technology advances and new ways of working are adopted, extending this tool set is vital to meet the demands of digital transformation. New opportunities that can be explored include the following:

  • Robotic process automation (RPA)—Software robots (bots) are developed to streamline and automate tasks typically handled by human operators. The application of RPA can vary from simple, repetitive, rule-based tasks to more sophisticated judgment-based tasks requiring robots that can automate complex decision-making.
  • Artificial intelligence for IT operations (AIOps)—The term AIOps originally referred to algorithmic IT operations, but it has become synonymous with artificial intelligence (AI) for IT operations. AIOps combines big data and machine learning to automate IT processes, including event correlation, anomaly detection, causality determination and transformation of data collected from different sources into actionable insights that can be used to:
    • Proactively monitor and tune applications
    • Predict and rapidly detect issues
    • Provide business teams with real-time insights into IT’s impact on business, thus facilitating informed decision-making
  • ChatOps—This model combines people, tools, processes and automation and connects them in a collaborative and transparent flow. It humanizes workflows by promoting feedback loops and improving communication, team collaboration and experience. This conversation-driven model brings systems and tools into the dialog, enabling IT operations to:
    • Optimize incident resolution and request fulfillment times
    • Improve knowledge and information sharing
    • Establish a culture of continuous improvement and innovation

For example, SK Telecom in South Korea deployed robots in the battle against the COVID-19 pandemic; leveraging AI, automation and 5G technology, the robot was capable of autonomously carrying out monitoring activities such as contactless temperature screenings, to automate safety checks and prevent the spread of COVID-19.4

TO ADDRESS THE SHORTCOMINGS OF THE TIERED SUPPORT MODEL, ENTERPRISES ARE ADOPTING A MODERN, DYNAMIC, COLLABORATIVE APPROACH REFERRED TO AS “SWARMING”.

New Tactics

Being masterless, a ronin does not adhere to the Japanese class system composed of tiers of warriors—a hierarchical structure similar to the traditional IT multitiered support model in which each level handles work items based on knowledge, competency and authority before elevating them to higher levels. This time-consuming model was the norm for decades and has been challenged by today’s high-velocity IT departments, leading to:

  • Long queues of requests and incidents
  • Delayed response, fulfillment and resolution times
  • Bouncing of issues between levels

To address the shortcomings of the tiered support model, enterprises are adopting a modern, dynamic, collaborative approach referred to as “swarming.” Swarming involves people from different backgrounds or levels working on an issue at the same time and having end-to-end visibility on issue resolution or request fulfillment.

For example, spectators at a Formula One race have seen swarming in action. When the race car enters the pit area, a team converges to work on it—changing the tires, refueling and fixing any mechanical issues.

Swarming can be used in conjunction with or as an alternative to a traditional tiered model (figure 1). It can be applied in different disciplines where collaboration among stakeholders is perceived as valuable, whether the issue involves an incident, a problem, a change or a release. Due to its multidisciplinary nature, swarming can take many forms. Some examples include the following:

  • Dispatch swarms—Frequent meetings throughout the day to review incoming work and select quick-to-complete items5
  • Backlog swarms—Meetings convened on a regular or ad hoc basis at the request of product or service specialists who need input from other specialist groups, thereby avoiding delays as work items are reassigned to different teams or queues6
  • Drop-in swarms—Experts are continuously available or continuously monitor the activity of other teams to decide whether and when to get involved.7

Despite its advantages, swarming also has some drawbacks that enterprises should consider. For example:

  • There is a perceived increase in cost when highly skilled specialists are included early in the work stream, especially if a per-item charging model is applied. In such cases, swarming can be used when work queues reach a certain threshold, or it can be limited to specific work item types.
  • The basis of employees’ performance reviews will shift from individual to team contributions. Traditional key performance indicators (KPIs) may have to be adjusted to consider both competencies.

One use case is Tricentis, a leading DevOps tools provider, that published an interesting case study on how intelligent swarming helped the enterprise improve overall performance and customer satisfaction.8

Evolving Ethics

For ronins, the Bushido represents a moral code concerning attitude, principle, behavior and lifestyle. For IT operations teams, ITSM practices and principles represent the same thing. As IT operations progress through the modernization journey, these practices should adapt to the evolving ethical standards and regulations of the digital age related to both information and technology:

  • Information—Information security, data protection and privacy regulations should be at the heart of ITSM. Digitization, automation and systems integration provide IT operations with access to new information, which increases overall data-related risk. IT operations managers should educate and train their employees to consider these regulations and implement the required controls in all phases of the service delivery life cycle.
  • Technology—As technology advances, IT operations should consider technology’s impact on the environment by adopting green and sustainable practices that minimize the carbon footprints of their enterprises. Examples of green practices include:
    • Virtualization and hardware consolidation
    • Telecommuting
    • Paperless offices
    • Resources reclamation
    • Environmentally friendly disposal of IT assets

For example, British Telecom, the United Kingdom’s leading broadband giant, announced two new initiatives that will help form the foundation for an eco-friendly recovery from the COVID-19 crisis.9

Conclusion

The world is volatile, uncertain, complex and ambiguous. IT as a discipline is evolving, and modernizing IT operations has become a vital part of adapting to technological advancements, new ways of working and new disruptions. The COVID-19 pandemic has illustrated how a disruption can accelerate technology trends and lead to the emergence of new trends at a faster pace. IT operations should be prepared to adapt and adopt whatever new processes will influence its three pillars: people, processes and technology. Maybe the day will come when the ronin warrior becomes obsolete and a new type of warrior rises.

THE COVID-19 PANDEMIC HAS ILLUSTRATED HOW A DISRUPTION CAN ACCELERATE TECHNOLOGY TRENDS AND LEAD TO THE EMERGENCE OF NEW TRENDS AT A FASTER PACE.

Endnotes

1 Cao, S.; “Interest in Twitter, Facebook Jobs Surges After CEOs Allow Permanent Work From Home,” Observer, 28 Mary 2020, https://observer.com/2020/05/twitter-facebook-square-job-interest-surge-permanent-remote-work/
2 Wittig, A.; M. Wittig; Amazon Web Services in Action, Manning Press, USA, 2016
3 Benzekki, K.; A. El Fergougui; A. Elbelrhiti Elalaoui; “Software-Defined Networking (SDN): A Survey,” Security and Communication Networks, vol. 9, iss. 18, 2016, p. 5803–5833
4 Waring, J.; “SK Telecom Deploys Robot in Covid-19 Battle,” Mobile World Live, 26 May 2020, https://www.mobileworldlive.com/asia/asia-news/sk-telecom-deploys-robot-in-covid-19-battle
5 AXELOS, ITIL 4: Create, Deliver and Support, The Stationery Office (TSO), United Kingdom, 2020
6 Ibid.
7 Ibid.
8 Murray, K.; “New Case Study: Swarming at Tricentis,” Consortium for Service Innovation, https://www.serviceinnovation.org/tricentis-swarming/
9 Weatherley, D.; “Revealed: BT’s ‘Green’ Recovery From the COVID-19 Pandemic,“ Energy, 5 June 2020, https://www.energydigital.com/smart-energy/revealed-bts-green-recovery-covid-19-pandemic

Abdelelah Alzaghloul, CISA, CRISC, CISM, CGEIT, ITIL 4 MP

Is an IT advisor with 16 years of experience in IT governance, service delivery and IT transformation programs. He is experienced in the deployment of various IT governance frameworks and standards in the telecommunications sector. He is also a certified trainer in IT governance and the service management fields.