Working Toward a White Box Approach: Transforming Complex Legacy Enterprise Applications

Author: Sandipan Gangopadhyay, Stuart McGuigan, Vijay Chakravarthy, Dheeraj Misra and Sumit Tyagi
Date Published: 25 January 2022

It has long been the assumption that, once built, all computer systems are Turing machines—that is, devices that perform one small, deterministic step at a time—and are, therefore, deterministic in nature.¹ For every set of inputs, there is always the same set of outputs no matter what part of the system. Practical application development involves manual programming of the requirements needed to achieve a business objective. The process of programming has never been free of defects and, therefore, entails a high degree of uncertainty.² Over the years, quality processes have been developed to detect defects at every step, and all system development life cycles require thorough testing phases. One approach to systems and application development utilizes a set of transparent closed-loop tools to eliminate uncertainty and defects within a well-bounded set of processes. Central to this approach is a logical systems repository (LSR), which, when implemented, allows for automation to drive application development and legacy systems migration from requirements to production release, which helps achieve better accuracy, consistency, and, consequently, increased speed and lower costs. The goal is to represent every element of complex computer systems (i.e., data, procedures, infrastructure) as data fully available for analysis and modeling. Alternate software development life cycle approaches can leverage that capability.

What Is a White Box and What Is an LSR?

Virtually every important business process enabled by computer systems at an organization involves a certain level of complexity. Complexity arises from how computer systems are constructed and connected to each other. There are six levels of system complexity:

Level 1: Size—Large modules/components composed of single blocks of >50 lines of code signal complexity. Size can be an indicator that the block of code is dealing with multiple responsibilities and needs to be broken down and componentized. Keeping track of when a component needs to be updated or connected becomes more difficult with a large body of code.

Level 2: Hard-wired interfaces—Hard-wired components with brittle interfaces have reduced reusability, which potentially leads to the development of redundant code. When any component undergoes a change, the corresponding components that interface with it must be incorporated in appropriate updates and testing (figure 1). Studies show that interfaces with low cohesion tend to degrade the cohesion of all the classes implementing them, compared to the classes that do not implement interfaces, significantly improving maintainability.³

Level 3: Depth of the call stack—When components within a system call other components, increases in this chain of invocations result in a longer control flow (call stack) (figure 2), and complexity of dependencies rises. When a downstream component needs to be changed or connected, all upstream and downstream impacts need to be considered. Measures such as implementation length and volume represent different levels of complexity and can impact several quality factors, including maintainability, efficiency and performance.⁴

Level 4: Hierarchy of the call stack—When components are called by multiple other components, the hierarchy (multiple callers/parents) of the call stack becomes tangled (figure 3). A single child component can be shared by multiple parent components. When a child must be changed or connected, the impact to all parents needs to be considered. Measured in McCabe’s cyclomatic complexity, call stack hierarchy has been found to correlate with several quality factors, such as maintainability, reliability and testability.⁵

Level 5: Multiple technologies and applications—When business processes cross generations of applications and infrastructure, the existence of multiple technologies that affect process and organizational boundaries can obfuscate the call stack (figure 4). How computer systems are implementing a business process becomes increasingly difficult to document and trace because a component in one process or language can be calling a component in a completely different process or technology. Tracing these dependencies manually is difficult and requires team members conversant in multiple programming languages and systems.

Level 6: Black box components—Multiple vendor products (black boxes) further complicate traceability because how code is flowing through these closed products and how they are processing information is unknown (figure 5).

Limitations of a Manual Approach to Design and Development
Large, complex IT projects tend to fail. In a study published by The Standish Group, researchers reviewed more than 50,000 IT projects between 1992 and 2004 and found that only 29 percent could be classified as successes.⁶ An enabler for these failures is often found to be complexity itself.⁷^, ⁸ Due to complexities, such as the six types described, many large programs are fraught with risk, including security vulnerabilities,⁹ because it is not humanly possible for architects and developers to manually keep track of unlimited amounts of dependencies and relationships¹⁰ in their visual short-term memory. Changes are often made to components without a full view of all interconnected components in the same system or in upstream/downstream systems. Project teams must rely on either the prior experience of subject matter experts (SMEs) or integration testing to discover such dependencies and the ensuing unintended consequences.

The high costs of manual migration—data migrations are often 200 percent of the cost of system acquisition¹¹ — force organizations to either postpone modernization or resort to spot upgrades necessitated by obsolete systems, resulting in a patchwork of bandages holding a brittle system together. There is also a high degree of correlation between Halstead’s effort, one of many measurements developed by Maurice Halstead at IBM to determine the mental effort required to develop or maintain a program,¹² and McCabe’s cyclomatic complexity measures.¹³

Complexities impact quality; introduce the risk of disruption to process and data; create uncertainty in terms of project timeline, milestone performance and resource requirements; and, ultimately, increase costs.

Furthermore, complex systems typically exhibit additional challenges for manual implementation, such as:

Deployments are large in terms of application code, systems integrations and databases changes.
Business logic is difficult to extract and is often duplicated, sometimes inconsistently.
Hundreds of contributors modify source code using inconsistent techniques and coding patterns.
Obsolete or dead code is not used, but it still needs to be maintained, increasing the nonproductive total cost of ownership.
Documentation can be out of date with weak traceability to the source code.
Capabilities are layered in existing complex systems, exponentially increasing technology debt.
Performance and security challenges increase due to complexity and scale.

An Alternative Approach
An alternative to the manual approach is to use a white box approach that exposes all the internal control flows, data lineage, traceability between business processes, functions supported by a system, flow of logic, business rules and data elements of a system as data and lets those data points drive decisions in architecture, design, development and testing. Elements of data include business functions, use cases and user stories, business logic and business rules, code flows and cyclomatic paths, data elements and data lineage, interfaces and dependencies, and infrastructure dependencies. This approach enables project teams to determine how business processes are being supported by the associated computer systems; how systems implement use cases; how the use cases are implemented across components, technologies, languages and systems; and the code, data and infrastructure assets involved. For an existing system, this information needs to be extracted with the use of parsers that will not only perform lexical and semantic analyses but span these control and data flows automatically.

The high costs of manual migration…force(s) organizations to either postpone modernization or resort to spot upgrades necessitated by obsolete systems…

This approach is a powerful solution for validating model-based software engineering principles¹⁴ and is now more relevant than ever as technical debt and the associated complexity of IT systems have quickly multiplied. White box testing approaches are also important for the validation of big data systems.¹⁵ This is because the level of testing paths to be exercised increases rapidly with size and complexity of data and is not easy to comprehensively visualize manually as mentioned. Furthermore, in the absence of visibility into impact from changes to a specific area of the system, it is difficult to assess the level of testing required in terms of number of paths of logic on the data constructs and corresponding scenarios to be tested. Managing the technical debt via replatforming or modernization requires an effort to get to a white box state that exposes internal control flows for all the collaborating systems within an enterprise.

When working with black box components, project teams do not have access to component source code or underlying data structures, which adds complexity. The white box approach analyzes and maps the configuration of such components and exhaustively details all interactions with those components. This level of visibility exposes all relevant flows involving the black box components and any related defects. The data are then stored in a logical system repository (LSR)—a database that houses all the previously listed elements of the internal controls and their relationships.

Case Study: How a White Box Makes a Difference

To measure the benefits of a white box approach to system migration and interconnectivity, an example of a specific project in which the system supports mission-critical business processes can be used. This system is complex and involves multiple technologies. It handles, among other functions, the processing of pharmaceutical prescription claims. This system implements regulated processes such as drug utilization reviews to determine if a drug prescribed to a patient by a physician is safe, considering other drugs, conditions, allergies; prior authorization; coverage; and pricing. Errors can result in significant harm to patients. Several studies have found that when implemented correctly, computerized physician order entry (CPOE) systems reduce medication errors and overall patient harm as opposed to manual recording.¹⁶

The white box approach analyzes and maps the configuration of such components and exhaustively details all interactions with those components.

When applied to these functions, even a system that conforms to Six Sigma, a set of techniques and tools for process improvement, exhibits a level of accuracy that falls short of the quality level required by the enterprise. At that level of accuracy, organizations that process more than a billion prescriptions a year will encounter more than 3,000 errors per year, a level that is not acceptable when patient lives are at stake.¹⁷ As noted, on account of the increased accuracy of a data-driven design, implementation and testing process supported by a white box approach, it is an optimal solution when a system needs to be modernized (e.g., from Broadvision, C++, CORBA to contemporary technologies); updated for new regulations (i.e., the US Health Insurance Portability and Accountability Act [HIPAA] D.0 or International Classification of Diseases [ICD].10); connected to other consumers (for white labeling the underlying services); or integrated with new clinical programs (to identify opportunities to calibrate patient therapy to deliver better clinical results).

Service Delivery System for Pharmacy Benefits Management
Pharmacy benefits management involves managing a highly complex customer service ecosystem. Call center operations typically address general inquiry and service capabilities to support patients directly (business to consumer [B2C]). The service capabilities include eligibility checks, prescription refills, order status, order fulfillment support, health benefit inquiries, insurance coverage inquiries, account inquiries, prior authorization support, case management and retail pharmacy support.

In this case study scenario, the systems responsible for call center operations and order fulfillment used a multitier architecture comprised of multiple technologies and vendor products. The key developers of the systems are no longer available for support. The system is not well documented either; hence, the system knowledge is tribal and limited to a few individuals in the organization. The reliability of the system has declined year over year while transaction volume has increased. The outdated technologies and lack of vendor support increase the risk of system outages, leading to suboptimal customer service and delays and errors in order fulfillment. The first few conventional attempts to upgrade and transform the system have been in vain; the initiatives were put on hold as they became expensive and time consuming.

For this case study, a system was selected that captures and manages patient preferences in pharmacy benefit management, and it was modernized and migrated to a new architecture. It important to note the size (figure 6) and complexity of the system (figure 7).

The metrics presented in figures 6 and 7 provide a view into how difficult it would be for an architect or developer to manually assess dependencies, impacts and optimal methods for updates, migration or interconnectivity.

Automated tools were used to extract the call stack across all languages and systems. The infrastructure footprint and a comprehensive inventory of functions, use cases, business rules, components and data elements were scanned, identified, verified with subject matter experts (SMEs) and stored in the LSR database, which was then used to drive implementation of the future state.

Making a Difference in Visibility and Transparency: Reverse Engineering
The LSR ensures high confidence in full code coverage. It ensures that no functionality is lost from the current state, regardless of its disposition in the future state— e.g., rewrite, redesign or remove. The factory code parsers handle all programming languages and code file types used in the current state.

In the case study, this approach reduced reliance on SMEs for initial discovery activities, allowing them to focus more on review and validation. Moreover, it clearly identified and listed all reverse-engineering tasks required. Further, the factory process/tools assisted the SMEs in performing data lineage analysis within and across all technology platform boundaries. Data lineage analysis detailed how each data element involved in the call stack would evolve through its life cycle.

The approach involved code parsing that spanned platforms and process boundaries yet allowed users to navigate and reverse navigate through the call hierarchy within and across all technology platforms. The SMEs performed ad hoc and on-demand analysis and assessed the impact of any change. The LSR acted as the sole central source of information required for forward engineering activities.

A Different Forward Engineering Life Cycle Driven by the LSR
With full and comprehensive visibility into how the system was implemented, a white box approach was used to build out the future state, including modernization, refactoring, migration and interconnectivity.

The white box approach ensured that all functionality was preserved, while allowing a coarse functionality to be broken down into granular units as required. The approach helped SMEs and developers to achieve clear separation of architectural concerns during the design and coding phases. The approach delivered reports that showed components that were either not forward engineered or that lacked traceability from the target state back to the current state. The visibility improved management of the developer workload and made engineering productivity more predictable.

Furthermore, the LSR was used as a source of information for all systems components, business rules and data elements to auto-generate code either partially or completely per target design. The LSR served as the meta model that code generators typically require to convert functions to code objects and components; they sourced information from the LSR about data elements inbound and outbound, business rules, and the hierarchy of callers. When necessary, the functionality was enhanced manually to increase its business fit and value. The business SMEs/analysts were able to inject new user stories or update an existing user story as required. The approach also focused on the performance of the forward-engineered functionality, such that it matched or enhanced the current production performance.

The White Box Implementation Process Used in the Case Study

The implementation of the white box process with regard to the pharmacy case study involves reverse and forward engineering using LSR.

Reverse Engineering
At a high level, the reverse engineering phase is a six-step process (figure 8) that establishes the LSR and the reverse engineering dashboards. The dashboard views are customized to the needs of various project roles (e.g., views that are specific to business analysts, quality engineers and business users). In addition, users are allowed to run prebuilt or custom queries over LSR for analysis.

With full and comprehensive visibility into how the system was implemented, a white box approach was used to build out the future state.

Forward Engineering
The forward engineering phase (figure 9) involves the future state design of every component in the LSR, along with its traceability to its current state. Every component is assessed for its fit in the future state architecture and moved to the suitable architecture layer in the future state. For example, business logic trapped in web pages could be moved to an appropriate restful application programming interface in the new and modern architecture, making it easier to maintain, reuse and improve user experience. This enables business transformation and enhancements as opposed to a plain lift and shift. The traceability exercise is performed to ensure comprehensive migration.

Benefits
The case study exhibits the numerous benefits of the white box approach, including:

Identification of where a change in the system can be performed the most efficiently and safely and be best integrated into existing business capabilities
Assessment of impact to all system components, interfaces and data elements
Traceability to a new system to ensure that no business rule is inadvertently left behind
Determination of the sequence and logical grouping of components that need to be changed
Identification of opportunities for secure enclaves based on actual data flows to enhance security
Demonstration of the impact to downstream systems across multiple interfaces
Prediction of effort and time and, thereby, resource needs
Accurate and definite selection of elements of the approved software development life cycle management standard operating procedure (SOP) of the organization, reducing duplication of work and improving team efficiency¹⁸
Creation of maps with cyclomatic flow, resulting in comprehensive test coverage, including test cases for unit testing and integration testing and for business simulation and user acceptance
Selection of data to be used for testing to help assert certain branches of code to exercise all the code (further improving test coverage)

It is important to measure and understand the impact of these incremental capabilities in the effort, cost, elapsed time, resource requirement and SME dependence. An improvement in accuracy of the migrated system will represent better patient experience and eliminate disruptions that can have clinical impacts by enabling users (i.e., patients and physicians) faster access to their prescriptions and accurate fulfillment and delivery of their medicines.

Results
The patient preferences function within the service delivery system for pharmacy benefit management was replatformed using both the white box approach and traditional approaches to support an objective comparison between the two. The white box approach resulted in a zero-defect release because all scenarios of the system’s behavior were modeled and exercised. Application of a large number of controls eliminated disruption in the reengineering and system connectivity to downstream applications.

Although improved accuracy was the primary objective, a number of other improvements also occurred using the white box approach, compared to the traditional approach to system migration, including:

The number of person days required to analyze the existing code base was reduced from 55 to 10.
The number of person hours to migrate the functionality was reduced from 4,629 to 1,612.
- The project team size was reduced by 50 percent (from 14 to 7).
- The cycle time was reduced by about 30 percent (down from 2 months to one month and a half).
There was a reduction of about 80 percent in person hours of SME time.
The white box approach determined that a target of 90 percent code coverage was required based on dependencies. All branches were exercised in testing.
Cycle time was reduced by more than 40 percent.

Implementing COBIT Controls

The white box approach enables and empowers implementation of very specific controls that improve an organization’s capability to govern, plan for an optimal outcome, build accurately, and measure effectiveness and efficiency of delivery. Specifically, COBIT can be used with the white box approach using a recommended set of objectives.¹⁹

Planning
The first step to implement a controls regime that ensures accuracy and predictability is to plan for the optimal outcome while eliminating risk (figure 10).

Implementation and Monitoring
The next step is to apply corresponding controls during the build and monitor phases (figure 11).

These control objectives are implemented in a highly transparent, data-driven and, therefore, audit-friendly approach. These controls not only help optimize outcomes, but they also demonstrate the underlying drivers for decision making. Therefore, they improve the systems’ recovery and resilience; in a changing situation, those underlying factors can be updated to rapidly come up with a new set of designs, plans and control regimes.

Conclusion: Delivery of Incremental Value

With today’s computing capability and automated discovery tools, it is possible to create a white box closed-loop methodology to fully realize the promise of control frameworks such as COBIT. Once business and nonfunctional requirements are fully described in an LSR, systems can be fully automated, with all the benefits that accrue when replacing manual effort with technology, such as better accuracy, lower cost and a faster turnaround. White box methods can reduce dependence on scarce resources of all types (technical and business), shorten cycle times significantly and eliminate technical defects prior to release.

In addition, once a system is fully characterized in an LSR, “what if” and other analyses can be used to assess the impact of new requirements, services refactoring and migration to new platforms. White box engineering is a valuable process that can be used to fulfill the long-held promise of deterministic systems. The impact of changes when interconnected is no longer a mystery.

Furthermore, the LSR enables more advanced business capabilities. In the system discussed herein, the LSR allowed for both a deeper analysis and traceability constructs for how each claim was processed through the underlying systems, including adjudication. This capability, termed “claims neurology,” allowed an implementation of long short-term memory (LSTM) deep learning neural nets to effectively predict margin performance of future claims.

Endnotes

¹ Turing, A. M.; “On Computable Numbers, With an Application to the Entscheidungsproblem,” Proceedings of the London Mathematical Society, 12 November 1936, https://londmathsoc.onlinelibrary.wiley.com/doi/pdf/10.1112/plms/s2-42.1.230
² Hinde, S.; “Why Do So Many Major IT Projects Fail?” Computer Fraud and Security, January 2005, www.sciencedirect.com/science/article/abs/pii/S136137230500148X?via%3Dihub
³ Abdeen, H.; H. Sahraoui; O. Shata; “How We Design Interfaces, and How to Assess It,” 2013 Institute of Electrical and Electronics Engineers (IEEE) International Conference on Software Maintenance, Netherlands, September 2013, https://ieeexplore.ieee.org/document/6676879
⁴ Lee, M. C.; “Software Quality Factors and Software Quality Metrics to Enhance Software Quality Assurance,” British Journal of Applied Science and Technology, vol. 4, iss. 21, 2014 https://www.journalcjast.com/index.php/CJAST/article/download/6739/11995/
⁵ Ibid.
⁶ Johnson, J.; My Life Is Failure: 100 Things You Should Know to Be a Successful Project Leader, The Standish Group International, USA, 2006
⁷ Basili, V.; L. Briand; W. Melo; “A Validation of Object-Oriented Design Metrics as Quality Indicators,” Institute of Electrical and Electronics Engineers (IEEE) Transactions on Software Engineering, vol. 22, iss. 10, October 1996, https://ieeexplore.ieee.org/document/544352
⁸ Nagappan, N.; T. Ball; A. Zeller; “Mining Metrics to Predict Component Failures,” Proceedings of the 28th International Conference on Software Engineering, May 2006, www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-2005-149.pdf
⁹ Shin, Y.; L. Williams; “An Empirical Model to Predict Security Vulnerabilities Using Code Complexity Metrics,” Proceedings of the Second Association for Computing Machinery (ACM) IEEE International Symposium on Empirical Software Engineering and Measurement, 9 October 2008, https://collaboration.csc.ncsu.edu/laurie/Papers/p315-shin.pdf
¹⁰Marois, R.; J. Ivanoff; “Capacity Limits of Information Processing in the Brain,” Trends in Cognitive Sciences, September 2005, www.sciencedirect.com/science/article/abs/pii/S1364661305001178
¹¹Shetty, J.; M. Anala; G. Shobha; “A Survey on Techniques of Secure Live Migration of Virtual Machine,” International Journal of Computer Applications, February 2012, https://www.researchgate.net/publication/258650676_A_Survey_on_Techniques_of_Secure_Live_Migration_of_Virtual_Machine
¹²IBM, “Halstead Effort,” https://www.ibm.com/docs/en/raa/6.1?topic=metrics-halstead-effort
¹³Henry, S.; D. Kafura; K. Harris; “On the Relationships Among Three Software Metrics,” Proceedings of the 1981 ACM Workshop/Symposium on Measurement and Evaluation of Software Quality, 1 January 1981, https://dl.acm.org/doi/abs/10.1145/800003.807911
¹⁴Küster, J.; M. Abd-El-Razik; “Validation of Model Transformations—First Experiences Using a White Box Approach,” International Conference on Model Driven Engineering Languages and Systems, Springer, Germany, 2006, https://doi.org/10.1007/978-3-540-69489-2_24
¹⁵Gulzar, M.; S. Mardani; M. Musuvathi; M. Kim; “White Box Testing of Big Data Analytics With Complex User-Defined Functions,” Proceedings of the 2019 27^th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 12 August 2019, https://doi.org/10.1145/3338906.3338953
¹⁶Singh, H.; S. Mani; D. Espadas; N. Petersen; V. Franklin; L. Petersen; “Prescription Errors and Outcomes Related to Inconsistent Information Transmitted Through Computerized Order Entry: A Prospective Study,” Archives of Internal Medicine, vol. 169, iss. 10, 25 May 2009, https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/773518
¹⁷Chassin, M.; “Is Health Care Ready for Six Sigma Quality?” The Milbank Quarterly, 26 December 2001, https://doi.org/10.1111/1468-0009.00106
¹⁸Bhattacharya, K.; S. Gangopadhyay; C. DeBrule; “Design of an Expert System for Decision Making in Complex Regulatory and Technology Implementation Projects,” Design for Tomorrow, vol. 3, Springer, Singapore, 6 May 2021, https://www.researchgate.net/publication/342121877_DESIGN_OF_AN_EXPERT_SYSTEM_FOR_DECISION_MAKING_IN_COMPLEX_REGULATORY_AND_TECHNOLOGY_IMPLEMENTATION_PROJECTS
¹⁹ISACA^®, COBIT^® 2019 Framework: Governance and Management Objectives, USA, 2018, https://www.isaca.org/resources/cobit

Sandipan Gangopadhyay, CGEIT

Is president and chief operations officer of GalaxE.Solutions. He is a member of ISACA®, the Institute of Electrical and Electronics Engineers (IEEE), and the Indian Institute of Chemical Engineers. He can be reached at sandipan@galaxe.com.

Stuart McGuigan

Has held executive positions in business and IT, including chief information officer of Liberty Mutual, CVS Health, Johnson and Johnson, and the US Department of State. He has led major IT transformation initiatives at those organizations in partnership with business and regulatory stakeholders. He can be reached at stmimc@gmail.com.

Vijay Chakravarthy

Is senior vice president of technology at GalaxE.Solutions. He is a certified enterprise architect and is the head of a strategic business unit focused on software engineering of medical and pharmacy systems. He can be reached at vchakravarthy@galaxe.com.

Dheeraj Misra

Is chief technology officer and chief automation officer at GalaxE.Solutions. He is responsible for the automation platforms and playbooks such as GxRay, GxMaps and GxInfra, used to implement the automation factory. He can be reached at dmisra@galaxe.com.

Sumit Tyagi

Is program manager at GalaxE.Solutions. He is responsible for the GxDash and GxMaps platforms that implement the logical system repository. He can be reached at sutyagi@galaxe.com.