Tracking What Matters: KPIs and Dashboards

April 10, 2025

In our previous chapter, Designing for Testability: Breaking the Next Barrier, we explored how testability became a key focus after resolving initial bottlenecks in team setup and manual testing. That chapter marked a turning point in our DevOps transformation, as we shifted attention to deeper software design principles that enabled smoother, faster testing.

Now in Chapter 4, we move into a critical next phase: defining and tracking the right KPIs and setting up dashboards to measure our progress. With foundational practices in place and testability improving, we needed objective, data-driven ways to visualize flow, identify constraints, and continuously improve.

Why KPIs and Dashboards Matter in DevOps

As we progressed in our journey to streamline end-to-end delivery, we recognized the need for a visual and objective mechanism to track work and uncover bottlenecks. This meant defining KPIs that weren’t just about numbers—but about highlighting actionable insights and driving the right behaviors.

When setting up KPIs, it’s essential to recognize that teams and organizations tend to optimize the metrics they monitor. This aligns with Goodhart’s Law, which states:

"When a measure becomes a target, it ceases to be a good measure."

This principle underscores the importance of selecting KPIs that guide teams toward the right outcomes rather than encouraging them to simply hit targets. The purpose of these KPIs is to measure progress toward our overarching goal. But what exactly is that goal?

Defining the Goal

As discussed in The Goal by Eliyahu M. Goldratt, the goal of a business is to increase net profit while simultaneously improving ROI and cash flow. Translating this to our software development context, where we are currently in the investment phase, we identified the goal as:

“To develop and deliver a marketable software product that generates sustainable revenue.”

Achieving this goal requires meeting several necessary conditions, including:

Time to Market → Delivering features and products quickly to capture market opportunities.
Future Scalability → Ensuring that the software is designed to grow with demand without compromising performance.
Cost Optimization → Balancing development expenses while maximizing value delivery.
Return on Investment (ROI) → Ensuring that the product delivers long-term financial benefits that justify the investment.

These conditions guided how we defined our KPIs-focusing not just on speed and efficiency, but also on building a scalable, cost-effective product that delivers long-term business value.

Visualizing the Flow: Setting Up Dashboards

As outlined in Accelerate by Nicole Forsgren, Jez Humble, and Gene Kim, our goal was to set up a dashboard that could visualize and track the four key DevOps metrics:

Deployment Frequency → How often teams deploy code to production.
Lead Time for Changes → How quickly code moves from commit to production.
Change Failure Rate → The percentage of deployments causing failures.
Mean Time to Restore (MTTR) → How quickly teams recover from failures.

However, generating these metrics in a meaningful way was initially challenging due to a lack of systematic tracking of work across teams. This is where adopting Azure DevOps as a single, common work management solution for all teams made a significant difference.

We started by setting up team-specific dashboards to track:

All assigned work items per team.
Work currently in progress.
Completed work.
Defects categorized by stage and priority.

First Challenge: Inconsistent Usage

Even with clear guidelines, teams used statuses and defect categories differently. This inconsistency made cross-team flow analysis nearly impossible.

The first challenge we encountered was the inconsistent use of work item states and defect classifications, despite having clear guidelines in place. Different teams used statuses differently, making it difficult to identify flow issues across teams. Bringing consistency to how work was tracked became critical, ensuring that dashboards could be used not just for individual teams but for flow optimization across the organization.

We then focused on defining work items properly ensuring that they could accurately reflect where work is in the value chain, while still being generic enough to be used across multiple teams. Most tools provide default status categories, but these weren’t always sufficient. It was important to define status categories that aligned with our team structure, the way work flowed, and our known bottlenecks.

Analyzing Workflow to Identify Bottlenecks

The first dashboards were primarily set up to visualize work in progress-to gain a clear picture of what was happening across teams. This included feature stories, user stories, defects, and test plans. These provided a snapshot of ongoing work, but they still did not provide a clear picture of the overall flow of work and value.

Having visibility of work in progress was an important first step, but the next challenge was to connect this visibility to flow-based metrics that could highlight where work was getting stuck and where we needed to improve.

Using the status of work items, we created metrics to show the number of work items at each state. Initially, we only had information on the count of work items in different states. More advanced analysis-such as tracking how long work items remained in each state-required additional tooling, which we introduced later. At this initial stage, however, we began our analysis using only the counts.

Even with this basic data, the dashboards proved valuable in identifying key bottlenecks:

Too many work items in progress compared to the number of items being completed within a given period.
A growing backlog of new work items that far exceeded what we could realistically complete within the next 3 to 6 months.
Teams with the largest backlog of unfinished work items, indicating where additional focus and support were needed.

Each of these insights required a different solution, but the ability to visualize the flow of work was a major breakthrough. This improvement took the Tools aspect of our DevOps Triangle further, providing the impetus for progress in the other two aspects-Architecture and People.

Using KPIs to Drive Continuous Improvement

We returned to the topic of KPIs at several stages throughout our DevOps transformation. As our tracking capabilities improved, we were able to generate more detailed metrics that helped us:

Measure progress over time to ensure we were moving toward our goal of developing a marketable software product.
Identify dependencies and handovers that were causing delays in the flow of work.
Pinpoint areas where teams needed additional support or process improvements.
Align our KPIs more closely with the four key DevOps metrics from Accelerate, providing an objective measure of our success.

By aligning Tools, Architecture, and People with the right KPIs, we created a system that provided clear visibility into our progress, helped us identify and unblock bottlenecks, and ensured that we remained focused on delivering long-term business value.

What’s Next? Branching and Continuous Deployment

With our KPIs and dashboards in place, we’re now ready to optimize the flow of code through environments and into production. In Chapter 5, we’ll dive into how we approached Branching and Continuous Deployment, and the cultural and technical shifts that enabled us to deploy faster and safer.

Stay tuned!

Category:

Freyr Digital

Blog Tags:

DevOps Transformation