Top 11 KPIs for Engineering Teams

October 8, 2024

Top 11 KPIs for Engineering Teams

Ready to start?

What is a software engineering KPI?

Measuring software development is no easy feat; in fact, determining the relevant areas to track and the metrics to measure can be a significant challenge for software engineering leaders. Engineering KPIs (key performance indicators) provide leaders with an indication of how well their teams are progressing with specific goals. Engineering KPIs are different to engineering metrics (e.g., the number of bugs reported), as they’re about tying specific metrics to a goal or objective.

When tailored for development teams and implemented effectively, KPIs can be leveraged to optimize workflows and determine the effectiveness of engineering teams and software development tools. KPIs can help engineering leaders demonstrate to their CEOs and other executives that their investments are paying off. They can also be used to track progress and prioritize goals.

{{cta-demo}}

Types of engineering KPIs

There are different types of KPIs, all of which provide value to engineering leaders and teams. These include business impact, system performance, and engineering effectiveness.

Business impact KPIs

How does your engineering team’s work impact the business? This set of KPIs encompasses areas such as investment, project timelines, adoption and time to market. They focus on how engineering work aligns with business goals, providing engineering leaders with a better idea of how well their investments in people and tools are faring. Stakeholders will want to be assured that engineering investment is justified by seeing a ROI.

System performance KPIs

System performance KPIs provide an insight into the reliability, quality, speed, and scalability of systems. If the underlying metrics present as high, the chances are that users will be generally satisfied by the overall performance. 

System performance KPIs may also be used to measure the performance against service level agreement (SLAs) but are used regardless by engineering teams to track performance for continuous improvement. 

Engineering effectiveness KPIs

Engineering effectiveness is about removing friction and improving cycle times across value streams. It encompasses developer productivity, developer experience, satisfaction, and retention. The idea behind this set of KPIs is that productivity is not just about an individual’s contribution — it is about the investment made in the tools, processes and people that they work with, and more importantly, how any friction points can be improved. 

Research by GitHub and DX found that happy developers are more productive and are likely to contribute more in boosting innovation. Happier developers are more likely to stay put, too, helping an engineering organization to retain its most valuable assets.

Top 11 software engineering KPIs 

There are numerous KPIs related to software engineering and platform engineering; but here are eleven of the top KPIs that you should be aware of if you’re working in engineering.

1. Project return on investment (ROI)

Perhaps the most important measure for engineering leaders is a project’s return on investment (ROI). This measures the return produced by a project against the investment required for the project. It is calculated by dividing the net benefits from the investment by its initial cost. For example, a project to standardize the creation and management of developer environments may have cost $50k, but the saving for cleaning up unused ephemeral environments could be $200k. The ROI is therefore $150k. 

The difficulty with this metric is often quantifying the net project benefits, as they are not all necessarily financial. For instance, a new feature that is developed is loved by an existing user is difficult to quantify. The new feature is likely to help retain the customer - meaning there is a ROI. This is where you can use qualitative data such as a survey that builds into the new feature how much likelier they are to continue using this application because of this new feature, or how much happier they are with the application as a result of this new feature. 

Project ROI can provide a better understanding of an engineering project’s resource allocation (time, developers, investment in tools or processes). It would be of interest to numerous roles including VP of engineering, and dependent on the project’s domain, product managers and AppSec managers.

2. Planning accuracy

While it makes sense to have a traditional project timeline in place, there needs to be a way for engineering managers to calculate how accurate these plans are.

To do so, you can first calculate schedule variance by taking away the completed time from the predicted time - for example if a project is due on September 27 but the team finishes it on September 29 the variance is two days; this provides managers with an indication of how accurate plans are and can provide teams with data that they can use to better predict project timelines in the future.

As the schedule variance becomes more reliable, teams can use this information for planning accuracy by using the ratio of the number of projects on track - to the total projects planned in a specific iteration. So if there were 10 projects on track out of 20 that were planned, this is a ratio of 1:2. This KPI helps to ensure alignment with the status and the roadmap.

3. Time to market

Time to market refers to the period required from the start date of development to the release date of a feature or product. This incorporates all stages including the planning, design, development and release. The actual starting point may differ at organizations, but keeping a formal starting point (e.g.,formal sign-off on the project or a project kick-off meeting), can help organizations to determine the efficiency of their development process.

For instance, if the concept of a new feature was first discussed at a kick-off on February 2 and it was released on June 2, it has a time to market of four months.

Metrics such as deployment frequency, lead time, cycle time and PR open-to-close can be tied to this KPI in an OKR framework. Any action to deliver better user experiences, such as implementing self-service actions, may help move the needle with these metrics, and therefore help to accomplish the end goal, which is faster time to market.

4. Mean Time To Recovery (MTTR)

Mean time to recovery (MTTR) refers to the average time it takes to restore normal operations after a product or system failure. For time to recovery (TTR), leaders assess the duration required to deploy a patch after a problem is reported, with the goal of returning the system to full functionality. MTTR represents the average of these recovery times across all failures. A longer MTTR may indicate lower code quality or inefficiencies in processes.

So if a number of incidents took place over a three month period, taking 2, 4, 5, and 8 days respectively to resolve, then the MTTR would be 4.75 days. 

While MTTR is a key metric for incident management, there are other metrics you can track to help you reduce MTTR. For example, comparing incidents created versus resolved over time, and tracking total open incidents over time. MTTR is a metric that almost everyone in engineering from engineer to SRE to VP of engineering will use to keep track of how long it takes to resolve incidents. 

5. Engineering Net Promoter Score (eNPS) Score

While a Net Promoter Score (NPS) is a metric used to gauge how likely customers are to recommend their businesses to other prospective customers, an employee Net Promoter Score (NPS) is a metric used in business to gather how likely employees are to recommend their company as a place of work to other people. Within engineering, this is further nuanced as it doesn’t just relate to the company’s benefits and work environment but also the tech stack, engineering standards (both clarity on the standards and how they are enforced/tracked) as well as the use of emerging and innovative technologies.

The best way of measuring developer satisfaction is to regularly conduct surveys. Organizations can ask about areas such as:

  • the way they interact with their co-workers
  • the way problems are dealt with
  • the way tasks are prioritized and organized
  • how they feel about working on the code
  • how they feel about the definition and enforcement of standards
  • their thoughts on the technology stack.

An example of executing this could be: Once a quarter, you can ask how likely you are to recommend another engineer to work at this company. Use a scale of 1-10, and allow them to elaborate on why they chose the number they chose. Then once a month pick a number of the more detailed questions based on OKRs set by the organization with a scale of 1 to 5, where numbers 1 to 3 are detractors and 4-5 are promoters. If an OKR is to enhance standards compliance, you could set a goal of 100% NPS for “I trust the way our standards are defined and enforced”. By asking engineers why they pick the score they did, you can get clarity on what may need to be done to hit your goal of enhancing standards compliance. This should be used in conjunction with scorecards that can help you track your standards compliance.

An engineering NPS is useful for engineering leaders - from VP of engineering to platform engineering team leads and SRE team leads. 

6. Deployment frequency 

Deployment frequency measures how often code changes are deployed to production, reflecting a team's agility and efficiency. It is calculated by counting deployments over a set time period, such as per day or week. For example, if a team deploys 3, 7, 6, and 8 times in four weeks, their average weekly deployment frequency is six. High-performing teams typically deploy multiple times a day, while lower performers deploy less frequently. Tracking this metric helps identify areas for improvement in developer productivity and processes. Engineering leaders are most likely to use deployment frequency to determine which engineering teams are excelling - and then try to use their best practices across other teams.

7. Production readiness

The definition of production readiness may vary from team to team. Setting criteria for what production ready means for your team, and then reviewing items such as ownership, reliability, security, monitoring, and quality are key to accomplishing this goal. That means being able to answer questions such as, “is every software asset clearly owned and maintained by a designated team?” and, “are all security best practices being followed?” through data that can be visualized for clarity. Using scorecards is one approach to answering these questions. 

You should be able to track trends over time and drill down into areas that need improvement. This is just one example of a standards-based metric which can in itself be a KPI, or can contribute to a wider objective of standards compliance. Production readiness is predominantly used by software engineers and site reliability engineers (SREs).

8. Change failure rate (CFR)

Change failure rate (CFR) is another DORA metric that measures the percentage of changes made in a production environment that result in a negative impact, such as downtime, incidents or errors. 

CFR is measured by calculating the ratio of failed changes to the total number of changes implemented (number of errors/total number of operations). A low CFR (ideally less than 5%) indicates a more reliable system.

9. Code quality

Code quality metrics provide insights into whether the codebase meets or falls short of the engineering team's standards and expectations.

Code quality can be determined using numerous metrics including:

  • Code churn
  • Average code review time
  • Code coverage
  • Code duplication
  • Technical debt

Improving code quality can be one way to help teams that are striving to meet service-level agreements (SLAs), but it is also a KPI that helps determine developer effectiveness, improve overall software quality, and contribute to faster development cycles. Engineering leaders could use scorecards to determine if the codebase meets standards by incorporating each of these metrics and establishing a threshold for the metric to be ‘good’, ‘neutral’ or ‘bad’. 

10. Feature adoption

The rate of feature adoption shows how relevant and useful new functionalities are for users. It is measured by dividing the number of users using the new feature by the total number of users, and multiplying by 100 to get a percentage. 

A team of users who adopt new features quickly suggests that the engineering team understands (and perhaps even listens) to their core user base. It also suggests that the quality of the software is good, as users are getting value out of the feature immediately. 

11. Cloud costs 

With cloud costs skyrocketing, engineering teams have to be on guard to ensure they’re spending efficiently. Some of the metrics you can track to help reduce cloud costs include:

  • Infrastructure costs over time (per resource, per team and per business domain) 
  • Ephemeral environment costs
  • Cost per environment (Dev vs Prod)

One of the challenges is to ensure you relate resources and their costs so you can easily identify where to adjust spending for better financial efficiency. 

Using an internal developer portal to improve engineering KPIs

An internal developer portal can greatly enhance how organizations track and act on engineering metrics by consolidating them in one place. The portal’s software catalog integrates data across the software development lifecycle about microservices, APIs, deployments, compliance, incidents, and more. 

By aggregating this information, the portal reveals connections between metrics that might otherwise go unnoticed, such as how a higher deployment frequency relates to an increase in incidents. This could indicate that the team is focusing more on speed rather than quality, which could subsequently lead to more bugs and errors in production.

The software catalog enables you to bring business context to your existing software development lifecycle (SDLC) information, extending the data model of the SDLC with assets like: feature, sprint, epic and product, allows you to track metrics such as time-to-market, planning accuracy, and more.

What’s more, the software catalog helps you quickly assess the impact of any drops in key metrics, allowing you to pinpoint which product lines or customers are affected, so you can respond with precise and targeted actions.

The portal’s scorecards help engineering teams to define and track standards in areas such as production readiness, AppSec, cloud costs, and code quality, helping to identify non-compliance. An internal developer portal is the ideal place to establish working agreements, giving developers a clear productivity goal to work towards — whether it’s standards-related or productivity-focused. 

The portal goes beyond an analytics interface and reports — it allows managers to deploy continuous improvement initiatives and keep track of them.

Engineers can also drive change using the portal. For instance, if PR review times are too long, they can create automations to nudge reviewers, and if MTTR is high, they can use self-service actions to automate hot fixes. 

By providing a unified view of both software engineering intelligence metrics, such as metrics within developer productivity and developer experience, and standards-based metrics, engineering leaders can use the portal’s capabilities to deliver better user experiences, which can help them accomplish their objective. 

A step-by-step guide of using a portal to improve engineering metrics

  1. Engineering leaders can tie objectives to metrics. The internal developer portal provides a number of capabilities that can help leaders to make an impact on these metrics, to ultimately meet their objectives.
This diagram shows how engineering metrics can be improved by the capabilities of an internal developer portal

2. If a leader is trying to improve feature time to market, they may be tracking metrics like deployment frequency. The portal can show the average deployment frequency by value or in a graph form.

The portal enables users to keep track of metrics such as deployment frequency

3. To move the needle with this metric, users can attempt to speed up processes while remaining compliant with standards. The portal enables users to create an end-to-end experience to improve deployment frequency. For instance, if you break down the process of bringing a feature to production into smaller steps, you can use the portal to optimize each step.

Examples include enabling users to scaffold a new service as a self-service. All developers have to do is fill in the form below.

The portal enables engineering teams to create self-service actions with guardrails built-in, providing developers with more autonomy

Then, in order to streamline the entire process you could use self-service actions to spin up a developer environment, send a reminder to review a PR, or rollback to a previous version. 

In addition, the portal’s scorecards can be used to track working agreements. In this example, deployment frequency is one of a number of metrics used to determine if a team is meeting expectations in regards to DORA metrics.

Define a working agreement and track them directly in the portal  

Learn more about how you can use an internal developer portal for your engineering metrics and KPIs using Port’s Insights. See what it’s like to use Port yourself by playing with our live demo. Or sign up to use Port for free here.

{{cta_1}}

Check out Port's pre-populated demo and see what it's all about.

Check live demo

No email required

{{cta_2}}

Contact sales for a technical product walkthrough

Let’s start
{{cta_3}}

Open a free Port account. No credit card required

Let’s start
{{cta_4}}

Watch Port live coding videos - setting up an internal developer portal & platform

Let’s start
{{cta_5}}

Check out Port's pre-populated demo and see what it's all about.

(no email required)

Let’s start
{{cta_6}}

Contact sales for a technical product walkthrough

Let’s start
{{cta_7}}

Open a free Port account. No credit card required

Let’s start
{{cta_8}}

Watch Port live coding videos - setting up an internal developer portal & platform

Let’s start
{{cta-demo}}
{{reading-box-backstage-vs-port}}

Example JSON block

{
  "foo": "bar"
}

Order Domain

{
  "properties": {},
  "relations": {},
  "title": "Orders",
  "identifier": "Orders"
}

Cart System

{
  "properties": {},
  "relations": {
    "domain": "Orders"
  },
  "identifier": "Cart",
  "title": "Cart"
}

Products System

{
  "properties": {},
  "relations": {
    "domain": "Orders"
  },
  "identifier": "Products",
  "title": "Products"
}

Cart Resource

{
  "properties": {
    "type": "postgress"
  },
  "relations": {},
  "icon": "GPU",
  "title": "Cart SQL database",
  "identifier": "cart-sql-sb"
}

Cart API

{
 "identifier": "CartAPI",
 "title": "Cart API",
 "blueprint": "API",
 "properties": {
   "type": "Open API"
 },
 "relations": {
   "provider": "CartService"
 },
 "icon": "Link"
}

Core Kafka Library

{
  "properties": {
    "type": "library"
  },
  "relations": {
    "system": "Cart"
  },
  "title": "Core Kafka Library",
  "identifier": "CoreKafkaLibrary"
}

Core Payment Library

{
  "properties": {
    "type": "library"
  },
  "relations": {
    "system": "Cart"
  },
  "title": "Core Payment Library",
  "identifier": "CorePaymentLibrary"
}

Cart Service JSON

{
 "identifier": "CartService",
 "title": "Cart Service",
 "blueprint": "Component",
 "properties": {
   "type": "service"
 },
 "relations": {
   "system": "Cart",
   "resources": [
     "cart-sql-sb"
   ],
   "consumesApi": [],
   "components": [
     "CorePaymentLibrary",
     "CoreKafkaLibrary"
   ]
 },
 "icon": "Cloud"
}

Products Service JSON

{
  "identifier": "ProductsService",
  "title": "Products Service",
  "blueprint": "Component",
  "properties": {
    "type": "service"
  },
  "relations": {
    "system": "Products",
    "consumesApi": [
      "CartAPI"
    ],
    "components": []
  }
}

Component Blueprint

{
 "identifier": "Component",
 "title": "Component",
 "icon": "Cloud",
 "schema": {
   "properties": {
     "type": {
       "enum": [
         "service",
         "library"
       ],
       "icon": "Docs",
       "type": "string",
       "enumColors": {
         "service": "blue",
         "library": "green"
       }
     }
   },
   "required": []
 },
 "mirrorProperties": {},
 "formulaProperties": {},
 "calculationProperties": {},
 "relations": {
   "system": {
     "target": "System",
     "required": false,
     "many": false
   },
   "resources": {
     "target": "Resource",
     "required": false,
     "many": true
   },
   "consumesApi": {
     "target": "API",
     "required": false,
     "many": true
   },
   "components": {
     "target": "Component",
     "required": false,
     "many": true
   },
   "providesApi": {
     "target": "API",
     "required": false,
     "many": false
   }
 }
}

Resource Blueprint

{
 “identifier”: “Resource”,
 “title”: “Resource”,
 “icon”: “DevopsTool”,
 “schema”: {
   “properties”: {
     “type”: {
       “enum”: [
         “postgress”,
         “kafka-topic”,
         “rabbit-queue”,
         “s3-bucket”
       ],
       “icon”: “Docs”,
       “type”: “string”
     }
   },
   “required”: []
 },
 “mirrorProperties”: {},
 “formulaProperties”: {},
 “calculationProperties”: {},
 “relations”: {}
}

API Blueprint

{
 "identifier": "API",
 "title": "API",
 "icon": "Link",
 "schema": {
   "properties": {
     "type": {
       "type": "string",
       "enum": [
         "Open API",
         "grpc"
       ]
     }
   },
   "required": []
 },
 "mirrorProperties": {},
 "formulaProperties": {},
 "calculationProperties": {},
 "relations": {
   "provider": {
     "target": "Component",
     "required": true,
     "many": false
   }
 }
}

Domain Blueprint

{
 "identifier": "Domain",
 "title": "Domain",
 "icon": "Server",
 "schema": {
   "properties": {},
   "required": []
 },
 "mirrorProperties": {},
 "formulaProperties": {},
 "calculationProperties": {},
 "relations": {}
}

System Blueprint

{
 "identifier": "System",
 "title": "System",
 "icon": "DevopsTool",
 "schema": {
   "properties": {},
   "required": []
 },
 "mirrorProperties": {},
 "formulaProperties": {},
 "calculationProperties": {},
 "relations": {
   "domain": {
     "target": "Domain",
     "required": true,
     "many": false
   }
 }
}
{{tabel-1}}

Microservices SDLC

  • Scaffold a new microservice

  • Deploy (canary or blue-green)

  • Feature flagging

  • Revert

  • Lock deployments

  • Add Secret

  • Force merge pull request (skip tests on crises)

  • Add environment variable to service

  • Add IaC to the service

  • Upgrade package version

Development environments

  • Spin up a developer environment for 5 days

  • ETL mock data to environment

  • Invite developer to the environment

  • Extend TTL by 3 days

Cloud resources

  • Provision a cloud resource

  • Modify a cloud resource

  • Get permissions to access cloud resource

SRE actions

  • Update pod count

  • Update auto-scaling group

  • Execute incident response runbook automation

Data Engineering

  • Add / Remove / Update Column to table

  • Run Airflow DAG

  • Duplicate table

Backoffice

  • Change customer configuration

  • Update customer software version

  • Upgrade - Downgrade plan tier

  • Create - Delete customer

Machine learning actions

  • Train model

  • Pre-process dataset

  • Deploy

  • A/B testing traffic route

  • Revert

  • Spin up remote Jupyter notebook

{{tabel-2}}

Engineering tools

  • Observability

  • Tasks management

  • CI/CD

  • On-Call management

  • Troubleshooting tools

  • DevSecOps

  • Runbooks

Infrastructure

  • Cloud Resources

  • K8S

  • Containers & Serverless

  • IaC

  • Databases

  • Environments

  • Regions

Software and more

  • Microservices

  • Docker Images

  • Docs

  • APIs

  • 3rd parties

  • Runbooks

  • Cron jobs

Starting with Port is simple, fast and free.

Let’s start