The Practical Guide to internal developer portals

Automations

Chapter 6

What are automations?

Internal developer portals have the potential to massively increase developer productivity - and one of the main ways of doing this is through automations.  

As software development workflows are complex, and developers and managers use multiple steps to arrive at their desired end result, automations tie these steps together to provide better developer experiences and outcomes.

The role of automations is process orchestration, allowing you to design workflows that operate efficiently with minimal manual intervention, all while adhering to your organization's guardrails. 

Automations connect the different pillars of the portal - self-service, the catalog and scorecards - together. 

What can you do with automations?

Automate repetitive tasks

The primary advantage of automation is its capacity to remove repetitive tasks like clean-ups, permission management, and shutting down unused development environments.

Without automation, tasks such as clean-ups, permission management, and terminating unused dev environments, either require human supervision or rely on ad-hoc and unmanaged cron jobs or scripts, which contradicts the platform engineering principle of scalable, reusable processes.

Manual and ad-hoc approaches lead to expensive errors and security or compliance problems. For instance, forgetting to revoke access for a former employee can expose sensitive information, and neglecting to terminate a development environment can unnecessarily deplete your cloud budget.
Your developer portal, which already integrates all aspects of the software development lifecycle, is ideally suited to automate these tasks. For instance, when a developer requests a new dev environment through a self-service action, the form should require specifying a TTL for the environment. Automations can then terminate the environment once the TTL expires.

Manage notifications and alerts

As the software catalog acts as a “single pane of glass” it helps reduce alert fatigue by giving a clear view of how software assets are connected, who owns them, their importance, and their dependencies. This information helps automations prioritize and direct alerts properly, making it easier to focus on what's important and ignore unnecessary noise.

  1. Developer-focused alerts and notifications include:
  • Automations to nudge developers when pull requests are taking too long to review
  • Notifying developers when their deployments have succeeded or failed
  • Alerting developers when changes are made to services or components with dependencies that matter to them.
  1. Manager-focused alerts and notifications include:
  • Alerting managers when SLOs are not being met or when key metrics show a decline in service performance 
  • Notifying managers when cloud costs approach a predefined threshold
  • Getting context with a link to the right section in the software catalog
  1. Security/SRE alerts and notifications include:
  • Notifications of critical vulnerabilities or anomalies in system behavior. 

Enforce policies

You can use automations to enforce policies consistently by integrating policies directly into your development workflows.

This means you can do the following, for example:

Limit resource use: You might want to control how many development environments one developer can create. With Automations, if a developer tries to create a third environment in a month, it can automatically trigger an approval requirement from their manager. You can also set it so that developers who have been with the company for over two years get automatic approval.

Manage temporary access to production: For on-call engineers who need temporary access to production environments during their shifts, Automations can automatically grant access when the shift begins and revoke it when it ends.

Example of automations in action

Here’s a real-life scenario which compares a 3AM incident with and without Automations:

| | Traditional incident management | Incident management with Automations | |---|---|---| | Step 1 | Delay in finding out about the incident itself | Automations notify the right on-call engineer instantly. | | Step 2 | Manually open incident tickets and trawl through outdated documentation to find owners, dependencies, etc. | Incident tickets auto-filled with relevant details. | | Step 3 | Manually open incident tickets and trawl through outdated documentation to find owners, dependencies, etc. | Auto-created Slack channels invite the right people. | | Step 4 | Scramble through various tools and dashboards for clues. | Engineers access all needed info via a single link. | | Step 5 | Rollback requires a change request approval (which can take hours). | Rollbacks initiated through self-service, approved automatically (with production access revoked at end of shift automatically, too) | | Step 6 | Deploy rollback and update all stakeholders manually. | Stakeholders updated seamlessly. | | | Result: The issue isn't resolved for a number of hours leaving engineers exhausted. Highly inefficient and high potential for errors. | Result: By 3:30 AM, the issue is resolved, business impact minimized, and the engineer can return to sleep, knowing the system is stable. |
Download guide

No email required

That is how the 'info box' will look like:
Further Reading:
Read: Why "running service" should be part of the data model in your internal developer portal

Let us walk you through the platform and catalog the assets of your choice.

I’m ready, let’s start