A microservice catalog isn’t enough: why software catalogs are the right approach for developer portals

Let's take a deep dive into the best approach today for platform engineers to catalog assets and services- the software catalog. In this article we'll discuss what a software catalog is, and why it is the best approach for internal developer platforms, regardless if you build or buy it

August 30, 2022

Ready to start?

A microservice catalog isn’t enough: why software catalogs are the right approach for developer portals

Introduction

A centralized service catalog, a resource catalog, a single pane of glass: everyone is talking about catalogs as the first building block for internal developer portals and a better developer experience

Is there a better approach to cataloging infrastructure assets, services and anything in between for platform engineering? We believe there is one: the software catalog. In this blog post we’ll make a technical argument to explain exactly what a software catalog is and why it's the best approach for internal developer platforms, regardless if you build or buy.

The rise of developer portals

To become agile, engineering organizations have gone a long way. The rise of DevOps has driven a lot of innovation, and one of the changes is that software monoliths are now broken into many small pieces: microservices, micro frontends, mono-repo, multi-repo and more. IaC is used to interact with cloud resources to realize the benefits of git, and third-party software of all kinds (SaaS, OSS, Cloud Services, etc.) is leveraged to write less code and focus on business logic. 

Yet, today, almost everyone is in agreement that when developers need to interact with these new tools, technologies, methodologies & processes all that remains is developer cognitive load.  This load has to be reduced so that engineering teams can stay productive. A shift left culture of “you build it, you own it” requires internal developer portals. The underlying goal is to abstract away infrastructure and make the lives of devops easier, while making developer self-service simple and with the proper guardrails in place.

Enter catalogs: software catalogs, microservice catalogs, resource catalogs

At the core of any developer portal is a catalog. The catalog is there to abstract away the complexity of DevOps technicalities, and provide developers a single pane of glass that represents all the software and infrastructure they need to interact with. 

Ask any platform engineering team what their use cases for internal developer portals or platforms are, and you will receive many answers: self-service, dependencies, microservices, orphan environments, lack of ownership, lack of SDLC understanding and control, Jenkins self-service run amok and more.

{{cta_4}}

Software catalog benefits

Is there an optimal architecture for a catalog - what I call a Software Catalog - that will cover all the use cases for an internal developer platform or a productized developer portal? There is, and its goal is to provide developers with a solid understanding and control of the Software Development Life Cycle across any DevOps architecture.

At the core of this approach is the unified catalog of (1) Service, (2) Environment, (3) Deployed Service and (4) Deployment. Only this unified approach can achieve the following:

  1. Eliminating developer cognitive load when creating (time to “Hello World!”) and deploying a service or resolving an issue.
  2. Elimination of orphan resources from environments, to cloud resources and permissions.
  3. A clear understanding of each service’s maturity and readiness.
  4. A detailed view of every service’s lifecycle from the first commit to many deployments running across different environments.
  5. Reducing the amount of recurring tickets or slack messages received by developers with questions  and tasks that can be easily automated through the software catalog.
  6. Alignment of work within the engineering organization by assigning an owner for each catalog entity, so that SREs, DevOps, developers  and platform engineering all have a shared language.
  7. Reducing the time it takes a new developer to perform their first commit by providing clarity and faster onboarding.

Software Catalog: The Basic Model (0-1)

This model covers the main SDLC intersections: from Services, through Environments and Deployments, all the way up to the cloud.

Before I dive into the details of each component in the software catalog, here’s a brief explanation of the ontology diagrammed here:

  1. Service. A service can be a microservice, software monolith or any other software architecture.
  2. Environment. An environment is any production, staging, QA, DevEnv, on-demand or any other environment type. 
  3. Deployed Service. A deployed service is a representation of the current “live” version of a service running in a specific environment. It will include references to the service, environment and deployment, as well as real-time information such as status,  uptime and any other relevant metadata.
  4. Deployment. A deployment could be described as an object representing a CD job. It includes the version of the deployed service and a link to the job itself. Unlike other objects, the deployment is an immutable item in the software catalog. It is important to keep it immutable to ensure the catalog remains a consistent source of truth. 

The data model should also show dependencies between each component, essentially allowing developers to get answers to questions such as:

  1. What is the datadog link to service x that is running in production?
  2. Who owns service y?
  3. What is the branch name of the running version in staging?
  4. What is the K8S dashboard link to production?
  5. What are the versions of the services that are deployed in the production environment?
  6. What is the deployment frequency of a specific service?
  7. What is the percentage of successful deployments?
  8. In which cloud providers are my services provisioned?
  9. Which programming languages are the most common in my organization?
  10. Who do I need to interact with when there is a bug in the production environment?
  11. Who do I need to interact with when I have a question about an API of a specific service?
  12. What are the relevant DORA Metrics for a team/service?
  13. And a lot more.

Let’s take a dive into each component type in the catalog →

{{gitops}}

Service 

The service profile should include everything about the service that is not related to the CI/CD pipeline itself. It is advisable to keep the profile clean and concrete to the profile of the service itself. The relation of the service to the environment & deployed service will provide the relevant SDLC information with regards to the service’s life down the road.

Relevant data includes:

  • Git repository URL
  • Responsible team (fetched from the identify provider)
  • Who is on-call
  • The main language & version the service is written in (Python/ Node/ Go/ Java/ etc)
  • The business domain the service belongs to (if applicable)
  • The slack channel for getting notifications about the service
  • Link to docs 
  • Hosting Infra (deployed as a serverless function or on Kubernetes)
  • The helm chart used to deploy the service
  • Services which the service is dependent upon

The service schema as it appears in Port’s developer portal

Environment 

Representing environments varies greatly among different engineering organizations. Some deploy services as a Lambda function, and others as Kubernetes clusters. Even when two companies use Kubernetes, they may have little in common with regards to what constitutes an environment. Some organizations represent an environment at a namespace resolution, while others refer to an entire cluster as the environment where services are being deployed on. There is no right way to do this.

Let's assume as an example that you want to represent an environment as a Kubernetes cluster so you can save these properties for the cluster:

  • Environment Type (for example dev, staging and prod)
  • Cluster Type (AKS, EKS, GKE) - If you operate multi clouds
  • Region
  • Deep-Links to logs & observability tools for the cluster - these are relevant for the DevOps Engineer (New Relic, Kiali, Prometheus, and more) 
  • Slack channel to get notifications about the cluster
  • Link to Runbooks
  • On-Call 
  • Cluster Configuration
The Environment schema as it appears in Port’s developer portal

Deployed Service

We want to enable developers to get information about how a specific service version behaves in a particular environment. The “Deployed Service” represents the runtime of this service@env.

The Deployed Service is related to the Service on the one side and to the Environment from the other side. 

Here are the properties we want to store on “Deployed Service”:

  • Related service
  • Health status
  • Related environment
  • App URL
  • Chart version 
  • Links to logs (New Relic, Sentry, Prometheus, and more) about the deployed service (mostly the ones that are relevant for a developer)
  • CPU limit 
  • Memory limit
  • CPU requests
  • Memory requests
  • Branch
  • Version

Deployment 

This is an immutable entity within the catalog, representing the CD job description including relevant data with regards to the service’s version. 

  • Date
  • Job duration
  • The user who triggered the job
  • The deployed service
  • Environments service got deployed for
  • The deployed image tag
  • The status of the deployment
  • Link to the job (Jenkins job, GitHub action, azure DevOps, etc.)
  • Link to deployment logs

Some notes on data ingestion when building the catalog (vs. buying a product)

  • Ingesting data into the software catalog isn’t trivial. I recommend saving services as YAML files inside your git repository. This transfers the ownership of these files to the service owners.
  • Environments will be ingested through Terraform. 
  • Deployed Services and Deployments will be ingested using an API.

Make sure your resource catalog is API-first so you can pipeline data from different sources and implement “exporters” to fetch data from meaningful intersections in your pipeline.

{{jenkins}}

Future Steps

We have started with four basic components: Service, Environment, DeployedService, and Deployment. 

After we establish the basic structure of the software catalog, we will want to add more data to it. We will want to know which cloud resources services are used in a specific environment, which tests are performed on each service, and understand the resulting service maturity, which secrets services use and what is their current on-demand environment and costs. To do this, we will gradually add entities to enrich the visibility in the catalog. 

This model is a great beginning that gives developers a good understanding and control of the Software Development Lifecycle, a first step towards a good developer experience. 

{{cta_1}}

Check out Port's pre-populated demo and see what it's all about.

Check live demo

No email required

{{cta_2}}

Contact sales for a technical product walkthrough

Let’s start
{{cta_3}}

Open a free Port account. No credit card required

Let’s start
{{cta_4}}

Watch Port live coding videos - setting up an internal developer portal & platform

{{cta_5}}

Check out Port's pre-populated demo and see what it's all about.

(no email required)

Let’s start
{{cta_6}}

Contact sales for a technical product walkthrough

Let’s start
{{cta_7}}

Open a free Port account. No credit card required

Let’s start
{{cta_8}}

Watch Port live coding videos - setting up an internal developer portal & platform

{{cta-demo}}
{{reading-box-backstage-vs-port}}

Example JSON block

{
  "foo": "bar"
}

Order Domain

{
  "properties": {},
  "relations": {},
  "title": "Orders",
  "identifier": "Orders"
}

Cart System

{
  "properties": {},
  "relations": {
    "domain": "Orders"
  },
  "identifier": "Cart",
  "title": "Cart"
}

Products System

{
  "properties": {},
  "relations": {
    "domain": "Orders"
  },
  "identifier": "Products",
  "title": "Products"
}

Cart Resource

{
  "properties": {
    "type": "postgress"
  },
  "relations": {},
  "icon": "GPU",
  "title": "Cart SQL database",
  "identifier": "cart-sql-sb"
}

Cart API

{
 "identifier": "CartAPI",
 "title": "Cart API",
 "blueprint": "API",
 "properties": {
   "type": "Open API"
 },
 "relations": {
   "provider": "CartService"
 },
 "icon": "Link"
}

Core Kafka Library

{
  "properties": {
    "type": "library"
  },
  "relations": {
    "system": "Cart"
  },
  "title": "Core Kafka Library",
  "identifier": "CoreKafkaLibrary"
}

Core Payment Library

{
  "properties": {
    "type": "library"
  },
  "relations": {
    "system": "Cart"
  },
  "title": "Core Payment Library",
  "identifier": "CorePaymentLibrary"
}

Cart Service JSON

{
 "identifier": "CartService",
 "title": "Cart Service",
 "blueprint": "Component",
 "properties": {
   "type": "service"
 },
 "relations": {
   "system": "Cart",
   "resources": [
     "cart-sql-sb"
   ],
   "consumesApi": [],
   "components": [
     "CorePaymentLibrary",
     "CoreKafkaLibrary"
   ]
 },
 "icon": "Cloud"
}

Products Service JSON

{
  "identifier": "ProductsService",
  "title": "Products Service",
  "blueprint": "Component",
  "properties": {
    "type": "service"
  },
  "relations": {
    "system": "Products",
    "consumesApi": [
      "CartAPI"
    ],
    "components": []
  }
}

Component Blueprint

{
 "identifier": "Component",
 "title": "Component",
 "icon": "Cloud",
 "schema": {
   "properties": {
     "type": {
       "enum": [
         "service",
         "library"
       ],
       "icon": "Docs",
       "type": "string",
       "enumColors": {
         "service": "blue",
         "library": "green"
       }
     }
   },
   "required": []
 },
 "mirrorProperties": {},
 "formulaProperties": {},
 "calculationProperties": {},
 "relations": {
   "system": {
     "target": "System",
     "required": false,
     "many": false
   },
   "resources": {
     "target": "Resource",
     "required": false,
     "many": true
   },
   "consumesApi": {
     "target": "API",
     "required": false,
     "many": true
   },
   "components": {
     "target": "Component",
     "required": false,
     "many": true
   },
   "providesApi": {
     "target": "API",
     "required": false,
     "many": false
   }
 }
}

Resource Blueprint

{
 “identifier”: “Resource”,
 “title”: “Resource”,
 “icon”: “DevopsTool”,
 “schema”: {
   “properties”: {
     “type”: {
       “enum”: [
         “postgress”,
         “kafka-topic”,
         “rabbit-queue”,
         “s3-bucket”
       ],
       “icon”: “Docs”,
       “type”: “string”
     }
   },
   “required”: []
 },
 “mirrorProperties”: {},
 “formulaProperties”: {},
 “calculationProperties”: {},
 “relations”: {}
}

API Blueprint

{
 "identifier": "API",
 "title": "API",
 "icon": "Link",
 "schema": {
   "properties": {
     "type": {
       "type": "string",
       "enum": [
         "Open API",
         "grpc"
       ]
     }
   },
   "required": []
 },
 "mirrorProperties": {},
 "formulaProperties": {},
 "calculationProperties": {},
 "relations": {
   "provider": {
     "target": "Component",
     "required": true,
     "many": false
   }
 }
}

Domain Blueprint

{
 "identifier": "Domain",
 "title": "Domain",
 "icon": "Server",
 "schema": {
   "properties": {},
   "required": []
 },
 "mirrorProperties": {},
 "formulaProperties": {},
 "calculationProperties": {},
 "relations": {}
}

System Blueprint

{
 "identifier": "System",
 "title": "System",
 "icon": "DevopsTool",
 "schema": {
   "properties": {},
   "required": []
 },
 "mirrorProperties": {},
 "formulaProperties": {},
 "calculationProperties": {},
 "relations": {
   "domain": {
     "target": "Domain",
     "required": true,
     "many": false
   }
 }
}
{{tabel-1}}

Microservices SDLC

  • Scaffold a new microservice

  • Deploy (canary or blue-green)

  • Feature flagging

  • Revert

  • Lock deployments

  • Add Secret

  • Force merge pull request (skip tests on crises)

  • Add environment variable to service

  • Add IaC to the service

  • Upgrade package version

Development environments

  • Spin up a developer environment for 5 days

  • ETL mock data to environment

  • Invite developer to the environment

  • Extend TTL by 3 days

Cloud resources

  • Provision a cloud resource

  • Modify a cloud resource

  • Get permissions to access cloud resource

SRE actions

  • Update pod count

  • Update auto-scaling group

  • Execute incident response runbook automation

Data Engineering

  • Add / Remove / Update Column to table

  • Run Airflow DAG

  • Duplicate table

Backoffice

  • Change customer configuration

  • Update customer software version

  • Upgrade - Downgrade plan tier

  • Create - Delete customer

Machine learning actions

  • Train model

  • Pre-process dataset

  • Deploy

  • A/B testing traffic route

  • Revert

  • Spin up remote Jupyter notebook

{{tabel-2}}

Engineering tools

  • Observability

  • Tasks management

  • CI/CD

  • On-Call management

  • Troubleshooting tools

  • DevSecOps

  • Runbooks

Infrastructure

  • Cloud Resources

  • K8S

  • Containers & Serverless

  • IaC

  • Databases

  • Environments

  • Regions

Software and more

  • Microservices

  • Docker Images

  • Docs

  • APIs

  • 3rd parties

  • Runbooks

  • Cron jobs

Starting with Port is simple, fast and free.

Let’s start