What is a Software Catalog
A software catalog is a centralized metadata repository that enables users to easily find and access application-related information such as ownership, versions, status, and more.
The software catalog encompasses everything from CI/CD metadata, cloud resources, Kubernetes, services, security data, APIs and more.
While a service catalog is defined as a catalog of an organization’s microservices, a software catalog is much more as it integrates tools from anything in the internal developer platform, enabling organizations to get more context on things like AppSec or FinOps as well as resources, such as cloud resources and Kubernetes.
A software catalog can include an API catalog, resource catalog, or other types of catalogs. This enables engineering teams to get additional context into a particular service or system. For instance, tying cost data about cloud resources to the microservice or system that developers, teams or leaders are responsible for, can help them to get a full picture of the service and determine whether a service is exceeding cost limits.
The rise in cloud native, DevOps, GitOps and Kubernetes, have all added layers of complexity to the software development environment. The resulting complex architecture makes it difficult to see everything in one place and answer simple questions, such as:
- Who is the owner of the code
- What is the appsec posture
- The status of the services and APIs (production ready, health)
- The costs associated with the services
- Dependencies and related services/resources etc
- Packages and more (a package catalog)
As developers are being given more responsibility for different parts of the software development lifecycle, they need a way to find the information they need in an easy way, which doesn’t mean they have to learn how to use various different UIs or ask for help.
Components of a Software Catalog
Entities
Think of everything your developers need information about; entities are microservices, cloud resources, CI/CD data, Kubernetes clusters etc, developer environments, pipelines, deployments, libraries, services, and components.
Data Model
To create a software catalog, you need a good data model which provides a graph of how everything is related to everything. It is the metadata schema for each of the entities you want to represent in the software catalog, and the relations between these entities. This data model should be a perfect fit for your organization’s needs, use cases, and user personas. It is important therefore to have an un-opinionated data model that provides you with flexibility so that it can adapt to your organization’s requirements.
There are two common base data models that serve as frameworks for structuring software catalogs:
SDLC Model: The classic model includes three main components - service, environment, and running service. Its aim is to simplify the way users understand the interdependencies within the software development lifecycle (SDLC).
C4 Model: The C4 model is a framework for visualizing the architecture of software systems, from different perspectives and with differing layers of detail. The model advocates the use of diagrams that are easy to understand by technical and business audiences, and encourages simplicity.
Data Model Extensions
Using integrations, you can extend the data model by adding, for example, metadata about
- Kubernetes (K8s)
- APIs
- Cloud Resources
- CI/CD
- Packages & Libraries
- Vulnerabilities
- Incident Management
- Alerts
- Tests
- Feature Flags
- Misconfigurations
- FinOps
Software Catalog Dependency Graph
Not all questions require knowledge about dependencies, but many do; for example, ‘which resources are affected by this outage?’.
A dependency graph can help users to visualize which entities in the software catalog have dependencies. This gives users clearer information in a shorter time frame, helping them to navigate risk assessment, rank priorities and gain a better understanding of the environment,
Scorecards
To measure and track the health and status of each service and application within the software catalog, engineering teams can use scorecards. Scorecards enable the engineering team to establish metrics so that they can mark production readiness, code quality, migration quality, operational performance, and more.
For example, an operational readiness scorecard can check if services are production-ready, by taking into account factors like ownership, documentation, runbooks, and infrastructure setup. Other scorecards include those for service maturity, operational maturity, DORA metrics, migrations, service health, code quality, cloud cost and security.
There are three reasons organizations use scorecards:
- Organizational alignment: Scorecards allow you to set clear, established standards and baselines for code quality, documentation, operational performance, and other metrics.
- Alerting and prioritization: By monitoring service scores against thresholds, you can be alerted when a service drops below acceptable baseline levels.
- Driving quality improvements. Initiatives are groups of scorecards that all fit within a strategic focus area. Engineering leaders rely on initiatives to set organizational priorities – and invest team energy and focus in concerted improvements in a given area.
Benefits of a Software Catalog
The key benefits of a software catalog are:
- Discoverability - they create a central repository of all the data about microservices, their dependencies, packages, APIs, and anything else, creating organization and obviating the need to collect data in spreadsheets.
- Eliminating manual updates (and therefore errors) - software catalogs help developers to reduce the amount of time they spend updating appsec, feature flag and other data about microservices. By being continually updated and enriched in real-time, a software catalog provides peace of mind that developers are using accurate information, reducing the chances of errors.
- Improved developer experience - without software catalogs, developers need to access the different interfaces with different (and confusing) data about microservices and the associated infrastructure and tools. Not only can this lead to errors when decision-making, but it can lead to frustration for the developer, particularly if they aren’t comfortable with using certain interfaces. A catalog reduces the need to learn how to use different interfaces and improves the overall developer experience.
Can you use a spreadsheet instead of a software catalog?
Many organizations still rely on spreadsheets to track software ownership, the person who is currently on-call, dependencies, vulnerabilities, misconfigurations and more. However, this is a labor-intensive exercise, whereby each change requires a manual update. Spreadsheets lack standardization, up-to-date data and context, causing issues in identifying resources, assets or owners. Users can’t get the critical answers they need, and can’t rely on the spreadsheet as a single pane of glass.
Can a CMDB serve as a software catalog?
Configuration management databases (CMDBs) enable users to store configuration about IT infrastructure and assets. They help IT professionals to get a better hold of managing changes, identifying problems, conducting root cause analysis, resolving issues and assessing the impact of changes, and more. However, CMDBs are very difficult to implement and require a lot of effort to maintain. In addition, they aren’t able to offer additional context about a microservice or system (such as impact on other services, ownership or health). They also don’t offer the same ability for users to intuitively find what they need, giving them democratized knowledge. Some organizations may seek to integrate their existing CMDB with a modern software catalog so that they have an intuitive interface and more contextual information at their fingertips, while maintaining their existing way of storing configuration information.
Can a catalog inside an incident management tool serve as a software catalog?
Incident management tools that come equipped with a service catalog provide users with the relevant information about incidents in the context of the service they are connected to. While this acts as a helpful method of tracking incidents, they don’t provide the whole context. For example, if a developer team experienced API downtime, they wouldn’t be able to access information about the API within the incident management service catalog. On the other hand, a software catalog would enable users to quickly refer to the API information such as the last response time, who it was last used by and its stability. Then, users can quickly identify the root cause of the issue and take corrective action. This information is readily available to developers, and can act as a proactive measure to prevent incidents, too.
Implementing a Software Catalog
1. Consider the different personas that will be using the software catalog - they will all have different requirements. For example, developers may want to abstract away Kubernetes complexity, DevOps may want a better understanding of the infrastructure, and DevSecOps may want better visibility of vulnerabilities and misconfigurations.
personas and their needs.
2. This is where different data models come in. You can decide between the different base models and extensions that your organization requires. The structure of the catalog is determined by the blueprints and relations defined in the data model.
3. It’s important to ensure that your data model supports a stateful representation of the environment. For example: the running service in the classic model reflects the real world, where “live” services are deployed to several environments, such as development, staging or production environments. This is critical for providing context.
The software catalog should sync in real time with data sources to provide a live graph of your services and applications. For example, integrating with CI/CD tools ensures that your software catalog will be kept up to date by making updates directly from your CI/CD.
4. Then you have to consider how you’re able to access and view the software catalog. Do you want admins to be able to curate specific pages for different use cases or teams? Should RBAC be used to put these pages into folders, so that only users who have the right permissions relevant for their responsibilities can access certain pages? Should each person have a different view of the catalog when they log in, which is personalized to the information that they need at their fingertips?
5. Establish metrics by using scorecards to benchmark each element within the software catalog. Define and measure quality, production readiness and developer productivity.