How we implemented our internal deployment pipeline in Port’s developer portal
April 3, 2024
Ready to start?
Overview
As Port’s platform team, we love giving our developers self-service capabilities. Lucky for us, self-service is an integral part of Port’s arsenal. Thanks to Port self-service actions, we can allow our developers to trigger their day-to-day operations straight from the developer portal.
When looking at developer self-service, the sky's the limit. You can give your developers access to anything, from scaffolding a new service repository to spinning up a new Kubernetes cluster for development, and all of this while reducing ticket-ops and ensuring a certain standard for how things should be done.
But let’s zone in on one of these options.
In this blog, I will be covering how we used Port to build our internal deploy pipeline, to allow Port developers to deploy new features (and bugs🥲) to production as fast as possible.
History
Before diving into the technical details, let’s review how Port developers previously deployed their services before we switched over to Port as the developer self-service hub.
Before using the portal, developers had run the deploy pipeline straight from Github, using GitHub workflows:
This would in turn run a chain of workflows, which validated the new deployment, built the service and deployed it to the requested runtime (production, staging, test, etc).
Over time, this became difficult to manage:
- The data required for the deployment process was spread across Github repository secrets and variables, and was sometimes hard coded in the different workflows. This meant that adding a new microservice to the list of services available for deployment required multiple changes in multiple workflow files - not something that can be done quickly and easily by anyone.
- Over time, we started managing a new production environment in another AWS region, which increased the complexity of our deploy pipeline. Both of these issues made the pipeline harder to maintain, and that complexity was only set to increase as time went on, unless we made a major refactor or change.
- As for our developers, looking at the pipeline abyss, and understanding where and why their deployments were failing became harder as well, which encouraged us to look for a better solution.
We had to re-think how we were managing our deployment pipeline. Here are the main issues we wanted to address:
- Reduce pipeline complexity - It was becoming hard to keep up with infrastructure metadata for multiple environments, and we wanted a better solution to easily onboard and support new environments.
- Hiding complexity from developers - Infrastructure is complicated, but a more complex pipeline shouldn’t mean a more complicated input form. We wanted to abstract away complexity from the devs when triggering the deploy pipeline and set a goal for ourselves to make sure the inputs required to trigger a deployment were minimal, clear and simple - that way even a brand new developer that just finished fixing his first bug, could push the fix into production as soon as it went through the review process.
- Better logs for everyone - we only wanted to show our developers the information that was necessary. This would provide clarity and visibility into why pipelines are failing, without needing our help.
Implementation
Reducing pipeline complexity
We wanted to be able to access information regarding different environments, dynamically. To make this possible, we created blueprints responsible for managing different information about our microservices.
This makes it easy to fetch data regarding a service during the deployment pipeline. It also allows us to store information about new services (or make updates to existing ones) and their metadata with ease, just by creating a new entity in Port.
Let’s look at our frontend Running Service entity for example:
Our frontend is deployed on AWS Cloudfront, and when working with Cloudfront, you need to provide the Cloudfront Distribution to run an invalidation. What better place to save this information than in your software catalog?
Storing information regarding our runtimes and services in Port also allowed us to implement an interesting and dynamic use case. We wanted to control which regions a service should be deployed to, if it is being deployed to a specific runtime.
To support this use case we created an AWS Regions property on our Runtime entity:
We use this array to generate a matrix in our Github pipeline. This has been pretty awesome, because now adding a new region to our production deployments is as easy as adding another string to an array!
On the GitHub workflows side, the pipeline was adapted to contain multiple generic pieces of logic which are applicable to all microservices. By leveraging the information available in Port, the pipeline chooses the correct workflows to trigger and provide them with the correct inputs.
This means that while our pipeline is comprised of multiple pieces, only the ones required to deploy a particular service are triggered during the deploy workflow:
and the entire process requires just 3 inputs!
Now, let’s dive a bit deeper into how our new super generic deploy pipeline works:
Technical deep-dive
Port API and our other internal microservices are deployed to AWS ECS, and our frontend is deployed to an S3 bucket and is served to our users via AWS Cloudfront.
Our deploy pipeline consists of 6 main jobs. Each job reports run logs back to Port, updating the user who triggered the workflow with the step that is currently running, if it was successful or not as well as information about the deployment process.
These are the jobs in our pipeline:
1. Deploy Caller
This is the entrypoint workflow for our deployment pipeline. This workflow takes the Port payload that is sent via the Port action trigger, parses it and hands it over to the deploy-generic.yml - which is responsible for running the rest of the jobs on this list.
2. Generate region matrix
This step takes the runtime input which is passed to the workflow, uses it to fetch the matching Port entity and then calculates the regions where the microservice should be deployed, based on its properties.
Below you can see how we fetch the entity using Port’s Github action. Then using the retrieved data we build the parameter used to run the next jobs in a matrix.
3. Pre Deploy
This job runs in a matrix for each region that needs to be deployed. It is responsible for:
- Checking the microservice lock status before initiating the deploy process.
- Fetching relevant service and runtime information from Port about the microservice. For example, the service type (ECS/Cloudfront).
- Calculating variables required for the current run, such as Task Definition version, ECR ARN, Cloudfront distribution ID, S3 bucket ARN and any other information that might be needed for this deployment run.
- Reporting a new deployment entity to Port to keep track of the deployment progress.
4. Build Handler
This job utilizes the parameters created in the pre-deploy job, and triggers either a workflow responsible for building an image (in case of an ECS deployment), or a workflow responsible for building the static site bundles (in case of a Cloudfront deployment).
In case of an image-based deployment, we only build the image once, and push it to a central ECR repository which all regions use.
Static site deployments act differently, requiring each region to have its own build - hence it runs in a matrix for each region.
This workflow file can easily be scaled to support new build types in accordance with a new service type, by adding a new block with the relevant IF statement.
5. Deploy Handler
Similar to the build handler, the Deploy handler uses the parameters generated in the pre-deploy workflow to trigger the correct Deploy workflow.
ECS-based microservices are directed to the Deploy ECS workflow, while the Cloudfront based services are directed to the Deploy Cloudfront workflow.
It runs the deployment in a matrix to ensure that the deployment process happens in all regions configured in the Runtime Port entity.
This workflow is also easily scalable if a new service type will be used in the future, by adding a new IF statement to direct the service to the relevant workflow.
6. Post Deploy
The responsibility of the Post Deploy workflow is to inform the user about the status of the deployment process, and to update 3rd party applications with information about the deployment process. This includes sending a Slack message regarding a successful or failed deployment attempt, updating the Port deployment entity with the relevant status and more.
Here is an abstract diagram that shows the flow of our deployment pipeline. Notice how the flow diverges to either a Cloudfront based flow, or an image (ECS) based flow.
Hiding complexity from developers
Our developers don’t REALLY need to know what’s going on behind the scenes when deploying a microservice to one of their environments, hence we wanted to keep the deployment form as simple as possible:
This user form contains the same data which the developer had to fill out in GitHub. To make things even easier - all of the fields that the developer needs to fill out are populated with data taken directly from the software catalog, which means they:
- are always up-to-date
- able to prevent any bad inputs
- Negate the need for context switching (having to go look for the correct information in a different tab or a different tool.)
This is all enabled by leveraging Port’s GitHub app internally, as well as configuring the form with “Entity” user inputs, which make it possible to leverage information from the catalog directly inside self-service action forms.
Better logs for everyone
Port’s Action run page allows the action runner to keep track of his action’s progress by viewing the status of the run, and its logs. Each step of our deployment pipeline reports a log back to Port with relevant information (step started, step ended, step failed because….).
This ensures that developers don’t have to dig through the pipeline outputs, and can figure out what went wrong (if anything) directly from Port.
Summary
As a fast growing company, keeping up with the never ending scale of infrastructure is tough, and giving your developers easy access to work with this infrastructure is even tougher.
Using Port as an abstraction layer allowed us to give our developers just the right amount of control over the infrastructure by giving them useful and continuous feedback on what is going on. The developers also use Port’s catalog to stay up-to-date about their microservices, closing the circle of being able to control the infrastructure, and also gain information about it, in a single pane of glass.
As both the DevOps and DevEx team for Port, we need to keep up with the growing complexity of our infrastructure and handle new requirements from our developers, all while keeping the infrastructure up and running. For us, being able to leverage Port internally really makes our job as platform engineers easier and helps keep up with the constantly increasing demand from our colleagues, the Port developers.
Check out Port's pre-populated demo and see what it's all about.
No email required
Contact sales for a technical product walkthrough
Open a free Port account. No credit card required
Watch Port live coding videos - setting up an internal developer portal & platform
Check out Port's pre-populated demo and see what it's all about.
(no email required)
Contact sales for a technical product walkthrough
Open a free Port account. No credit card required
Watch Port live coding videos - setting up an internal developer portal & platform
Book a demo right now to check out Port's developer portal yourself
Apply to join the Beta for Port's new Backstage plugin
It's a Trap - Jenkins as Self service UI
Further reading:
Example JSON block
Order Domain
Cart System
Products System
Cart Resource
Cart API
Core Kafka Library
Core Payment Library
Cart Service JSON
Products Service JSON
Component Blueprint
Resource Blueprint
API Blueprint
Domain Blueprint
System Blueprint
Microservices SDLC
Scaffold a new microservice
Deploy (canary or blue-green)
Feature flagging
Revert
Lock deployments
Add Secret
Force merge pull request (skip tests on crises)
Add environment variable to service
Add IaC to the service
Upgrade package version
Development environments
Spin up a developer environment for 5 days
ETL mock data to environment
Invite developer to the environment
Extend TTL by 3 days
Cloud resources
Provision a cloud resource
Modify a cloud resource
Get permissions to access cloud resource
SRE actions
Update pod count
Update auto-scaling group
Execute incident response runbook automation
Data Engineering
Add / Remove / Update Column to table
Run Airflow DAG
Duplicate table
Backoffice
Change customer configuration
Update customer software version
Upgrade - Downgrade plan tier
Create - Delete customer
Machine learning actions
Train model
Pre-process dataset
Deploy
A/B testing traffic route
Revert
Spin up remote Jupyter notebook
Engineering tools
Observability
Tasks management
CI/CD
On-Call management
Troubleshooting tools
DevSecOps
Runbooks
Infrastructure
Cloud Resources
K8S
Containers & Serverless
IaC
Databases
Environments
Regions
Software and more
Microservices
Docker Images
Docs
APIs
3rd parties
Runbooks
Cron jobs