In a recent Lunch and Learn session, Rhys Smith, Senior Software Engineer at Audacia, provided an overview of YAML and its application within Azure Pipelines. This session covered everything from the fundamentals of YAML and building efficient pipelines, through to highlighting how YAML streamlines CI/CD processes in DevOps practices. Here’s a summary of the key points and insights shared during the talk.
What is YAML?
YAML, which stands for "YAML Ain't Markup Language," is a human-readable data serialisation format designed for simplicity and readability. It’s structured in a way that allows developers to create organised configurations that are easy to read and modify. This readability makes YAML particularly valuable for version control and collaborative coding. By storing YAML in source code repositories, teams can work together to build, test, and deploy software with consistency, traceability, and high code quality.
The Shift to YAML in Azure Pipelines
Azure Pipelines supports both classic pipelines, which use a drag-and-drop interface, and YAML pipelines, which are defined through code. Teams are shifting focus to YAML pipelines for several reasons:
Version Control:
YAML configurations are stored alongside the source code, making it easier to track changes and provide a platform for team members to review each other's changes ensuring high quality.
Modularity and Reusability:
YAML allows developers to break down workflows into reusable modules. For example, a module that defines how to build an API can be reused across multiple projects, promoting consistency and reducing duplication.
Flexibility:
Unlike classic pipelines, which can become visually cluttered with numerous steps and dependencies, YAML pipelines allow for more straightforward scaling and complex workflows.
Comparing Classic and YAML Pipelines
Classic pipelines rely on a visual interface, where tasks and processes are connected through drag-and-drop. This can be beneficial for users who appreciate a graphical overview of their workflows, similar to working with a logic app. However, this visual approach has limitations:
- As the complexity of workflows grows, the interface can become cumbersome, with many interconnected boxes and arrows.
- Versioning is more challenging, as changes are not inherently tied to specific branches or releases.
YAML pipelines, in contrast, use code to define workflows, which allows for precise version control. When a code change is merged into a branch, the corresponding YAML-defined pipeline executes the appropriate stages automatically, ensuring changes are applied seamlessly.
Key Components of YAML Pipelines
Understanding YAML pipelines involves a few core concepts:
- Stages: These represent major phases of a pipeline, such as build, test, and deploy. Each stage corresponds to a significant part of the CI/CD process.
- Jobs: Jobs are units of work within a stage. Each job can have multiple steps and runs on a specific agent or machine. Jobs can be run sequentially or in parallel, and they are often used to break down different parts of the pipeline, such as building code, running tests, or deploying to environments.
- Tasks: The smallest unit of work in a pipeline, tasks can include anything from running a command-line script to using a predefined action like the .NET CLI for builds.
Summary of Hierarchy:
Stages → Group multiple jobs.
Jobs → Run sequentially or in parallel within a stage.
Steps → Define actions within jobs.
Tasks → Predefined actions, usually as steps.
This structured approach simplifies the pipeline definition, and also makes it easier to maintain and troubleshoot complex processes.
Build Agents and Pools
The execution of a YAML-defined pipeline is handled by build agents, which run the tasks and jobs on a specified machine, whether an on-premises server or a virtual machine in the cloud. These agents are organised into agent pools. When a new build is triggered, the pool assigns the job to an available agent, ensuring efficient resource use.
Managing agent contention is crucial. Overloading a pool can cause delays for other projects. It’s important to balance the number of agents used per pipeline, exploring options like dynamic scaling to adapt to varying workloads.
Using Templates and Reusable Modules for Efficiency
One of the primary advantages of YAML is its ability to reference external templates and modules. We have our own templates which can be found in Audacia's Github, Audacia.Build repository. These contain predefined tasks and templates for building .NET projects such as ASP.NET APIs. By referencing these templates, our development teams can avoid reinventing the wheel for every new project, significantly reducing setup time and ensuring best practices are followed.
Automating with Triggers and Schedules
YAML pipelines also support advanced automation through triggers and schedules:
- Triggers: These are used to initiate pipelines automatically when changes are made to a specific branch. For example, a pipeline might automatically run validation tests when a pull request is opened against the dev branch.
- Schedules: These allow tasks to run at set times, such as daily dependency checks or nightly test automation runs.
For more advanced workflows, pipelines can even be configured to trigger each other, creating an intricate web of automated processes that ensure everything from code validation to deployment is handled smoothly.
Best Practices
Here are a couple of our best practises:
- Keep pipelines DRY (Don’t Repeat Yourself).
- Use tasks published by Microsoft and other reputable vendors.
- Use App Configuration and Key Vault to manage environment-specific configurations securely, ensuring sensitive data remains protected.