GitHub Action Gotchas

I started with GitHub Actions a couple years ago. Recently I came across a few interesting use cases while I was trying to setup Terraform workflow with GitHub actions. These use cases prompted me to make use some new features in GitHub Action. So I put them in a post here.

Runners can assume IAM Role in AWS

In many scenarios we want to execute AWS CLI command from GitHub action. Also, executables such as terraform inherits credential from AWS CLI. The credential should be a temporary role-based credential instead of an IAM user based on access keys.

There is a GitHub Action called configure-aws-credentials-action-for-github-actions that can help configure GitHub runner using OIDC identity provider (since Nov 2021 v1.6.0). With the action, the GitHub runner can assume an IAM role as an IAM user (with access key), or using a web identity.

For a GitHub runner to have a web identity thereby assume an IAM role, we should configure OIDC provider in AWS. We can do that from AWS console (i.e. under IAM), or using CloudFormation code. Below is a snippet as an example:

Resources:
  GitHubOIDC:
    Type: AWS::IAM::OIDCProvider
    Properties:
      Url: https://token.actions.githubusercontent.com
      ClientIdList: 
        - sts.amazonaws.com
      ThumbprintList:
        - 6938fd4d98bab03faadb97b34396831e3780aea1

Then from the configured OIDC provider, we can obtain a thumbprint. GitHub action gives the thumbprint here. In AWS, we configure an IAM role whose AssumeRolePolicyDocument will reference the thumbprint. Here is an example. In the condition section of AssumeRolePolicyDocument, we can also specify a specific GitHub repository so that only Actions from that repository can assume the IAM role with their web identities.

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1-node16
        with:
          role-to-assume: ${{ vars.IAM_ROLE_ARN }}
          aws-region: ${{ vars.AWS_REGION }}

This way, we map a GitHub runner’s web identity to an IAM role with a step using the Action above. We also filter what GitHub org and what repo can trigger actions that assumes the role, by the condition clause in the role statement. If the step fails, we can look at CloudTrail on the AWS side for causes. Look for entries with AssumeRoleWithWebIdentity as Event Name.

Reusable workflows

For better reusability of Action steps, GitHub introduced reusable workflows (generally available since Nov 2021). It is particularly helpful when we need to run a workflow for different environments. The reusable workflows files (YML) can be placed in separate repositories, and be reference as such. This allows enterprises to centralize the management of reusable workflows.

I have been using the act project to emulate GitHub action locally on MacBook. As of Jan 2023, act does not support reusable workflow.

With the split between caller and reusable workflows, we have a new challenge of passing secrets and variables between them. It is not straightforward and GitHub documentation needs improvement to get the documentation clear. Also because the word “environment” is used in different contexts, it is ambiguous and therefore difficult to Google relevant information.

Passing variables

First, there are several types of variables in GitHub action:

  • Environment variable: declared under env keyword in a workflow. To use environment variable, use the env context. For example: ${{ env.MY_VARIABLE }}
  • Configuration variable: introduced in Jan 2023, configuration variables are defined at repository, environment and organization levels. To use configuration variable, use vars context, and ensure the workflow job specifies a value for environment attribute.
  • Secrets: GitHub also calls it Environment secret when defined at environment level. It works the same way as a configuration variable because it is also specific to an environment. The content is not viewable once set.

The reason GitHub action makes this so confusing, is that on one page, its documentation distinguishes between environment variable and configuration variable:

On another page, the document refers to configuration variables at environment level as environment variable:

It seems that “configuration variable” is too new for GitHub to refine its documentation as of January. This semantical confusion gave me a hard time investigating how to pass “Environment variable” to reusable workflows. I will stick to the meaning on the first page to distinguish environment variable and configuration variable at environment level.

Passing environment variable isn’t straightforward. In this discussion thread, people discussed how inconvenient it is. I used the workaround in this comment, where I had to create a job for the sake of storing variable values to output.

Pass secret is easier. This is an insightful blog post (Dec 2021) about passing secret to reusable workflow. The attempt 3 in the post works for me. First, we pass the value of environment to the reusable workflow as an input, then at job level specify the environment with the value. Then in the jobs we can reference secrets as ${{ secrets.NAME }}. The job will pick up the secret based on the correct environment.

It appears that since May 2022, GitHub introduced secrets: inherit keyword to address this. However, the method above still works for configuration variable.

Authentication of GitHub Actions

By default, a GitHub action can access the code repository that triggers the action and no other repositories (with GITHUB_TOKEN). However, in many cases we need to access external repositories. For example, terraform init command from a GitHub action implicitly calls git clone to pull module code from external repositories. A GitHub workflow may also reference a workflow file from external repositories.

The question is how to authenticate GitHub workflow to access external repo. This post has a thorough discussion. We may create a Personal Access Token and pass it to set-git-credentials action. We are essentially sharing a personal credential (and repo access) with a GitHub action, which is not a good practice.

The proper way to solve this problem, is to create a separate GitHub App and grant the access only the repo that the workflow needs to access. The GitHub App will generate a private key. Then we supply the private key to workflow-application-token-action so the workflow can act as the GitHub App, thereby access the external repos. The post has more details in the GitHub App section. Suppose we have terraform get command to clone external repo, the actions may look like this:

      - name: HashiCorp - Setup Terraform
        uses: hashicorp/setup-terraform@v2

      - name: Get RepoReader App Token
        id: get_repo_reader_token
        uses: peter-murray/workflow-application-token-action@v2
        with:
          application_id: ${{ vars.REPO_READER_APPLICATION_ID }}
          application_private_key: ${{ secrets.REPO_READER_PRIVATE_KEY }}

      - name: Cache Git Creds
        uses: de-vri-es/setup-git-credentials@v2
        with:
          credentials: https://x-access-token:${{ steps.get_repo_reader_token.outputs.token }}@github.com/

         # Terraform Get implicitly calls git clone which uses the credential cached as above
      - name: Terraform Get 
        run: terraform get

Another benefit of using GitHub App is that the token is a short-lived credential that expires as the job is finished, whereas a PAT will expire on a preset date. In this use case we can think of GitHub App as a service account with minimized privilege to read a short list of repos.

Conclusion

Since I first used GitHub actions, it has evolved quite a bit with these new features, although the documentation is somewhat lagging. It is still very helpful as all of these are free to personal use. I look forward to more interesting features.