18F/federalist-garden-build

View on GitHub
README.md

Summary

Maintainability
Test Coverage
# Pages Build Container

Docker image for building and publishing static sites as part of the cloud.gov Pages platform.

Generally, site builds work in three stages: clone, build, and publish. Each stage is broken down into a number of steps. First, the container checks out the site from GitHub. Then it builds the site with the specified build engine. Then it gzip compresses text files and sets cache control headers. Finally, it uploads the built site to S3, and also creates redirect objects for directories, such as `/path` => `/path/`.

## Usage

### Command
```
python main.py [options]
```

### Command options
One of the following flags *must* be specified:

| Flag | Example | Description |
| ---- | ------- | ----------- |
| `-p`, `--params` | `-p '{"foo": "bar"}'` | An encrypted JSON encoded string containing the [build arguments](#build-arguments) |
| `-f`, `--file` | `--file ./.local/my-build.json` | A path to a JSON file containing the [build arguments](#build-arguments) |

### Using cloud.gov tasks
```
cf run-task <APP_NAME> "cd app && python main.py [options]"
```

### Using `docker-compose`
```
docker-compose run --rm app python main.py [options]
```

### Full examples
```
# build arguments provided as a JSON encoded string

cf run-task pages-build-container "python main.py -p '{\"foo\": \"bar\"}'" --name "build-123"
```

```
# build arguments provided in a JSON encoded file

docker-compose run --rm app python main.py -f /tmp/local/my-build.json
```

## Environment variables

| Name | Optional? | VCAP Service | Description |
| ---- | :-------: | ------------ | ----------- |
| `CACHE_CONTROL` | Y | | Default value to set for the `Cache-Control` header of all published files, default is `max-age=60` |
| `DATABASE_URL` | N | | The URL of the database for database logging |
| `USER_ENVIRONMENT_VARIABLE_KEY` | N |  `federalist-{space}-uev-key` | Encryption key to decrypt user environment variables |
| `MAX_WORKERS` | N | | Maximum number of workers/threads to use when uploading files to S3 |

When running locally, environment variables are configured in `docker-compose.yml` under the `app` service.

## Connected CF service

| Name | Type | Description |
| ---- | ---- | ----------- |
| `federalist-((env))-rds` | Brokered | The RDS db credentials |
| `federalist-((env))-uev-key` | User Provided | The site environment variable encryption key |
| `pages-((env))-encryption` | User Provided | The site build params encryption key |

## Build arguments

| Name | Optional? | Default | Description |
| ---- | :-------: | ------- | ----------- |
| `aws_access_key_id` | N | | AWS access key for the destination S3 bucket |
| `aws_secret_access_key` | N | | AWS secret key for the destination S3 bucket |
| `aws_default_region` | N | | AWS region for the destination S3 bucket |
| `bucket` | N | | AWS S3 bucket name for the destination S3 bucket |
| `github_token` | Y | `None` | GitHub auth token for cloning the repository |
| `status_callback` | N | | The URL the container should use to report the status of the completed build (ie, success or failure) |
| `config` | Y | `None` | A yaml block of configuration to add to `_config.yml` before building. Currently only used in `jekyll` site builds |
| `generator` | N | | The engine to use to build the site (`'jekyll'`, `'hugo'`, `'node.js'`, or `'static'`) |
| `owner` | N | | The GitHub organization of the source repository |
| `repository` | N | | The name of source the repository |
| `branch` | N | | The branch of the source repository to build |
| `site_prefix` | N | | The S3 bucket "path" that the site files will be published to. It should **not** have a trailing or prefix slash (Ex. `preview/<OWNER>/<REPOSITORY>/<BRANCH>`) |
| `baseurl` | Y | `None` | The base URL that will be used by the build engine to determine the absolute path for site assets (blank for custom domains, the `site_prefix` with a preceding `/` for preview domains |
| `user_environment_variables` | Y | | Array of objects containing the name and encrypted values of user-provided environment variables (Ex. `[{ name: "MY ENV VAR", ciphertext: "ABC123" }]`) |


### Encrypted params argument

When build parameters are passed to the build script using the `-p / --params` flag, they are an encrypted JSON encoded string created by the pages-core queue worker and decrypted using a shared key stored as CF user provided service `pages-<env>-encryption` and the [decrypt cipher](./src/crypto/decrypt.py).

## Environment variables provided during builds

The following environment variables are available during site builds and when running the `federalist` npm script. They may be useful for customizing the display of certain information in the published site, for example, to display the current published branch name.

* `OWNER`
* `REPOSITORY`
* `BRANCH`
* `SITE_PREFIX`
* `BASEURL`

## Development

### Getting started

#### Requirements
- [Docker](https://www.docker.com/) and [Docker Compose](https://docs.docker.com/compose/)
- AWS S3 bucket name and associated credentials (key, secret, region)
- A Github repository with a Pages-compatible site
- A Github Personal Access Token if building a private repository, see [creating a new personal token for your GitHub account](https://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/) for more information.

#### Clone the repository
```sh
  git clone git@github.com:cloud-gov/pages-build-container.git
  cd pages-build-container
```

#### Create build arguments
```sh
  mkdir -p .local
  cp .local.sample.json .local/my-build.json
```

#### Update build arguments
Update the appropriate fields to contain the desired values for your build, see [build arguments](#build-arguments) for options. The `.local` folder should not be checked into version control (it is in `.gitignore`) and will be mounted into the Docker container at `/tmp/local`.

#### Initialize the database
This only needs to be once. To force a reinitialization of the database, remove the `tmp/db` folder in the project root and run the below command again.

```sh
  docker-compose run --rm db
```
Then kill the process when it is done.

#### Run the build
```sh
  docker-compose build
  docker-compose run --rm app python main.py -f /tmp/local/my-build.json
```
If the database is not ready when running a build (despite the healthcheck), just try running the build again.

#### Interact with the build environment
```sh
  docker-compose run --rm app bash
  python main.py -f /tmp/local/my-build.json
```

### Inspecting the database

1. Ensure the database is running (in the background)
```
docker-compose up -d --no-deps db
```

2. Run psql in the container
```
docker-compose exec db psql -U postgres -d pages
```

### Inspecting logs
During or after builds the echoserver and database logs can be viewed with:
```sh
  # all logs
  docker-compose logs

  # only the echo server
  docker-compose logs echoserver

  # only the db
  docker-compose logs db
```

### Testing
1. Build the test image
```sh
docker-compose build test
```

2. Run any testing steps
```sh
# unit tests
docker-compose run --rm test pytest

# unit tests with code coverage
docker-compose run --rm test pytest --cov-report xml:./coverage/coverage.xml --cov-report html:./coverage --cov-report term --cov=src

# lint
docker-compose run --rm test flake8

# static analysis
docker-compose run --rm test bandit -r src
```

### Continuous Integration
We use Concourse CI for our CI/CD system. To use Concourse, one must have appropriate permissions in UAA as administered by the cloud.gov operators. Access to Concourse also requires using the GSA VPN.

1. To get started install and authenticate with the `fly` CLI:
- `brew install --cask fly`
- `fly -t <Concourse Target Name> login -n pages -c <concourse url>`

2. Update local credential files (see ci/vars/example.yml)

#### CI deployments
This repository contains two distinct deployment pipelines in concourse:
- [__build-container__](./ci/pipeline.yml)
- [__build-container dev__](./ci/pipeline-dev.yml)

__build-container__ creates the site build container image, pushes it to ECR, and then deploys the image for the build container app.

__*&#8595; NOTICE &#8595;*__

> __build-container dev__ deploys the Pages app/api, the admin app, and the queues app when a PR is created into the `staging` branch. This uses a unique pipeline file: [./ci/pipeline-dev.yml](./ci/pipeline-dev.yml)

#### Deployment
##### Pipeline instance variables
Three instances of the pipeline are set for the `pages dev`, `pages staging` and `pages production` environments. Instance variables are used to fill in Concourse pipeline parameter variables bearing the same name as the instance variable. See more on [Concourse vars](https://concourse-ci.org/vars.html). Each instance of the pipeline has three instance variables associated to it: `deploy-env`, `git-branch`.

|Instance Variable|Pages Dev|Pages Staging|Pages Production|
--- | --- | ---| ---|
|**`deploy-env`**|`dev`|`staging`|`production`|
|**`git-branch`**|`staging`|`staging`|`main`|

## Public domain

This project is in the worldwide [public domain](LICENSE.md). As stated in [CONTRIBUTING](CONTRIBUTING.md):

> This project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the [CC0 1.0 Universal public domain dedication](https://creativecommons.org/publicdomain/zero/1.0/).
>
> All contributions to this project will be released under the CC0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.

[Federalist]: https://federalist.18f.gov
[cloud.gov Pages]: https://cloud.gov/pages
[Docker Compose]: https://docs.docker.com/compose/install/
[Docker]: https://docs.docker.com/engine/installation/
[pages-builder]: https://github.com/cloud-gov/pages-builder