Incremental AWS S3 deployment with resource fingerprints.

Description

Deploying a site e.g. to AWS S3 currently involves uploading all files, even if they haven't changed. Add incremental deployment to only upload files if they are different from what is on the server.

One approach could be to check local records of timestamps as we do with incremental mummification in GUISE-104, but the situation is a bit different: the deployment target is further from our control, and could be independently modified. We would need to check deployment target timestamps, which may not be practical and we have less of a guarantee that the timestamps wouldn't change.

So a more practical and robust approach is probably to check some sort of etag or hash (preferably some modern SHA variant).

To indicate full deployment it is undecided whether to piggy-back on the --full flag or (more likely) use a separate flag --all.

Environment

None

Activity

Show:
Garret Wilson
May 3, 2020, 2:20 PM

We can update the AWS SDK in this ticket as well just to keep it at the latest.

Garret Wilson
May 3, 2020, 2:18 PM

The initial implementation worked with existing resources, but it neglected to check if the resource event exists (i.e. for new deployments or existing deployments uploading new resources). This bug must be fixed before release.

Garret Wilson
February 18, 2020, 4:45 PM

Interestingly if I add metadata in the form test=foobar, on S3 it appears as x-amz-meta-test=foobar in the AWS console. But via the SDK it still shows up programmatically as test=foobar. So I suppose this value would appear as an HTTP header x-amz-meta-test via normal HTTP calls (I don't know if authentication/authorization plays a role or not).

In fact now that I think about it, that's probably how this works behind the scene anyway, as I retrieve the metadata via an SDK object headObject() call. The HeadObjectResponse.metadata() method probably just parses out the x-amz-meta- headers.

I don't know whether these headers would be included in a CloudFront distribution.

Garret Wilson
February 17, 2020, 4:14 PM

This ticket will also update the mummy/targetModifiedAt of to use content-modifiedAt, in line with the current use of content-type, with the view that all properties in the description unless noted otherwise will refer to the target artifact file (that is, to the resource itself as generated for the site).

Garret Wilson
February 17, 2020, 3:18 PM

After much thought on URF-98, it seems that rather than a checksum as an ends to itself, rather what we are wanting to use is a fingerprint of the content (which may be a checksum/hash). Thus we will use SHA-256 and store it in the new content-fingerprint from URF-98.

Fixed

Assignee

Garret Wilson

Reporter

Garret Wilson

Labels

None

Fix versions

Priority

Major