Extract image metadata.
When a site is being planned, an image mummifier should extract relevant information from supported images. (Probably at least Exif metadata from JPEG should be supported.)
The final implementation of this ticket supports the following:
IPTC: ObjectName (IIM 2:05, 0x205)
Exif: XPTitle (0x9C9B)
IPTC: Caption (IIM 2:120, 0x0278)
Exif: ImageDescription (0x010E)
IPTC: By-line (IIM 2:80, 0x0250)
Exif: Artist (0x013B)
IPTC: CopyrightNotice (IIM 2:116, 0x0274)
Exif: Copyright (0x8298)
XMP (photoshop:DateCreated or xmp:CreateDate)
IPTC DateCreated (IIM 2:55, 0x0237), TimeCreated (IIM 2:60, 0x023C)
Exif DateTimeOriginal (0x9003), SubSecTimeOriginal (0x9291), OffsetTimeOriginal (0x9011)
From the Exif 2.32 specification, it looks like OffsetTimeOriginal (0x9011) is what I really want to pair with DateTimeOriginal, not TimeZoneOffset. (It looks like TimeZoneOffset was proposed but never made it into the Exif standard. Or maybe it is that OffsetTimeOriginal is what actually made it into the spec.)
Writing the Exif DateTimeOriginal information to image aspects is proving convoluted because it is supposed to go in a separate sub Exif directory, and the Apache Commons Imaging library is confusing about how to do that. For the purposes here, we just need to read the original image createdAt value so that we can get it into the plan and sort images later. It's not so important for consuming e.g. the previews and thumbnails. I've opened to add that later. Update: Writing createdAt has been moved to GUISE-169.
Here are example date/time values:
Exif - DateTimeOriginal: 2009:08:29 16:51:21
IPTC - DateCreated: 2009:08:29; TimeCreated: 16:51:21-07:00
XMP - xmp:CreateDate: 2009:08:29 16:51:21.00-07:00
As you can see the IPTC and XMP values provide time zone offsets, but not Exif. (Unfortunately this camera's time zone must not have been set correctly or it did not have that option and Lightroom chose the processing time zone, because the photo was not taking in the time zone indicated.) For Exif there is a separate TimeZoneOffset field, but it doesn't seem to be present in my test photo; thus the Exif value would be interpreted in the current time zone or in UTC. I will choose to interpret it in UTC, and when I write Exif back out I will write the value in UTC and add a TimeZoneOffset of zero to hopefully make the value more deterministic.
The good news for Exif is that since we made the decision that XMP takes precedent, its correct absolute date/time value will be used when we write Exif back out for the image aspects and thus the Exif value will be corrected and fixed in relation to UTC.
I'm not finding any information on the format of IPTC DateCreated (IIM 2:55, 0x0237), TimeCreated (IIM 2:60, 0x023C). So I'm going to parse them as ISO date and ISO time, and just emit a warning if there is a problem. That means no tests. I managed to set date values for IPTC DateCreated and TimeCreated using Metadata++ to find out what the format was, and also realized that metadata-extractor has a specialized method for extracting an instant from this.
Geocoding photo locations using Lightroom’s Map module and GPS data from December 2019 says:
The Exif standard recently added the ability for cameras to include timezone information in their photo timestamps, but few, if any, current cameras support it.
I'm not seeing time zone information as separate tags in any of my photos from Lightroom. Some XMP values seem to have the zone offset included (at least the Adobe API is indicating that), but one doesn't seem to be correct. Of course, maybe I forgot to update the time zone in the camera. Or maybe that camera model didn't even provide an option for setting the time zone. I'm seeing other XMP values that do have the correct zone offset in the value. But then maybe Lightroom just used the time zone I was in when I edited and produced the photos. It would be hard to know without a lot of testing and experimentation—across time zones, across camera models, and across processing software.
I think for now the best thing is to just get an absolute Instant in terms of UTC. At least that will get us more or less on the right day, and allow the images to be sorted later on. That is the main goal for the moment.