Add support for language literals.

Description

To ease adding metadata to documents, we need to add a SURF/TURF literal for a command resource type: a human language (like a Java locale). The type would be urf-Language and the canonical lexical form would be a language tag as per BCP 47. In SURF/TURF the delimiter would be ~, bringing to mind an audio waveform for spoken language.

Examples would include:

  • ~en

  • ~en-US

  • ~en-CA-newfound

  • ~pt

  • ~pt-BR

  • ~sr-Latn-RS

  • ~es-419

  • ~sl-nedis

  • ~de-CH-1996

Thus ~en-US in a SURF/TURF document would be equivalent to |"en-us"|*urf-Language in a TURF document.

Environment

None

Activity

Show:
Garret Wilson
February 2, 2020, 9:51 PM

As was decided for charset values in URF-91, the canonical lexical form for lexical ID generation in URF tags should be lowercase, as language tags are not case-sensitive.

Garret Wilson
February 8, 2019, 10:32 PM
Edited

We need to think about defining an info-lang property. There are several "info" properties that need to be defined, such as info-description, and the https://urf.name/info/ namespace needs to be reserved. These would allow a general way to describe general things (sort of like Dublin Core, but built into the framework). From experience, it proves confusing to users if this were in the https://urf.name/urf/ namespace (e.g. urf-lang), as it might seem that the properties were for URF-specific things and not general descriptive properties. Update: The ticket for this is now URF-82.

Assignee

Garret Wilson

Reporter

Garret Wilson

Labels

None

Components

Fix versions

Priority

Major