Implement CSV-based format for URF: URF CSV.

Description

Besides the comprehensive URF format that is "CSV-like" (), URF needs a simple format that is based on CSV yet still allows specification of URF aspects such as type and ID. This format would allow one to "import" data from the wild with minimal effort, although it would still require manual tweaking (i.e. it wouldn't be a full data wrangling system). It would however be a rigorous URF format, not just a conversion from CSV.

  • Name: "URF CSV".

  • Filename extension: .urf.csv

  • MIME content type: text/urf+csv

  • RFC 4180 with CRLF or LF line endings; a required header field, and UTF-8 charset (with optional BOM?)

  • The resource type defaults to the name of the CSV file (e.g. FooBar for FooBar.urf.csv.

  • Each header represents a property handle.

  • If a header indicates no type, each cell is considered a string with no encoding other than the RFC 4180 encoding.

  • A header can indicate a type using : and a type handle, e.g. :urf-String or type tag, e.g. :|<https://urf.name/urf/String>|. In that case, the RFC 4180 decoded string is parsed as the TURF representation of the given type (without any delimiters), which for the most part corresponds to the lexical representation of the given type.

  • Nevertheless the urf-String type allows a special encoding for an empty string: """". An RFC 4180 parser will parse this as ", which is not a valid TURF string representation, yet the presence of that " will force the RFC 4180 compliant parser to distinguish the string from null; furthermore the presence of a single logical " in a column in the wild should be exceedingly rare anyway.

  • If a type ends in #, the RFC 4180 cell contents is used as the type ID. For example a Foo.urf.csv column the the name bar*Bar# and a cell value 123 would result in the property with the handle bar being set to Bar#123.

  • No comments or namespaces allowed; this is pure RFC 4180 content with the barest capabilities for import.

  • If a column begins with # it indicates the text content of each cell is to be used as an object identifier. If the column also has a name (and optional type), the value is additionally added as a resource property. For example a Foo.urf.csv column with the header # and a value 123 will have a tag of Foo#123.

  • If a column begins with ! it is ignored.

Environment

None

Activity

Show:
Garret Wilson
January 3, 2019, 1:02 PM

The use if * as a delimiter for a type (e.g. fooBar*urf-Number seems a little misleading, as in TURF this would mean that the fooBar property is of type urf-Number. After some thought, a more appropriate delimiter would be :, e.g. fooBar:urf-Number. This seems natural, is not ambiguous with TURF syntax, and in terms of TURF could be considered mapping the fooBar property to the urf-Number type. (What we really are wanting to do is to indicate the urf-Range of the property.)

Assignee

Garret Wilson

Reporter

Garret Wilson

Labels

None

Priority

Major

Epic Name

URF CSV