Home

MS Base

  • prefix: ms:
  • schema: b:

Categories (Data):

Categories (Governance):

units:

 
 

Schema element "b:meta-data"

Elements which give facts about the payload, which is defined by the unit, independent of how the payload is included.

Definitions

<group name="meta-data">
  <sequence>
    <element name="type"     type="b:Constant" />
    <element name="name"     type="string"       [0..1] />
    <element name="language" type="b:Constants"  [0..1] />
    <element name="charset"  type="token"        [0..1] />
  </sequence>
</group>

Description

element charset

Charset is a common abbreviation of character set. Which charset is used by the payload data is only useful for applications which process that data. Binary data formats, like images, will not contain a charset specification.

The content of this field is defined by RFC2978. See the page with all character-sets at IANA.

element language

The languages used in the payload, when it contains text. This fact MAY be used by search filters. Values are constants in one of the ms:Language/ sets.

The concept of language in text is complex: more details about contained languages may be included in other ways.

element name

A unit may have a name which is shown to users. The chosen unit-ids may not be the nicest way to present the Unit to a person.

The name defaults to the percent-decoded version of Unit's id.

Be warned that the name is (like everything else) in the UTF8 charset. See the discussion below about the problem with using filenames here.

element type

The type of data which is contained in the payload of this unit. This is equivalent to the type of attachments in emails. The data itself may come in a serialization (format) which can be converted to the indicated type.

Discussion

Using filenames as name

When you want to use a filename as name (which makes a good case), you will need to include an encoded version of the filename because those are often UTF-16 (NTFS) or "bytes" (UNIX/Linux).

Also, the use of hidden files (leading dot on UNIX/Linux) and hidden meta-data directories (__MACOSX on Apple) will complicate the set-up. It hurts even more when file-systems are considered case-insensitive, may contain control characters, and use white-spaces. Some operating systems use UTF8 for รถ, other use o + ", which are not equivalent under simple comparison.

This Meshy Space Base does not solve any of these problems. Extensions, like the Meshy Space Concept may offer a better solution. They start with using the filename's precise byte-sequence percent-encoded as unit id, and a readable UTF-8 version of it as name. Search on name can be made case-insensitive.


mark@overmeer.net      Web-pages generated on 2023-12-19