Namespace Registration for Metadata Identifiers (META)
Namespace ID: META
Version: 1
Date: 2022-11-14
Registrant:
Name: Juha Hakala
E-mail: juha.hakala&helsinki.fi
Affiliation: Senior adviser, The National Library of Finland
Address: P.O.Box 15, 00014 Helsinki University, Finland.
Web URL: https://www.kansalliskirjasto.fi/en/
Background:
According to ISO 5127, metadata is data about other data, documents or
records that describes their content, context, structure, data format,
provenance and/or rights attached to them. Metadata elements and their
machine readable codes are specified in (cataloguing) formats.
Libraries have been using a metadata format (Machine Readable
Cataloguing, MARC) since late 1960s, and starting in 1990s, museums,
archives and many other organizations have developed their own metadata
formats. There are also metadata formats for administrative metadata
(rights, long-term preservation). As a result of this format
proliferation and increased availability of high-quality structured
metadata as open linked data, it is important to enable human and
machine users to make sense of metadata. This requires at least basic
understanding of metadata elements in these formats, such as Dublin
Core Title or MARC 21 tag 245, Title statement.
Metadata format specifications are usually available in the Web for
free. For instance, the Library of Congress has published MARC 21
formats at https://www.loc.gov/marc/. The National Library of Finland
maintains translations of these formats at
. A list of MARC 21 translations
(n = 20) is available at .
From users' point of view, using MARC 21 documentation is not as easy
as it should be. Information may have been published in a form (e.g.
a PDF document) which is not sufficiently machine understandable. And
even if information about each metadata element is available in HTML,
there are no explicit links between translations. There is no connection
between the page describing Title statement in English
, the page describing
it in Finnish
and the Swedish version
.
Most metadata format communities have chosen to use location dependent
HTTP URLs as identifiers for their metadata formats and their elements.
The only exception is the Dublin Core Metadata Initiative, which chose
Persistent URLs (PURLs) as identifiers. For instance, DC metadata
element Creator has two Persistent URLs;
http://purl.org/dc/terms/creator
in the /terms/ namespace, and
http://purl.org/dc/terms/1.1/creator
in the /elements/1.1/ namespace, because these two creators have
different semantics.
NOTE: These PURLs identify creator-related metadata elements in the
Dublin Core format, not the creators themselves. Creators have other
identifiers, such as ORCiDs and ISNIs, which may be expressed as URIs
(e.g. ).
The French translation of DCMI Metadata Terms uses the same PURLs, but
they resolve to English texts. In order to access French element
descriptions, it is necessary to use their URLs, such as
or creator in the Terms namespace. Translations to e.g. Czech, Japanese
and Italian are similar in this respect: the identifier of the element
is the PURL of the English version (which is OK), but the PURL does not
enable the user to retrieve documentation about the element in the
appropriate language.
From users' point of view it is confusing that in MARC 21 and all other
formats which rely on HTTP URLs, each translation of the format has its
own set of identifiers. In Dublin Core the same identifier is used for
several translations, but the identifier provides linking only to the
version in English. Therefore users who want to understand the metadata
they see, may have difficulties finding this information in an
appropriate language.
NOTE: Some XML-based metadata formats have XML namespaces:
AudioMD: http://www.loc.gov/audioMD/
MARCXML: http://www.loc.gov/MARC21/slim
PREMIS: http://www.loc.gov/premis/rdf/v3/
VideoMD: http://www.loc.gov/videoMD/
Unfortunately namespaces for AudioMD, MARCXML and VideoMD are not
resolvable using e.g. HTTP, and functionality supported by the PREMIS
namespace is insufficient - links to individual tags such as
http://www.loc.gov/premis/rdf/v3/Copyright
do not provide useful results, since they all take the client only to
the beginning of the file containing all element descriptions. When
implemented in this manner links to descriptions of metadata elements do
not help human and machine users of metadata to understand the metadata
provided.
Purpose:
Provide a uniform basis and a tool for identification of elements in
metadata formats.
Provide functional improvements to the current URL-based identifiers of
metadata elements.
Benefits:
Linking to alternative versions (e.g. full / concise, human readable /
machine readable) of element descriptions with one URN.
Automatic selection of the appropriate language (see below). For
instance, a bibliographic record containing a link to the URI explaining
MARC tag 245 in Finnish is only useful for users who understand Finnish,
because the HTTP server holding these pages cannot redirect Swedish /
English speaking users to MARC tag 245 pages on other HTTP servers that
would be appropriate for them.
Having a URN namespace dedicated for metadata elements may improve
co-ordination on how identifiers for metadata elements are created and
what they resolve to. It may also encourage the organizations
maintaining metadata formats to provide documentation separately for
each metadata element and in a form more suitable to the Web than e.g.
PDF.
A resolution service for URN:META identifiers can direct users to
information in their preferred languages if the resolution method
communicates their language preferences to the resolver.
If a URN:META identifier is assigned to the Library of Congress
description of MARC 245 tag, it is possible for a resolution service
to direct a user to the descriptions of this tag in multiple languages,
depending on the language settings of the user's Web client. This
functionality can be implemented by supporting the HTTP Accept-Language
header in the URN resolver. A prerequisite for this functionality is
that all element URLs from translated versions are harvested to the
resolver's URN – URL mapping table. Since these URLs should be stable,
keeping the links up to date is feasible.
If a network protocol used does not support language negotiation the
required functionality may also be implemented with the URN R-component.
NOTE: A URN should not be assigned if an element already has a
well-managed persistent identifier such as DOI
.
Syntax:
The Namespace-Specific String (NSS) consists of three parts:
o a prefix consisting of a code identifying the metadata format
and optional sub-namespace code(s) separated by a colon(s);
o a hyphen (-) as the delimiting character; and,
o a string assigned under the auspices of the format maintenance
agency.
These strings may be constructed according to the local preferences as
long as they are aligned with the requirements of RFC 3986 and RFC 8141.
Format maintenance agency is the organization maintaining the original
version of the format, such as Dublin Core Metadata Initiative for
Dublin Core, or the Library of Congress for MARC 21 formats. A
maintenance agency shall specify the NSS syntax for its formats, and it
may outsource the URN assignment and maintenance of URN resolver to a
third party, such as an organization maintaining a translation of the
format.
The following formal definition uses ABNF [RFC5234].
meta-nss = prefix "-" meta-string
prefix = format-code *( ":" sub-namespace )
; The entire prefix is case insensitive.
format-code = 1*(ALPHA / DIGIT)
; As assigned by the National Library of Finland
; (identifies the metadata format and the maintenance
; agency to which the branch is delegated).
sub-namespace = 1*(ALPHA / DIGIT)
; As assigned by the respective format maintenance
; agency.
meta-string = path-rootless
; The "path-rootless" rule is defined in RFC 3986.
; Syntax requirements specified in RFC 8141 MUST be
; taken into account.
The meta-string is case-sensitive unless specified as case-insensitive
by the maintenance agency.
The following metadata format codes SHALL be used:
Descriptive metadata
Code Format(s) URL
BF BIBFRAME http://www.loc.gov/bibframe/
danMARC2 DANMARC2 http://www.kat-format.dk/danMARC2/
DW Darwin Core https://dwc.tdwg.org/terms/
DC Dublin Core https://www.dublincore.org/specifications/dublin-core/
DDI DDI https://ddialliance.org/explore-documentation
EAD EAD https://www.loc.gov/ead/
FINMARC FINMARC https://www.kiwi.fi/display/Marc21/FINMARC
IMARC INTERMARC https://www.bnf.fr/fr/intermarc-bibliographique-de-diffusion
LIDO LIDO http://network.icom.museum/cidoc/working-groups/lido/
MARC MARC 21 https://www.loc.gov/marc/marcdocz.html
MARCXML MARCXML http://www.loc.gov/standards/marcxml/
MIX MIX http://www.loc.gov/standards/mix/
MODS MODS http://www.loc.gov/standards/mods/
ONIX ONIX https://www.editeur.org/8/ONIX/
UKMARC UKMARC https://www.webarchive.org.uk/wayback/archive/20160000000000/http://www.bl.uk/bibliographic/ukmarc.html
UNIMARC UNIMARC https://www.ifla.org/unimarc
Administrative metadata
Code Format URL
PREMIS PREMIS http://www.loc.gov/standards/premis/
TEXTMD textMD https://www.loc.gov/standards/textMD/
AUDIOMD audioMD https://www.loc.gov/standards/amdvmd/
VIDEOMD videoMD https://www.loc.gov/standards/amdvmd/
Cataloguing rules
Code Rules URL
ISBD ISBD http://iflastandards.info/ns/isbd/elements/
RDA RDA https://www.rdaregistry.info
One code may cover an entire family of formats (e.g. MARC Authority,
Bibliographic and Holdings formats). Sub-namespaces may be used to
differentiate formats within these format families if necessary.
Since all prefixes with the same format-code are delegated to the same
maintenance agency, such families are perforce maintained by the same
agency.
National translations of metadata standards and cataloguing rules shall
use the codes and URNs of the original specifications. Thus the Finnish
translation of MARC 21 shall use the prefix MARC of MARC 21 and URNs
assigned to the elements of the English version of the format. If
resolution is done via HTTP, the URN resolver can use HTTP language
negotiation to direct the client to the correct language version. Other
network protocols may support similar functionality in the future.
National variants of metadata formats (e.g. historical FINMARC format,
which was based on equally outdated USMARC) shall have their own format
codes, since their tags and semantics may differ from the original ones.
For instance, UKMARC tag 245 is not the same as USMARC tag 245, since in
the former subtitle had its own tag, 248, whereas in the latter,
subtitle was included in tag 245.
Metadata application profiles such as Darwin Core, an extension of
Dublin Core intended for sharing of information about biological
diversity, may have codes of their own if they a) extend the base
format substantially and b) are well documented and stable.
The structure (if any) of the meta-string is determined by the authority
for the prefix. Within the meta-string, it is recommended that a hyphen
is used for separating different sections of the identifier from one
another in order to improve the human readability of the string.
Maintenance agencies SHOULD NOT use in meta-strings characters requiring
percent-encoding.
Registering format codes:
New codes will be added by the National Library of Finland on request.
Requests should be sent to meta-request&helsinki.fi.
The list of registered format codes will be maintained as a part of this
document.
NOTE: Formats included above are the ones most commonly used by
libraries, archives, museums and in publishing. It is anticipated that
the need for adding new formats will not be frequent.
Rules for lexical equivalence:
Whereas the prefix is case insensitive, meta-strings MAY be case
sensitive at the preference of the assigning authority; parsers
therefore SHALL treat these as case sensitive, and any case mapping
needed to introduce case insensitivity is the responsibility of the
relevant resolution system.
Case insensitivity of the prefix must be taken into account when
URN:META identifiers are compared.
META assignment:
National and international metadata format maintenance agencies may use
URN META when they want to assign persistent identifiers for the
metadata elements and tags of their formats, and provide URN-based
access to machine or human readable descriptions about these metadata
elements. For the time being these descriptions are unstructured text
on Web pages.
The URN assigned to the element shall not change even if the description
of the element is changed. URNs assigned to deleted elements shall not
be re-used.
Metadata format maintenance agencies shall have procedures in place to
make sure that the assigned URNs are unique and persistent. Since the
number of metadata elements on formats is relatively low (at most a few
hundreds) such procedures can be simple (e.g. URN can be based on the
name of the element).
Security and Privacy:
URN:META identifiers do not have any known security or privacy issues.
They are intended to have publicly-known meanings and do not refer to
specific individuals, groups, or organizations.
Interoperability:
URN:META identifiers do not have any known interoperability related
issues.
Resolution:
General
URNs in the URN:META namespace MUST be resolvable.
Each registrant of a format-code MUST register a base URL (ending in
"/") for a resolution service for URN:METAs that have that format-code.
A registrant may provide additional base URLs for prefixes composed of
that format-code and one or more following sub-namespaces. The base URL
for resolving a particular URN:META is the base URL for the longest
registered prefix which is an initial part of the URN's prefix.
A URN:META is resolved by composing a URL by concatenating the base URL
with the URN:META and fetching that URL using the normal HTTP/HTTPS GET
method. The retrieved resource SHOULD describe the identified metadata
element and MAY provide or reference further information about it.
Resolution services MAY respond to GET requests with a redirection
response whose Location header field is a URL of a preexisting
description of the element. If information about the element is
available in multiple languages, a resolution service SHOULD use the
HTTP Accept-Language header to select a URL of a resource in the user's
requested language.
URN to URL resolution service from the URN to the URL of the page
describing the identified metadata element MUST be supported. Other
resolution services such as a link to appropriate cataloguing rule may
be provided if appropriate.
Example 1
Namespace URN:META:MARC contains URNs which identify the tags of MARC
21 metadata formats in various languages.
Unlike current HTTP URLs, all language versions share the same
identifier.
In this example, URNs are expressed as HTTP URIs which use a
(non-existent) URN resolver located http://example.com/.
It is assumed that the language setting of the client is English (and
as a default) URNs http://example.com/urn:meta:marc- will
be resolved at MARC 21 format specific pages at directories
https://www.loc.gov/marc/bibliographic/
https://www.loc.gov/marc/authority/
https://www.loc.gov/marc/holdings/
on the Library of Congress site.
Case 1
URN of the MARC Bibliographic format tag 245 (Title Statement)
Assuming that the registered resolver base URL for urn:meta:marc is
http://example.com/, urn:meta:marc-bd245 would be resolved by fetching
http://example.com/urn:meta:marc-bd245
which (absent an Accept-Language header in the GET request) would
redirect either to
https://www.loc.gov/marc/bibliographic/bd245.html
(full description of the tag)
or to
https://www.loc.gov/marc/bibliographic/concise/bd100.html
(concise description of the tag)
depending on the service requested. If full description is set as a
default, concise description can be made available via an R-component
based service request.
If the GET request contained "Accept-Language: fi" (preferring
Finnish), it would always redirect to
https://marc21.kansalliskirjasto.fi/bib/20X-24X.htm#245
since there is only one Finnish translation, based on the concise
version.
If the GET request contained "Accept-Language: sv" (preferring Swedish),
it would always redirect to
http://www.kb.se/katalogisering/Formathandboken/Bibliografiska-formatet/210-249/#245
since there is only one Swedish translation, based on the concise version.
File names of pages describing tags in MARC 21 formats have the same
syntax (MARC Bibliographic uses bdxxx.html, where xxx is the tag
number), no URN – URL mapping table is required in the resolver. Target
URLs can be generated from URNs programmatically. Since the target URLs
have been stable for decades, the need to modify programmatic mapping
should be very infrequent.
Case 2
URN of the MARC Authority format tag 100
Assuming that the registered resolver base URL for urn:meta:marc is
http://example.com/, urn:meta:marc-ad100 would be resolved by fetching
http://example.com/urn:meta:marc-ad100
which (absent an Accept-Language header in the GET request) would
redirect to
https://www.loc.gov/marc/authority/ad100.html
If the request contained "Accept-Language: fi" (preferring Finnish), it
would redirect to
https://marc21.kansalliskirjasto.fi/aukt/1XX.htm#100
If the request contained "Accept-Language: sv" (preferring Swedish), it
would redirect to
http://www.kb.se/katalogisering/Formathandboken/Auktoritetsformatet/1XX/#100
Example 2
URN of the Dublin Core "terms" namespace property Title and the Dublin
Core "elements/1.1" namespace element Title.
Assuming that the registered resolver base URL for urn:meta:dc is
http://example.com/, urn:meta:dc:terms-title would be resolved by
fetching
http://example.com/urn:meta:dc:terms-title
which (absent an Accept-Language header in the GET request) would
redirect to
http://purl.org/dc/terms/title
Assuming that the registered resolver base URL for urn:meta:dc is
http://example.com/, urn:meta:dc:elements1.1-title would be resolved by
fetching
http://example.com/urn:meta:dc:elements1.1-title
which (absent an Accept-Language header in the GET request) would
redirect to
http://purl.org/dc/elements/1.1/title
Persistence:
Persistence of URN:META resolution services depends on the persistence
of metadata formats.
Metadata about deprecated MARC formats such as USMARC, UKMARC or
FINMARC may no longer be available at all, and even if it is, its form
and content may not be suitable for URN resolution. See for instance
UKMARC documentation at
https://www.webarchive.org.uk/wayback/archive/20160107133726/http://www.bl.uk/bibliographic/ukmarc.html
or FINMARC documentation at
https://www.kiwi.fi/display/Marc21/FINMARC
Additional documentation / information:
None