Procedures for the IDN Repository
Overview
The Repository of IDN Practices was created to support the development of the internationalized domain names (IDNs) by promoting the sharing of registry IDN policies. The policies are referred to as “Label Generation Rulesets” (LGRs), and historically as “IDN tables” or “variant tables.”
Specifically, as described in the Guidelines for the Implementation of Internationalized Domain Names:
A registry will publish one or several lists of Unicode code points that are permitted for registration and will not accept the registration of any name containing an unlisted code point. Each such list will indicate the script or language(s) it is intended to support. If registry policy treats any code point in a list as a variant of any other code point, the nature of that variance and the policies attached to it will be clearly articulated. All such code point listings will be placed in the IANA Repository for IDN TLD Practices in tabular format together with any rules applied to the registration of names containing those code points, before any such registration may be accepted.
We do not maintain the content of the data and hence are not responsible for its accuracy. The sole purpose of the repository is to publish LGRs that have been verified as coming from representatives of domain registries. Registries that implement IDN support are strongly encouraged to use this repository, and some may be contractually required to do so by ICANN.
LGR Requirements
LGRs should be listed in one of the following formats:
- Label Generation Ruleset format (recommended). See RFC 7940.
- ASCII or UTF-8 encoded text file, adhering to common conventions such as those documented in RFC 3743 and RFC 4290. The format should be applied consistently throughout the document and allow for machine parsing.
An LGR must contain the following elements in its header:
- Script or Language Designator (see below for guidance)
- Version Number (this must increase with each amendment to the LGR, even if the updates are limited to the header itself)
- Effective Date (the date at which the policy becomes applicable in operational use)
- Registry Contact Details (contact name, email address, and/or phone number)
Bear in mind that the main purpose of this repository is to encourage re-use and analysis. As such, we recommend against submitting data in other formats unless it is not possible to represent the LGR using these available standards.
Registration Process
All delegated TLD registries can apply to register an LGR or submit an update to an existing LGR. Deprecated LGRs will be archived and marked as historical.
gTLD Submissions
Service requests should be submitted to ICANN via the appropriate Naming Services portal. For more information, see IDN service requests.
ccTLD and Other Submissions
- Anyone may submit a request to list or update an LGR, but we will ask for its posting to be approved by the current Administrative Contact for the domain in the Root Zone Database. Before submitting any registration or change request, the submitter should ensure that the Administrative Contact is aware of the proposed LGR and ready to consent to its publication.
- A registration request must include all of the necessary elements described in the requirements listed above. Requests should be sent to the email address idn-tables@iana.org. The submitter will receive a confirmation email containing a ticket number.
- Requests comprising multiple LGRs should be sent in the form of an archive (such as a ZIP or TAR file). For particularly complex applications, involving many LGRs or domains, please contact us in advance in order to identify the most optimal method of submission that does not involve unnecessary repetition.
Upon submission, within five days either:
- the Administrative Contact will receive an email, with the Technical Contact and submitter in copy, requesting the Administrative Contact’s confirmation that the details of the request are correct, or
- the submitter will receive notice that the LGR does not contain all of the elements necessary for registration. The LGR will not be sent to the Administrative Contact for confirmation until any necessary changes have been made.
LGRs may be revised according to the same process (i.e. request to idn-tables@iana.org followed by confirmation request to Administrative Contact). Updated LGRs must have a version number that is distinct from the current or previously posted version of the LGR. Updates should be accompanied by a description of the reason(s) for the request.
Language and Script Codes
When constructing Label Generation Rulesets (LGRs), the languages and scripts the ruleset covers need to be specified. This is a list of common languages and scripts to aid in the classification process. For less common classifications, refer to the IANA Language Subtag Registry and RFC 5646.
Language/Script | Code | Notes |
---|---|---|
Korean | ko | |
Chinese | zh | |
Simplified Chinese | zh-Hans | |
Traditional Chinese | zh-Hant | |
Brazilian Portuguese | pt-BR | |
American English | en-US | |
English | en | |
Swedish | sv | |
Latin (script) | Latn | |
Arabic (language) | ar | |
Arabic (script) | Arab | Includes languages beyond Arabic language, such as Urdu and Persian |
Cyrillic | Cyrl | |
Japanese (script) | Jpan | Includes Katakana, Hiragana and Kanji |
For special cases there may be a need to list entries without a specific language
or script classification. In such cases use und
.
Note that we are unable to accept registrations for languages and script codes that are not recognized in the Language Subtag Registry.
Common confusion with country codes
As language names and country names are often similar, if not identical, there is often confusion between the two. Be careful not to accidentally use a country code when you intend to use a language code. Common errors we see include "kr" instead of "ko" for Korean, "jp" instead of "ja" for Japanese, and "dk" instead of "da" for Danish.
Country codes may be used, however, as geographic designators (which appear in upper case in language tags). For example, "ja-JP" means "Japanese as used in Japan," and "zh-TW" means "Chinese as used in Taiwan." Use script tags ("Hant," "Hans," etc.) to denote distinctions between writing systems.