HI All

On 06/05/2015 14:46, Job Snijders wrote:

Hi all,

On Wed, May 06, 2015 at 02:07:56PM +0200, Piotr Strzyzewski wrote:

correct name? Just for arguments sake, changing my name into Chinese
with Google translate changes the space to a '.'. If that is correct
then the current syntax check fails.

Well spotted.

The syntax check might need fixing then. The assumption that every name
consists of at least two strings seperated by a space is based on
nothing. I would consider this a bug. Who will file the issue on github?
:-)

I agree there is no reason to keep that specific syntax check. But my point was that changing any attribute to UTF8 only may affect syntax checks or business rules and these need to be considered.

Also "person:", "role:" and "org-name:" are all defined as 'lookup
keys'.

This could introduce some inconveniences while using cli interface.

Why would this be an issue with UTF8? Can someone from RIPE NCC comment
on how this looks from the technical side of things?

There was a Labs article written some time ago on
UTF8 https://labs.ripe.net/Members/kranjbar/internationalisation-of-ripe-database

This article put forward the idea of keeping all existing attributes
in ASCII (but really meant Latin1) and allowing additional optional
attributes for name and contact details in local language.

It was back in 2010 during the RIPE61 when I propose person-idn: and
other similar attributes. Although I understand your point of view, I
believe that the situation has changed through years.

So you two are leaning towards allowing UTF8 in some fields, and in
other places add an optional new attribute (such as person-idn) if
people want to describe more clearly what their actual name is? 

If this is the case it would be good if you go over all
fields/attributes the database currently knows, and compile a full list
of attributes that should receive an idn-sibling or should accept UTF8
instead of whatever they currently accept.

This is what I suggested a few years ago. Someone (maybe a task force?) needs to look at every attribute in every object and choose one of three categories for it:

-Latin1 only: some attributes make no sense in local language, eg status, import
-Duplicated: some attributes may need to be available in Latin1 for registry consistency, legal reasons, or simply maintaining a database for the whole region to make use of, but could also be duplicated in local language, eg org-name, abuse-mailbox
-UTF8 only: some attributes could be open to any character set, eg remarks, notify (only relevant to maintainer of object)

This requires a bit more preparation work and introducing new attributes, but in the end it allows much more of the database to be opened up to the possibility of UTF8 without restricting any of its value or usage throughout the whole region.

cheers
denis


Kind regards,

Job