Re: [db-wg] UTF8

6 May 2015

      On Wed, May 06, 2015 at 09:13:28AM +0000, denis walker wrote:

Dear Denis
...
Thanks for the clarification. I don't think it makes sense to restrict
the UTF8 to only character sets defined within the RIPE region. (Not
sure it is even technically possible.) But if a Chinese person lives
and works in this region why would they not be able to enter their
This idea came from the fact that if someone live in this region,
probably have some documents issued by local authorities.
Of course for some cases this could not be true. So, good point.
...
correct name? Just for arguments sake, changing my name into Chinese
with Google translate changes the space to a '.'. If that is correct
then the current syntax check fails.
Well spotted.
...
Also "person:", "role:" and "org-name:" are all defined as 'lookup
keys'. That means you can enter their values in a query as the query
string and that will be searched on in the database. The individual
This could introduce some inconveniences while using cli interface.
...
'words' from these attribute values are stored in index tables in the
database and searched as part of the query to return objects with
matching values. I believe it is problematic to do string comparison
in UTF8.
I really doubt. Have you used Google search recently? ;-)

Being more serious, I believe that most of the countries with their own
alphabets do use internet tools and webpages without translating all
the names, addresses and other things to US-ASCII or Latin1.
...
Also the Full Text Search allows searches on all these attributes as
well as "address:", "descr:" and "remarks:". Again all the component
parts of these values are indexed for this search.
...
So to allow any attribute in UTF8 only, may require software changes
and may put restrictions on some of the services the database
currently provides. If you cannot rely on a search returning the
correct objects then you cannot allow those searches.
I'm aware that any modification may require software changes.
I hope that you haven't suggested that we should abandon any
improvements just because it requires some work to do.
...
There was a Labs article written some time ago on
UTF8https://labs.ripe.net/Members/kranjbar/internationalisation-of-ripe-database
...
This article put forward the idea of keeping all existing attributes
in ASCII (but really meant Latin1) and allowing additional optional
attributes for name and contact details in local language. I think
that would be a good first step to provide additional benefits of
localisation without breaking any of the current functionality. Even
if it was only an interim step it would allow time to asses any issues
and monitor the usefulness of these new attributes.
It was back in 2010 during the RIPE61 when I propose person-idn: and
other similar attributes. Although I understand your point of view, I
believe that the situation has changed through years.

Best regards,
Piotr

-- 
gucio -> Piotr Strzyżewski
E-mail: Piotr.Strzyzewski@polsl.pl

Re: [db-wg] UTF8

Piotr Strzyzewski