Hi David & DB folks !
What happened:
The software first searches for 'U. Schaefer' and cannot find anything. Then it breaks up the 'U. Schaefer' in the following fields:
U (trailing dots are discarded) Schaefer
the U key is discarded because it is too small (for a good reason)
As much as I remember, this is in line with a recommendation from the DB-WG.
... Short discussion:
- U. Schaefer
Names likes this will be rejected in the next release of the updating code. Names should consist of at least two parts, not counting abbreviated parts/titles.
Good stuff. But I think we should try to weed out offending things before (or in parallel) with the introduction of this more stringent checks. I think we add just another level of complexity and user confusion if we don't...
- The inetnum object is incorrect: It references a non-existing person object. This will not happen in normal circumstances. The new (not yet deployed) updating code will disallow (most) non-existing references.
Hmmm.... What about allowing an update but issuing a warning? And the other way 'round - issue a warning if an object which is still referenced gets removed. I'm not sure whether it makes sense to *sometimes* enforce link sanity and to allow for breaking them later (e.g. with a delete for the referenced object). Maybe that's just some aspects for a (hypothetical) package of sanity checks that might be activated regularly or upon request.
- the software should have found 'U. Schaefer' if the object existed (and nothing more, see algoritm above)
- I can change the indexing in such a way that that trailing dots are indexed. However, this might be worse then the problem: accidental dots after names cause the generation of different keys from the objects where the dots are left out and thus objects that are obvious the same are not seen as such anymore (by the software).
I don;t think that we should make software chages to accomodate syntax errors. In a lot of different documents, the syntax for a person object is cleraly defined: name [initials] name - the initials to be registered without a trailing dot. At the same time titles and gender specific local language "syntactic sugar" is not allowed. Not sure whether there's a chance to guard against the proliferation of "Herr", "Frau", "Professor" etc... ~~~~~~ WRT "fuzzy" / wildcard matches - it would make the RIPE-DB software more user friendly (and maybe more attractive for deployment by other registries) to provide for (implicit/explicit) regexp or partial key matches. But I think that would be a major (conceptual) change and should be discussed, both technically and the human resources involved. Even if it might be easy to implement :-) Wilfried. -------------------------------------------------------------------------- Wilfried Woeber : e-mail: Woeber@CC.UniVie.ac.at Computer Center - ACOnet : Vienna University : Tel: +43 1 4065822 355 Universitaetsstrasse 7 : Fax: +43 1 4065822 170 A-1010 Vienna, Austria, Europe : NIC: WW144 --------------------------------------------------------------------------
Hi, As APNIC uses the RIPE database software, I thought I might intrude here for a second and throw in my two yen.
Names likes this will be rejected in the next release of the updating code. Names should consist of at least two parts, not counting abbreviated parts/titles.
As I have mentioned in the past, there are people who have only one name. If you decide to do this, please make a compile or run-time configuration option that will defeat this syntax check.
WRT "fuzzy" / wildcard matches - it would make the RIPE-DB software more user friendly (and maybe more attractive for deployment by other registries) to provide for (implicit/explicit) regexp or partial key matches.
To what end? Is the purpose of the database registration information lookup or a more general white pages service? I' would suggest the former and leave the latter to the WHOIS++/LDAP/X.500 crowd. Regards, -drc
Hi David,
David R. Conrad writes :
As APNIC uses the RIPE database software, I thought I might intrude here for a second and throw in my two yen.
Names likes this will be rejected in the next release of the updating code. Names should consist of at least two parts, not counting abbreviated parts/titles.
As I have mentioned in the past, there are people who have only one name. If you decide to do this, please make a compile or run-time configuration option that will defeat this syntax check.
Don't worry. I heard already about your concerns. It is supported: # NROFNAMES # # minimal number of components that a name should consist of NROFNAMES 2
WRT "fuzzy" / wildcard matches - it would make the RIPE-DB software more user friendly (and maybe more attractive for deployment by other registries) to provide for (implicit/explicit) regexp or partial key matches.
To what end? Is the purpose of the database registration information lookup or a more general white pages service? I' would suggest the former and leave the latter to the WHOIS++/LDAP/X.500 crowd.
In general, I agree with this principle: The whois servers should be fast & reliable. We should only support queries that the users (eg.: networking communitity) needs. We better don't support lookup methods that are more usefull for the 'sales & marketing' department. However, this doesn't mean that we shouldn't be user-friendly. David K.
On Fri, 19 Jul 1996 10:13:53 +0200 (MET DST) David.Kessens@ripe.net wrote:
Names likes this will be rejected in the next release of the updating code. Names should consist of at least two parts, not counting abbreviated parts/titles.
As I have mentioned in the past, there are people who have only one name. If you decide to do this, please make a compile or run-time configuration option that will defeat this syntax check.
Don't worry. I heard already about your concerns. It is supported:
# NROFNAMES # # minimal number of components that a name should consist of
NROFNAMES 2
I'm not convinced that enforcing the name to exist of 2 words is really going to solve the problem. There are a couple of problems here: 1. People enter names in the database with an illegal format: U. Schultz Miss Emma Peel Smith (Joe Smith, but only the last name) 2. Namespace collisions, which are more likely the smaller and the more common the key is (I have not found somebody else with my name yet, but it is more likely that 'Smith' collides with something else) With the growth of the Internet, I believe that using people's names as (exclusive) search key is no longer sufficient. 3. In some areas of the world (India?), people *really* have only one name. Making the database resistent against these names doesn't help: I see no reason to lock them out from using other registries then the APNIC 'names fix' version. Rather than locking out people from India, I believe that the correct approach is to enforce people to use NIC handles. We have been migrating to that for quite some time, and maybe now is the time to cut over. We all know that NIC handles work quite well; there is already much operational experience with this as it is used in day-to-day operation. There may still be a couple of small open issues but most of them have been dragging around for a long time and they will be resolved much more quickly once NIC handles are enforced. As to the 'illegal name' issue, I wonder if there are places where the correct spelling of names allows a dot in the first word or the last word (middle words are OK). This would allow cases like: David R. Conrad but blocks cases like: Prof. Jones U. Schultz This check isn't perfect, but is reasonable because it catches most common mistakes, and the ones that are left are harmless. Geert Jan
On Fri, 19 Jul 1996 10:13:53 +0200 (MET DST) David.Kessens@ripe.net wrote:
Names likes this will be rejected in the next release of the updating code. Names should consist of at least two parts, not counting abbreviated parts/titles.
As I have mentioned in the past, there are people who have only one name. If you decide to do this, please make a compile or run-time configuration option that will defeat this syntax check.
Don't worry. I heard already about your concerns. It is supported:
# NROFNAMES # # minimal number of components that a name should consist of
NROFNAMES 2
I'm not convinced that enforcing the name to exist of 2 words is really going to solve the problem. There are a couple of problems here: 1. People enter names in the database with an illegal format: U. Schultz Miss Emma Peel Smith (Joe Smith, but only the last name) 2. Namespace collisions, which are more likely the smaller and the more common the key is (I have not found somebody else with my name yet, but it is more likely that 'Smith' collides with something else) With the growth of the Internet, I believe that using people's names as (exclusive) search key is no longer sufficient. 3. In some areas of the world (India?), people *really* have only one name. Making the database resistent against these names doesn't help: I see no reason to lock them out from using other registries then the APNIC 'names fix' version. Rather than locking out people from India, I believe that the correct approach is to enforce people to use NIC handles. We have been migrating to that for quite some time, and maybe now is the time to cut over. We all know that NIC handles work quite well; there is already much operational experience with this as it is used in day-to-day operation. There may still be a couple of small open issues but most of them have been dragging around for a long time and they will be resolved much more quickly once NIC handles are enforced. As to the 'illegal name' issue, I wonder if there are places where the correct spelling of names allows a dot in the first word or the last word (middle words are OK). This would allow cases like: David R. Conrad but blocks cases like: Prof. Jones U. Schultz This check isn't perfect, but is reasonable because it catches most common mistakes, and the ones that are left are harmless. My stuiver's worth, Geert Jan
Hi Wilfried,
Wilfried Woeber, UniVie/ACOnet writes :
Short discussion:
- U. Schaefer
Names likes this will be rejected in the next release of the updating code. Names should consist of at least two parts, not counting abbreviated parts/titles.
Good stuff. But I think we should try to weed out offending things before (or in parallel) with the introduction of this more stringent checks. I think we add just another level of complexity and user confusion if we don't...
I feel that it is best to introduce these checks immediately. The earlier you do it, the better. The database grows fast (as the Internet does). It can only save (much) work in the future.
- The inetnum object is incorrect: It references a non-existing person object. This will not happen in normal circumstances. The new (not yet deployed) updating code will disallow (most) non-existing references.
Hmmm.... What about allowing an update but issuing a warning?
I can always issue a warning instead of an error. However, I think that an error message is more appropriate in case of an error - missing references are errors in my view.
And the other way 'round - issue a warning if an object which is still referenced gets removed. I'm not sure whether it makes sense to *sometimes* enforce link sanity and to allow for breaking them later (e.g. with a delete for the referenced object).
Deletion of referenced objects will now also be checked. This will also generate an error. I can easily change this to a warning if people want to, although my personal opinion is that this should be an error.
- the software should have found 'U. Schaefer' if the object existed (and nothing more, see algoritm above)
- I can change the indexing in such a way that that trailing dots are indexed. However, this might be worse then the problem: accidental dots after names cause the generation of different keys from the objects where the dots are left out and thus objects that are obvious the same are not seen as such anymore (by the software).
I don;t think that we should make software chages to accomodate syntax errors. In a lot of different documents, the syntax for a person object is cleraly defined: name [initials] name - the initials to be registered without a trailing dot.
Agreed.
At the same time titles and gender specific local language "syntactic sugar" is not allowed. Not sure whether there's a chance to guard against the proliferation of "Herr", "Frau", "Professor" etc...
I do some minimal checkings. You are most likely caught when you are living in North-West Europe ;-). You may always send me a list of very common titles in your country, and I will add them, but be aware of the fact that I want to keep this checking minimal, to avoid people (in other countries) that might have a name that could be the same as a title.
WRT "fuzzy" / wildcard matches - it would make the RIPE-DB software more user friendly (and maybe more attractive for deployment by other registries) to provide for (implicit/explicit) regexp or partial key matches.
See my comments in my mail to David Conrad.
But I think that would be a major (conceptual) change and should be discussed, both technically and the human resources involved. Even if it might be easy to implement :-)
It might not be that difficult to implement. Hoever, it might very well violate one of the success factors of the database, that is: quick answers. David K.
participants (4)
-
David R. Conrad
-
David.Kessens@ripe.net
-
Geert Jan de Groot
-
Wilfried Woeber, UniVie/ACOnet