Stavros, On Wed, 19 Aug 2015 16:00:59 +0200 Stavros Konstantaras <stavros@nlnetlabs.nl> wrote:
To make things more clear, our project at NLnet Labs is the development of "a modern IRRtoolset" written in Python and targeting any operator, not just another homemade tool for our own needs. Or to say in more detail the creation of a tool that is able to configure BGP routers directly+automatically by extracting the policies from RIPE DB. This means that I am struggling to avoid the (re)use of any past or legacy software that exists, otherwise we loose independency and inherit restrictions from the past.
I don't know of any Python RPSL parser you can use. Your goals make a certain amount of sense. Unfortunately RPSL is an ugly language... it's actually more like several languages in one, all bad. :P * Start with simple attribute/value, one per line * Oh, but then "RPSL-ize", which allows line continuation and end-of-line comments * Oh, actually we want to make a separate grammar for each attribute * But be sure to make all of this "extensible", to insure maximum confusion! My approach in the past has been to start with a simple generic parser that split text into objects, objects into attributes, extracts attribute name & values (handling line continuation and end-of-line comments). You can do all of this with Python regular expressions, something like r"\n(?![ \t+])" to split objects into attributes; cleaning end-of-line comments & line continuation characters is straightforward. Then you can deal with the grammar for the attributes you know, and ignore the rest. You can extend the parsing to be more and more comprehensive until you are able to parse the data set you worry about. I'm not sure this is worth doing... but if there was a good, easy-to-use Python library then maybe interesting things would result? Cheers, -- Shane