Hi,
I assume you're referring to the daily dumps that we release
      here:
      https://data-store.ripe.net/datasets/atlas-daily-dumps/
    
There are a couple of things that I find are relatively slow to deal with on the command line: standard bzip2 tooling, and jq for json parsing. So I lean on a couple of other tools to speed things up for me:
- the lbzip2 suite parallelises parts of the compress/decompress
      pipeline
      - GNU parallel can split data in a pipe onto one process per core
    
So, for example, on my laptop I can reasonably quickly pull out
      all of the traceroutes my own probe ran:
      lbzcat traceroute-2018-07-23T0700.bz2 | parallel -q --pipe jq '. |
      select(.prb_id == 14277)'
      
      Stéphane has written about using jq to parse Atlas results on
      labs.ripe.net also:
https://labs.ripe.net/Members/stephane_bortzmeyer/processing-ripe-atlas-results-with-jq
Happy to hear from others what tools they use for data processing!
Cheers,
S.
    
Dear RIPE Atlas users,I am studying the processing of the data collected by the probes as a Big Data problem. For instance, one hour of traceroute data count for 500 Mo (bzip2), so 7 Go of data in text format. Can you share with me how you deal with these data in practice.
are you using a super machine, Big Data tools?
best regards,Hayat