My Atlas probe (ID 14587) has been showing as offline since December 16th. I can see DHCP requests/replies, it ARPs for the gateway and I can see DNS requests and pings from the probe, and successful replies to all of this traffic being sent back to the probe. e.g.: 12:13:38.790096 IP 192.168.0.115.35404 > 217.146.115.154.53: 2+ AAAA? ctr-nue15.atlas.ripe.net. (42) 12:13:38.797337 IP 217.146.115.154.53 > 192.168.0.115.35404: 2 1/6/12 AAAA 2a01:4f8:120:14ee::2 (482) 12:13:38.798179 IP 192.168.0.115.57785 > 217.146.115.154.53: 3+ A? ctr-nue15.atlas.ripe.net. (42) 12:13:38.799649 IP 217.146.115.154.53 > 192.168.0.115.57785: 3 1/6/12 A 78.46.87.51 (470) 12:14:58.990977 IP 192.168.0.115.37934 > 217.146.115.154.53: 2+ AAAA? IUSB-READONLY.U36.M6466B3B0EC98.sos.atlas.ripe.net. (68) 12:14:59.060132 IP 217.146.115.154.53 > 192.168.0.115.37934: 2 1/1/2 AAAA 2001:67c:2e8:11::c100:1337 (164) 12:14:59.060983 IP 192.168.0.115.57523 > 217.146.115.154.53: 3+ A? IUSB-READONLY.U36.M6466B3B0EC98.sos.atlas.ripe.net. (68) 12:14:59.114915 IP 217.146.115.154.53 > 192.168.0.115.57523: 3 1/1/2 A 193.0.19.55 (152) 12:14:59.115925 IP 192.168.0.115 > 193.0.19.55: ICMP echo request, id 954, seq 0, length 64 12:14:59.142586 IP 193.0.19.55 > 192.168.0.115: ICMP echo reply, id 954, seq 0, length 64 12:15:00.116800 IP 192.168.0.115 > 193.0.19.55: ICMP echo request, id 954, seq 1, length 64 12:15:00.143773 IP 193.0.19.55 > 192.168.0.115: ICMP echo reply, id 954, seq 1, length 64 I can't see any other IPv4 traffic, but I do see IPv6 HTTPS traffic. I'm not sure where to go next with figuring out why the probe seems to have vanished. Any ideas would be greatly appreciated. Many thanks. -- - Steve
Hi Steve, You could notice "USB-READONLY". This is the SOS message indicating that probe has a failed flash drive. If you replace it with new flash drive the probe will be back in business. /vty On 1/6/16 1:23 PM, Steve Hill wrote:
My Atlas probe (ID 14587) has been showing as offline since December 16th. I can see DHCP requests/replies, it ARPs for the gateway and I can see DNS requests and pings from the probe, and successful replies to all of this traffic being sent back to the probe. e.g.:
12:13:38.790096 IP 192.168.0.115.35404 > 217.146.115.154.53: 2+ AAAA? ctr-nue15.atlas.ripe.net. (42) 12:13:38.797337 IP 217.146.115.154.53 > 192.168.0.115.35404: 2 1/6/12 AAAA 2a01:4f8:120:14ee::2 (482) 12:13:38.798179 IP 192.168.0.115.57785 > 217.146.115.154.53: 3+ A? ctr-nue15.atlas.ripe.net. (42) 12:13:38.799649 IP 217.146.115.154.53 > 192.168.0.115.57785: 3 1/6/12 A 78.46.87.51 (470) 12:14:58.990977 IP 192.168.0.115.37934 > 217.146.115.154.53: 2+ AAAA? IUSB-READONLY.U36.M6466B3B0EC98.sos.atlas.ripe.net. (68) 12:14:59.060132 IP 217.146.115.154.53 > 192.168.0.115.37934: 2 1/1/2 AAAA 2001:67c:2e8:11::c100:1337 (164) 12:14:59.060983 IP 192.168.0.115.57523 > 217.146.115.154.53: 3+ A? IUSB-READONLY.U36.M6466B3B0EC98.sos.atlas.ripe.net. (68) 12:14:59.114915 IP 217.146.115.154.53 > 192.168.0.115.57523: 3 1/1/2 A 193.0.19.55 (152) 12:14:59.115925 IP 192.168.0.115 > 193.0.19.55: ICMP echo request, id 954, seq 0, length 64 12:14:59.142586 IP 193.0.19.55 > 192.168.0.115: ICMP echo reply, id 954, seq 0, length 64 12:15:00.116800 IP 192.168.0.115 > 193.0.19.55: ICMP echo request, id 954, seq 1, length 64 12:15:00.143773 IP 193.0.19.55 > 192.168.0.115: ICMP echo reply, id 954, seq 1, length 64
I can't see any other IPv4 traffic, but I do see IPv6 HTTPS traffic. I'm not sure where to go next with figuring out why the probe seems to have vanished. Any ideas would be greatly appreciated.
Many thanks.
Hi, your probe is tagged as "readonly flash drive". Disconnect the probe, connect the usb stick to your computer and test it (e.g. delete all partitions, create own partitions and write some data to it). If data can not be written to the usb stick a) disable the write lock, b) replace it by a new one one (4 GB capacity at minimum). Then reconnect the probe (power and ethernet, but without the usb stick), wait 5 minutes and then plug in the usb stick. I hope that helps. Greetings, Christian Am 06.01.2016 um 13:23 schrieb Steve Hill:
My Atlas probe (ID 14587) has been showing as offline since December 16th. I can see DHCP requests/replies, it ARPs for the gateway and I can see DNS requests and pings from the probe, and successful replies to all of this traffic being sent back to the probe. e.g.:
12:13:38.790096 IP 192.168.0.115.35404 > 217.146.115.154.53: 2+ AAAA? ctr-nue15.atlas.ripe.net. (42) 12:13:38.797337 IP 217.146.115.154.53 > 192.168.0.115.35404: 2 1/6/12 AAAA 2a01:4f8:120:14ee::2 (482) 12:13:38.798179 IP 192.168.0.115.57785 > 217.146.115.154.53: 3+ A? ctr-nue15.atlas.ripe.net. (42) 12:13:38.799649 IP 217.146.115.154.53 > 192.168.0.115.57785: 3 1/6/12 A 78.46.87.51 (470) 12:14:58.990977 IP 192.168.0.115.37934 > 217.146.115.154.53: 2+ AAAA? IUSB-READONLY.U36.M6466B3B0EC98.sos.atlas.ripe.net. (68) 12:14:59.060132 IP 217.146.115.154.53 > 192.168.0.115.37934: 2 1/1/2 AAAA 2001:67c:2e8:11::c100:1337 (164) 12:14:59.060983 IP 192.168.0.115.57523 > 217.146.115.154.53: 3+ A? IUSB-READONLY.U36.M6466B3B0EC98.sos.atlas.ripe.net. (68) 12:14:59.114915 IP 217.146.115.154.53 > 192.168.0.115.57523: 3 1/1/2 A 193.0.19.55 (152) 12:14:59.115925 IP 192.168.0.115 > 193.0.19.55: ICMP echo request, id 954, seq 0, length 64 12:14:59.142586 IP 193.0.19.55 > 192.168.0.115: ICMP echo reply, id 954, seq 0, length 64 12:15:00.116800 IP 192.168.0.115 > 193.0.19.55: ICMP echo request, id 954, seq 1, length 64 12:15:00.143773 IP 193.0.19.55 > 192.168.0.115: ICMP echo reply, id 954, seq 1, length 64
I can't see any other IPv4 traffic, but I do see IPv6 HTTPS traffic. I'm not sure where to go next with figuring out why the probe seems to have vanished. Any ideas would be greatly appreciated.
Many thanks.
On 06/01/16 13:17, Estelmann, Christian wrote:
your probe is tagged as "readonly flash drive".
Disconnect the probe, connect the usb stick to your computer and test it (e.g. delete all partitions, create own partitions and write some data to it). If data can not be written to the usb stick a) disable the write lock, b) replace it by a new one one (4 GB capacity at minimum). Then reconnect the probe (power and ethernet, but without the usb stick), wait 5 minutes and then plug in the usb stick.
Thanks - looks like that was it. The original drive has gone read-only, swapping it for another USB stick has fixed the issue. -- - Steve
On 2016-01-06 15:58, Steve Hill wrote:
On 06/01/16 13:17, Estelmann, Christian wrote:
your probe is tagged as "readonly flash drive".
Disconnect the probe, connect the usb stick to your computer and test it (e.g. delete all partitions, create own partitions and write some data to it). If data can not be written to the usb stick a) disable the write lock, b) replace it by a new one one (4 GB capacity at minimum). Then reconnect the probe (power and ethernet, but without the usb stick), wait 5 minutes and then plug in the usb stick.
Thanks - looks like that was it. The original drive has gone read-only, swapping it for another USB stick has fixed the issue.
A little bit of background information that could help: The probes use USB storage to buffer results. This comes handy when the probe is off-line, and especially handy if the error is outside the host's network. The unpredictable nature of power losses and other unfriendly events regarding filesystems of course affect these sticks too. This is not unexpected, so the probes do file system checks and repairs as needed to overcome fs corruption resulting from this (and, as some people know this by experience, they can even format and use a new USB stick if it's inserted while the probe is powered up). We learned the hard way that the particular type of USB stick (Sandisk nano) has a particular "feature", namely that it switches to permanent read-only mode if it detects possible corruption. This is most likely there to prevent further escalation of the problem, while allowing recovery of the remaining data from the stick. This is probably ok for generic use, as they are cheap and replaceable. However, in the RIPE Atlas context, this is bad, as there's nothing we can do to fix -- besides reporting to the host. We're unsure what exactly triggers the behaviour. Because of this behaviour we started using a different brand for storage a while ago. We see no need to pro-actively swap out the sticks (and it's also very difficult to do in practice) as the problem only occurs with a small amount of probes. Cheers, Robert
On 10/01/16 11:42, Robert Kisteleki wrote:
A little bit of background information that could help:
Thanks for the detailed explanation. I had seen the "USB drive readonly" notification, but didn't know whether it was a warning or if the drive was supposed to be readonly anyway. Once I'd been told that this was a problem I popped the drive out of the probe and put it in another machine and found that yes it was indeed readonly. Some Googling confirmed that this is a permanent state for those USB disks after they detect corruption. Bit of a pain that these sticks don't have a "just wipe the thing and start over" option in that case. -- - Steve
Hi, as one of my probes also had problems with the USB drive it would be nice to have the procedure on how to recover from this failure described in the FAQ. Because I just tried my luck and booted without the drive and inserted it when the probe was booted. And this worked but it could have avoided a ticket to the RIPE team if this would have been described in the FAQ. On 11/01/16 10:44, Steve Hill wrote:
On 10/01/16 11:42, Robert Kisteleki wrote:
A little bit of background information that could help:
Thanks for the detailed explanation. I had seen the "USB drive readonly" notification, but didn't know whether it was a warning or if the drive was supposed to be readonly anyway. Once I'd been told that this was a problem I popped the drive out of the probe and put it in another machine and found that yes it was indeed readonly. Some Googling confirmed that this is a permanent state for those USB disks after they detect corruption.
Bit of a pain that these sticks don't have a "just wipe the thing and start over" option in that case.
On 2016-01-11 10:52, Annika Wickert wrote:
Hi,
as one of my probes also had problems with the USB drive it would be nice to have the procedure on how to recover from this failure described in the FAQ.
Because I just tried my luck and booted without the drive and inserted it when the probe was booted. And this worked but it could have avoided a ticket to the RIPE team if this would have been described in the FAQ.
Hello, We've been reluctant to publish the procedure in the FAQ, as the outcome is most likely that it'll be exercised even if there's no reason to. However, we're working on a feature to give probe hosts more guidance about what's going on (and especially what's going wrong) with their probe (*), and here we will make it clear if the USB replacement is in order. Regards, Robert (*) anything we can detect remotely, such as DNS configuration errors, firewalls, readonly USBs, flakey power sources, ...
Hi, On Mon, Jan 11, 2016 at 11:53:29AM +0100, Robert Kisteleki wrote:
However, we're working on a feature to give probe hosts more guidance about what's going on (and especially what's going wrong) with their probe (*), and here we will make it clear if the USB replacement is in order.
This is much appreciated... I've bitten by USB outages a few times, and it wasn't always obvious why the probe was acting up. Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
participants (6)
-
Annika Wickert
-
Estelmann, Christian
-
Gert Doering
-
Robert Kisteleki
-
Steve Hill
-
Viktor Naumov