Recently my probe stopped responding to RIPE even though it was pingable through the local network. Putting aside what the cause might have been* is there any way to 're-awaken' or cycle in some way a probe without physical access to it? (* sfaict it didn't restart when the USB socket on the server it takes power from was remotely restarted) Sent via RIPE Forum -- https://www.ripe.net/participate/mail/forum
Hi, On Mon, Apr 11, 2016 at 01:34:53PM +0200, Alison Wheeler wrote:
Recently my probe stopped responding to RIPE even though it was pingable through the local network. Putting aside what the cause might have been* is there any way to 're-awaken' or cycle in some way a probe without physical access to it?
(* sfaict it didn't restart when the USB socket on the server it takes power from was remotely restarted)
Check the SOS messages on the "my probes -> network" web page - most likely the USB flash is borked. The probe will signal this (and other errors) to the NCC controller, but the controller doesn't bother tell to this anyone but list it on the web page, next to the "beware of the leopard" sign. Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
Sadly, there was nothing in there to indicate a problem, other than no records: <pre> 74.125.47.144 2016-04-07 18:57:44 A 0h 1m 74.125.47.140 2016-04-07 18:57:44 AAAA 0h 1m 2a00:1450:400c:c08::116 2016-03-26 00:58:46 A 8d 9h 17m 74.125.47.14 2016-03-26 00:58:46 AAAA 8d 9h 17m </pre> and as I don't check the probe's status daily (mea culpa!) it wasn't until I was mailed after a week of it being down that I knew there was an issue. But even then, if the RIPE servers can't 'see' the probe I can't do anything via the web interface, hence my wondering if there is any way to awaken it via a local network action (analogous to sending a WakeOnLAN ping) Sent via RIPE Forum -- https://www.ripe.net/participate/mail/forum
Hi, On Mon, Apr 11, 2016 at 01:49:04PM +0200, Alison Wheeler wrote:
Sadly, there was nothing in there to indicate a problem, other than no records: <pre> 74.125.47.144 2016-04-07 18:57:44 A 0h 1m 74.125.47.140 2016-04-07 18:57:44 AAAA 0h 1m 2a00:1450:400c:c08::116 2016-03-26 00:58:46 A 8d 9h 17m 74.125.47.14 2016-03-26 00:58:46 AAAA 8d 9h 17m </pre>
These are not the SOS messages.
and as I don't check the probe's status daily (mea culpa!) it wasn't until I was mailed after a week of it being down that I knew there was an issue. But even then, if the RIPE servers can't 'see' the probe I can't do anything via the web interface, hence my wondering if there is any way to awaken it via a local network action (analogous to sending a WakeOnLAN ping)
Most likely it will need flash stick cyling. But you'll see that in the SOS messages. (Repating myself, it would be *so* helpful if the mail "hey, your probe is down!" actually included availble SOS diagnostics...) Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
On 2016/04/11 13:54 , Gert Doering wrote:
Most likely it will need flash stick cyling. But you'll see that in the SOS messages.
(Repating myself, it would be *so* helpful if the mail "hey, your probe is down!" actually included availble SOS diagnostics...)
There is a project to try to have a section of the probe web site dedicated to diagnosing probe problems. I think that is close to going live.
Hi, On Mon, Apr 11, 2016 at 01:57:52PM +0200, Philip Homburg wrote:
On 2016/04/11 13:54 , Gert Doering wrote:
Most likely it will need flash stick cyling. But you'll see that in the SOS messages.
(Repating myself, it would be *so* helpful if the mail "hey, your probe is down!" actually included availble SOS diagnostics...)
There is a project to try to have a section of the probe web site dedicated to diagnosing probe problems. I think that is close to going live.
While that is an improvement, actually having diagnostic info in the *mails* sent (especially about "we're having USB troubles, so better bring a fresh USB stick with you before you drive out to the probe site!") would be even better. The reason why I'm so insistant about this: I have two probes sitting in remote locations and both needed *multiple* visits because the flash was broken, the system *knew* about it ("ROUSB"), and didn't tell me (and yes, because I'm old and stupid and forgot to check the well-hidden SOS message list when the second probe broke...) - would have saved me about three visits, which accumulate to about half a day human life time. Far too valuable to waste on lack of proper diagnostics. Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
On 2016/04/11 14:39 , Gert Doering wrote:
While that is an improvement, actually having diagnostic info in the *mails* sent (especially about "we're having USB troubles, so better bring a fresh USB stick with you before you drive out to the probe site!") would be even better.
The reason why I'm so insistant about this: I have two probes sitting in remote locations and both needed *multiple* visits because the flash was broken, the system *knew* about it ("ROUSB"), and didn't tell me (and yes, because I'm old and stupid and forgot to check the well-hidden SOS message list when the second probe broke...) - would have saved me about three visits, which accumulate to about half a day human life time. Far too valuable to waste on lack of proper diagnostics.
Okay. I just created a ticket for this. No idea when someone gets around to actually doing it though.
On 2016-04-11 13:54:40 CET, Gert Doering wrote:
74.125.47.144 2016-04-07 18:57:44 A 0h 1m 74.125.47.140 2016-04-07 18:57:44 AAAA 0h 1m 2a00:1450:400c:c08::116 2016-03-26 00:58:46 A 8d 9h 17m 74.125.47.14 2016-03-26 00:58:46 AAAA 8d 9h 17m These are not the SOS messages.
That's the content under the "SOS History" heading. I don't see anything which could be described as "the "beware of the leopard" sign. Sent via RIPE Forum -- https://www.ripe.net/participate/mail/forum
Hi, On Mon, Apr 11, 2016 at 02:54:44PM +0200, Alison Wheeler wrote:
On 2016-04-11 13:54:40 CET, Gert Doering wrote:
74.125.47.144 2016-04-07 18:57:44 A 0h 1m 74.125.47.140 2016-04-07 18:57:44 AAAA 0h 1m 2a00:1450:400c:c08::116 2016-03-26 00:58:46 A 8d 9h 17m 74.125.47.14 2016-03-26 00:58:46 AAAA 8d 9h 17m These are not the SOS messages.
That's the content under the "SOS History" heading.
Oh, good point. I never looked there when one of my probes was actually *working* - if it's broken, but still *has* working DNS, you should see at least a few queries each time it's booting, and if the probe is unhappy, with something in the "Info" column (which is empty if everything is fine). Since you do not see anything - try powercycling again, and possibly sniffing on the probe's network port what it's doing...
I don't see anything which could be described as "the "beware of the leopard" sign.
This is a reference to an obscure book :-) (and was intended to read "it's there, if you know where to find it"). Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
On 2016/04/11 13:49 , Alison Wheeler wrote:
Sadly, there was nothing in there to indicate a problem, other than no records: <pre> 74.125.47.144 2016-04-07 18:57:44 A 0h 1m 74.125.47.140 2016-04-07 18:57:44 AAAA 0h 1m 2a00:1450:400c:c08::116 2016-03-26 00:58:46 A 8d 9h 17m 74.125.47.14 2016-03-26 00:58:46 AAAA 8d 9h 17m </pre> and as I don't check the probe's status daily (mea culpa!) it wasn't until I was mailed after a week of it being down that I knew there was an issue. But even then, if the RIPE servers can't 'see' the probe I can't do anything via the web interface, hence my wondering if there is any way to awaken it via a local network action (analogous to sending a WakeOnLAN ping)
Unfortunately, there is an issue where after some time the filesystem on the USB stick just becomes corrupt. There is no obvious pattern. Some probes are running for years without any problem. If the filesystem is corrupt, then to recover it should be sufficient the connect the probe without the USB stick wait for about 10 minutes and insert the USB stick. This works only if the probe can get address from DHCP. It doesn't work in all case, sometimes the USB is actually broken. And there is at least one way that the filesystem can be corrupt in a way that can fool this trick.
participants (3)
-
Alison Wheeler
-
Gert Doering
-
Philip Homburg