Hi list,
A couple of weeks ago, I
asked in this thread why some built-in measurements are
missing, or not performed at scheduled interval.
Cristel and Robert very
kindly shared what they thought could be the causes:
scheduling, probe reboot, updating task list, etc.
The discussion gave me the
idea to verify if there are as well missing measurements while
the probe is powered and connected to an Atlas controller,
i.e. probe seemly works in a good condition.
Here, I would like to
share one case among many Ive observed where built-in
measurements are missed continuously for a long time, even
when the probe is well connected to a controller.
Lets look at a time
window from '2016-06-16 21:53:20 +0000 to '2016-06-18
20:16:40 +0000 for probe 22144.
First, I queried its
connection events to Atlas controller (msm_id 7000).
The result says the probe
connected to a controller at '2016-06-16 21:54:19 +0000 and
became disconnected at '2016-06-18 20:13:48 +0000. Between
these two moments, the probe is supposed to remain connected,
and thus continuously powered.
Then, I queried the
built-in ping measurements toward b-root (msm_id 1010) within
the time window.
Here below the timestamps
at which measurements are performed.
[2016-06-16 22:51:08
+0000, '2016-06-17 02:07:08 +0000, '2016-06-17 03:11:02
+0000',
'2016-06-17 04:03:07
+0000, '2016-06-17 05:23:03 +0000, '2016-06-17 07:35:17
+0000',
'2016-06-17 10:51:06
+0000, '2016-06-17 14:07:04 +0000, '2016-06-17 15:11:03
+0000,
'2016-06-17 17:23:06
+0000, '2016-06-17 18:27:04 +0000, '2016-06-17 20:39:04
+0000,
'2016-06-17 22:51:09
+0000, '2016-06-17 23:55:03 +0000, '2016-06-18 02:07:08
+0000',
'2016-06-18 03:11:04
+0000, '2016-06-18 06:27:05 +0000, '2016-06-18 08:39:02
+0000',
'2016-06-18 10:27:13
+0000, '2016-06-18 10:51:08 +0000, '2016-06-18 11:55:13
+0000',
'2016-06-18 12:03:13
+0000, '2016-06-18 15:11:04 +0000, '2016-06-18 17:23:05
+0000',
'2016-06-18 18:27:09
+0000, '2016-06-18 18:43:13 +0000, '2016-06-18 19:35:18
+0000',
'2016-06-18 20:15:06
+0000]
We can see the intervals
between neighbouring measurements are much larger than the
planned value 240sec.
I investigated as well
other built-in ping measurements, say toward k-root (msm_id
1001).
Here below are the
timestamps:
['2016-06-16 22:51:08
+0000, '2016-06-17 02:07:07 +0000, '2016-06-17 03:11:05
+0000',
'2016-06-17 04:03:08
+0000, 2016-06-17 05:23:03 +0000, '2016-06-17 07:35:17
+0000',
'2016-06-17 10:51:10
+0000, '2016-06-17 14:07:04 +0000, '2016-06-17 15:11:04
+0000',
]
Very similar phenomenon is
observed.
Between the first two
measurements in the above lists, there is an interval of more
than one hour, which can hardly be explained by measurement
secluding issue or temporarily high load.
Whats more, the probe
remained connected at those moments, therefore is free of
reboot and power-off..
As a reference, probe
12657 has all the measurements coming at due interval within
the time window.
What could be the possible
causes behind such missing is my doubt.
And I do appreciate your
thinkings on this so that the measurements can be processed
and analysed with propre caution.
Thanks.
Regards,
wenqin
On 2016-09-02 12:20, Wenqin SHAO wrote:
Thanks for confirming.
The specified frequency is indeed well respected. When
there is no data-missing, the interval shift rarely
exceed 14s, small compared to 240s the scheduled
interval.
What intrigues me is that the exact phase/timing is as
well kept after power cut and reboot.
The probes have a crontab-like mechanism to remember what
they need to do.
As long as their clock is more or less ok, they will stick
to the
pre-allocated times and tasks.
By the way, can a
measurement be as well skipped, as designed behaviour,
due to scheduling issues mentioned by @Cristel?
We're trying to avoid overloading probes, but not
everything is under our
full control. Some measurements can pile up; Cristel &
Randy & co. had a
paper about the observed (worst-case) behaviour.
Regards,
Robert