Command-line fetch¶
The fetch_ska
application allows use of the Ska engineering archive
without getting into Python or using any scripting. From a single command
line tool you can access most of the common processing steps associated with
fetching and using telemetry data:
Fetch a set of MSIDs over a time range, specifying the sampling as either full-resolution, 5-minute, or daily data.
Filter out bad or missing data.
Interpolate (resample) all MSID values to a common uniformly-spaced time sequence.
Remove or select time intervals corresponding to specified Kadi event types.
Change the time format from CXC seconds (seconds since 1998.0) to something more convenient like GRETA time.
Write the MSID telemetry data to a zip file.
Aside from the first two steps (fetching data and filtering bad data), all the steps are optional.
Getting started¶
The very first thing is to get set up to use the Ska environment following the
instructions in the Ska Analysis Tutorial. Assuming that is
done, then you need to enter the Ska environment using the ska
(or skatest
)
alias:
% ska
(In case you don’t use linux frequently, the %
is meant to represent the command
prompt, so don’t type that). After doing ska
you should see your prompt change to
include a ska-
prefix.
Getting help¶
You can get help by asking ska_fetch
to print its command line options:
% ska_fetch --help
usage: ska_fetch [-h] [--start START] [--stop STOP] [--sampling SAMPLING]
[--unit-system UNIT_SYSTEM] [--interpolate-dt INTERPOLATE_DT]
[--remove-events REMOVE_EVENTS] [--select-events SELECT_EVENTS]
[--time-format TIME_FORMAT] [--outfile OUTFILE] [--quiet]
[--max-fetch-Mb MAX_FETCH_MB] [--max-output-Mb MAX_OUTPUT_MB]
MSID [MSID ...]
Fetch telemetry from the Ska engineering archive.
Examples
========
# Get full-resolution TEPHIN, AOPCADMD for last 30 days, and save as telem.zip
% ska_fetch --sampling=5min --outfile=telem.zip --time-format=greta TEPHIN AOPCADMD
# Get daily temps since 2000, removing times within 100000 seconds of safe- or normal- sun
% ska_fetch --sampling=daily --outfile=tephin.zip \
--remove-events='safe_suns[pad=100000] | normal_suns[pad=100000]' \
tephin tcylaft6
# Get daily IRU-2 temps since 2004, removing known LTT bad times
% ska_fetch AIRU2BT --start 2004:001 --sampling=daily --outfile=airu2bt.zip \
--remove-events='ltt_bads[msid="AIRU2BT"]'
Arguments
=========
positional arguments:
MSID MSID to fetch
optional arguments:
-h, --help show this help message and exit
--start START Start time for data fetch (default=<stop> - 30 days)
--stop STOP Stop time for data fetch (default=NOW)
--sampling SAMPLING Data sampling (full|5min|daily) (default=5min)
--unit-system UNIT_SYSTEM
Unit system for data (eng|sci|cxc) (default=eng)
--interpolate-dt INTERPOLATE_DT
Interpolate to uniform time steps (secs, default=None)
--remove-events REMOVE_EVENTS
Remove kadi events expression (default=None)
--select-events SELECT_EVENTS
Select kadi events expression (default=None)
--time-format TIME_FORMAT
Output time format (secs|date|greta|jd|frac_year|...)
--outfile OUTFILE Output file name (default=fetch.zip)
--quiet Suppress run-time logging output
--max-fetch-Mb MAX_FETCH_MB
Max allowed memory (Mb) for fetching (default=1000)
--max-output-Mb MAX_OUTPUT_MB
Max allowed memory (Mb) for file output (default=20)
Try it out¶
There are plenty of options but frequently you’ll only need a few. Let’s start by trying the first example provided in the help output:
% ska_fetch TEPHIN AOPCADMD --start=2013:001 --stop=2013:030 --sampling=5min \
--time-format=greta --outfile=telem.zip
Fetching 5min-resolution data for MSIDS=['TEPHIN', 'AOPCADMD']
from 2013:001:12:00:00.000 to 2013:030:12:00:00.000
Writing data to telem.zip
That was easy, now let’s unzip the archive and see what we got. First look at the archive contents:
% unzip -l telem.zip
Archive: telem.zip
Length Date Time Name
--------- ---------- ----- ----
460424 03-06-2014 11:46 TEPHIN.csv
221559 03-06-2014 11:46 AOPCADMD.csv
--------- -------
681983 2 files
Now let’s unzip:
% unzip telem.zip
Archive: telem.zip
inflating: TEPHIN.csv
inflating: AOPCADMD.csv
The first data file is a comma-separated values file TEPHIN.csv
. This could be
imported into Excel or any number of other applications. Let’s look at the first few
lines of the file with the linux head
command:
% head TEPHIN.csv
times,samples,vals,mins,maxes,means,midvals
2013001.120424816,10,113.798,113.798,113.798,113.798,113.798
2013001.120952816,10,113.798,113.798,113.798,113.798,113.798
2013001.121520816,10,113.798,113.798,113.798,113.798,113.798
2013001.122048816,10,113.798,113.798,113.798,113.798,113.798
2013001.122616816,10,113.798,113.798,113.798,113.798,113.798
2013001.123144816,10,113.798,113.798,113.798,113.798,113.798
2013001.123712816,10,113.798,113.798,113.798,113.798,113.798
2013001.124240816,10,113.798,113.798,113.798,113.798,113.798
2013001.124808816,10,113.798,113.798,113.798,113.798,113.798
For the TEPHIN data the column names are mostly straighforward. For 5-minute or daily
data, the vals
column is the same as the mean. This is a convience so you can
use vals
for full
, 5min
and daily
sampling analysis. The midvals
column represents the telemetered value at exactly the midpoint of the interval.
Now let’s examine the AOPCADMD output:
% head AOPCADMD.csv
times,samples,vals,raw_vals
2013001.120424816,320,NPNT,1
2013001.120952816,320,NPNT,1
2013001.121520816,320,NPNT,1
2013001.122048816,320,NPNT,1
2013001.122616816,320,NPNT,1
2013001.123144816,320,NPNT,1
2013001.123712816,320,NPNT,1
2013001.124240816,320,NPNT,1
2013001.124808816,320,NPNT,1
For the AOPCADMD data notice there are no statistic values. This is because it is a state
code MSID and so there is no useful meaning for a mean or max. The final raw_vals
column is the raw telemetered value, while vals
has been translated into the
corresponding state code string.
Details¶
There are many options controlling fetch_ska
, but they can be broken down
into manageable subsets as in the following sections. This will include detailed
discussion of how to use each of the options.
Desired telemetry¶
Argument |
Description |
---|---|
msids |
MSID(s) to fetch (string or list of strings) |
–start |
Start time for data fetch (default=<stop> - 30 days) |
–stop |
Stop time for data fetch (default=NOW) |
–sampling |
Data sampling (full | 5min | daily) (default=5min)) |
–unit_system |
Unit system for data (eng | sci | cxc) (default=eng) |
The first argument msids
is the only one that always has to be provided. It should be
either a single string like COBSRQID
or a list of strings like TEPHIN
TCYLAFT6 TEIO
. Note that the MSID is case-insensitive so tephin
is fine.
The --start
and --stop
arguments are typically a string like 2012:001
or
2012:001:02:03:04
(ISO time) or 2012001.020304
(GRETA time). If not provided
then the last 30 days of telemetry will be fetched.
The --sampling
argument will choose between either full-resolution telemetry
or the 5-minute or daily summary statistic values. The default is 5min
.
The --unit_system
argument selects the output unit system. The choices are engineering
units (i.e. what is in the TDB and GRETA), science units (mostly just temperatures in C
instead of F), or CXC units (whatever is in CXC decom, which e.g. has temperatures in K).
Interpolation¶
Argument |
Description |
---|---|
–interpolate_dt |
Interpolate to uniform time steps (secs, default=None) |
In general different MSIDs will come down in telemetry with different sampling and time
stamps. Interpolation allows you to put all the MSIDs onto a common time sequence so you
can compare them, plot one against the other, and so forth. You can see the
Interpolation section for the gory details, but if you need to have your MSIDs on
a common time sequence then set interpolate_dt
to the desired time step
in seconds. When interpolating ska_fetch
uses filter_bad=True
and
union_bad=True
(as described in Interpolation).
Intervals¶
Argument |
Description |
---|---|
–remove_events |
Remove kadi events expression (default=None) |
–select_events |
Select kadi events expression (default=None) |
These arguments allow you to select or remove intervals in the data using the Kadi event definitions. For instance we can select times of stable NPM dwells during radiation zones:
% ska_fetch AOATTER1 AOATTER2 AOATTER3 --start=2014:001 --stop=2014:010 \
select_events='dwells & rad_zones'
Note the use of a single-quote string for the select events expression. This makes sure the expression is treated as a single entity and special characters are not interpreted by the shell.
The order of processing is to first remove event intervals, then select event intervals.
The expression for --remove_events
or --select_events
can be any logical expression
involving Kadi query names (see the event definitions table). The
following string would be valid: 'dsn_comms | (dwells[pad=-300] & ~eclipses)'
, and for
select_events
this would imply selecting telemetry which is either during a DSN pass
or (within a NPM dwell and not during an eclipse). The [pad=-300]
qualifier means
that a buffer of 300 seconds is applied on each edge to provide padding from the maneuver.
A positive padding expands the event intervals while negative contracts the intervals.
Another example of practical interest is using the LTT bad times event to remove bad times for long-term trending plots by MSID. In this case we get daily IRU-2 temps since 2004, removing known LTT bad times:
% ska_fetch AIRU2BT --start 2004:001 --sampling=daily --outfile=airu2bt.zip \
--remove-events='ltt_bads[msid="AIRU2BT"]'
Notice the syntax here which indicates selecting all the LTT bad times corresponding
to AIRU2BT
. See the
LTT bad times
section for more details.
Output¶
Argument |
Description |
---|---|
–time_format |
Output time format (secs|date|greta|jd|…, default=secs) |
–outfile |
Output file name (default=’fetch.zip’) |
By default the times
column for each MSID output is provided in the format of seconds
since 1998.0 (CXC seconds). The time_format
argument allows selecting any time format
supported by Chandra.Time. A common
option for FOT analysis will be greta
.
The MSID set will always be written out as a compressed zip archive with the given name
(or fetch.zip
if not provided). This archive will contain one or more CSV files
corresponding to the MSIDs in the set.
Process control¶
Argument |
Description |
---|---|
–quiet |
Suppress run-time logging output (default=False) |
–max_fetch_Mb |
Max allowed memory (Mb) for fetching (default=1000) |
–max_output_Mb |
Max allowed memory (Mb) for output (default=100) |
Normally ska_fetch
outputs a few lines of progress information as it is processing the
request. To disable this logging use the --quiet
flag.
The next two arguments are in place to prevent accidentally doing a huge query that will consume all available memory or generate a large file that will be slow to read. For instance getting all the gyro count data for the mission will take more than 70 Gb of memory.
The --max_fetch_Mb
argument specifies how much memory the fetched MSID set can
take. This has a default of 1000 Mb = 1 Gb.
The --max_output_Mb
checks the size of the actual output MSID set (the uncompressed
binary in memory), which may be smaller than the fetch object if data sampling has been
reduced via the --interpolate_dt
argument. This has a default of 100 Mb.
As an example of what happens if you run into the limits, here is an attempt at the aforementioned gyro counts query:
% ska_fetch AOGYRCT1 AOGYRCT2 AOGYRCT3 AOGYRCT4 --start=2000:001 --sampling=full
Fetching full-resolution data for MSIDS=['AOGYRCT1', 'AOGYRCT2', 'AOGYRCT3', 'AOGYRCT4']
from 2000:001:12:00:00.000 to 2014:065:17:35:42.347
********************************************************************************
ERROR: Requested fetch requires 76821.73 Mb vs. limit of 1000.00 Mb
********************************************************************************
Both of the defaults here are relatively conservative, and with experience you can set larger values.