Data Utilities¶
Fetching data can be a complex task. Data can be can requested based on a time range or an OBSID. Image telemetry data can be accessed through MAUDE, using either blob or VCDU frame queries, or read from mica level0 files. The image data is enriched with data from other sources to provide information on attitude and star catalogs. These sources include the REST API on the kadi web server and various packages that access data on disk. The actual fetching of the data can optionally be done on multiple processes.
Aca-view includes utilities (known as services) to fetch the data and notify the application as the data becomes available. It also includes utilities to fetch data in standalone mode (no GUI).
-
Low-level Fetching Functions. Simple functions to get the data.
-
Data Service. The top-level class to encapsulate data access.
Multi-process Data Service. A collection of tools to fetch data in multiple processes.
File Service. A worker class to read data from files.
MAUDE Service.A worker class to get data from MAUDE.
Other. Other useful functions.
aca_view_fetch. A script to fetch data in standalone mode with the same arguments as aca_view
Fetching Data¶
The fetch
function and the aca_view_fetch
script are high-level utilities to get data without the graphical user interface. They wrap
all fetching functionality and are configured in the same way as aca_view configures data fetching.
The following statement reads a list of mica level0 files and outputs an
AcaTelemetryTimeline
using only local
data:
from pathlib import Path
from aca_view import tests
from aca_view.data.fetch import fetch
TEST_DATA_DIR = Path(tests.__file__).parent / 'data'
timeline = fetch(
filenames = TEST_DATA_DIR.glob('aca*fits.gz'),
mica = True,
ska_data_sources=['local']
)
and the following statement fetches the first 100 seconds of OBSID 8008 image telemetry from MAUDE:
from aca_view.data.fetch import fetch
timeline = fetch(
obsid = 8008,
stop = 100,
maude = True,
ska_data_sources=['ska_api']
)
Configuring Data Fetching¶
The validate_config
function produces a standard
dictionary with settings for the data fetching task. It tries to provide sensible defaults
for arguments that are not given, and tries to make sure the input arguments are consistent.
Within aca-view, the input to this function is a combination of settings in the persistent store and arguments passed in the command line or the data-service dialog.
- aca_view.data.config.validate_config(**kwargs)[source]¶
Make sure that settings are compatible for configuring data fetching.
Examples of some checks:
maude channel must be a valid channel
start/stop must be the right format
start/stop must be different
is using ASVT MAUDE channel, do not use mica or cheta as data would be inconsistent
if neither maude nor mica are enabled, enable both
if maude is enabled and mica is not, then set the cheta sources to [‘maude’]
Arguments:
max_workers: Maximum number of workers for multi-processing task (default: None)
real_time: Boolean flag to select real-time fetching (default: False)
multi_process: Boolean flag to select multi-process mode (default: False)
maude: Boolean flag to enable MAUDE (default: False)
mica: Boolean flag to enable reading mica level0 files (default: True)
filenames: List of filenames to read (default: [])
start: Start of the time range (default: None)
stop: End of the time range (default: None)
obsid: Observation ID (default: None)
maude_channel: MAUDE channel to use (default: ‘FLIGHT’, ignored if maude == False)
cheta_sources: data sources for cheta (default: [‘cxc’, ‘maude’])
ska_data_sources: data sources for Ska data (default: [‘local’, ‘ska_api’])
ska_api_url: Kadi web server URL (default: ‘https://web-kadi.cfa.harvard.edu/api/ska_api’)
guess_agasc_id: Boolean flag to enable guessing AGASC IDs based on ra/dec (default: False)
level0_times: shift times to match the times in mica level0 files (default: True)
Examples:
validate_config(real_time=True)
validate_config(filenames=['filename.fits'])
validate_config(obsid=8008)
validate_config(start='2022:240:10:00', stop='2022:240:30:00')
validate_config(start='2022:240:10:00', stop='2022:240:30:00', maude=True)
- Parameters:
args_in – dict A dictionary with settings. Usually from command line.
- Returns:
dict
API¶
Low-level Fetching Functions¶
These are simple functions that can be called directly and do not depend on Qt. They are the ones that actually get the data.
- aca_view.data.core.fetch.from_file(filename, settings)[source]¶
Get image data from a file.
Required settings:
level0_times. Shift times to match the times in level0 files.
ska_data_sources. A list of sources for Ska data.
ska_api_url. The URL of the kadi web server.
- aca_view.data.core.fetch.from_maude(start, stop, settings)[source]¶
Get image data from MAUDE VCDU frames for the given time interval.
Data is supplemented with some telemetry data from cheta.
Normally, the settings are the output of
validate_config
.Required settings:
level0_times. Shift times to match the times in level0 files.
maude_channel. MAUDE channel.
ska_data_sources. A list of sources for Ska data.
ska_api_url. The URL of the kadi web server.
- aca_view.data.core.fetch.from_blobs(start, stop, settings)[source]¶
Get image data from MAUDE blobs for the given time interval.
Required settings:
level0_times. Shift times to match the times in level0 files.
maude_channel. MAUDE channel.
ska_data_sources. A list of sources for Ska data.
ska_api_url. The URL of the kadi web server.
Services¶
Data Service¶
The top-level class encapsulating all data access
- class aca_view.data.data_service.DataService(settings)[source]¶
QObject class to encapsulate all data fetching and emit signals to tell the Qt application.
This class is typically created by the Qt application directly.
This class does not do any of the work, it merely bridges a worker class and the application. The worker class should take care of the details and implement a standard interface. Something like:
class Worker(QObject): # a signal to update the timeline update_timeline = QtC.pyqtSignal(dict) # a constructor taking a dictionary with settings def __init__(self, settings): pass def start(self): ... def quit(self): ...
Multi-process Data Service¶
A collection of tools to spread the data fetching into multiple child processes (both files or from MAUDE).
- class aca_view.data.multiprocess_service.MultiProcessWorker(settings)[source]¶
QObject-derived class to do the work of fetching data.
This particular implementation spreads the load over multiple child processes.
This class implements our interface for data service workers.
- class aca_view.data.multiprocess_server.MultiProcessServer(settings, queue_in, queue_out)[source]¶
Start a server to do multi-process fetching.
This class takes a time range as input and two queues to communicate the parent process. It creates a pool of concurrent processes that do the fetching and combine their results. It runs three concurrent tasks: _concurrent_fetch is constantly appending the results from the child processes, _server is constantly listening for and responding to requests from the parent process, and _monitor is constantly monitoring the state of the child processes.
- aca_view.data.multiprocess_server.run(logging_queue, queue_in, queue_out, settings)[source]¶
top-level function to be run on the main data-service process (a child of the GUI process)
- Parameters:
logging_queue – multiprocessing.Queue Queue where to send logging messages
queue_in – multiprocessing.Queue Queue where parent process puts commands
queue_out – multiprocessing.Queue Queue where data-service process sends back results to parent process
settings – Settings to initialize the MultiProcessServer
File Service¶
A worker class to read data from files. This has been superseded by the MultiProcessWorker.
- class aca_view.data.file_service.FileServiceWorker(settings)[source]¶
QObject-derived class to do the work of fetching data.
This particular implementation knows how to read files.
This class implements our interface for data service workers.
MAUDE Service¶
A worker class to get data from MAUDE. This has been superseded by the MultiProcessWorker.
- class aca_view.data.maude_service.MaudeServiceWorker(settings)[source]¶
QObject-derived class to do the work of fetching data.
This particular implementation knows how fetch from MAUDE.
This class implements our interface for data service workers.
Other¶
- aca_view.data.core.add_centroids.add_centroids(data, settings)[source]¶
Get the centroids in telemetry for each image.
Input data should be a dictionary of the form:
{'slot_data': Table, 'non_slot_data': Table}
The centroids are added as columns ‘YAGS’/’ZAGS’ to data[‘slot_data’].
The input data can optionally have the ‘blobs’ entry, in which case the centroids are extracted from them. These blobs are modified version of the output from maude. If blobs are not present, the centroids come from cheta (either maude or cxc, depending on settings[‘cheta_sources’]).
This function should never raise an exception.
- Parameters:
data – dict
settings – dict
- aca_view.data.core.add_dark_cal.add_dark_cal(data, settings=None)[source]¶
Get dark-cal ID corresponding to the times in non-slot data.
It expects that the input data is a dictionary of the form:
{'slot_data': Table, 'non_slot_data': Table}
The dark-cal ID is added as column ‘dark_cal_id’ to data[‘non_slot_data’].
This function should never raise an exception.
- Parameters:
data – dict
settings – dict
- aca_view.data.core.add_non_img_data.add_non_img_data(data, settings=None)[source]¶
Get non-slot data either from blobs or from cheta.
Input data should be a dictionary of the form:
{'start': float, 'stop': float, 'slot_data': Table, 'non_slot_data': Table}
If data[‘non_slot_data’] is a non-empty table, this function does nothing. This function a table to data[‘non_slot_data’] with columns:
INTEG integration time
COBSRQID. Obs ID.
AOATTQT<N>. Estimated attitude quaternion.
AOKALSTR. # of Kalman stars.
CVCMJCTR. Major frame counter.
CVCMNCTR. Minor frame counter.
AOPCADMD. PCAD MODE
AOACASEQ. ASPECT CAMERA PROCESSING SEQUENCE
AACCCDPT. ACA CCD temperature from OBC telemetry
The input data can optionally have the ‘blobs’ entry, in which case the data is extracted from them. These blobs are modified version of the output from maude.
This function should never raise an exception.
- Parameters:
data – dict
settings – dict
- aca_view.data.core.add_residuals.add_residuals(data, settings)[source]¶
Add residuals for each slot in data[‘slot_data’].
Input data should be a dictionary of the form:
{'slot_data': Table, 'non_slot_data': Table}
This adds the following masked columns: PRED_YAGS, PRED_ZAGS, DYAGS, DZAGS, AGASC_ID, RA, DEC. If there is no catalog associated to a given time, the corresponding row is masked.
This does not use chandra_aca.centroid_resid, because:
chandra_aca.centroid_resid uses cheta for attitudes (not available in real-time)
chandra_aca.centroid_resid does a bunch of things we do not need (such as shifting times, using other aspect solutions)
if there is no star or no catalog, a sort of generic exception is raised
This function should never raise an exception.
- Parameters:
data – dict
settings – dict
- aca_view.data.core.add_starcat.add_starcat(data, settings=None)[source]¶
Get the starcat_date for the corresponding catalog at each time in data[‘non_slot_data’].
Input data should be a dictionary of the form:
{'slot_data': Table, 'non_slot_data': Table}
The starcat date is added as column ‘starcat_date’ to data[‘non_slot_data’].
This function should never raise an exception.
- Parameters:
data – dict
settings – dict
- aca_view.data.core.add_status_strings.add_status_strings(data)[source]¶
Enhance the given table with data.
Input data should be a dictionary of the form:
{'slot_data': Table, 'non_slot_data': Table}
This function sets the following columns on data[‘slot_data’]:
GLBSTAT
GLBSTAT_STR
IMGSTAT
IMGSTAT_STR
This function should never raise an exception.
- Parameters:
data – dict
- aca_view.data.fetch.fetch(obsid=None, start=None, stop=None, filenames=None, mica=None, maude=None, maude_channel='FLIGHT', multi_process=True, ska_data_sources=None, real_time=None, real_time_offset=0, max_workers=None)[source]¶
Utility function to run the ACA-view data services in standalone mode (no GUI).
This is NOT called from aca_view, but it is equivalent. It is intended as a way to test data utilities without having to start aca_view.
- Parameters:
obsid – int The observation ID (optional)
start – CxoTime Get data after this time (optional)
stop – CxoTime Get data before this time (optional)
filenames – list List of filenames (optional)
mica – bool Flag to enable mica (if neither MAUDE nor mica are enabled explicitly, both are enabled)
maude – bool Flag to enable MAUDE (if neither MAUDE nor mica are enabled explicitly, both are enabled)
maude_channel – str MAUDE channel (default: “FLIGHT”)
multi_process – bool Flag to enable/disable fetching in multiple processes
real_time – bool Flag to enable real-time mode
real_time_offset – int Time offset between ‘now’ and the time to query MAUDE (for testing real–time-like behavior with old data)
max_workers – int Maximum number of workers in multi-process mode
aca_view_fetch¶
Example script that runs the ACA-view data service in standalone mode.
usage: aca_view_fetch [-h] [--out OUT] [--obsid OBSID] [--start START]
[--stop STOP] [--mica] [--maude]
[--maude-channel {FLIGHT,ASVT}]
[--real-time-offset REAL_TIME_OFFSET] [--real-time]
[--max-workers MAX_WORKERS]
[filenames ...]
Positional Arguments¶
- filenames
If files are specified, obsid/start/stop are ignored.
Default: []
Named Arguments¶
- --out
- --obsid
ObsID. If given, start/stop are ignored.
- --start
Starting time (any valid CxoTime string format).
- --stop
Stopping time (any valid CxoTime string format). Default is NOW if –start is given.
- --mica
Use Mica. If neither –mica/–maude are given, it will try MAUDE and then mica.
Default: False
- --maude
Use MAUDE. If neither –mica/–maude are given, it will try MAUDE and then mica.
Default: False
- --maude-channel
Possible choices: FLIGHT, ASVT
MAUDE channel to use (one of: FLIGHT, ASVT)
Default: “FLIGHT”
- --real-time-offset
Shift times in MAUDE requests by this amount in real-time mode.
Default: 0.0
- --real-time
Use MAUDE to fetch real-time data.
Default: False
- --max-workers