Ska Package Helpers¶
Ska_helpers is a collection of utilities for the Ska3 runtime environment.
Chandra Models Data¶
Get data from chandra_models repository.
- ska_helpers.chandra_models.chandra_models_cache(func)[source]¶
Decorator to cache outputs for a function that gets chandra_models data.
The key used for caching the function output includes the passed arguments and keyword arguments, as well as the values of the environment variables below. This ensures that the cache is invalidated if any of these environment variables change:
CHANDRA_MODELS_REPO_DIR
CHANDRA_MODELS_DEFAULT_VERSION
THERMAL_MODELS_DIR_FOR_MATLAB_TOOLS_SW
Example:
@chandra_models_cache def get_aca_spec_info(version=None): _, info = get_data("chandra_models/xija/aca/aca_spec.json", version=version) return info
- ska_helpers.chandra_models.get_data(file_path: str | Path, version: str | None = None, repo_path: str | Path | None = None, require_latest_version: bool = False, timeout: int | float = 5, read_func: Callable | None = None, read_func_kwargs: dict | None = None) tuple [source]¶
Get data from chandra_models repository.
There are three environment variables that impact the behavior:
CHANDRA_MODELS_REPO_DIR
orTHERMAL_MODELS_DIR_FOR_MATLAB_TOOLS_SW
: override the default root for the chandra_models repositoryCHANDRA_MODELS_DEFAULT_VERSION
: override the default repo version. You can set this to a fixed version in unit tests (e.g. withmonkeypatch
), or set to a developement branch to test a model file update with applications like yoshi where specifying a version would require a long chain of API updates.
THERMAL_MODELS_DIR_FOR_MATLAB_TOOLS_SW
is used to define the chandra_models repository location when running in the MATLAB tools software environment. If this environment variable is set, then the git is_dirty() check of the chandra_models directory is skipped as the chandra_models repository is verified via SVN in the MATLAB tools software environment. Users in the FOT Matlab tools should exercise caution if using locally-modified files for testing, as the version information reported by this function in that case will not be correct.- Parameters:
- file_pathstr, Path
Name of model
- versionstr
Tag, branch or commit of chandra_models to use (default=latest tag from repo). If the
CHANDRA_MODELS_DEFAULT_VERSION
environment variable is set then this is used as the default. This is useful for testing.- repo_pathstr, Path
Path to directory or URL containing chandra_models repository (default is
$SKA/data/chandra_models
or either of theCHANDRA_MODELS_REPO_DIR
orTHERMAL_MODELS_DIR_FOR_MATLAB_TOOLS_SW
environment variables if set).- require_latest_versionbool
Require that
version
matches the latest release on GitHub- timeoutint, float
Timeout (sec) for querying GitHub for the expected chandra_models version. Default = 5 sec.
- read_funccallable
Optional function to read the data file. This function must take the file path as its first argument. If not provided then read the file as a text file.
- read_func_kwargsdict
Optional dict of kwargs to pass to
read_func
.
- Returns:
- tuple of dict, str
Xija model specification dict, chandra_models version
Examples
First we read the model specification for the ACA model. The
get_data()
function returns the text of the model spec so we need to usejson.loads()
to convert it to a dict.>>> import json >>> from astropy.io import fits >>> from ska_helpers import chandra_models >>> txt, info = chandra_models.get_data("chandra_models/xija/aca/aca_spec.json") >>> model_spec = json.loads(txt) >>> model_spec["name"] 'aacccdpt'
Next we read the acquisition probability model image. Since the image is a gzipped FITS file we need to use a helper function to read it.
>>> def read_fits_image(file_path): ... with fits.open(file_path) as hdus: ... out = hdus[1].data ... return out, file_path ... >>> acq_model_image, info = chandra_models.get_data( ... "chandra_models/aca_acq_prob/grid-floor-2018-11.fits.gz", ... read_func=read_fits_image ... ) >>> acq_model_image.shape (141, 31, 7)
Now let’s get the version of the chandra_models repository:
>>> chandra_models.get_repo_version() '3.47'
Finally get version 3.30 of the ACA model spec from GitHub. The use of a lambda function to read the JSON file is compact but not recommended for production code.
>>> model_spec_3_30, info = chandra_models.get_data( ... "chandra_models/xija/aca/aca_spec.json", ... version="3.30", ... repo_path="https://github.com/sot/chandra_models.git", ... read_func=lambda fn: (json.load(open(fn, "rb")), fn), ... ) >>> model_spec_3_30 == model_spec False
- ska_helpers.chandra_models.get_github_version(url: str = 'https://api.github.com/repos/sot/chandra_models/releases/latest', timeout: int | float = 5) bool | None [source]¶
Get latest chandra_models GitHub repo release tag (version).
This queries GitHub for the latest release of chandra_models.
- Parameters:
- urlstr
URL for latest chandra_models release on GitHub API
- timeoutint, float
Request timeout (sec, default=5)
- Returns:
- str, None
Tag name (str) of latest chandra_models release on GitHub. None if the request timed out, indicating indeterminate answer.
Environment¶
The ska_helpers.environment
module provides a function to configure the Ska3
runtime environment at the point of import of every Ska3 package.
- ska_helpers.environment.configure_ska_environment()[source]¶
Configure environment for Ska3 runtime.
This is called by ska_helpers.version.get_version() and thus gets called upon import of every Ska3 package.
This includes setting NUMBA_CACHE_DIR to $HOME/.ska3/cache/numba if that env var is not already defined. This is to avoid problems with read-only filesystems.
Git helpers¶
Helper functions for using git.
- ska_helpers.git_helpers.make_git_repo_safe(path: str | Path) None [source]¶
Ensure git repo at
path
is a safe git repository.A “safe” repo is one which is owned by the user calling this function. See: https://github.blog/2022-04-12-git-security-vulnerability-announced/#cve-2022-24765
If an unsafe repo is detected then this command issues a warning to that effect and then updates the user’s git config to add this repo as a safe directory.
This function should only be called for known safe git repos such as
$SKA/data/chandra_models
.- Parameters:
path – str, Path Path to top level of a git repository
Logging¶
- ska_helpers.logging.basic_logger(name, format='%(asctime)s %(funcName)s: %(message)s', propagate=False, **kwargs)[source]¶
Create logger
name
using logging.basicConfig.This is a thin wrapper around logging.basicConfig, except:
Uses logger named
name
instead of the root loggerDefaults to a standard format for Ska applications. Specify
format=None
to use the defaultbasicConfig
format.Not recommended for multithreaded or multiprocess applications due to using a temporary monkey-patch of a global variable to create the logger. It will probably work but it is not guaranteed.
This function does nothing if the
name
logger already has handlers configured, unless the keyword argumentforce
is set toTrue
. It is a convenience method intended to do one-shot creation of a logger.The default behaviour is to create a StreamHandler which writes to
sys.stderr
, set a formatter using the format string"%(asctime)s %(funcName)s: %(message)s"
, and add the handler to thename
logger with a level of WARNING.By default the created logger will not propagate to parent loggers. This is to prevent unexpected logging from other packages that set up a root logger. To propagate to parent loggers, set
propagate=True
. See https://docs.python.org/3/howto/logging.html#logging-flow, in particular how the log level of parent loggers is ignored in message handling.Example:
# In __init__.py for a package or in any module from ska_helpers.logging import basic_logger logger = basic_logger(__name__, level='INFO') # In other submodules within a package the normal usage is to inherit # the package logger. import logging logger = logging.getLogger(__name__)
A number of optional keyword arguments may be specified, which can alter the default behaviour.
- filename
Specifies that a FileHandler be created, using the specified filename, rather than a StreamHandler.
- filemode
Specifies the mode to open the file, if filename is specified (if filemode is unspecified, it defaults to ‘a’).
- format
Use the specified format string for the handler.
- datefmt
Use the specified date/time format.
- style
If a format string is specified, use this to specify the type of format string (possible values ‘%’, ‘{’, ‘$’, for %-formatting,
str.format()
andstring.Template
- defaults to ‘%’).- level
Set the
name
logger level to the specified level. This can be a number (10, 20, …) or a string (‘NOTSET’, ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘ERROR’, ‘CRITICAL’) orlogging.DEBUG
, etc.- stream
Use the specified stream to initialize the StreamHandler. Note that this argument is incompatible with ‘filename’ - if both are present, ‘stream’ is ignored.
- handlers
If specified, this should be an iterable of already created handlers, which will be added to the
name
handler. Any handler in the list which does not have a formatter assigned will be assigned the formatter created in this function.- force
If this keyword is specified as true, any existing handlers attached to the
name
logger are removed and closed, before carrying out the configuration as specified by the other arguments.
Note that you could specify a stream created using open(filename, mode) rather than passing the filename and mode in. However, it should be remembered that StreamHandler does not close its stream (since it may be using sys.stdout or sys.stderr), whereas FileHandler closes its stream when the handler is closed.
Note this function is probably not thread-safe.
- Parameters:
- namestr
logger name
- formatstr
format string for handler
- propagate: bool
propagate to parent loggers (default=False)
- **kwargsdict
other keyword arguments for logging.basicConfig
- Returns:
- loggerLogger object
Retry¶
Retry package initially copied from https://github.com/invl/retry.
This project appears to be abandoned so moving it to ska_helpers.
LICENSE:
Copyright 2014 invl
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
- exception ska_helpers.retry.RetryError(failures)[source]¶
Keep track of the stack of exceptions when trying multiple times.
- Parameters:
- failureslist of dict, each with keys ‘type’, ‘value’, ‘trace’.
- ska_helpers.retry.retry(exceptions=<class 'Exception'>, tries=-1, delay=0, max_delay=None, backoff=1, jitter=0, logger=<Logger ska_helpers.retry.api (WARNING)>, mangle_alert_words=False)[source]¶
Returns a retry decorator.
- Parameters:
exceptions – an exception or a tuple of exceptions to catch. default: Exception.
tries – the maximum number of attempts. default: -1 (infinite).
delay – initial delay between attempts. default: 0.
max_delay – the maximum value of delay. default: None (no limit).
backoff – multiplier applied to delay between attempts. default: 1 (no backoff).
jitter – extra seconds added to delay between attempts. default: 0. fixed if a number, random if a range tuple (min, max)
logger – logger.warning(fmt, error, delay) will be called on failed attempts. default: retry.logging_logger. if None, logging is disabled.
mangle_alert_words – if True, mangle alert words “warning”, “error”, “fatal”, “exception” when issuing a logger warning message. Default: False.
- Returns:
a retry decorator.
- ska_helpers.retry.retry_call(f, args=None, kwargs=None, exceptions=<class 'Exception'>, tries=-1, delay=0, max_delay=None, backoff=1, jitter=0, logger=<Logger ska_helpers.retry.api (WARNING)>, mangle_alert_words=False)[source]¶
Calls a function and re-executes it if it failed.
- Parameters:
f – the function to execute.
args – the positional arguments of the function to execute.
kwargs – the named arguments of the function to execute.
exceptions – an exception or a tuple of exceptions to catch. default: Exception.
tries – the maximum number of attempts. default: -1 (infinite).
delay – initial delay between attempts. default: 0.
max_delay – the maximum value of delay. default: None (no limit).
backoff – multiplier applied to delay between attempts. default: 1 (no backoff).
jitter – extra seconds added to delay between attempts. default: 0. fixed if a number, random if a range tuple (min, max)
logger – logger.warning(fmt, error, delay) will be called on failed attempts. default: retry.logging_logger. if None, logging is disabled.
mangle_alert_words – if True, mangle alert words “warning”, “error”, “fatal”, “exception”, “fail” when issuing a logger warning message. Default: False.
- Returns:
the result of the f function.
- ska_helpers.retry.tables_open_file(*args, **kwargs)[source]¶
Call
tables.open_file(*args, **kwargs)
with retry up to 3 times.This only catches tables.exceptions.HDF5ExtError. After an initial failure it will try again after 2 seconds and once more after 4 seconds.
- Parameters:
*args –
args passed through to tables.open_file()
mangle_alert_words – (keyword-only) if True, mangle alert words “warning”, “error”, “fatal”, “exception”, “fail” when issuing a logger warning message. Default: True.
retry_delay – (keyword-only) initial delay between attempts. default: 2.
retry_tries – (keyword-only) the maximum number of attempts. default: 3.
retry_backoff – (keyword-only) multiplier applied to delay between attempts. default: 2.
retry_logger – (keyword-only) logger.warning(msg) will be called.
**kwargs –
additional kwargs passed through to tables.open_file()
- Returns:
tables file handle
Setup Helpers¶
- ska_helpers.setup_helper.duplicate_package_info(vals, name_in, name_out)[source]¶
Duplicate a list or dict of values inplace, replacing
name_in
withname_out
.Normally used in setup.py for making a namespace package that copies a flat one. For an example see setup.py in the ska_sun or Ska.Sun repo.
- Parameters:
vals – list or dict of values
name_in – string to replace at start of each value
name_out – output string
Utilities¶
- class ska_helpers.utils.LRUDict(capacity=128)[source]¶
Dict that maintains a fixed capacity and evicts least recently used item when full.
Inherits from collections.OrderedDict to maintain the order of insertion.
- class ska_helpers.utils.LazyDict(load_func, *args, **kwargs)[source]¶
Dict which is lazy-initialized using supplied function
load_func
.This class allows defining a module-level dict that is expensive to initialize, where the initialization is done lazily (only when actually needed).
- Parameters:
- load_funcfunction
Reference to a function that returns a dict to init this dict object
- *args
Arguments list for
load_func
- **kwargs
Keyword arguments for
load_func
Examples
from ska_helpers.utils import LazyDict def load_func(a, b): # Some expensive function in practice print('Here in load_func') return {'a': a, 'b': b} ONE = LazyDict(load_func, 1, 2) print('ONE is defined but not yet loaded') print(ONE['a'])
- copy() a shallow copy of D ¶
- get(key, default=None, /)¶
Return the value for key if key is in the dictionary, else default.
- items() a set-like object providing a view on D's items ¶
- keys() a set-like object providing a view on D's keys ¶
- pop(k[, d]) v, remove specified key and return the corresponding value. ¶
If the key is not found, return the default if given; otherwise, raise a KeyError.
- popitem()¶
Remove and return a (key, value) pair as a 2-tuple.
Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.
- setdefault(key, default=None, /)¶
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- values() an object providing a view on D's values ¶
- class ska_helpers.utils.LazyVal(load_func, *args, **kwargs)[source]¶
Value which is lazy-initialized using supplied function
load_func
.This class allows defining a module-level value that is expensive to initialize, where the initialization is done lazily (only when actually needed).
The lazy value is accessed using the
val
property.- Parameters:
- load_funcfunction
Reference to a function that returns a dict to init this dict object
- *args
Arguments list for
load_func
- **kwargs
Keyword arguments for
load_func
Examples
from ska_helpers.utils import LazyVal def load_func(a): # Some expensive function in practice print('Here in load_func') return a ONE = LazyVal(load_func, 1) print('ONE is defined but not yet loaded') print(ONE.val)
- class ska_helpers.utils.TypedDescriptor(*, default=None, required=False, cls=None)[source]¶
Class to create a descriptor for a dataclass attribute that is cast to a type.
This is a base class for creating a descriptor that can be used to define an attribute on a dataclass that is cast to a specific type. The type is specified by setting the
cls
class attribute on the descriptor class.Most commonly
cls
is a class likeCxoTime
orQuat
, but it could also be a built-in likeint
orfloat
or any callable function.This descriptor can be used either as a base class with the
cls
class attribute set accordingly, or as a descriptor with thecls
keyword argument set.Warning
This descriptor class is recommended for use within a dataclass. In a normal
class the default value must be set to the correct type since it will not be coerced to the correct type automatically.
The default value cannot be
list
,dict
, orset
since these are mutable and are disallowed by the dataclass machinery. In most cases alist
can be replaced by atuple
and adict
can be replaced by anOrderedDict
.- Parameters:
- defaultoptional
Default value for the attribute. If specified and not
None
, it will be coerced to the correct type viacls(default)
. If not specified, the default for the attribute isNone
.- requiredbool, optional
If
True
, the attribute is required to be set explicitly when the object is created. IfFalse
the default value is used if the attribute is not set.
Examples
>>> from dataclasses import dataclass >>> from ska_helpers.utils import TypedDescriptor
Here we make a dataclass with an attribute that is cast to an int.
>>> @dataclass >>> class SomeClass: ... int_val: int = TypedDescriptor(required=True, cls=int) >>> obj = SomeClass(10.5) >>> obj.int_val 10
Here we define a
QuatDescriptor
class that can be used repeatedly for any quaternion attribute.>>> from Quaternion import Quat >>> class QuatDescriptor(TypedDescriptor): ... cls = Quat >>> @dataclass ... class MyClass: ... att1: Quat = QuatDescriptor(required=True) ... att2: Quat = QuatDescriptor(default=[10, 20, 30]) ... att3: Quat | None = QuatDescriptor() ... >>> obj = MyClass(att1=[0, 0, 0, 1]) >>> obj.att1 <Quat q1=0.00000000 q2=0.00000000 q3=0.00000000 q4=1.00000000> >>> obj.att2.equatorial array([10., 20., 30.]) >>> obj.att3 is None True >>> obj.att3 = [10, 20, 30] >>> obj.att3.equatorial array([10., 20., 30.])
- ska_helpers.utils.convert_to_int_float_str(val: str) int | float | str [source]¶
Convert an input string into an int, float, or string.
This tries to convert the input string into an int using the built-in
int()
function. If that fails then it triesfloat()
, and finally if that fails it returns the original string.This function is often useful when parsing text representations of structured data where the data types are implicit.
- Parameters:
- valstr
The input string to convert
- Returns:
- int, float, or str
The input value as an int, float, or string.
Notes
An input string like “01234” is interpreted as a decimal integer and will be returned as the integer 1234. In some contexts a leading 0 indicates an octal number and to avoid confusion in Python a leading 0 is not allowed in a decimal integer literal.
- ska_helpers.utils.lru_cache_timed(maxsize=128, typed=False, timeout=3600)[source]¶
LRU cache decorator where the cache expires after
timeout
seconds.This wraps the functools.lru_cache decorator so that the entire cache gets cleared if the cache is older than
timeout
seconds.This is mostly copied from this gist, with no license specified: https://gist.github.com/helix84/05ee246d6c80bc7bacdfa6a62fbff3fa
The cachetools package provides a way to apply the timeout per-item, if that is required.
- Parameters:
- maxsizeint
functools.lru_cache maxsize parameter
- typedbool
functools.lru_cache typed parameter
- timeoutint, float
Clear cache after
timeout
seconds from last clear
- ska_helpers.utils.temp_env_var(name, value)[source]¶
A context manager that temporarily sets an environment variable.
Example:
>>> os.environ.get("MY_VARIABLE") None >>> with temp_env_var("MY_VARIABLE", "my_value"): ... os.environ.get("MY_VARIABLE") ... 'my_value' >>> os.environ.get("MY_VARIABLE") None
- Parameters:
name – str Name of the environment variable to set.
value – str Value to set the environment variable to.
Version Info¶
The ska_helpers.version
module provides utilities to handle package
versions. The version of a package is determined using importlib if it is
installed, and setuptools_scm
otherwise.
- ska_helpers.version.get_version(package, distribution=None)[source]¶
Get version string for
package
with optionaldistribution
name.If the package is not from an installed distribution then get version from git using setuptools_scm.
- Parameters:
- package
package name, typically __package__
- distribution
name of distribution if different from
package
(Default value = None)
- Returns:
- str
Version string
- ska_helpers.version.parse_version(version)[source]¶
Parse version string and return a dictionary with version information. This only handles the default scheme.
- Parameters:
- version
str
- Returns:
- dict
version information
Default versioning scheme¶
What follows is the scheme as described in setuptools_scm’s documentation.
In the standard configuration setuptools_scm
takes a look at three things:
latest tag (with a version number)
the distance to this tag (e.g. number of revisions since latest tag)
workdir state (e.g. uncommitted changes since latest tag)
and uses roughly the following logic to render the version:
- no distance and clean:
{tag}
- distance and clean:
{next_version}.dev{distance}+{scm letter}{revision hash}
- no distance and not clean:
{tag}+dYYYMMMDD
- distance and not clean:
{next_version}.dev{distance}+{scm letter}{revision hash}.dYYYMMMDD
The next version is calculated by adding 1
to the last numeric component of
the tag.
For Git projects, the version relies on git describe,
so you will see an additional g
prepended to the {revision hash}
.
Due to the default behavior it’s necessary to always include a
patch version (the 3
in 1.2.3
), or else the automatic guessing
will increment the wrong part of the SemVer (e.g. tag 2.0
results in
2.1.devX
instead of 2.0.1.devX
). So please make sure to tag
accordingly.
Run time information¶
The ska_helpers.run_info
module provides convenience functions to get
and print relevant run time information such as machine name, user name, date,
program version, and so on. This is aimed at executable scripts and cron jobs.
- ska_helpers.run_info.get_run_info(opt=None, *, version=None, stack_level=1)[source]¶
Get run time information as dict.
- Parameters:
- opt
argparse options (Default value = None)
- version
program version (default=__version__ in calling module)
- stack_level
stack level for getting calling module (Default value = 1)
- Returns:
- dcit
run information
- ska_helpers.run_info.get_run_info_lines(opt=None, *, version=None, stack_level=2)[source]¶
Get run time information as formatted lines.
- Parameters:
- opt
argparse options (Default value = None)
- version
program version (default=__version__ in calling module)
- stack_level
stack level for getting calling module (Default value = 2)
- Returns:
- list
formatted information lines
- ska_helpers.run_info.log_run_info(log_func, opt=None, *, version=None, stack_level=3)[source]¶
Output run time information as formatted lines via
log_func
.Each formatted line is passed to
log_func
.- Parameters:
- log_func
logger output function (e.g. logger.info)
- opt
argparse options (Default value = None)
- version
program version (default=__version__ in calling module)
- stack_level
stack level for getting calling module (Default value = 3)