ska_numpy
¶
Provide useful utilities for numpy.
- ska_numpy.Numpy.add_column(recarray, name, val, index=None)[source]¶
Add a column
name
with valueval
torecarray
and return a new record array.- Parameters
recarray – Input record array
name – Name of the new column
val – Value of the new column (np.array or list)
index – Add column before index (default: append at end)
- Return type
New record array with column appended
- ska_numpy.Numpy.compress(recarray, delta=None, indexcol=None, diff=None, avg=None, colnames=None)[source]¶
Compress
recarray
rows into intervals where adjacent rows are similar.In addition to the original column names, the output recarray will have these columns:
<indexcol>_start
start value of the
indexcol
column.<indexcol>_stop
stop value of the
indexcol
column (inclusive up to the next interval).samples
number of samples in interval
If
indexcol
is None (default) then the table row index will be used and the output columns will be row_start and row_stop.delta
is a dict mapping column names to a delta value defining whether a column is sufficiently different to break the interval. These are used when generating the defaultdiff
functions for numerical columns (i.e. those for which abs(x) succeeds).diff
is a dict mapping column names to functions that take as input two values and return a boolean indicating whether the values are sufficiently different to break the interval. Default diff functions will be generated ifdiff
is None or for columns without an entry.avg
is a dict mapping column names to functions that calculate the average of a numpy array of values for that column. Default avg functions will be generated ifavg
is None or for columns without an entry.Example:
a = ((1, 2, 'hello', 2.), (1, 4, 'hello', 3.), (1, 2, 'hello', 4.), (1, 2, 'hi there', 5.), (1, 2, 'hello', 6.), (3, 2, 'hello', 7.), (1, 2, 'hello', 8.), (2, 2, 'hello', 9.)) arec = numpy.rec.fromrecords(a, names=('col1','col2','greet','time')) acomp = compress(arec, indexcol='time', delta={'col1':1.5})
- Parameters
delta – dict of delta thresholds defining when to break interval
indexcol – name of column to report start and stop values for interval.
diff – dict of functions defining the diff of 2 vals for that column name.
avg – dict of functions defining the average value for that column name.
colnames – list of column names to include (default = all).
- Return type
record array of compressed values
- ska_numpy.Numpy.filter(recarray, filters)[source]¶
Apply the list of
filters
to the numpy record arrayrecarray
and return the filtered recarray. See L{match} for description of the filter syntax.- Parameters
recarray – Input numpy record array
filters – List of filters
- Return type
Filtered record array
- ska_numpy.Numpy.interpolate(yin, xin, xout, method='linear', sorted=False, cython=True)[source]¶
Interpolate the curve defined by (xin, yin) at points xout. The array xin must be monotonically increasing. The output has the same data type as the input yin.
- Parameters
yin – y values of input curve
xin – x values of input curve
xout – x values of output interpolated curve
method – interpolation method (‘linear’ | ‘nearest’)
sorted – xout values are sorted so use search_both_sorted
cython – use Cython interpolation code if possible (default=True)
@:rtype: numpy array with interpolated curve
- ska_numpy.Numpy.match(recarray, filters)[source]¶
Apply the list of
filters
to the numpy record arrayrecarray
and return the corresponding boolean mask array.Each filter is a string with a simple boolean comparison of the form:
colname op value
where
colname
is a column name inrecarray
,op
is an operator (e.g. == or < or >= etc), andvalue
is a value. String values can optionally be enclosed in single or double quotes.The pseudo-column name ‘_row_’ can be used to filter on the row number.
- Parameters
recarray – Input numpy record array
filters – List of filters or string with one filter
- Return type
list of strings
- ska_numpy.Numpy.pformat(recarray, fmt=None)[source]¶
Light wrapper around ska_numpy.pprint to return a string instead of printing to a file.
- Parameters
recarray – input record array
fmt – dict of format specifiers (optional)
- Return type
string
- ska_numpy.Numpy.pprint(recarray, fmt=None, out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]¶
Print a nicely-formatted version of
recarray
toout
file-like object. Iffmt
is provided it should be a dict ofcolname:fmt_spec
pairs wherefmt_spec
is a format specifier (e.g. ‘%5.2f’).- Parameters
recarray – input record array
fmt – dict of format specifiers (optional)
out – output file-like object
- Return type
None
- ska_numpy.Numpy.search_both_sorted(a, v)[source]¶
Find indices where elements should be inserted to maintain order.
Find the indices into a sorted float array a such that, if the corresponding elements in float array v were inserted before the indices, the order of a would be preserved.
Similar to np.searchsorted but BOTH a and v must be sorted in ascending order. If len(v) < len(a) / 100 then the normal np.searchsorted is called. Otherwise both v and a are cast to np.float64 internally and a Cython function is called to compute the indices in a fast way.
- Parameters
a – input float array, sorted in ascending order
v – float values to insert into a, sorted in ascending order
- Returns
indices as int np.array
- ska_numpy.Numpy.smooth(x, window_len=10, window='hanning')[source]¶
Smooth the data using a window with requested size.
This method is based on the convolution of a scaled window with the signal. The signal is prepared by introducing reflected copies of the signal (with the window size) in both ends so that transient parts are minimized in the begining and end part of the output signal.
Example:
t = linspace(-2, 2, 50) y = sin(t) + randn(len(t)) * 0.1 ys = ska_numpy.smooth(y) plot(t, y, t, ys)
See also:
numpy.hanning, numpy.hamming, numpy.bartlett, numpy.blackman, numpy.convolve scipy.signal.lfilter
- Parameters
x – input signal
window_len – dimension of the smoothing window
window – type of window (‘flat’, ‘hanning’, ‘hamming’, ‘bartlett’, ‘blackman’)
- Return type
smoothed signal
- ska_numpy.Numpy.structured_array(vals, colnames=None)[source]¶
Create a numpy structured array (ndarray) given a dict of numpy arrays. The arrays can be multidimensional but must all have the same length (same size of the first dimension).
- Parameters
vals – dict of numpy ndarrays
colnames – column names (default=sorted vals keys)