pypeit.sampling module

Provides a set of functions to handle resampling.

class pypeit.sampling.Resample(y, e=None, mask=None, x=None, xRange=None, xBorders=None, inLog=False, newRange=None, newpix=None, newLog=True, newdx=None, base=10.0, ext_value=0.0, conserve=False, step=True)[source]

Bases: object

Resample regularly or irregularly sampled data to a new grid using integration.

This is a generalization of the routine ppxf_util.log_rebin from from the ppxf package by Michele Cappellari.

The abscissa coordinates (x) or the pixel borders (xBorders) for the data (y) should be provided for irregularly sampled data. If the input data is linearly or geometrically sampled (inLog=True), the abscissa coordinates can be generated using the input range for the (geometric) center of each grid point. If x, xBorders, and xRange are all None, the function assumes grid coordinates of x=numpy.arange(y.shape[-1]).

The function resamples the data by constructing the borders of the output grid using the new* keywords and integrating the input function between those borders. The output data will be set to ext_value for any data beyond the abscissa limits of the input data.

The data to resample (y) can be a 1D or 2D vector; the abscissa coordinates must always be 1D. If (y) is 2D, the resampling is performed along the last axis (i.e., axis=-1).

The nominal assumption is that the provided function is a step function based on the provided input (i.e., step=True). If the output grid is substantially finer than the input grid, the assumption of a step function will be very apparent. To assume the function is instead linearly interpolated between each provided point, choose step=False; higher-order interpolations are not provided.

If errors are provided, a nominal error propagation is performed to provide the errors in the resampled data.

Warning

Depending on the details of the resampling, the output errors are likely highly correlated. Any later analysis of the resampled function should account for this. A covariance calculation will be provided in the future on a best-effort basis.

The conserve keyword sets how the units of the input data should be treated. If conserve=False, the input data are expected to be in density units (i.e., per x coordinate unit) such that the integral over \(dx\) is independent of the units of \(x\) (i.e., flux per unit angstrom, or flux density). If conserve=True, the value of the data is assumed to have been integrated over the size of each pixel (i.e., units of flux). If conserve=True, \(y\) is converted to units of per step in \(x\) such that the integral before and after the resample is the same. For example, if \(y\) is a spectrum in units of flux, the function first converts the units to flux density and then computes the integral over each new pixel to produce the new spectra with units of flux.

Todo

Allow the user to provide the output pixel borders directly.
Allow for higher order interpolations.
Allow for a covariance matrix calculation.

Parameters:

y (numpy.ndarray) – Data values to resample. Can be a numpy.ma.MaskedArray, and the shape can be 1 or 2D. If 1D, the shape must be \((N_{\rm pix},)\); otherwise, it must be \((N_y,N_{\rm pix})\). I.e., the length of the last axis must match the input coordinates.
e (numpy.ndarray, optional) – Errors in the data that should be resampled. Can be a numpy.ma.MaskedArray, and the shape must match the input y array. These data are used to perform a nominal calculation of the error in the resampled array.
mask (numpy.ndarray, optional) – A boolean array (masked values are True) indicating values in y that should be ignored during the resampling. The mask used during the resampling is the union of this object and the masks of y and e, if they are provided as numpy.ma.MaskedArrays.
x (numpy.ndarray, optional) – Abcissa coordinates for the data, which do not need to be regularly sampled. If the pixel borders are not provided, they are assumed to be half-way between adjacent pixels, and the first and last borders are assumed to be equidistant about the provided value. If these coordinates are not provided, they are determined by the input borders, the input range, or just assumed to be the indices, \(0..N_{\rm pix}-1\).
xRange (array-like, optional) – A two-element array with the starting and ending value for the coordinates of the centers of the first and last pixels in y. Default is \([0,N_{\rm pix}-1]\).
xBorders (numpy.ndarray, optional) – An array with the borders of each pixel that must have a length of \(N_{\rm pix}+1\).
inLog (bool, optional) – Flag that the input is logarithmically binned, primarily meaning that the coordinates are at the geometric center of each pixel and the centers are spaced logarithmically. If false, the sampling is expected to be linear.
newRange (array-like, optional) – A two-element array with the (geometric) centers of the first and last pixel in the output vector. If not provided, assumed to be the same as the input range.
newpix (int, optional) – Number of pixels for the output vector. If not provided, assumed to be the same as the input vector.
newLog (bool, optional) – The output vector should be logarithmically binned.
newdx (float, optional) – The sampling step for the output vector. If newLog=True, this has to be the change in the logarithm of x for the output vector! If not provided, the sampling is set by the output range (see newRange above) and number of pixels (see newpix above).
base (float, optional) – The base of the logarithm used for both input and output sampling, if specified. The default is 10; use numpy.exp(1) for natural logarithm.
ext_value (float, optional) – Set extrapolated values to the provided float. By default, extrapolated values are set to 0. If set to None, values are just set to the linear exatrapolation of the data beyond the provided limits; use ext_value=None with caution!
conserve (bool, optional) – Conserve the integral of the input vector. For example, if the input vector is a spectrum in flux units, you should conserve the flux in the resampling; if the spectrum is in units of flux density, you do not want to conserve the integral.
step (bool, optional) – Treat the input function as a step function during the resampling integration. If False, use a linear interpolation between pixel samples.

x

The coordinates of the function on input.

Type:: numpy.ndarray

xborders

The borders of the input pixel samples.

Type:: numpy.ndarray

y

The function to resample.

Type:: numpy.ndarray

e

The 1-sigma errors in the function to resample.

Type:: numpy.ndarray

m

The boolean mask for the input function.

Type:: numpy.ndarray

outx

The coordinates of the function on output.

Type:: numpy.ndarray

outborders

The borders of the output pixel samples.

Type:: numpy.ndarray

outy

The resampled function.

Type:: numpy.ndarray

oute

The resampled 1-sigma errors.

Type:: numpy.ndarray

outf

The fraction of each output pixel that includes valid data from the input function.

Type:: numpy.ndarray

Raises:: ValueError – Raised if y is not of type numpy.ndarray, if y is not one-dimensional, or if xRange is not provided and the input vector is logarithmically binned (see inLog above).

_input_coordinates(x, xRange, xBorders, inLog, base)[source]: Determine the centers and pixel borders of the input coordinates.

_output_coordinates(newRange, newpix, newLog, newdx, base)[source]: Set the output coordinates.

_resample_linear(v, quad=False)[source]: Resample the vectors.

_resample_step(v, quad=False)[source]: Resample the vectors.

pypeit.sampling._pixel_borders(xlim, npix, log=False, base=10.0)[source]

Determine the borders of the pixels in a vector given the first, last and number of pixels

Parameters:

xlim (numpy.ndarray) – (Geometric) Centers of the first and last pixel in the vector.
npix (int) – Number of pixels in the vector.
log (bool) – (Optional) The input range is (to be) logarithmically sampled.
base (float) – (Optional) The base of the logarithmic sampling. The default is 10.0; use numpy.exp(1.) for the natural logarithm.

Returns:

A vector with the (npix+1) borders of the pixels and the sampling rate. If logarithmically binned, the sampling is the step in :math`log x`.

Return type:

numpy.ndarray, float

pypeit.sampling._pixel_centers(xlim, npix, log=False, base=10.0)[source]

Determine the centers of pixels in a linearly or geometrically sampled vector given first, last and number of pixels

Parameters:

xlim (numpy.ndarray) – (Geometric) Centers of the first and last pixel in the vector.
npix (int) – Number of pixels in the vector.
log (bool) – (Optional) The input range is (to be) logarithmically sampled.
base (float) – (Optional) The base of the logarithmic sampling. The default is 10.0; use numpy.exp(1.) for the natural logarithm.

Returns:

A vector with the npix centres of the pixels and the sampling rate. If logarithmically binned, the sampling is the step in :math`log x`.

Return type:

numpy.ndarray, float

pypeit.sampling.angstroms_per_pixel(wave, log=False, base=10.0, regular=True)[source]

Return a vector with the angstroms per pixel at each channel.

When regular=True, the function assumes that the wavelengths are either sampled linearly or geometrically. Otherwise, it calculates the size of each pixel as the difference between the wavelength coordinates. The first and last pixels are assumed to have a width as determined by assuming the coordinate is at its center.

Parameters:

wave (numpy.ndarray) – (Geometric) centers of the spectrum pixels in angstroms.
log (numpy.ndarray, optional) – The vector is geometrically sampled.
base (float, optional) – Base of the logarithm used in the geometric sampling.
regular (bool, optional) – The vector is regularly sampled.

Returns:

The angstroms per pixel.

Return type:

numpy.ndarray

pypeit.sampling.rectify_image(img, col, bpm=None, ocol=None, max_ocol=None, extract_width=None, mask_threshold=0.5)[source]

Rectify the image by shuffling flux along columns using the provided column mapping.

The image recification is one dimensional, treating each image row independently. It can be done either by a direct resampling of the image columns using the provided mapping of output to input column location (see col and Resample) or by an extraction along the provided column locations (see extract_width). The latter is generally faster; however, when resampling each row, the flux is explicitly conserved (see the conserve argument of Resample).

Parameters:

img (numpy.ndarray) – The 2D image to rectify. Shape is \((N_{\rm row}, N_{\rm col})\).
col (numpy.ndarray) – The array mapping each output column to its location in the input image. That is, e.g., col[:,0] provides the column coordinate in img that should be rectified to column 0 in the output image. Shape is \((N_{\rm row}, N_{\rm map})\).
bpm (numpy.ndarray, optional) – Boolean bad-pixel mask for pixels to ignore in input image. If None, no pixels are masked in the rectification. If provided, shape must match img.
ocol (numpy.ndarray, optional) –
The column in the output image for each column in col. If None, assume:
```
ocol = numpy.arange(col.shape[1])
```
These coordinates can fall off the output image (i.e., \(<0\) or \(\geq N_{\rm out,col}\)), but those columns are removed from the output).
max_ocol (int, optional) – The last viable column index to include in the output image; ie., for an image with ncol columns, this should be ncol-1. If None, assume max(ocol).
extract_width (float, optional) – The width of the extraction aperture to use for the image rectification. If None, the image recification is performed using Resample along each row.
mask_threshold (float, optional) – Either due to bpm or the bounds of the provided img, pixels in the rectified image may not be fully covered by valid pixels in img. Pixels in the output image with less than this fractional coverage of an input pixel are flagged in the output.

Returns:

Two numpy.ndarray objects are returned both with shape (nrow,max_ocol+1), the rectified image and its boolean bad-pixel mask.

pypeit.sampling.resample_vector_npix(outRange=None, dx=None, log=False, base=10.0, default=None)[source]

Determine the number of pixels needed to resample a vector given first, last pixel and dx

Parameters:

outRange (list, numpy.ndarray) – Two-element array with the starting and ending x coordinate of the pixel centers to divide into pixels of a given width. If log is True, this must still be the linear value of the x coordinate, not log(x)!.
dx (float) – Linear or logarithmic pixel width.
log (bool) – Flag that the range should be logarithmically binned.
base (float) – Base for the logarithm
default (int) – Default number of pixels to use. The default is returned if either outRange or dx are not provided.

Returns:

Returns two objects: The number of pixels to cover outRange with pixels of width dx and the adjusted range such that number of pixels of size dx is the exact integer.

Return type:

tuple

Raises:

ValueError – Raised if the range is not a two-element vector

pypeit.sampling.spectral_coordinate_step(wave, log=False, base=10.0)[source]

Return the sampling step for the input wavelength vector.

If the sampling is logarithmic, return the change in the logarithm of the wavelength; otherwise, return the linear step in angstroms.

Parameters:

wave (numpy.ndarray) – Wavelength coordinates of each spectral channel in angstroms.
log (bool) – (Optional) Input spectrum has been sampled geometrically.
base (float) – (Optional) If sampled geometrically, the sampling is done using a logarithm with this base. For natural logarithm, use numpy.exp(1).

Returns:

Spectral sampling step in either angstroms (log=False) or the step in log(angstroms).

Return type:

float

pypeit.sampling.spectrum_velocity_scale(wave)[source]

Determine the velocity sampling of an input wavelength vector when log sampled

Note

The wavelength vector is assumed to be geometrically sampled! However, the input units expected to be in angstroms, not, e.g., log(angstrom).

Parameters:: wave (numpy.ndarray) – Wavelength coordinates of each spectral channel in angstroms. It is expected that the spectrum has been sampled geometrically
Returns:: Velocity scale of the spectrum in km/s.
Return type:: float