pypeit.tracepca module

Implements a general purpose object used to decompose and predict traces using principle-component analysis.

class pypeit.tracepca.TracePCA(trace_cen=None, npca=None, pca_explained_var=99.0, reference_row=None, coo=None)[source]

Bases: DataContainer

Class to build and interact with PCA model of traces.

This is primarily a container class for the results of pca_decomposition(), fit_pca_coefficients(), and pca_predict().

The datamodel attributes are:

Version: 1.1.0

Attribute

Type

Array Type

Description

input_npca

int

Requested number of PCA components if provided.

input_pcav

float, numpy.floating

Requested variance accounted for by PCA decomposition.

npca

int

Number of PCA components used.

nspec

int

Number of pixels in the image spectral direction.

ntrace

int

Number of traces used to construct the PCA.

pca_coeffs

numpy.ndarray

float, numpy.floating

PCA component coefficients. If the PCA decomposition used \(N_{\rm comp}\) components for \(N_{\rm vec}\) vectors, the shape of this array must be \((N_{\rm vec}, N_{\rm comp})\). The array can be 1D with shape \((N_{\rm vec},)\) if there was only one PCA component.

pca_coeffs_model

numpy.ndarray

PypeItFit

An array of PypeItFit objects, one per PCA component, that models the trend of the PCA component coefficients with the reference coordinate of each vector. These models are used by predict() to model the expected coefficients at a new reference coordinate.

pca_components

numpy.ndarray

float, numpy.floating

Vectors with the PCA components. Shape must be \((N_{\rm comp}, N_{\rm spec})\).

pca_mean

numpy.ndarray

float, numpy.floating

The mean offset of the PCA decomposotion for each spectral pixel. Shape is \((N_{\rm spec},)\).

reference_row

int, numpy.integer

The row (spectral position) used as the reference coordinate system for the PCA.

trace_coo

numpy.ndarray

float, numpy.floating

Trace coordinates. Shape must be \((N_{\rm spec},N_{\rm trace})\).

Parameters:
  • trace_cen (numpy.ndarray, optional) – A floating-point array with the spatial location of each each trace. Shape is \((N_{\rm spec}, N_{\rm trace})\). If None, the object is “empty” and all of the other keyword arguments are ignored.

  • npca (bool, optional) – The number of PCA components to keep. See pca_decomposition().

  • pca_explained_var (float, optional) – The percentage (i.e., not the fraction) of the variance in the data accounted for by the PCA used to truncate the number of PCA coefficients to keep (see npca). Ignored if npca is provided directly. See pca_decomposition().

  • reference_row (int, optional) – The row (spectral position) in trace_cen to use as the reference coordinate system for the PCA. If None, set to the \(N_{\rm spec}/2\).

  • coo (numpy.ndarray, optional) – Floating-point array with the reference coordinates for each trace. If provided, the shape must be \((N_{\rm trace},)\). If None, the reference coordinate system is defined by the value of trace_cen at the spectral position defined by reference_row. See the mean argument of pca_decomposition().

_bundle(ext='PCA')[source]

Bundle the data for writing.

classmethod _parse(hdu, hdu_prefix=None, **kwargs)[source]

Parse the data from the provided HDU.

See _parse() for the argument descriptions.

_reinit()[source]

Erase and/or define all the attributes of the class.

build_interpolator(order, ivar=None, weights=None, function='polynomial', lower=3.0, upper=3.0, maxrej=1, maxiter=25, minx=None, maxx=None, debug=False)[source]

Wrapper for fit_pca_coefficients() that uses class attributes and saves the input parameters.

datamodel = {'input_npca': {'descr': 'Requested number of PCA components if provided.', 'otype': <class 'int'>}, 'input_pcav': {'descr': 'Requested variance accounted for by PCA decomposition.', 'otype': (<class 'float'>, <class 'numpy.floating'>)}, 'npca': {'descr': 'Number of PCA components used.', 'otype': <class 'int'>}, 'nspec': {'descr': 'Number of pixels in the image spectral direction.', 'otype': <class 'int'>}, 'ntrace': {'descr': 'Number of traces used to construct the PCA.', 'otype': <class 'int'>}, 'pca_coeffs': {'atype': (<class 'float'>, <class 'numpy.floating'>), 'descr': 'PCA component coefficients. If the PCA decomposition used :math:`N_{\\rm comp}` components for :math:`N_{\\rm vec}` vectors, the shape of this array must be :math:`(N_{\\rm vec}, N_{\\rm comp})`. The array can be 1D with shape :math:`(N_{\\rm vec},)` if there was only one PCA component.', 'otype': <class 'numpy.ndarray'>}, 'pca_coeffs_model': {'atype': <class 'pypeit.core.fitting.PypeItFit'>, 'descr': 'An array of PypeItFit objects, one per PCA component, that models the trend of the PCA component coefficients with the reference coordinate of each vector.  These models are used by :func:`predict` to model the expected coefficients at a new reference coordinate.', 'otype': <class 'numpy.ndarray'>}, 'pca_components': {'atype': (<class 'float'>, <class 'numpy.floating'>), 'descr': 'Vectors with the PCA components.  Shape must be :math:`(N_{\\rm comp}, N_{\\rm spec})`.', 'otype': <class 'numpy.ndarray'>}, 'pca_mean': {'atype': (<class 'float'>, <class 'numpy.floating'>), 'descr': 'The mean offset of the PCA decomposotion for each  spectral pixel. Shape is :math:`(N_{\\rm spec},)`.', 'otype': <class 'numpy.ndarray'>}, 'reference_row': {'descr': 'The row (spectral position) used as the reference coordinate system for the PCA.', 'otype': (<class 'int'>, <class 'numpy.integer'>)}, 'trace_coo': {'atype': (<class 'float'>, <class 'numpy.floating'>), 'descr': 'Trace coordinates.  Shape must be :math:`(N_{\\rm spec},N_{\\rm trace})`.', 'otype': <class 'numpy.ndarray'>}}

Object datamodel.

decompose(trace_cen, npca=None, pca_explained_var=99.0, reference_row=None, coo=None)[source]

Construct the PCA from scratch.

Parameters:
  • trace_cen (numpy.ndarray) – A floating-point array with the spatial location of each each trace. Shape is \((N_{\rm spec}, N_{\rm trace})\). Cannot be None.

  • npca (bool, optional) – The number of PCA components to keep. See pca_decomposition().

  • pca_explained_var (float, optional) – The percentage (i.e., not the fraction) of the variance in the data accounted for by the PCA used to truncate the number of PCA coefficients to keep (see npca). Ignored if npca is provided directly. See pca_decomposition().

  • reference_row (int, optional) – The row (spectral position) in trace_cen to use as the reference coordinate system for the PCA. If None, set to the \(N_{\rm spec}/2\).

  • coo (numpy.ndarray, optional) – Floating-point array with the reference coordinates for each trace. If provided, the shape must be \((N_{\rm trace},)\). If None, the reference coordinate system is defined by the value of trace_cen at the spectral position defined by reference_row. See the mean argument of pca_decomposition().

classmethod from_dict(d=None)[source]

Instantiate from a dictionary.

This is a basic wrapper for from_dict that appropriately toggles is_empty.

internals = ['is_empty']

A list of strings identifying a set of internal attributes that are not part of the datamodel.

predict(x)[source]

Predict one or more traces given the functional forms for the PCA coefficients.

Warning

The PCA coefficients must have first been modeled by a function before using this method. An error will be raised if fit_coeff is not defined.

Parameters:

x (float, numpy.ndarray) – One or more spatial coordinates (at the PCA reference row) at which to sample the PCA coefficients and produce the PCA model for the trace spatial position as a function of spectral pixel.

Returns:

The array with the predicted spatial locations of the trace. If the provided coordinate is a single value, the returned shape is \((N_{\rm pix},)\); otherwise it is \((N_{\rm pix}, N_{\rm x})\).

Return type:

numpy.ndarray

version = '1.1.0'

Datamodel version.

pypeit.tracepca.pca_trace_object(trace_cen, order=None, trace_bpm=None, min_length=0.6, npca=None, pca_explained_var=99.0, reference_row=None, coo=None, minx=None, maxx=None, trace_wgt=None, function='polynomial', lower=3.0, upper=3.0, maxrej=1, maxiter=25, debug=False)[source]

Decompose and reconstruct the provided traces using principle-component analysis.

Parameters:
  • trace_cen (numpy.ndarray) – A floating-point array with the spatial location of each each trace. Shape is \((N_{\rm spec}, N_{\rm trace})\).

  • order (int, list, optional) –

    The order of the polynomial to use fit each PCA coefficient as a function of trace position. If None, order is set to \(3.3 N_{\rm use}/N_{\rm trace}\), where \(N_{\rm use}\) is the number of traces used to construct the PCA and \(N_{\rm trace}\) is the number of total traces provided. If an integer (determined automatically if the argument is None), the order per PCA component (see npca) is set to cascade from high-to-low order as follows:

    _order = np.clip(order - np.arange(npca), 1, None).astype(int)
    

  • trace_bpm (numpy.ndarray, optional) – Bad-pixel mask for the trace data (True is bad; False is good). Must match the shape of trace_cen.

  • min_length (float, optional) – The good position of the trace must cover at least this fraction of the spectral dimension for use in the PCA decomposition.

  • npca (bool, optional) – The number of PCA components to keep. See pca_decomposition().

  • pca_explained_var (float, optional) – The percentage (i.e., not the fraction) of the variance in the data accounted for by the PCA used to truncate the number of PCA coefficients to keep (see npca). Ignored if npca is provided directly. See pca_decomposition().

  • reference_row (int, optional) – The row (spectral position) in trace_cen to use as the reference coordinate system for the PCA. If None, set to the \(N_{\rm spec}/2\) or based on the spectral position that crosses the most number of valid trace positions.

  • coo (numpy.ndarray, optional) – Floating-point array with the reference coordinates to use for each trace. If None, coordinates are defined at the reference row of trace_cen. Shape must be \((N_{\rm trace},)\).

  • minx (float, optional) – Minimum and maximum values used to rescale the independent axis data. If None, the minimum and maximum values of coo are used. See robust_fit().

  • maxx (float, optional) – Minimum and maximum values used to rescale the independent axis data. If None, the minimum and maximum values of coo are used. See robust_fit().

  • trace_wgt (numpy.ndarray, optional) – Weights to apply to the PCA coefficient of each trace during the fit. Weights are independent of the PCA component. See weights parameter of fit_pca_coefficients(). Shape must be \((N_{\rm trace},)\).

  • function (str, optional) – Type of function used to fit the data.

  • lower (float, optional) – Number of standard deviations used for rejecting data below the mean residual. If None, no rejection is performed. See robust_fit().

  • upper (float, optional) – Number of standard deviations used for rejecting data above the mean residual. If None, no rejection is performed. See robust_fit().

  • maxrej (int, optional) – Maximum number of points to reject during fit iterations. See robust_fit().

  • maxiter (int, optional) – Maximum number of rejection iterations allows. To force no rejection iterations, set to 0.

  • debug (bool, optional) – Show plots useful for debugging.