pypeit.metadata module

Provides a class that handles the fits metadata required by PypeIt.

class pypeit.metadata.PypeItMetaData(spectrograph, par, files=None, data=None, usrdata=None, strict=True)[source]

Bases: object

Provides a table and interface to the relevant fits file metadata used during the reduction.

The content of the fits table is dictated by the header keywords specified for the provided spectrograph. It is expected that this table can be used to set the frame type of each file.

The metadata is validated using checks specified by the provided spectrograph class.

For the data table, one should typically provide either the file list from which to grab the data from the fits headers or the data directly. If neither are provided the table is instantiated without any data.

Parameters:

spectrograph (pypeit.spectrographs.spectrograph.Spectrograph) – The spectrograph used to collect the data save to each file. The class is used to provide the header keyword data to include in the table and specify any validation checks.
par (pypeit.par.pypeitpar.PypeItPar) – PypeIt parameters used to set the code behavior.
files (str, list, optional) – The list of files to include in the table.
data (table-like, optional) – The data to include in the table. The type can be anything allowed by the instantiation of astropy.table.Table.
usrdata (astropy.table.Table, optional) – A user provided set of data used to supplement or overwrite metadata read from the file headers. The table must have a filename column that is used to match to the metadata table generated within PypeIt. Note: This is ignored if data is also provided. This functionality is only used when building the metadata from the fits files.
strict (bool, optional) – Function will fault if there is a problem with the reading the header for any of the provided files; see get_headarr(). Set to False to instead report a warning and continue.

spectrograph: (Spectrograph): The spectrograph used to collect the data save to each file. The class is used to provide the header keyword data to include in the table and specify any validation checks.

par

PypeIt parameters used to set the code behavior. If not provided, the default parameters specific to the provided spectrograph are used.

Type:: PypeItPar

configs

A dictionary of the unique configurations identified.

Type:: dict

type_bitmask

The bitmask used to set the frame type of each fits file.

Type:: FrameTypeBitMask

calib_bitmask

The bitmask used to keep track of the calibration group bits.

Type:: BitMask

table

The table with the relevant metadata for each fits file to use in the data reduction.

Type:: astropy.table.Table

_build(files, strict=True, usrdata=None)[source]

Generate the fitstbl that will be at the heart of PypeItMetaData.

Parameters:

files (str, list) – One or more files to use to build the table.
strict (bool, optional) – Function will fault if astropy.io.fits.getheader fails to read any of the headers. Set to False to report a warning and continue.
usrdata (astropy.table.Table, optional) – Parsed for frametype for a few instruments (e.g. VLT) where meta data may not be required

Returns:

Dictionary with the data to assign to table.

Return type:

dict

_check_calib_groups()[source]

Check that the calibration groups are valid.

This currently only checks that the science frames are associated with one calibration group.

TODO: Is this appropriate for NIR data?

_get_cfgs(copy=False, rm_none=False)[source]

Convenience method to return configs with possible alterations.

This method should not be called by any method outside of this class; use unique_configurations() instead.

Parameters:

copy (bool, optional) – Return a deep copy of configs instead of the object itself.
rm_none (bool, optional) – Remove any configurations set to ‘None’. If copy is True, this is done after configs is copied to a new dictionary.

Returns:

A nested dictionary, one dictionary per configuration with the associated metadata for each.

Return type:

dict

_impose_types(columns, types)[source]

Impose a set of types on certain columns.

Note

table is edited in place.

Parameters:

columns (list) – List of column names
types (list) – List of types

_repr_html_()[source]

_set_calib_group_bits()[source]: Set the calibration group bit based on the string values of the ‘calib’ column.

_vet_instrument(meta_tbl)[source]

Confirm the metadata gathered for a set of measurements are all from this spectrograph.

This function only issues warnings; no exceptions are raised.

Parameters:: meta_tbl (astropy.table.Table) – Table with the meta data; see PypeItMetaData.

property calib_groups: Return the calibration group identifiers.

clean_configurations()[source]

Ensure that configuration-defining keywords all have values that will yield good PypeIt reductions. Any frames that do not are removed from table, meaning this method may modify that attribute directly.

The valid values for configuration keys is set by valid_configuration_values().

static configuration_generator(start=0)[source]

construct_basename(row, obstime=None)[source]

Construct the root name primarily for PypeIt file output.

Parameters:

row (int) – The 0-indexed row of the frame.
obstime (astropy.time.Time, optional) – The MJD of the observation. If None, constructed using construct_obstime().

Returns:

The root name for file output.

Return type:

str

construct_obstime(row)[source]

Construct the MJD of when the frame was observed.

Parameters:: row (int) – The 0-indexed row of the frame.
Returns:: The MJD of the observation.
Return type:: astropy.time.Time

static default_keys()[source]

edit_frame_type(indx, frame_type, append=False)[source]

Edit the frame type by hand.

Parameters:

indx (int) – The 0-indexed row in the table to edit
frame_type (str, list) – One or more frame types to append/overwrite.
append (bool, optional) – Append the frame type. If False, all existing frame types are overwitten by the provided type.

finalize_usr_build(frametype, setup)[source]

Finalize the build of the table based on user-provided data, typically pulled from the PypeIt file.

This function:

sets the frame types based on the provided object
sets all the configurations to the provided setup
assigns all frames to a single calibration group, if the ‘calib’ column does not exist
if the ‘comb_id’ column does not exist, this sets the combination groups to be either undefined or to be unique for each science or standard frame, see set_combination_groups().

Note

This should only be run if all files are from a single instrument configuration. table is modified in-place.

Todo

Why isn’t frametype just in the user-provided data? It may be (see get_frame_types) and I’m just not using it…

Parameters:

frametype (dict) – A dictionary with the types designated by the user. The file name and type are expected to be the key and value of the dictionary, respectively. The number of keys therefore must match the number of files in table. For frames that have multiple types, the types should be provided as a string with comma-separated types.
setup (str) – If the ‘setup’ columns does not exist, fill the configuration setup columns with this single identifier.

find_calib_group(grp)[source]

Find all the frames associated with the provided calibration group.

Parameters:: grp (int) – The calibration group integer.
Returns:: Boolean array selecting those frames in the table included in the selected calibration group.
Return type:: numpy.ndarray
Raises:: PypeItError – Raised if the ‘calibbit’ column is not defined.

find_configuration(setup, index=False)[source]

Find all frames associated with the provided setup/configuration.

Parameters:

setup (str) – The setup/configuration to search on.
index (bool, optional) – Return an array of 0-indexed indices instead of a boolean array.

Returns:

A boolean array, or an integer array if index=True, with the table rows associated with the requested setup/configuration.

Return type:

numpy.ndarray

find_frame_calib_groups(row)[source]: Find the calibration groups associated with a specific frame.

find_frame_files(ftype, calib_ID=None)[source]

Return the list of files with a given frame type.

The frames must also match the science frame index, if it is provided.

Parameters:

ftype (str) – The frame type identifier. See the keys for FrameTypeBitMask.
calib_ID (int, optional) – Index of the calibration group that it must match. If None, any row of the specified frame type is included.

Returns:

List of file paths that match the frame type and science frame ID, if the latter is provided.

Return type:

list

find_frames(ftype, calib_ID=None, index=False)[source]

Find the rows with the associated frame type.

If the index is provided, the frames must also be matched to the relevant science frame.

Parameters:

ftype (str) – The frame type identifier. See the keys for FrameTypeBitMask. If set to the string ‘None’, this returns all frames without a known type.
calib_ID (int, optional) – Index of the calibration group that it must match. If None, any row of the specified frame type is included.
index (bool, optional) – Return an array of 0-indexed indices instead of a boolean array.

Returns:

A boolean array, or an integer array if index=True, with the rows that contain the frames of the requested type.

Return type:

numpy.ndarray

Raises:

PypeItError – Raised if the framebit column is not set in the table.

frame_paths(indx)[source]

Return the full paths to one or more frames.

Parameters:: indx (int, array-like) – One or more 0-indexed rows in the table with the frames to return. Can be an array of indices or a boolean array of the correct length.
Returns:: List of the full paths of one or more frames.
Return type:: list

get_configuration(indx, cfg_keys=None, modified=False)[source]

Return the configuration dictionary for a given frame.

Parameters:

indx (int) – The index of the table row to use to construct the configuration.
cfg_keys (list, optional) – The list of metadata keys to use to construct the configuration. If None, the configuration_keys of spectrograph is used.
modified (bool, optional) – Return the configuration as modified by the spectrograph-specific modify_config().

Returns:

A dictionary with the metadata values from the selected row.

Return type:

dict

get_configuration_names(ignore=None, return_index=False, configs=None)[source]

Get the list of the unique configuration names.

This provides just the list of setup identifiers (‘A’, ‘B’, etc.) and the row index where it first occurs. This is different from unique_configurations() because the latter determines and provides the configurations themselves.

This is mostly a convenience function for the writing routines.

Parameters:

ignore (list, optional) – Ignore configurations in the provided list.
return_index (bool, optional) – Return row indices with the first occurence of these configurations.
configs (str, list, optional) – One or more strings used to select the configurations to include in the returned objects. If 'all', pass back all configurations. Otherwise, only return the configurations matched to this provided string or list of strings (e.g., [‘A’,’C’]).

Returns:

The list of unique setup names. A tuple is returned with a second numpy.ndarray object providing the indices of the first occurrence of these setups, if requested (using return_index).

Return type:

tuple, numpy.ndarray

Raises:

PypeItError – Raised if the ‘setup’ isn’t been defined.

get_frame_types(flag_unknown=False, user=None, merge=True)[source]

Generate a table of frame types from the input metadata object.

Todo

Here’s where we could add a SPIT option.

Parameters:

flag_unknown (bool, optional) – Instead of crashing out if there are unidentified files, leave without a type and continue.
user (dict, optional) – A dictionary with the types designated by the user. The file name and type are expected to be the key and value of the dictionary, respectively. The number of keys therefore must match the number of files in table. For frames that have multiple types, the types should be provided as a string with comma-separated types.
merge (bool, optional) – Merge the frame typing into the exiting table.

Returns:

A Table with two columns, the type names and the type bits. See FrameTypeBitMask for the allowed frame types.

Return type:

astropy.table.Table

ignore_frames()[source]

Construct a list of frame types to ignore, and the corresponding indices of these frametypes in the table.

Returns:: Two objects are returned, (1) A dictionary where the keys are the frame types that are configuration-independent and the values are the metadata keywords that can be used to assign the frames to a configuration group, and (2) an integer numpy.ndarray with the table rows that should be ignored when defining the configuration.
Return type:: tuple

keys()[source]

static maximum_number_of_configurations()[source]

merge(usrdata, match_type=True)[source]

Use the provided table to supplement or overwrite the metadata.

If the internal table already contains the column in usrdata, the function will try to match the data type of the usrdata column to the existing data type. If it can’t it will just add the column anyway, with the type in usrdata. You can avoid this step by setting match_type=False.

Parameters:

usrdata (astropy.table.Table) – A user provided set of data used to supplement or overwrite metadata read from the file headers. The table must have a filename column that is used to match to the metadata table generated within PypeIt.
match_type (bool, optional) – Attempt to match the data type in usrdata to the type in the internal table. See above.

Raises:

TypeError – Raised if usrdata is not an astropy.table.Table
KeyError – Raised if filename is not a key in the provided table.

property n_calib_groups: Return the number of calibration groups.

property n_configs

remove_rows(rows, regroup=False)[source]

Remove the provided rows from the data table.

This edits the object directly, nothing is returned.

Parameters:

rows (int, array-like) – One or more rows that should be removed from the datatable. This is passed directly to astropy.table.Table.remove_rows; see astropy documentation to confirm allowed types.
regroup (bool, optional) – If True, reset the setup/configuration, calibration, and combination groups.

set_calibration_groups(global_frames=None, default=False, force=False)[source]

Group calibration frames into sets.

Requires the ‘setup’ column to have been defined. For now this is a simple grouping of frames with the same configuration.

Todo

Maintain a detailed description of the logic.

The ‘calib’ column has a string type to make sure that it matches with what can be read from the pypeit file. The ‘calibbit’ column is actually what is used to determine the calibration group of each frame; see calib_bitmask.

Parameters:

global_frames (list, optional) – A list of strings with the frame types to use in all calibration groups (e.g., [‘bias’, ‘dark’]).
default (bool, optional) – If the ‘calib’ column is not present, set a single calibration group for all rows.
force (bool, optional) – Force the calibration groups to be reconstructed if the ‘calib’ column already exists.

Raises:

PypeItError – Raised if ‘setup’ column is not defined, or if global_frames is provided but the frame types have not been defined yet.

set_combination_groups(assign_objects=True)[source]

Set combination groups.

Note

table is edited in place.

This function can be used to initialize the combination group and background group columns, and/or to initialize the combination groups to the set of objects (science or standard frames) to a unique integer.

If the ‘comb_id’ or ‘bkg_id’ columns do not exist, they’re set to -1.

Parameters:: assign_objects (bool, optional) – If all of ‘comb_id’ values are less than 0 (meaning they’re unassigned), the combination groups are set to be unique for each standard and science frame. For some instruments (e.g., Keck/NIRES), this will also parse known dither patterns and use them to set default difference-imaging groups.

set_configurations(configs=None, force=False, fill=None)[source]

Assign each frame to a configuration (setup) and include it in the metadata table.

The internal table is edited in place. If the ‘setup’ column already exists, the configurations are not reset unless you call the function with force=True.

Parameters:

configs (dict, optional) – A nested dictionary, one dictionary per configuration with the associated values of the metadata associated with each configuration. The metadata keywords in the dictionary should be the same as in the table, and the keywords used to set the configuration should be the same as returned by the spectrograph configuration_keys method. The latter is not checked. If None, this is set by unique_configurations().
force (bool, optional) – Force the configurations to be reset.
fill (str, optional) – If the ‘setup’ column does not exist, fill the configuration setup columns with this single identifier. Ignores other inputs.

Raises:

PypeItError – Raised if none of the keywords in the provided configuration match with the metadata keywords. Also raised when some frames cannot be assigned to a configuration, the spectrograph defined frames that have been ignored in the determination of the unique configurations, but the frame types have not been set yet.

set_frame_types(type_bits, merge=True)[source]

Set and return a Table with the frame types and bits.

Parameters:

type_bits (numpy.ndarray) – Integer bitmask with the frame types. The length must match the existing number of table rows.
merge (bool, optional) – Merge the types and bits into the existing table. This will overwrite any existing columns.

Returns:

Table with two columns, the frame type name and bits.

Return type:

astropy.table.Table

set_pypeit_cols(write_bkg_pairs=False, write_manual=False)[source]

Generate the list of columns to be included in the fitstbl (nearly the complete list).

Parameters:

write_bkg_pairs (bool, optional) – Add additional PypeIt columns for calib, comb_id and bkg_id
write_manual (bool, optional) – Add additional PypeIt columns for manual extraction

Returns:

Array of columns to be used in the fits table>

Return type:

numpy.ndarray

set_user_added_columns()[source]

Set columns that the user might add

Note

table is edited in place.

This function can be used to initialize columns that the user might add

sort(col)[source]

unique_configurations(force=False, copy=False, rm_none=False)[source]

Return the unique instrument configurations.

If run before the 'setup' column is initialized, this function determines the unique instrument configurations by finding unique combinations of the items in the metadata table listed by the spectrograph configuration_keys method.

If run after the 'setup' column has been set, this simply constructs the configuration dictionary using the unique configurations in that column.

This is used to set the internal configs. If this attribute is not None, this function simply returns config (cf. force).

Warning

Any frame types returned by the config_independent_frames() method for spectrograph will be ignored in the construction of the unique configurations. If config_independent_frames() does not return None and the frame types have not yet been defined (see get_frame_types()), this method will fault!

Parameters:

force (bool, optional) – Force the configurations to be redetermined. Otherwise the configurations are only determined if configs has not yet been defined.
copy (bool, optional) – Return a deep copy of configs instead of the object itself.
rm_none (bool, optional) – Remove any configurations set to ‘None’. If copy is True, this is done after configs is copied to a new dictionary.

Returns:

A nested dictionary, one dictionary per configuration with the associated metadata for each.

Return type:

dict

Raises:

PypeItError – Raised if there are list of frame types to ignore but the frame types have not been defined yet.

write(output=None, rows=None, columns=None, sort_col=None, overwrite=False, header=None)[source]

Write the metadata either to a file or to the screen.

The method allows you to set the columns to print and which column to use for sorting.

Parameters:

output (str, optional) – Output signature or file name. If None, the table contents are printed to the screen. If 'table', the table that would have been printed/written to disk is returned. Otherwise, the string is interpreted as the name of an ascii file to which to write the table contents.
rows (numpy.ndarray, optional) – A boolean vector selecting the rows of the table to write. If None, all rows are written. Shape must match the number of the rows in the table.
columns (str, list, optional) – A list of columns to include in the output file. Can be provided as a list directly or as a comma-separated string. If None or 'all', all columns in are written; if 'pypeit', the columns are the same as those included in the pypeit file. Each selected column must be a valid pypeit metadata keyword, specific to spectrograph. Additional valid keywords, depending on the processing level of the metadata table, are directory, filename, frametype, framebit, setup, calib, and calibbit.
sort_col (str, optional) – Name of the column to use for sorting the output. If None, the table is printed in its current state.
overwrite (bool, optional) – Overwrite any existing file; otherwise raise an exception.
header (str, list, optional) – One or more strings to write to the top of the file, on string per file line; # `` is added to the beginning of each string. Ignored if ``output does not specify an output file.

Returns:

The table object that would have been written/printed if output == 'table'. Otherwise, the method always returns None.

Return type:

astropy.table.Table

Raises:

ValueError – Raised if the columns to include are not valid, or if the column to use for sorting is not valid.
FileExistsError – Raised if overwrite is False and the file exists.

write_pypeit(output_path=None, cfg_lines=None, write_bkg_pairs=False, write_manual=False, configs=None, config_subdir=True, version_override=None, date_override=None)[source]

Write a pypeit file in data-table format.

The pypeit file is the main configuration file for PypeIt, configuring the control-flow and algorithmic parameters and listing the data files to read. This function writes the columns selected by the pypeit.spectrographs.spectrograph.Spectrograph.pypeit_file_keys(), which can be specific to each instrument.

Parameters:

output_path (str, optional) – Root path for the output pypeit files. If None, set to current directory. If the output directory does not exist, it is created.
cfg_lines (list, optional) – The list of configuration lines to include in the file. If None are provided, the vanilla configuration is included.
write_bkg_pairs (bool, optional) – When constructing the pypeit.metadata.PypeItMetaData object, include two columns called comb_id and bkg_id that identify object and background frame pairs.
write_manual (bool, optional) – Add additional PypeIt columns for manual extraction
configs (str, list, optional) – One or more strings used to select the configurations to include in the returned objects. If 'all', pass back all configurations. Otherwise, only return the configurations matched to this provided string or list of strings (e.g., [‘A’,’C’]). See configs.
config_subdir (bool, optional) – Flag to place the pypeit file in a subdirectory named for each configuration. If True, the pypeit file is written to {spec}_{config}/{spec}_{config}.pypeit (e.g., shane_kast_blue_A/shane_kast_blue_A.pypeit). If False, the pypeit file is placed directly in the output_path.
version_override (str, optional) – Override the current version and use this one instead. For documentation purposes only!
date_override (str, optional) – Override the current date and use this one instead. For documentation purposes only!

Raises:

PypeItError – Raised if the ‘setup’ isn’t defined and split is True.

Returns:

List of PypeIt files generated.

Return type:

list

write_sorted(ofile, overwrite=True, ignore=None, write_bkg_pairs=False, write_manual=False)[source]

Write the sorted file.

The sorted file lists all the unique instrument configurations (setups) and the frames associated with each configuration. The output data table is identical to the pypeit file output.

Todo

This is for backwards compatibility, but we should consider reformatting/removing it.

Parameters:

ofile (str, Path) – Name for the output sorted file.
overwrite (bool, optional) – Overwrite any existing file with the same name.
ignore (list, optional) – Ignore configurations in the provided list.
write_bkg_pairs (bool, optional) – Add additional PypeIt columns for calib, comb_id and bkg_id
write_manual (bool, optional) – Add additional PypeIt columns for manual extraction

Raises:

PypeItError – Raised if the ‘setup’ isn’t been defined.