pypeit.data.utils module
Data utilities for built-in PypeIt
data files
Note
If the hostname URL for the telluric atmospheric grids on S3 changes, the
only place that needs to change is the file s3_url.txt
.
Implementation Documentation
This module contains the organization scheme for the pypeit/data
files
needed by the PypeIt
package. Any routine in the package that needs to load
a data file stored in this directory should use the paths supplied by this
module and not call, e.g. importlib.resources.files
or attempt to otherwise directly access the package directory structure. In
this way, if structural changes to this directory are needed, only this module
need be modified and the remainder of the package can remain ignorant of those
changes and continue to call the paths supplied by this module.
Furthermore, all paths returned by this module are pathlib.Path
objects
rather than pure strings, with all of the functionality therein contained.
Most (by number) of the package data files here are distributed with the
PypeIt
package and are accessed via the Paths
class. For instance, the NIR spectrophotometry for Vega is accessed via:
vega_file = data.Paths.standards / 'vega_tspectool_vacuum.dat'
For some directories, however, the size of the included files is large enough
that it was beginning to cause problems with distributing the package via PyPI.
For these specific directories, the data is still stored in the GitHub
repository but is not distributed with the PyPI package. In order to access
and use these files, we use the AstroPy download/cache system, and specific
functions (get_*_filepath()
) are required to interact with these files.
Currently, the directories covered by the AstroPy download/cache system are:
arc_lines/reid_arxiv
skisim
sensfuncs
From time to time, it may be necessary to add additional files/directories to
the AstroPy download/cache system. In this case, there is a particular
sequence of steps required. The caching routines look for remote-hosted data
files in either the develop
tree or a tagged version tree (e.g., 1.8.0
)
of the repository, any new files must be already present in the repo before
testing a new get_*_filepath()
routine. Order of operations is:
Add any new remote-hosted files to the GitHub repo via a separate PR that also modifies
MANIFEST.in
to exclude these files from the distributed package.Create a new
get_*_filepath()
function in this module, following the example of one of the existing functions. Elsewhere inPypeIt
, load the needed file by invoking the newget_*_filepath()
function. An example of this can be found inpypeit/core/flux_calib.py
whereget_skisim_filepath()
is called to locate sky transmission files.
If new package-included data are added that are not very large (total directory
size < a few MB), it is not necessary to use the AstroPy cache/download system.
In this case, simply add the directory path to the
Paths
class and access the enclosed files similarly
to the Vega example above.
- class pypeit.data.utils.Paths[source]
Bases:
object
List of hardwired paths within the pypeit.data module
Each @property method returns a
pathlib.Path
object- _data = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pypeit/envs/release/lib/python3.9/site-packages/pypeit/data')
- class property arc_plot: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- class property arclines: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- static check_isdir(path: Path) Path [source]
Check that the hardwired directory exists
If yes, return the directory path, else raise an error message
- class property data: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- class property extinction: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- class property filters: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- class property linelist: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- class property nist: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- class property reid_arxiv: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- class property sensfuncs: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- class property skisim: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- class property sky_spec: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- class property spectrographs: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- class property standards: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- class property static_calibs: Path
Path subclass for non-Windows systems.
On a POSIX system, instantiating a Path should return this object.
- pypeit.data.utils.fetch_remote_file(filename: str, filetype: str, remote_host: str = 'github', install_script: bool = False, force_update: bool = False, full_url: str | None = None) Path [source]
Use
astropy.utils.data
to fetch file from remote or cacheThe function
download_file()
will first look in the local cache (the optioncache=True
is used with this function to retrieve downloaded files from the cache, as needed) before downloading the file from the remote server.The remote file can be forcibly downloaded through the use of
force_update
.- Parameters:
filename (str) – The base filename to search for
filetype (str) – The subdirectory of
pypeit/data/
in which to find the file (e.g.,arc_lines/reid_arxiv
orsensfuncs
)remote_host (
str
, optional) – The remote host scheme. Currently only ‘github’ and ‘s3_cloud’ are supported. Defaults to ‘github’].install_script (
bool
, optional) – This function is being called from an install script (i.e.,pypeit_install_telluric
) – relates to warnings displayed. Defaults to False.force_update (
bool
, optional) – Forceastropy.utils.data.download_file()
to update the cache by downloading the latest version. Defaults to False.full_url (
str
, optional) – The full url (i.e., skip _build_remote_url()) Defaults to None.
- Returns:
The local path to the desired file in the cache
- Return type:
- pypeit.data.utils.get_extinctfile_filepath(extinction_file: str) Path [source]
Return the full path to the
extinction
fileUnlike other get_*_filepath() functions, the extinction files are included with the PyPI distribution since they are small text files. The purpose of this function is to be able to load in user-installed extinction files from observatories not already included. Users may self-install such files using the
pypeit_install_extinctfile
script.- Parameters:
extinction_file (str) – The base filename of the
extinction
file to be located- Returns:
The full path to the
extinction
file- Return type:
- pypeit.data.utils.get_linelist_filepath(linelist_file: str) Path [source]
Return the full path to the
linelist
fileIt is desired to allow users to utilize their own arc line lists for wavelength calibration without modifying the distributed version of the package. We can utilize the
astropy
download/cache system added previously to include this functionality.Using the script
pypeit_install_linelist
, custom arc line lists can be installed into the PypeIt cache (nominally~/.pypeit/cache
), and are not placed into the package directory itself.Given the line list filename, this function checks first for the existance of the file in the package directory, then checks the PypeIt cache. For all built-in line lists, this function returns the file location within the package directory. For user-supplied lists that were installed using the script, this function returns the location within the cache.
The cache keeps a hash of the file URL, which contains the PypeIt version number. As users update to newer versions, the
linelist
files must be reinstalled using the included script.- Parameters:
linelist_file (str) – The base filename of the
linelist
file to be located- Returns:
The full path to the
linelist
file- Return type:
- pypeit.data.utils.get_reid_arxiv_filepath(arxiv_file: str) tuple[pathlib.Path, str] [source]
Return the full path to the
reid_arxiv
fileIn an attempt to reduce the size of the PypeIt package as distributed on PyPI, the
reid_arxiv
files are not longer distributed with the package. The collection of files are hosted remotely, and only thereid_arxiv
files needed by a particular user are downloaded to the local machine.This function checks for the local existance of the
reid_arxiv
file, and downloads it from the remote server using AstroPy’sdownload_file()
function. The file downloaded in this fashion is kept in the PypeIt cache (nominally~/.pypeit/cache
) and is not placed into the package directory itself.The cache keeps a hash of the file URL, which contains the PypeIt version number. As users update to newer versions, the
reid_arxiv
files will be downloaded again (matching the new version #) to catch any changes.As most users will need only a small number of
reid_arxiv
files for thier particular reductions, the remote fetch will only occur once per file (per version of PypeIt).
- pypeit.data.utils.get_sensfunc_filepath(sensfunc_file: str, symlink_in_pkgdir: bool = False) Path [source]
Return the full path to the
sensfunc
fileIn an attempt to reduce the size of the PypeIt package as distributed on PyPI, the
sensfunc
files are not longer distributed with the package. The collection of files are hosted remotely, and only thesensfunc
files needed by a particular user are downloaded to the local machine.This function checks for the local existance of the
sensfunc
file, and downloads it from the remote server using AstroPy’sdownload_file()
function. The file downloaded in this fashion is kept in the PypeIt cache (nominally~/.pypeit/cache
) and is not placed into the package directory itself.The cache keeps a hash of the file URL, which contains the PypeIt version number. As users update to newer versions, the
sensfunc
files will be downloaded again (matching the new version #) to catch any changes.As most users will need only a small number of
sensfunc
files for thier particular reductions, the remote fetch will only occur once per file (per version of PypeIt).- Parameters:
- Returns:
The full path to the
sensfunc
file- Return type:
- pypeit.data.utils.get_skisim_filepath(skisim_file: str) Path [source]
Return the full path to the
skisim
fileIn an attempt to reduce the size of the PypeIt package as distributed on PyPI, the
skisim
files are not longer distributed with the package. The collection of files are hosted remotely, and only theskisim
files needed by a particular user are downloaded to the local machine.This function checks for the local existance of the
skisim
file, and downloads it from the remote server using AstroPy’sdownload_file()
function. The file downloaded in this fashion is kept in the PypeIt cache (nominally~/.pypeit/cache
) and is not placed into the package directory itself.The cache keeps a hash of the file URL, which contains the PypeIt version number. As users update to newer versions, the
skisim
files will be downloaded again (matching the new version #) to catch any changes.As most users will need only a small number of
skisim
files for thier particular reductions, the remote fetch will only occur once per file (per version of PypeIt).- Parameters:
skisim_file (str) – The base filename of the
skisim
file to be located- Returns:
The full path to the
skisim
file- Return type:
- pypeit.data.utils.get_telgrid_filepath(telgrid_file: str) Path [source]
Return the full path to the
telgrid
fileAtmospheric Telluric Grid files are not part of the PypeIt package itself due to their large (~4-8GB) size. These files are hosted remotely (see the PyepIt documentation), and only the
telgrid
files needed by a particular user are downloaded to the local machine.This function checks for the local existance of the
telgrid
file, and downloads it from the remote server using AstroPy’sdownload_file()
function. The file downloaded in this fashion is kept in the PypeIt cache (nominally~/.pypeit/cache
) and is not placed into the package directory itself.As most users will need only a small number of
telgrid
files for thier particular reductions, the remote fetch will only occur once per file.- Parameters:
telgrid_file (str) – The base filename of the
telgrid
file to be located- Returns:
The full path to the
telgrid
file- Return type:
- pypeit.data.utils.load_sky_spectrum(sky_file: str) XSpectrum1D [source]
Load a sky spectrum into an XSpectrum1D object
NOTE: This is where the path to the data directory is added!
Todo
Try to eliminate the XSpectrum1D dependancy
- Parameters:
sky_file (str) – The filename (NO PATH) of the sky file to use.
- Returns:
Sky spectrum
- Return type:
- pypeit.data.utils.load_telluric_grid(filename: str)[source]
Load a telluric atmospheric grid
NOTE: This is where the path to the data directory is added!
- Parameters:
filename (str) – The filename (NO PATH) of the telluric atmospheric grid to use.
- Returns:
Telluric Grid FITS HDU list
- Return type:
- pypeit.data.utils.load_thar_spec()[source]
Load the archived ThAr spectrum
NOTE: This is where the path to the data directory is added!
- Parameters:
filename (str) – The filename (NO PATH) of the telluric atmospheric grid to use.
- Returns:
ThAr Spectrum FITS HDU list
- Return type:
- pypeit.data.utils.search_cache(pattern_str: str) list[pathlib.Path] [source]
Search the cache for items matching a pattern string
This function searches the PypeIt cache for files whose URL keys contain the input
pattern_str
, and returns the local filesystem path to those files.
- pypeit.data.utils.write_file_to_cache(filename: str, cachename: str, filetype: str, remote_host: str = 'github')[source]
Use
astropy.utils.data
to save local file to cacheThis function writes a local file to the PypeIt cache as if it came from a remote server. This is useful for being able to use locally created or separately downloaded files in place of PypeIt-distributed versions.
- Parameters:
filename (str) – The filename of the local file to save
cachename (str) – The name of the cached version of the file
filetype (str) – The subdirectory of
pypeit/data/
in which to find the file (e.g.,arc_lines/reid_arxiv
orsensfuncs
)remote_host (
str
, optional) – The remote host scheme. Currently only ‘github’ and ‘s3_cloud’ are supported. Defaults to ‘github’.