FileSystem#

settings activitysim.core.configuration.FileSystem#

Manage finding and loading files for ActivitySim’s command line interface.

Fields:
  • cache_dir (pathlib.Path)

  • configs_dir (tuple[pathlib.Path, ...])

  • data_dir (tuple[pathlib.Path, ...])

  • output_dir (pathlib.Path)

  • pipeline_file_name (str)

  • profile_dir (pathlib.Path)

  • settings_file_name (str)

  • sharrow_cache_dir (pathlib.Path)

  • working_dir (pydantic.types.DirectoryPath)

Validators:
  • configs_dirs_must_exist » configs_dir

  • data_dirs_must_exist » data_dir

field cache_dir: Path = None#

Name of the output directory for general cache files.

If not given, a directory named “cache” will be created inside the usual output directory.

field configs_dir: tuple[Path, ...] = ('configs',)#

Name[s] of the config directory.

Validated by:
  • configs_dirs_must_exist

field data_dir: tuple[Path, ...] = ('data',)#

Name of the data directory.

Validated by:
  • data_dirs_must_exist

field output_dir: Path = 'output'#

Name of the output directory.

This directory will be created on access if it does not exist.

field pipeline_file_name: str = 'pipeline'#

The name for the base pipeline file or directory.

field profile_dir: Path = None#

Name of the output directory for pyinstrument profiling files.

If not given, a unique time-stamped directory will be created inside the usual output directory.

field settings_file_name: str = 'settings.yaml'#
field sharrow_cache_dir: Path = None#

Name of the output directory for sharrow cache files.

If not given, a directory named “__sharrowcache__” will be created inside the general cache directory.

field working_dir: DirectoryPath = None#

Name of the working directory.

All other directories (configs, data, output, cache), when given as relative paths, are assumed to be relative to this working directory. If it is not provided, the usual Python current working directory is used.

Constraints:
  • format = directory-path

validator configs_dirs_must_exist  »  configs_dir#
validator data_dirs_must_exist  »  data_dir#
expand_input_file_list(input_files) list[Path]#

expand list by unglobbing globs globs

find_trace_file_path(file_name, trace_dir=None, return_all=False, file_type=None)#

Find the complete path to one or more existing trace file(s).

Parameters:
file_namestr

Base name of the trace file.

trace_dirpath-like, optional

Construct the trace file path within this directory. If not provided (typically for normal operation) the “trace” sub-directory of the normal output directory given by get_output_dir is used. The option to give a different location is primarily used to conduct trace file validation testing.

return_allbool, default False

By default, only a single matching filename is returned, otherwise an exception is raised. Alternatively, set this to true to return all matches.

file_typestr, optional

If provided, ensure that the located file path(s) have this extension.

Returns:
Path or list[Path]

A single Path if return_all is False, otherwise a list

Raises:
FileNotFoundError

If there are zero OR multiple matches.

get_cache_dir(subdir=None) Path#

Get the cache directory, creating it if needed.

The cache directory is used to store:
  • skim memmaps created by skim+dict_factories

  • tvpb tap_tap table cache

  • pre-compiled sharrow modules

Parameters:
subdirPath-like, optional

If given, get this subdirectory of the output_dir.

Returns:
Path
get_config_file_path(file_name: Path | str, mandatory: bool = True, allow_glob: bool = False) Path#

Find the first matching file among config directories.

Parameters:
file_namePath-like

The name of the file to match.

mandatorybool, default True

Raise a FileNotFoundError if no match is found. If set to False, this method returns None when there is no match.

allow_globbool, default False

Allow glob-style matches.

Returns:
Path or None
get_configs_dir() tuple[Path]#

Get the configs directories.

Returns:
tuple[Path]
get_data_dir() tuple[Path]#

Get the data directories.

Returns:
tuple[Path]
get_data_file_path(file_name, mandatory=True, allow_glob=False, alternative_suffixes=()) Path#

Find the first matching file among data directories.

Parameters:
file_namePath-like

The name of the file to match.

mandatorybool, default True

Raise a FileNotFoundError if no match is found. If set to False, this method returns None when there is no match.

allow_globbool, default False

Allow glob-style matches.

alternative_suffixesIterable[str], optional

Other file suffixes to search for, if the expected filename is not found. This allows, for example, the data files to be stored as compressed csv (”*.csv.gz”) without changing the config files.

Returns:
Path or None
get_log_file_path(file_name) Path#

Get the complete path to a log file.

Parameters:
file_namestr

Base name of the log file.

Returns:
Path
get_output_dir(subdir=None) Path#

Get an output directory, creating it if needed.

Parameters:
subdirPath-like, optional

If given, get this subdirectory of the output_dir.

Returns:
Path
get_output_file_path(file_name) Path#
get_pipeline_filepath() Path#

Get the complete path to the pipeline file or directory.

Returns:
Path
get_profiling_file_path(file_name) Path#

Get the complete path to a profile output file.

Parameters:
file_namestr

Base name of the profiling output file.

Returns:
Path
get_segment_coefficients(model_settings: pydantic.main.BaseModel | dict, segment_name: str)#
get_sharrow_cache_dir() Path#

Get the sharrow cache directory, creating it if needed.

The sharrow cache directory is used to store only sharrow’s cache of pre-compiled functions.

Returns:
Path
get_trace_file_path(file_name, tail=None, trace_dir=None, create_dirs=True, file_type=None)#

Get the complete path to a trace file.

Parameters:
file_namestr

Base name of the trace file.

tailstr or False, optional

Add this suffix to filenames. If not given, a quasi-random short string is derived from the current time. Set to False to omit the suffix entirely. Having a unique suffix makes it easier to open multiple comparable trace files side-by-side in Excel, which doesn’t allow identically named files to be open simultaneously. Omitting the suffix can be valuable for using automated tools to find file differences across many files simultaneously.

trace_dirpath-like, optional

Construct the trace file path within this directory. If not provided (typically for normal operation) the “trace” sub-directory of the normal output directory given by get_output_dir is used. The option to give a different location is primarily used to conduct trace file validation testing.

create_dirsbool, default True

If the path to the containing directory of the trace file does not yet exist, create it.

file_typestr, optional

If provided, ensure that the generated file path has this extension.

Returns:
Path
get_working_subdir(subdir) Path#
open_log_file(file_name, mode, header=None, prefix=False)#
classmethod parse_args(args)#
persist_sharrow_cache() None#

Change the sharrow cache directory to a persistent, user-global location.

The change is made in-place to sharrow_cache_dir for this object. The location for the cache is selected by platformdirs.user_cache_dir. An extra directory layer based on the current numba version is also added to the cache directory, which allows for different sets of cache files to co-exist for different version of numba (i.e. different conda envs). This location is not configurable – to select a different location, change the value of FileSystem.sharrow_cache_dir itself.

read_model_coefficients(model_settings: Optional[Union[LogitComponentSettings, dict[str, Any]]] = None, file_name: Optional[str] = None) DataFrame#
read_model_settings(file_name, mandatory=False)#
read_model_spec(file_name: Path | str)#
read_settings_file(file_name: str, mandatory: bool = True, include_stack: bool = False, configs_dir_list: Optional[tuple[Path]] = None, validator_class: Optional[type[pydantic.main.BaseModel]] = None)#

Load settings from one or more yaml files.

This method will look for first occurrence of a yaml file named <file_name> in the directories in configs_dir list, and read settings from that yaml file.

Settings file may contain directives that affect which file settings are returned:

  • inherit_settings (boolean)

    If found and set to true, this method will backfill settings in the current file with values from the next settings file in configs_dir list (if any)

  • include_settings: string <include_file_name>

    Read settings from specified include_file in place of the current file. To avoid confusion, this directive must appear ALONE in the target file, without any additional settings or directives.

Parameters:
file_namestr
mandatoryboolean, default True

If true, raise SettingsFileNotFoundError if no matching settings file is found in any config directory, otherwise this method will return an empty dict or an all-default instance of the validator class.

include_stackboolean or list

Only used for recursive calls, provides a list of files included so far to detect and prevent cycles.

validator_classtype[pydantic.BaseModel], optional

This model is used to validate the loaded settings.

Returns:
dict or validator_class