Disaggregate Accessibility#

The disaggregate accessibility model is an extension of the base accessibility model. While the base accessibility model is based on a mode-specific decay function and uses fixed market segments in the population (i.e., income), the disaggregate accessibility model extracts the actual destination choice logsums by purpose (i.e., mandatory fixed school/work location and non-mandatory tour destinations by purpose) from the actual model calculations using a user-defined proto-population. This enables users to include features that may be more critical to destination choice than just income (e.g., automobile ownership).

Structure#

Inputs

  • disaggregate_accessibility.yaml - Configuration settings for disaggregate accessibility model.

  • annotate.csv [optional] - Users can specify additional annotations specific to disaggregate accessibility. For example, annotating the proto-population tables.

Outputs

  • final_disaggregate_accessibility.csv [optional]

  • final_non_mandatory_tour_destination_accesibility.csv [optional]

  • final_workplace_location_accessibility.csv [optional]

  • final_school_location_accessibility.csv [optional]

  • final_proto_persons.csv [optional]

  • final_proto_households.csv [optional]

  • final_proto_tours.csv [optional]

The above tables are created in the model pipeline, but the model will not save any outputs unless specified in settings.yaml - output_tables. Users can return the proto population tables for inspection, as well as the raw logsum accessibilities for mandatory school/work and non-mandatory destinations. The logsums are then merged at the household level in final_disaggregate_accessibility.csv, which each tour purpose logsums shown as separate columns.

Usage

The disaggregate accessibility model is run as a model step in the model list. There are two necessary steps:

  • initialize_proto_population

  • compute_disaggregate_accessibility

The reason the steps must be separate is to enable multiprocessing. The proto-population must be fully generated and initialized before activitysim slices the tables into separate threads. These steps must also occur before initialize_households in order to avoid conflict with the shadow_pricing model.

The model steps can be run either as part the activitysim model run, or setup to run as a standalone run to pre-computing the accessibility values. For standalone implementations, the final_disaggregate_accessibility.csv is read into the pipeline and initialized with the initialize_household model step.

  • Configuration File: disaggregate_accessibility.yaml

  • Core Table: Users define the variables to be generated for ‘PROTO_HOUSEHOLDS’, ‘PROTO_PERSONS’, and ‘PROTO_TOURS’ tables. These tables must include all basic fields necessary for running the actual model. Additional fields can be annotated in pre-processing using the annotation settings of this file.

Configuration#

settings activitysim.abm.models.disaggregate_accessibility.DisaggregateAccessibilitySettings#

Bases: PydanticReadable

Fields:
field BASE_RANDOM_SEED: int = 0#
field CREATE_TABLES: dict[str, DisaggregateAccessibilityTableSettings | str] = {}#
field DESTINATION_SAMPLE_SIZE: float | int = 0#

Number of destination zone alternatives sampled for calculating the destination logsum.

Setting this to zero implies sampling all zones.

Decimal values < 1 will be interpreted as a percentage, e.g., 0.5 = 50% sample.

field FROM_TEMPLATES: bool = False#
field MERGE_ON: dict[str, list[str]] [Required]#

Field to merge the proto-population logsums onto the full synthetic population/

The proto-population should be designed such that the logsums are able to be joined exactly on these variables specified to the full population. Users specify the to join on using:

  • by: An exact merge will be attempted using these discrete variables.

  • asof [optional]: The model can peform an “asof” join for continuous variables, which finds the nearest value. This method should not be necessary since synthetic populations are all discrete.

  • method [optional]: Optional join method can be “soft”, default is None. For cases where a full inner join is not possible, a Naive Bayes clustering method is fast but discretely constrained method. The proto-population is treated as the “training data” to match the synthetic population value to the best possible proto-population candidate. The Some refinement may be necessary to make this procedure work.

field NEAREST_METHOD: str = 'skims'#
field ORIGIN_SAMPLE_METHOD: Literal[None, 'full', 'uniform', 'uniform-taz', 'kmeans'] = None#

The method in which origins are sampled.

Population weighted sampling can be TAZ-based or “TAZ-agnostic” using KMeans clustering. The potential advantage of KMeans is to provide a more geographically even spread of MAZs sampled that do not rely on TAZ hierarchies. Unweighted sampling is also possible using ‘uniform’ and ‘uniform-taz’.

  • None [Default] - Sample zones weighted by population, ensuring at least one TAZ is sampled per MAZ. If n-samples > n-tazs then sample 1 MAZ from each TAZ until n-remaining-samples < n-tazs, then sample n-remaining-samples TAZs and sample an MAZ within each of those TAZs. If n-samples < n-tazs, then it proceeds to the above ‘then’ condition.

  • “kmeans” - K-Means clustering is performed on the zone centroids (must be provided as maz_centroids.csv), weighted by population. The clustering yields k XY coordinates weighted by zone population for n-samples = k-clusters specified. Once k new cluster centroids are found, these are then approximated into the nearest available zone centroid and used to calculate accessibilities on. By default, the k-means method is run on 10 different initial cluster seeds (n_init) using using [k-means++ seeding algorithm](https://en.wikipedia.org/wiki/K-means%2B%2B). The k-means method runs for max_iter iterations (default=300).

  • “uniform” - Unweighted sample of N zones independent of each other.

  • “uniform-taz” - Unweighted sample of 1 zone per taz up to the N samples specified.

field ORIGIN_SAMPLE_SIZE: float | int = 0#

The number of sampled origins where logsum is calculated.

Setting this to zero implies sampling all zones.

Origins without a logsum will draw from the nearest zone with a logsum. This parameter is useful for systems with a large number of zones with similar accessibility. Fractional values less than 1 will be interpreted as a percentage, e.g., 0.5 = 50% sample.

field ORIGIN_WEIGHTING_COLUMN: str [Required]#
field add_size_tables: bool = True#
field annotate_proto_tables: list[DisaggregateAccessibilityAnnotateSettings] = []#

Allows modification of the proto-population.

Annotation configurations are available here, if users wish to modify the proto-population beyond basic generation in the YAML.

field suffixes: DisaggregateAccessibilitySuffixes = DisaggregateAccessibilitySuffixes(source_file_paths=None, SUFFIX='proto_', ROOTS=['persons', 'households', 'tours', 'persons_merged', 'person_id', 'household_id', 'tour_id'])#
field zone_id_names: dict[str, str] = {'index_col': 'zone_id'}#

Examples#

Implementation#