General purpose modeling tools

“Tools” can include, inter alia:

  • Codes for retrieving data from specific data sources and adapting it for use with message_ix_models.

  • Codes for modifying scenarios; although tools for building models should go in message_ix_models.model.

On this page:

Exogenous data (tools.exo_data)

Generic tools for working with exogenous data sources.

MEASURES

Supported measures.

SOURCES

Known sources for data.

DemoSource(source, source_kw)

Example source of exogenous population and GDP data.

ExoDataSource(source, source_kw)

Base class for sources of exogenous data.

prepare_computer(context, c[, source, source_kw])

Prepare c to compute GDP, population, or other exogenous data.

register_source(cls)

Register cls as a source of exogenous data.

class message_ix_models.tools.exo_data.DemoSource(source, source_kw)[source]

Example source of exogenous population and GDP data.

Parameters:
  • source (str) – Must be like test s1, where “s1” is a scenario ID from (“s0”…”s4”).

  • source_kw (dict) – Must contain an element “measure”, one of MEASURES.

id: str = 'DEMO'

Identifier for this particular source.

static random_data()[source]

Generate some random data with n, y, s, and v dimensions.

message_ix_models.tools.exo_data.MEASURES = ('GDP', 'POP')

Supported measures.

Todo

Store this in a separate code list or concept scheme.

message_ix_models.tools.exo_data.SOURCES: Dict[str, Type[ExoDataSource]] = {'DEMO': <class 'message_ix_models.tools.exo_data.DemoSource'>}

Known sources for data. Use register_source() to add to this collection.

message_ix_models.tools.exo_data.prepare_computer(context, c: Computer, source='test', source_kw: Mapping | None = None)[source]

Prepare c to compute GDP, population, or other exogenous data.

Returns a tuple of keys. The first, like {m}:n-y, triggers the following computations:

  1. Load data by invoking a ExoDataSource.

  2. Aggregate on the n (node) dimension according to Config.regions.

  3. Interpolate on the y (year) dimension according to Config.years.

Additional key(s) include:

  • {m}:n-y:y0 indexed: same as {m}:n-y, indexed to values as of y0, that is, the first model year.

Todo

Extend to also prepare to compute values indexed to a particular n.

Parameters:
  • source (str) – Identifier of the source, possibly with other information to be handled by a ExoDataSource.

  • source_kw (dict, optional) –

    Keyword arguments for a Source class. These can include indexers, selectors, or other information needed by the source class to identify the data to be returned.

    If the key “measure” is present, it must be one of MEASURES.

message_ix_models.tools.exo_data.register_source(cls: Type[ExoDataSource]) Type[ExoDataSource][source]

Register cls as a source of exogenous data.

class message_ix_models.tools.exo_data.ExoDataSource(source: str, source_kw: Mapping)[source]

Base class for sources of exogenous data.

abstract __call__() Quantity[source]

Return the data.

The Quantity returned by this method must have dimensions “n” and “y”. If the original/upstream/raw data has additional dimensions or different dimension IDs, the code must transform these, make appropriate selections, etc.

abstract __init__(source: str, source_kw: Mapping) None[source]

Handle source and source_kw.

An implementation must:

  • Raise ValueError if it does not recognize or cannot handle the arguments in source or source_kw.

  • Recognize and handle (if possible) a “measure” keyword in source_kw from MEASURES.

It may:

  • Transform these into other values, for instance by mapping certain values to others, applying regular expressions, or other operations.

  • Store those values as instance attributes for use in __call__(), below.

It should not actually load data or perform any time- or memory-intensive operations.

id: str = ''

Identifier for this particular source.

ADVANCE data (tools.advance)

get_advance_data([query])

Return data from the ADVANCE Work Package 2 data snapshot at LOCATION.

advance_data(variable[, query])

Return a single ADVANCE data variable as a genno.Quantity.

message_ix_models.tools.advance.LOCATION = ('advance', 'advance_compare_20171018-134445.csv.zip')

Expected location of the ADVANCE WP2 data snapshot.

This is a location relative to a parent directory. The specific parent directory depends on whether message_data is available:

Without message_data:

The code finds the data within (3) Other, system-specific (“local”) directories (see discussion there for how to configure this location). Users should:

  1. Visit https://tntcat.iiasa.ac.at/ADVANCEWP2DB/dsd?Action=htmlpage&page=about and register for access to the data.

  2. Log in.

  3. Download the snapshot with the file name given in LOCATION to a subdirectory advance/ within their local data directory.

With message_data:

The code finds the data within (2) data/ directory in the message_data repo. The snapshot is stored directly in the repository using Git LFS.

Handle data from the ADVANCE project.

message_ix_models.tools.advance.DIMS = ['model', 'scenario', 'region', 'variable', 'unit', 'year']

Standard dimensions for data produced as snapshots from the IIASA ENE Program “WorkDB”.

Todo

Move to a common location for use with other snapshots in the same format.

message_ix_models.tools.advance.NAME = 'advance_compare_20171018-134445.csv'

Name of the data file within the archive.

message_ix_models.tools.advance._fuzz_data(size=100.0, include: List[Tuple[str, str]] = [])[source]

Select a subset of the data for use in testing.

Parameters:
  • size (numeric) – Number of rows to include.

  • include (sequence of 2-tuple (str, str)) – (variable name, unit) to include. The data will be partly duplicated to ensure the given variable name(s) are included.

message_ix_models.tools.advance._read_workdb_snapshot(path: Path, name: str) Series[source]

Read the data file.

The expected format is a ZIP archive at path containing a member at name in CSV format, with columns corresponding to DIMS, except for “year”, which is stored as column headers (‘wide’ format). (This corresponds to an older version of the “IAMC format,” without more recent additions intended to represent sub-annual time resolution using a separate column.)

Todo

Move to a general location for use with other files in the same format.

Data returned by this function is cached using cached(); see also SKIP_CACHE.

message_ix_models.tools.advance.advance_data(variable: str, query: str | None = None) Quantity[source]

Return a single ADVANCE data variable as a genno.Quantity.

Parameters:

query (str, optional) – Passed to get_advance_data().

Returns:

with the dimensions DIMS and name variable. If the units of the data for variable are consistent and parseable by pint, the returned Quantity has these units; otherwise units are discarded and the returned Quantity is dimensionless.

Return type:

genno.Quantity

message_ix_models.tools.advance.get_advance_data(query: str | None = None) Series[source]

Return data from the ADVANCE Work Package 2 data snapshot at LOCATION.

Parameters:

query (str, optional) – Passed to pandas.DataFrame.query() to limit the returned values.

Returns:

with a pandas.MultiIndex having the levels DIMS.

Return type:

pandas.Series

Data returned by this function is cached using cached(); see also SKIP_CACHE.