General purpose modeling tools (tools
)
“Tools” can include, inter alia:
Codes for retrieving data from specific data sources and adapting it for use with
message_ix_models
.Codes for modifying scenarios; although tools for building models should go in
message_ix_models.model
.
On other pages:
On this page:
Exogenous data (tools.exo_data
)
Generic tools for working with exogenous data sources.
The tools in this module support use of data from arbitrary sources and formats in model-building code.
For each source/format, a subclass of ExoDataSource
adds tasks to a genno.Computer
that retrieve/load and transform the source data into genno.Quantity
.
An example using one such class, message_ix_models.project.advance.data.ADVANCE
.
from genno import Computer
from message_ix_models.project.advance.data import ADVANCE
# Keyword arguments corresponding to ADVANCE.Options
kw = dict(
measure="Transport|Service demand|Road|Passenger|LDV",
model="MESSAGE",
scenario="ADV3TRAr2_Base",
)
# Add tasks to retrieve and transform data
c = Computer()
keys = c.apply(ADVANCE, context=context, **kw)
# Retrieve some of the data
q_result = c.get(keys[0])
# Pass the data into further calculations
c.add("derived", "mul", keys[1], k_other)
Measures recognized by some data sources. |
|
Registered sources for data. |
|
|
Options for a concrete ExoDataSource subclass. |
|
Example source of exogenous population and GDP data. |
|
Abstract class for sources of exogenous data. |
|
Add structural information to c. |
|
Prepare c to compute GDP, population, or other exogenous data. |
|
Register |
- class message_ix_models.tools.exo_data.BaseOptions(aggregate: bool = True, interpolate: bool = True, measure: str = '', name: str = '', dims: tuple[str, ...] = ('n', 'y'))[source]
Options for a concrete ExoDataSource subclass.
- aggregate: bool = True
True
ifExoDataSource.transform()
should aggregate data on the \(n\) dimension.
- classmethod from_args(source_id: str | ExoDataSource, *args, **kwargs)[source]
Construct an instance from keyword arguments.
- Parameters:
source_id – For backwards-compatibility with
prepare_computer()
.
- interpolate: bool = True
True
ifExoDataSource.transform()
should interpolate data on the \(y\) dimension.
- class message_ix_models.tools.exo_data.DemoSource(*args, **kwargs)[source]
Example source of exogenous population and GDP data.
- class Options(aggregate: bool = True, interpolate: bool = True, measure: str = '', name: str = '', dims: tuple[str, ...] = ('n', 'y'), scenario: str = '')[source]
- get() AnyQuantity [source]
Return the data.
Implementations in concrete classes may load data from file, retrieve from remote sources or local caches, generate data, or anything else.
The Quantity returned by this method must have dimensions corresponding to
key
. If the original/upstream/raw data has different dimensionality (fewer or more dimensions; different dimension IDs), a concrete class must transform these, make appropriate selections, etc.
- message_ix_models.tools.exo_data.MEASURES = ('GDP', 'POP')
Measures recognized by some data sources. Concrete
ExoDataSource
subclasses may provide support for other measures.Todo
Store this in a separate code list or concept scheme.
- message_ix_models.tools.exo_data.SOURCES: dict[str, type[ExoDataSource]] = {'ADVANCE': <class 'message_ix_models.project.advance.data.ADVANCE'>, 'DemoSource': <class 'message_ix_models.tools.exo_data.DemoSource'>, 'GEA': <class 'message_ix_models.project.gea.data.GEA'>, 'GFEI': <class 'message_ix_models.tools.gfei.GFEI'>, 'IEA_EEI': <class 'message_ix_models.tools.iea.eei.IEA_EEI'>, 'IEA_EWEB': <class 'message_ix_models.tools.iea.web.IEA_EWEB'>, 'PRICE_EMISSION': <class 'message_ix_models.model.emissions.PRICE_EMISSION'>, 'SHAPE': <class 'message_ix_models.project.shape.data.SHAPE'>, 'SSPOriginal': <class 'message_ix_models.project.ssp.data.SSPOriginal'>, 'SSPUpdate': <class 'message_ix_models.project.ssp.data.SSPUpdate'>}
Registered sources for data. Use
register_source()
to add to this collection.
- message_ix_models.tools.exo_data.add_structure(c: Computer, *, context: Context, strict: bool = True) None [source]
Add structural information to c.
Helper for
ExoDataSource.add_tasks()
andprepare_computer()
.The added tasks include:
“context”: context, if not already set.
“n::codes”:
get_codes()
for the node code list according toConfig.regions
.“n::groups”:
codelist_to_groups()
called on “n::codes”.“y”: list of periods according to
Config.years
, if not already set.“y0”: first element of “y”.
“y::coords”:
dict
mappingstr("y")
to the elements of “y”.“yv::coords”:
dict
mappingstr("yv")
to the elements of “y”.“y0::coord”:
dict
mappingstr("y")
to “y0”.
- Parameters:
strict – if
True
, raise exceptions if the keys to be added are already in c.
- message_ix_models.tools.exo_data.register_source(cls: type[ExoDataSource], *, id: str | None = None) type[ExoDataSource] [source]
Register
ExoDataSource
cls as a source of exogenous data.
- class message_ix_models.tools.exo_data.ExoDataSource(*args, **kwargs)[source]
Abstract class for sources of exogenous data.
As an abstract class ExoDataSource must be subclassed to be used. Concrete subclasses must implement at least the
get()
method that performs the loading of the raw data when executed, and may override others, as described below.The class method
ExoDataSource.add_tasks()
adds tasks to agenno.Computer
. It returns agenno.Key
that refers to the loaded and transformed data. This method usually should not be modified for subclasses.The behaviour of a subclass can be customized in these ways:
Create a subclass of
BaseOptions
and set it as theOptions
class attribute.Override
__init__()
, which receives keyword arguments viaadd_tasks()
.Override
transform()
, which is called to add further tasks which will transform the data.
See the documentation for these methods and attributes for further details.
- Options
Class defining per-instance options understood by this data source.
An concrete class may override this with a subclass of
BaseOptions
. That subclass may change the default values of any attributes of BaseOptions, or add others.alias of
BaseOptions
- __init__(*args, **kwargs) None [source]
Create an instance and prepare info for
transform()
/get()
.The base implementation:
Sets
options
—if not already set—by passing kwargs toOptions
.Raises an exception if there are other/unhandled args or kwargs.
If
key
is not set, constructs it with:Subclasses may pre-empt this behaviour by setting
key
statically or dynamically.
A concrete class implementation must:
Set
options
, either directly or by callingsuper().__init__()
with or without keyword arguments.Set
key
, either directly or by callingsuper().__init__()
. In the latter case, it may setname
,measure
, and/ordims
to control the behaviour.Raise an exception if unrecognized or invalid kwargs are passed.
and may:
Transform kwargs or
options
arguments into other values, for instance by mapping certain values to others, applying regular expressions, or other operations.Store those values as instance attributes for use in
get()
.Log messages that give information that helps to debug exceptions.
It must not perform any time- or memory-intensive operations, such as actually loading or fetching data. Those operations should be in
get()
.
- classmethod _where() list[str | Path] [source]
Helper for
__init__()
methods in concrete classes.Return
where
If
use_test_data
isTrue
, also append"test"
.
- classmethod add_tasks(c: Computer, *args, context: Context | None = None, strict: bool = True, **kwargs) tuple [source]
Add tasks to c to provide and transform the data.
The first returned key is
key
, and will trigger the following tasks:Load or retrieve data by invoking
ExoDataSource.get()
.If
BaseOptions.aggregate
isTrue
, aggregate on the \(n\) (node) dimension according toConfig.regions
.If
BaseOptions.interpolate
isTrue
, interpolate on the \(y\) (year) dimension according toConfig.years
.
Steps (2) and (3) are added by
transform()
and may differ in concrete classes.Other returned keys include further transformations:
key + "y0_indexed"
: same askey
, but indexed to the values as of the first model period.
Other keys that are created but not returned can be accessed on c:
key + "message_ix_models.foo.bar.CLASS"
: the raw data, with a tag from the fully-qualified name of the ExoDataSource class.
To support the loading and transformation of data,
add_structure()
is first called with c.Todo
Add option/tasks to index to a particular label on the \(n\) dimension.
- Parameters:
context – Passed to
add_structure()
.strict – Passed to
add_structure()
.
- Return type:
- abstractmethod get() AnyQuantity [source]
Return the data.
Implementations in concrete classes may load data from file, retrieve from remote sources or local caches, generate data, or anything else.
The Quantity returned by this method must have dimensions corresponding to
key
. If the original/upstream/raw data has different dimensionality (fewer or more dimensions; different dimension IDs), a concrete class must transform these, make appropriate selections, etc.
- key: Key
Key for the returned
Quantity
. This may either be set statically on a concrete subclass, or created via__init__()
.
- options: BaseOptions
Instance of the
Options
class.A concrete class that overrides
Options
should redefine this attribute, to facilitate type checking.
- transform(c: Computer, base_key: Key) Key [source]
Add tasks to c to transform raw data from base_key.
base_key refers to the
Quantity
returned byget()
. Viaadd_tasks()
,transform()
adds additional tasks to c that further transform the data. (Such operations may be done inget()
directly, buttransform()
allows use ofgenno
operators and conveniences.)In the default implementation:
If
aggregate
isTrue
, aggregate the data (genno.operator.aggregate()
) on the \(n\) dimension using the key “n::groups”.If
interpolate
isTrue
, interpolate the data (genno.operator.interpolate()
) on the \(y\) dimension using “y::coords”.
Concrete classes may override this method to, for instance, change how aggregate and interpolate are handled, or add further steps. Such overrides may call the base implementation, or not.
- Returns:
referring to the data from base_key after any transformation. This may be the same as base_key.
- Return type:
- use_test_data: bool = False
True
to allow the class to look up and use test data. If no test data exists, this setting has no effect. See_where()
.
- where: list[str | Path] = []
where
keyword argument topath_fallback()
. See_where()
.
- message_ix_models.tools.exo_data.prepare_computer(context, c: Computer, source='test', source_kw: Mapping | None = None, *, strict: bool = True) tuple[Key, ...] [source]
Prepare c to compute GDP, population, or other exogenous data.
Check each
ExoDataSource
inSOURCES
to determine whether it recognizes and can handle source and source_kw. If a source is identified, add tasks to c that retrieve and process data into aQuantity
with, at least, dimensions \((n, y)\).Deprecated since version 2025-06-06: Use
ExoDataSource.add_tasks()
instead. Seeexo_data
.- Return type:
- Raises:
ValueError – if no source is registered which can handle source and source_kw.
Deprecated since version 2025-06-06: Use
c.apply(SOURCE.add_tasks, …)
as shown above.
IAMC data structures (tools.iamc
)
Tools for working with IAMC-structured data.
- message_ix_models.tools.iamc.describe(data: DataFrame, extra: str | None = None) StructureMessage [source]
Generate SDMX structure information from data in IAMC format.
- Parameters:
data – Data in “wide” or “long” IAMC format.
extra (
str
, optional) – Extra text added to the description of each Codelist.
- Returns:
The message contains one
Codelist
for each of the MODEL, SCENARIO, REGION, VARIABLE, and UNIT dimensions. Codes for the VARIABLE code list have annotations withid="preferred-unit-measure"
that give the corresponding UNIT Code(s) that appear with each VARIABLE.- Return type:
- message_ix_models.tools.iamc.iamc_like_data_for_query(path: pathlib.Path, query: str, *, archive_member: str | None = None, drop: list[str] | None = None, non_iso_3166: Literal['keep', 'discard'] = 'discard', replace: dict | None = None, unique: str = 'MODEL SCENARIO VARIABLE UNIT', **kwargs) AnyQuantity [source]
Load data from path in an IAMC-like format and transform to
Quantity
.The steps involved are:
Read the data file. Additional kwargs are passed to
pandas.read_csv()
. By default (unless kwargs explicitly give a different value), pyarrow is used for better performance.Pass the result through
to_quantity()
, with the parameters query, drop, non_iso_3166, replace, and unique.Cache the result using
cached
. Subsequent calls with the same arguments will yield the cached result rather than repeating steps (1) and (2).
- Parameters:
archive_member (
bool
, optional) – If given, path may be a tar or ZIP archive with 1 or more members. The member named by archive_member is extracted and read usingtarfile.TarFile
orzipfile.ZipFile
.- Returns:
of the same structure returned by
to_quantity()
.- Return type:
Data returned by this function is cached using
cached()
; see alsoSKIP_CACHE
.
Policies (tools.policy
)
Policies.
- class message_ix_models.tools.policy.Policy[source]
Base class for policies.
This class has no attributes or public methods. Other modules in
message_ix_models
:should subclass Policy to represent different kinds of policy.
may add attributes, methods, etc. to aid with the implementation of those policies in concrete scenarios.
in contrast, may use minimal subclasses as mere flags to be interpreted by other code.
The default implementation of
hash()
returns a value the same for every instance of a subclass. This means that two instances of the same subclass hash equal. SeeConfig.policy
.
World Bank structures (tools.wb
)
Tools for World Bank data.
- message_ix_models.tools.wb.assign_income_groups(cl_node: sdmx.model.common.Codelist, cl_income_group: sdmx.model.common.Codelist, method: str = 'population', replace: dict[str, str] | None = None) None [source]
Annotate cl_node with income groups.
Each node is assigned an
Annotation
withid="wb-income-group"
, according to the income groups of its children (countries), as reflected in cl_income_group (seeget_income_group_codelist()
).- Parameters:
method (
"population"
or"count"
) –Method for aggregation:
"population"
(default): the WB World Development Indicators (WDI) 2020 population for each country is used as a weight, so that the node’s income group is the income group of the plurality of the population of its children."count"
: each country is weighted equally, so that the node’s income group is the mode (most frequently occurring value) of its childrens’.
replace (
dict
) – Mapping from wb-income-group annotation text appearing in cl_income_group to texts to be attached to cl_node. Mapping two keys to the same value effectively combines or aggregates those groups. Seemake_map()
.
Example
Annotate the R12 node list with income group information, mapping high income countries (HIC) and upper-middle income countries (UMC) into one group and aggregating by population.
>>> cl_node = get_codelist(f"node/R12") >>> cl_ig = get_income_group_codelist() >>> replace = make_map({"HIC": "HMIC", "UMC": "HMIC"}) >>> assign_income_groups(cl_node, cl_ig, replace=replace) >>> cl_node["R12_NAM"].get_annotation(id="wb-income-group").text HMIC
- message_ix_models.tools.wb.fetch_codelist(id: str) sdmx.model.common.Codelist [source]
Retrieve code lists related to the WB World Development Indicators.
In principle this could be done with
sdmx.Client("WB_WDI").codelist(id)
, but the World Bank SDMX REST API does not support queries for a specific code list. See https://datahelpdesk.worldbank.org/knowledgebase/articles/1886701-sdmx-api-queries.fetch_codelist()
retrieves http://api.worldbank.org/v2/sdmx/rest/codelist/WB/, the structure message containing all code lists; and extracts and returns the one with the given id.
- message_ix_models.tools.wb.get_income_group_codelist() sdmx.model.common.Codelist [source]
Return a
Codelist
with World Bank income group information.The returned code list is a modified version of the one with URN
…Codelist=WB:CL_REF_AREA_WDI(1.0)
, viafetch_codelist()
.This is augmented with information about the income group and lending category concepts as described at https://datahelpdesk.worldbank.org/knowledgebase/articles/906519
The information is stored two ways:
Existing codes in the list like “HIC: High income” that designate groups of countries are associated with child codes that are designated as members of that country. These can be accessed at
Code.child
.Existing codes in the list like “ABW: Aruba” are annotated with:
id="wb-income-group"
: the URN of the income group code, for instance “urn:sdmx:org.sdmx.infomodel.codelist.Code=WB:CL_REF_AREA_WDI(1.0).HIC”. This is an unambiguous reference to a code in the same list.id="wb-lending-category"
: the name of the lending category, if any.
These can be accessed using
Code.annotations
,Code.get_annotation
, and other methods.
Tools for scenario manipulation
Add |
|
Add bound for generic relation at the global level. |
|
Add accounting possibility for CO2 emissions from FFI. |
|
Add structure and data for emission constraints. |
|
Add a budget constraint to a given region. |
|
Modify scen to include an emission bound. |
|
Add a global CO2 price to scen. |
|
Remove all |
|
Revise hydrogen-blending constraints. |