Reporting¶
Top-level methods and classes:
|
Configure reporting globally. |
|
Class for generating reports on |
|
A hashable key for a quantity that includes its dimensionality. |
Convert arguments to the internal Quantity data format. |
Others:
-
reporting.
configure
(**config)¶ Configure reporting globally.
Modifies global variables that affect the behaviour of all Reporters and computations, namely
RENAME_DIMS
andREPLACE_UNITS
.Valid configuration keys—passed as config keyword arguments—include:
- Other Parameters
units (mapping) – Configuration for handling of units. Valid sub-keys include:
replace (mapping of str -> str): replace units before they are parsed by pint. Added to
REPLACE_UNITS
.define (
str
): block of unit definitions, added to thepint
application registry so that units are recognized. See the pint documentation on defining units.
rename_dims (mapping of str -> str) – Update
RENAME_DIMS
.
- Warns
UserWarning – If config contains unrecognized keys.
-
class
ixmp.reporting.
Reporter
(**kwargs)¶ Class for generating reports on
ixmp.Scenario
objects.A Reporter is used to postprocess data from from one or more
ixmp.Scenario
objects. Theget()
method can be used to:Retrieve individual quantities. A quantity has zero or more dimensions and optional units. Quantities include the ‘parameters’, ‘variables’, ‘equations’, and ‘scalars’ available in an
ixmp.Scenario
.Generate an entire report composed of multiple quantities. A report may:
Read in non-model or exogenous data,
Trigger output to files(s) or a database, or
Execute user-defined methods.
Every report and quantity (including the results of intermediate steps) is identified by a
Key
; all the keys in a Reporter can be listed withkeys()
.Reporter uses a graph data structure to keep track of computations, the atomic steps in postprocessing: for example, a single calculation that multiplies two quantities to create a third. The graph allows
get()
to perform only the requested computations. Advanced users may manipulate the graph directly; but common reporting tasks can be handled by using Reporter methods:add
(data, *args, **kwargs)General-purpose method to add computations.
add_file
(path[, key])Add exogenous quantities from path.
add_product
(key, *quantities[, sums])Add a computation that takes the product of quantities.
add_queue
(queue[, max_tries, fail])Add tasks from a list or queue.
add_single
(key, *computation[, strict, index])Add a single computation at key.
aggregate
(qty, tag, dims_or_groups[, …])Add a computation that aggregates qty.
apply
(generator, *keys, **kwargs)Add computations by applying generator to keys.
check_keys
(*keys)Check that keys are in the Reporter.
configure
([path])Configure the Reporter.
describe
([key, quiet])Return a string describing the computations that produce key.
disaggregate
(qty, new_dim[, method, args])Add a computation that disaggregates qty using method.
finalize
(scenario)Prepare the Reporter to act on scenario.
full_key
(name_or_key)Return the full-dimensionality key for name_or_key.
get
([key])Execute and return the result of the computation key.
keys
()Return the keys of
graph
.set_filters
(**filters)Apply filters ex ante (before computations occur).
visualize
(filename, **kwargs)Generate an image describing the reporting structure.
write
(key, path)Write the report key to the file path.
-
add
(data, *args, **kwargs)¶ General-purpose method to add computations.
add()
can be called in several ways; its behaviour depends on data; see below. It chains to methods such asadd_single()
,add_queue()
, andapply()
, which can also be called directly.- Parameters
data (various) –
args (various) –
- Other Parameters
sums (bool, optional) – If
True
, all partial sums of the key data are also added to the Reporter.- Returns
Some or all of the keys added to the Reporter.
- Return type
list of Key-like
- Raises
KeyError – If a target key is already in the Reporter; any key referred to by a computation does not exist; or
sums=True
and the key for one of the partial sums of key is already in the Reporter.
See also
add()
may be called with:list
: data is a list of computations like[(list(args1), dict(kwargs1)), (list(args2), dict(kwargs2)), ...]
that are added one-by-one.the name of a function in
computations
(e.g. ‘select’): A computation is added with keyargs[0]
, applying the named function toargs[1:]
and kwargs.str
, the name of aReporter
method (e.g. ‘apply’): the corresponding method (e.g.apply()
) is called with the args and kwargs.Any other
str
orKey
: the arguments are passed toadd_single()
.
add()
may also be used to:Provide an alias from one key to another:
>>> from message_ix.reporting import Reporter >>> rep = Reporter() # Create a new Reporter object >>> rep.add('aliased name', 'original name')
Define an arbitrarily complex computation in a Python function that operates directly on the
ixmp.Scenario
:>>> def my_report(scenario): >>> # many lines of code >>> return 'foo' >>> rep.add('my report', (my_report, 'scenario')) >>> rep.finalize(scenario) >>> rep.get('my report') foo
Note
Use care when adding literal
str()
values as a computation argument foradd()
; these may conflict with keys that identify the results of other computations.
-
apply
(generator, *keys, **kwargs)¶ Add computations by applying generator to keys.
- Parameters
generator (callable) – Function to apply to keys.
keys (hashable) – The starting key(s).
kwargs – Keyword arguments to generator.
The generator may have a type annotation for Reporter on its first positional argument. In this case, a reference to the Reporter is supplied, and generator may use the Reporter methods to add computations:
def gen0(r: ixmp.Reporter, **kwargs): r.load_file('file0.txt', **kwargs) r.load_file('file1.txt', **kwargs) # Use the generator to add several computations rep.apply(my_gen, units='kg')
Or, generator may
yield
a sequence (0 or more) of (key, computation), which are added to thegraph
:def gen1(**kwargs): op = partial(computations.load_file, **kwargs) yield from (f'file:{i}', op, 'file{i}.txt') for i in range(2) rep.apply(my_gen, units='kg')
-
add_file
(path, key=None, **kwargs)¶ Add exogenous quantities from path.
Reporting the key or using it in other computations causes path to be loaded and converted to
Quantity
.- Parameters
path (os.PathLike) – Path to the file, e.g. ‘/path/to/foo.ext’.
key (str or Key, optional) – Key for the quantity read from the file.
- Other Parameters
dims (dict or list or set) – Either a collection of names for dimensions of the quantity, or a mapping from names appearing in the input to dimensions.
units (str or pint.Unit) – Units to apply to the loaded Quantity.
- Returns
Either key (if given) or e.g.
file:foo.ext
based on the path name, without directory components.- Return type
-
add_product
(key, *quantities, sums=True)¶ Add a computation that takes the product of quantities.
- Parameters
- Returns
The full key of the new quantity.
- Return type
-
add_queue
(queue, max_tries=1, fail='raise')¶ Add tasks from a list or queue.
- Parameters
queue (list of 2-tuple) – The members of each tuple are the arguments (i.e. a list or tuple) and keyword arguments (i.e. a dict) to
add()
.max_tries (int, optional) – Retry adding elements up to this many times.
fail ('raise' or log level, optional) – Action to take when a computation from queue cannot be added after max_tries.
-
add_single
(key, *computation, strict=False, index=False)¶ Add a single computation at key.
- Parameters
key (str or Key or hashable) – A string, Key, or other value identifying the output of task.
computation (object) –
Any dask computation, i.e. one of:
any existing key in the Reporter.
any other literal value or constant.
a task, i.e. a tuple with a callable followed by one or more computations.
A list containing one or more of #1, #2, and/or #3.
strict (bool, optional) – If True, key must not already exist in the Reporter, and any keys referred to by computation must exist.
index (bool, optional) – If True, key is added to the index as a full-resolution key, so it can be later retrieved with
full_key()
.
-
aggregate
(qty, tag, dims_or_groups, weights=None, keep=True, sums=False)¶ Add a computation that aggregates qty.
- Parameters
qty (
Key
or str) – Key of the quantity to be aggregated.tag (str) – Additional string to add to the end the key for the aggregated quantity.
dims_or_groups (str or iterable of str or dict) – Name(s) of the dimension(s) to sum over, or nested dict.
weights (
xarray.DataArray
, optional) – Weights for weighted aggregation.keep (bool, optional) – Passed to
computations.aggregate
.
- Returns
The key of the newly-added node.
- Return type
-
check_keys
(*keys)¶ Check that keys are in the Reporter.
If any of keys is not in the Reporter, KeyError is raised. Otherwise, a list is returned with either the key from keys, or the corresponding
full_key()
.
-
configure
(path=None, **config)¶ Configure the Reporter.
Accepts a path to a configuration file and/or keyword arguments. Configuration keys loaded from file are replaced by keyword arguments.
Valid configuration keys include:
default: the default reporting key; sets
default_key
.filters: a
dict
, passed toset_filters()
.files: a
list
where every element is adict
of keyword arguments toadd_file()
.alias: a
dict
mapping aliases to original keys.
- Warns
UserWarning – If config contains unrecognized keys.
-
default_key
= None¶ The default reporting key.
-
describe
(key=None, quiet=True)¶ Return a string describing the computations that produce key.
If key is not provided, all keys in the Reporter are described.
The string can be printed to the console, if not quiet.
-
disaggregate
(qty, new_dim, method='shares', args=[])¶ Add a computation that disaggregates qty using method.
- Parameters
qty (hashable) – Key of the quantity to be disaggregated.
new_dim (str) – Name of the new dimension of the disaggregated variable.
method (callable or str) – Disaggregation method. If a callable, then it is applied to var with any extra args. If then a method named ‘disaggregate_{method}’ is used.
args (list, optional) – Additional arguments to the method. The first element should be the key for a quantity giving shares for disaggregation.
- Returns
The key of the newly-added node.
- Return type
-
finalize
(scenario)¶ Prepare the Reporter to act on scenario.
The
Scenario
object scenario is associated with the key'scenario'
. All subsequent processing will act on data from this scenario.
-
classmethod
from_scenario
(scenario, **kwargs)¶ Create a Reporter by introspecting scenario.
- Parameters
scenario (ixmp.Scenario) – Scenario to introspect in creating the Reporter.
kwargs (optional) – Passed to
Scenario.configure()
.
- Returns
A Reporter instance containing:
A ‘scenario’ key referring to the scenario object.
Each parameter, equation, and variable in the scenario.
All possible aggregations across different sets of dimensions.
Each set in the scenario.
- Return type
-
full_key
(name_or_key)¶ Return the full-dimensionality key for name_or_key.
An ixmp variable ‘foo’ with dimensions (a, c, n, q, x) is available in the Reporter as
'foo:a-c-n-q-x'
. ThisKey
can be retrieved with:rep.full_key('foo') rep.full_key('foo:c') # etc.
-
get
(key=None)¶ Execute and return the result of the computation key.
Only key and its dependencies are computed.
- Parameters
key (str, optional) – If not provided,
default_key
is used.- Raises
ValueError – If key and
default_key
are bothNone
.
-
set_filters
(**filters)¶ Apply filters ex ante (before computations occur).
Filters are stored in the reporter at the
'filters'
key, and are passed toixmp.Scenario.par()
and similar methods. All quantity values read from the Scenario are filtered before any other computations take place.- Parameters
filters (mapping of str → (list of str or None)) –
Argument names are dimension names; values are lists of allowable labels along the respective dimension, or None to clear any existing filters for the dimension.
If no arguments are provided, all filters are cleared.
-
property
unit_registry
¶ The
pint.UnitRegistry()
used by the Reporter.
-
visualize
(filename, **kwargs)¶ Generate an image describing the reporting structure.
This is a shorthand for
dask.visualize()
. Requires graphviz.
-
write
(key, path)¶ Write the report key to the file path.
-
class
ixmp.reporting.
Key
(name, dims=[], tag=None)¶ A hashable key for a quantity that includes its dimensionality.
Quantities in a
Scenario
can be indexed by one or more dimensions. A Key refers to a quantity using three components:For example, an ixmp parameter with three dimensions can be initialized with:
>>> scenario.init_par('foo', ['a', 'b', 'c'], ['apple', 'bird', 'car'])
Key allows a specific, explicit reference to various forms of “foo”:
in its full resolution, i.e. indexed by a, b, and c:
>>> k1 = Key('foo', ['a', 'b', 'c']) >>> k1 == 'foo:a-b-c' True
Notice that a Key has the same hash, and compares equal (==) to its
str()
.in a partial sum over one dimension, e.g. summed along c with dimensions a and b:
>>> k2 = k1.drop('c') >>> k2 == 'foo:a-b' True
in a partial sum over multiple dimensions, etc.:
>>> k1.drop('a', 'c') == k2.drop('a') == 'foo:b' True
Note
Some remarks:
repr(key)
prints the Key in angle brackets (‘<>’) to signify it is a Key object.>>> repr(k1) <foo:a-b-c>
Keys are immutable: the properties
name
,dims
, andtag
are read-only, and the methodsappend()
,drop()
, andadd_tag()
return new Key objects.Keys may be generated concisely by defining a convenience method:
>>> def foo(dims): >>> return Key('foo', dims.split()) >>> foo('a b c') foo:a-b-c
-
add_tag
(tag)¶ Return a new Key with tag appended.
-
append
(*dims)¶ Return a new Key with additional dimensions dims.
-
drop
(*dims)¶ Return a new Key with dims dropped.
-
classmethod
from_str_or_key
(value, drop=[], append=[], tag=None)¶ Return a new Key from value.
- Parameters
drop (list of str or
True
, optional) – Existing dimensions of value to drop. Seedrop()
.append (list of str, optional.) – New dimensions to append to the returned Key. See
append()
.tag (str, optional) – Tag for returned Key. If value has a tag, the two are joined using a ‘+’ character. See
add_tag()
.
- Returns
- Return type
-
iter_sums
()¶ Generate (key, task) for all possible partial sums of the Key.
-
classmethod
product
(new_name, *keys, tag=None)¶ Return a new Key that has the union of dimensions on keys.
Dimensions are ordered by their first appearance:
First, the dimensions of the first of the keys.
Next, any additional dimensions in the second of the keys that were not already added in step 1.
etc.
- Parameters
new_name (str) – Name for the new Key. The names of keys are discarded.
-
ixmp.reporting.
Quantity
(data, *args, **kwargs)¶ Convert arguments to the internal Quantity data format.
- Parameters
data – Quantity data.
args – Positional arguments, passed to
AttrSeries
orSparseDataArray
.kwargs – Keyword arguments, passed to
AttrSeries
orSparseDataArray
.
- Other Parameters
name (str, optional) – Quantity name.
units (str, optional) – Quantity units.
attrs (dict, optional) – Dictionary of attributes; similar to
attrs
.
The Quantity
constructor converts its arguments to an internal, xarray.DataArray
-like data format:
# Existing data
data = pd.Series(...)
# Convert to a Quantity for use in reporting calculations
qty = Quantity(data, name="Quantity name", units="kg")
rep.add("new_qty", qty)
Common ixmp.reporting
usage, e.g. in message_ix
, creates large, sparse data frames (billions of possible elements, but <1% populated); DataArray
’s default, ‘dense’ storage format would be too large for available memory.
Currently, Quantity is
AttrSeries
, a wrappedpandas.Series
that behaves like aDataArray
.In the future,
ixmp.reporting
will useSparseDataArray
, and eventuallyDataArray
backed by sparse data, directly.
The goal is that reporting code, including built-in and user computations, can treat quantity arguments as if they were DataArray
.
Computations¶
Elementary computations for reporting.
Unless otherwise specified, these methods accept and return
Quantity
objects for data
arguments/return values.
Calculations:
|
Sum across multiple quantities. |
|
Aggregate quantity by groups. |
|
Simply apply units to qty. |
|
Disaggregate quantity by shares. |
|
Return the product of any number of quantities. |
|
Return the ratio numerator / denominator. |
|
Select from qty based on indexers. |
|
Sum quantity over dimensions, with optional weights. |
Input and output:
|
Read the file at path and return its contents as a |
|
Write a quantity to a file. |
Data manipulation:
|
Concatenate Quantity objs. |
-
ixmp.reporting.computations.
aggregate
(quantity, groups, keep)¶ Aggregate quantity by groups.
- Parameters
quantity (
Quantity
) –groups (dict of dict) – Top-level keys are the names of dimensions in quantity. Second-level keys are group names; second-level values are lists of labels along the dimension to sum into a group.
keep (bool) – If True, the members that are aggregated into a group are returned with the group sums. If False, they are discarded.
- Returns
Same dimensionality as quantity.
- Return type
Quantity
-
ixmp.reporting.computations.
apply_units
(qty, units, quiet=False)¶ Simply apply units to qty.
Logs on level
WARNING
if qty already has existing units.
-
ixmp.reporting.computations.
concat
(*objs, **kwargs)¶ Concatenate Quantity objs.
Any strings included amongst args are discarded, with a logged warning; these usually indicate that a quantity is referenced which is not in the Reporter.
-
ixmp.reporting.computations.
data_for_quantity
(ix_type, name, column, scenario, config)¶ Retrieve data from scenario.
- Parameters
ix_type ('equ' or 'par' or 'var') – Type of the ixmp object.
name (str) – Name of the ixmp object.
column ('mrg' or 'lvl' or 'value') – Data to retrieve. ‘mrg’ and ‘lvl’ are valid only for
ix_type='equ'
, and ‘level’ otherwise.scenario (ixmp.Scenario) – Scenario containing data to be retrieved.
config (dict of (str -> dict)) – The key ‘filters’ may contain a mapping from dimensions to iterables of allowed values along each dimension. The key ‘units’/’apply’ may contain units to apply to the quantity; any such units overwrite existing units, without conversion.
- Returns
Data for name.
- Return type
Quantity
Disaggregate quantity by shares.
-
ixmp.reporting.computations.
load_file
(path, dims={}, units=None)¶ Read the file at path and return its contents as a
Quantity
.Some file formats are automatically converted into objects for direct use in reporting code:
.csv
:Converted to
Quantity
. CSV files must have a ‘value’ column; all others are treated as indices, except as given by dims. Lines beginning with ‘#’ are ignored.
- Parameters
path (pathlib.Path) – Path to the file to read.
dims (collections.abc.Collection or collections.abc.Mapping, optional) – If a collection of names, other columns besides these and ‘value’ are discarded. If a mapping, the keys are the column labels in path, and the values are the target dimension names.
units (str or pint.Unit) – Units to apply to the loaded Quantity.
-
ixmp.reporting.computations.
product
(*quantities)¶ Return the product of any number of quantities.
-
ixmp.reporting.computations.
ratio
(numerator, denominator)¶ Return the ratio numerator / denominator.
- Parameters
numerator (Quantity) –
denominator (Quantity) –
-
ixmp.reporting.computations.
select
(qty, indexers, inverse=False)¶ Select from qty based on indexers.
-
ixmp.reporting.computations.
sum
(quantity, weights=None, dimensions=None)¶ Sum quantity over dimensions, with optional weights.
- Parameters
quantity (Quantity) –
weights (Quantity, optional) – If dimensions is given, weights must have at least these dimensions. Otherwise, any dimensions are valid.
dimensions (list of str, optional) – If not provided, sum over all dimensions. If provided, sum over these dimensions.
Internal format for reporting quantities¶
-
ixmp.reporting.quantity.
assert_quantity
(*args)¶ Assert that each of args is a Quantity object.
- Raises
TypeError – with a indicative message.
-
class
ixmp.reporting.attrseries.
AttrSeries
(data=None, *args, name=None, attrs=None, **kwargs)¶ pandas.Series
subclass imitatingxarray.DataArray
.The AttrSeries class provides similar methods and behaviour to
xarray.DataArray
, so thatixmp.reporting.computations
methods can use xarray-like syntax.- Parameters
-
align_levels
(other)¶ Work around https://github.com/pandas-dev/pandas/issues/25760.
Return a copy of obj with common levels in the same order as ref.
-
assign_coords
(**kwargs)¶
-
property
coords
¶ Like
xarray.DataArray.coords
. Read-only.
-
property
dims
¶ Like
xarray.DataArray.dims
.
-
drop
(label)¶ Like
xarray.DataArray.drop()
.
-
classmethod
from_series
(series, sparse=None)¶
-
item
(*args)¶ Like
xarray.DataArray.item()
.
-
rename
(new_name_or_name_dict)¶
-
sel
(indexers=None, drop=False, **indexers_kwargs)¶ Like
xarray.DataArray.sel()
.
-
squeeze
(dim=None, *args, **kwargs)¶
-
sum
(*args, **kwargs)¶ Like
xarray.DataArray.sum()
.
-
to_dataframe
()¶
-
to_series
()¶
-
transpose
(*dims)¶
-
class
ixmp.reporting.sparsedataarray.
SparseAccessor
(obj)¶ xarray
accessor to helpSparseDataArray
.See the xarray accessor documentation, e.g.
register_dataarray_accessor()
.-
property
COO_data
¶ True
if the DataArray hassparse.COO
data.
-
convert
()¶ Return a
SparseDataArray
instance.
-
property
dense
¶ Return a copy with dense (
ndarray
) data.
-
property
dense_super
¶ Return a proxy to a
ndarray
-backedDataArray
.
-
property
-
class
ixmp.reporting.sparsedataarray.
SparseDataArray
(data: Any = <NA>, coords: Optional[Union[Sequence[Tuple], Mapping[Hashable, Any]]] = None, dims: Optional[Union[Hashable, Sequence[Hashable]]] = None, name: Hashable = None, attrs: Mapping = None, indexes: Dict[Hashable, pandas.core.indexes.base.Index] = None, fastpath: bool = False)¶ DataArray
with sparse data.SparseDataArray uses
sparse.COO
for storage withnumpy.nan
as itssparse.COO.fill_value
. Some methods ofDataArray
are overridden to ensure data is in sparse, or dense, format as necessary, to provide expected functionality not currently supported bysparse
, and to avoid exhausting memory for some operations that require dense data.-
equals
(other) → bool¶ True if two SparseDataArrays have the same dims, coords, and values.
Overrides
equals()
for sparse data.
-
classmethod
from_series
(obj, sparse=True)¶ Convert a pandas.Series into a SparseDataArray.
-
property
loc
¶ Attribute for location based indexing like pandas.
Note
This version does not allow assignment, since the underlying sparse array is read-only. To modify the contents, create a copy or perform an operation that returns a new array.
-
sel
(indexers=None, method=None, tolerance=None, drop=False, **indexers_kwargs) → ixmp.reporting.sparsedataarray.SparseDataArray¶ Return a new array by selecting labels along the specified dim(s).
Overrides
sel()
to handle >1-D indexers with sparse data.
-
to_dataframe
(name=None)¶ Convert this array and its coords into a
DataFrame
.Overrides
to_dataframe()
.
-
to_series
() → pandas.core.series.Series¶ Convert this array into a
Series
.Overrides
to_series()
to create the series without first converting to a potentially very largenumpy.ndarray
.
-
Utilities¶
-
ixmp.reporting.utils.
RENAME_DIMS
= {}¶ Dimensions to rename when extracting raw data from Scenario objects. Mapping from Scenario dimension name -> preferred dimension name.
-
ixmp.reporting.utils.
REPLACE_UNITS
= {'%': 'percent'}¶ Replacements to apply to quantity units before parsing by pint. Mapping from original unit -> preferred unit.
-
ixmp.reporting.utils.
clean_units
(input_string)¶ Tolerate messy strings for units.
Handles two specific cases found in MESSAGEix test cases:
Dimensions enclosed in ‘[]’ have these characters stripped.
The ‘%’ symbol cannot be supported by pint, because it is a Python operator; it is translated to ‘percent’.
-
ixmp.reporting.utils.
collect_units
(*args)¶ Return an list of ‘_unit’ attributes for args.
-
ixmp.reporting.utils.
dims_for_qty
(data)¶ Return the list of dimensions for data.
If data is a
pandas.DataFrame
, its columns are processed; otherwise it must be a list.ixmp.reporting.RENAME_DIMS is used to rename dimensions.
-
ixmp.reporting.utils.
filter_concat_args
(args)¶ Filter out str and Key from args.
A warning is logged for each element removed.