This page introduces considerations, tools, and features for using distributed or high-throughput computing with MESSAGEix-GLOBIOM.
Scenarios in the MESSAGEix-GLOBIOM global model family are characterized by:
solve()times of between 10 and 60 minutes, depending on hardware and configuration, plus similar amounts of time to run the legacy reporting in
Memory usage of ~10 GB or more using
JDBCBackend, currently the only supported backend.
These resource needs can be a bottleneck in applications, for example:
where many/related scenarios must be solved.
when iteration (repeatedly solving 1 or more scenarios) is a key approach in developing code that sets up scenarios.
To improve research productivity, researchers may choose to run scenarios or ‘workflows’ (a combination of solving scenarios and pre- and post-processing steps or codes) through distributed computing, i.e. not on their local machine. Hardware and software environments for distributed computing can vary widely, and can be categorized in multiple ways, such as:
More powerful single-CPU systems, accessed remotely.
Cloud services, e.g. Google Compute Engine; Amazon AWS; Github Actions; etc. providing access to one or more machines.
Dedicated cluster systems (sometimes labelled high-throughput computing, HTC, or high-performance computing, HPC, systems) for scientific computing, operated by a variety of parties.
Specific configuration necessarily depends on the specific system(s) in use and the researcher’s application.
The individual features, tools, utilities, etc. should each be simple, i.e. do one thing, and do it well.