ESEm design

Here we provide a brief description of the main architectural decisions behind the design for ESEm in order to hopefully make it easier for contributors and users alike to understand the various components and how they fit together.

Emulation

We try to provide a seamless interface for users whether they provide iris Cube’s, xarray DataArray’s or numpy ndarrays. This is done using the esem.wrappers.DataWrapper and associated subclasses, which keep a copy of the provided object but only exposes the underlying numpy array to the emulation engines. When the data is requested from this wrapper using the esem.wrappers.DataWrapper.wrap() method then it will return a copy of the input object (Cube or DataArray) with the data replaced by the emulated data.

This layer will also ensure the underlying (numpy) data is wrapped in a esem.wrappers.ProcessWrapper. This class transparently applies any requested esem.data_processors.DataProcessor in sequence.

The user can then create an esem.emulator.Emulator object by providing a concrete esem.model_adaptor.ModelAdaptor such as a esem.model_adaptor.KerasModel. There are two layers of abstraction here: The first to deal with different interfaces to different emulation libraries; and the second to apply the pre- and post-processing and allow a single esem.emulator.Emulator.batch_stats() method. The esem.emulator.Emulator._predict() provides an important internal interface to the underlying model which reverts any data-processing but leaves the emulator output as a TensorFlow Tensor to allow optimal sampling.

The top-level functions esem.gp_model(), esem.cnn_model() and esem.rf_model() provide an simple interface for constructing these emulators and should be sufficient for most users.

Calibration

We try and keep this interface very simple; a esem.sampler.Sampler should be initialised with an esem.emulator.Emulator object to sample from, some observations and associated uncertainties. The only method it has to provide is esem.sampler.Sampler.sample() which should provide sample \(\theta\) from the posterior.

Wherever possible these samplers should take advantage of the fact that the esem.emulator.Emulator._predict() method returns TensorFlow tensors and always prefer to use them directly rather than using esem.emulator.Emulator.predict() or calling .numpy() on them. This allows the sampling to happen on GPUs where available and can substantially speed-up sampling.

The esem.abc_sampler.ABCSampler extends this interface to include both esem.abc_sampler.ABCSampler.get_implausibility() and esem.abc_sampler.ABCSampler.batch_constrain() methods. The first allows inspection of the effect of different observations on the constraint and the second allows a streamlined approach for rejecting samples in batch, taking advantage of the large amounts of memory available on modern GPUs.