ECCO Adjoint Modeling

[Click here for an article version of the discussion]

What makes an ocean model an even more powerful tool? Its adjoint!

Perhaps best known by its use in data assimilation or state estimation, adjoint models are increasingly employed for other purposes as well. For example, as a tool to investigate the workings of the ocean that can be difficult to ascertain otherwise.

Here we describe what an adjoint is, how it’s used, and what’s special about ECCO’s adjoint.

What is an adjoint?

Adjoint is a transformation used in studying mathematical relationships.

An adjoint model refers to an adjoint-transformed version of a model. In this context, a "model" is a computer program that applies a set of rules that takes input and calculates output.

To distinguish it from its adjoint, the original pre-transformed model is often referred to as the "forward model" in contrast to its adjoint that computes things "backward" – switching what is input and what is output.

Adjoint models are particularly useful for computing how forward model output depends on its input, i.e., its sensitivity.

These images show the relationships between forward and adjoint models' inputs and outputs. Adjoint variables are distinguished by their carets (e.g., â).

Relationships between forward and adjoint model inputs and outputs

For illustration, suppose we have a simple "forward model" written as follows:
a = x – 2y + 3z
b = 4x – 5z

Now suppose we have a quantity of interest J — often called "cost function", "objective function", or "target function" — defined in terms of the forward model output:
J = 6a - 7b

We want to calculate the sensitivity of J. How much would J change if the forward model's input changes one by one? Here's a way... start with changing x by 1. So, x = 1 | y = 0 | z = 0:

a = 1 – (2*0) + (3*0) = 1
b = (4*1) – (5*0) = 4

With a=1 and b=4, let's calculate the sensitivity:

J = (6*1) - (7*4) = -22 But we still need to change y and z by 1 to evaluate the sensitivity to y and z. Each requires separate evaluation of the forward model. This way of evaluating the sensitivity leads to answers of (-22, -12, 53).

That's a lot of steps... is there an easier way? Yes! The "adjoint model", in comparison, allows us to evaluate this sensitivity in one step.

Get ready to enter the "world of the matrix." Remember the equations of the forward model? Each can be shown in matrix form, like this:

Matrix form makes it easier to show how an adjoint is backward of a forward model. Below, our forward and adjoint models are highlighted. The positions of (a, b) and (x, y, z) are switched for the adjoint variables (â, b̂) and (x̂, ŷ, ẑ). Also, rows in the forward model matrix are columns in the adjoint model matrix.

So, what are the equations for the adjoint model? Click the image below to switch among them.

Now that we have the adjoint equations, how do we calculate the sensitivity of J? The adjoint variables are the sensitivity!

Namely, by setting the adjoint model's input (â, b̂) to be the sensitivity of J to (a,b)...

â = (6*1) - (7*0) = 6
b̂ = (6*0) - (7*1) = -7

... the adjoint model's output (x̂, ŷ, ẑ) is the sensitivity of J to (x, y, z):

These results match the previous calculations. Happily, this computation requires only one evaluation of the adjoint model instead of the three that were required by the forward model.

These equations are simple and thus the efficiency of using the adjoint might not be obvious. But imagine the model being more complex such as in ECCO, where a single forward model evaluation requires days of computation on a state-of-the-art supercomputer and its input consists of millions of variables! On the other hand, using the adjoint model requires only one evaluation.

Thus, it is impractical to use the forward model to evaluate the sensitivity of a model like ECCO’s, whereas it is routine using its adjoint. This computational efficiency is what makes an adjoint model indispensable.

What are other examples of applications using adjoints to compute sensitivity? Check out how airplane wings and motorcycles are optimized in terms of airflow.

How is an adjoint used?

Sensitivity (gradient) is foundational in calculus used in every branch of science and engineering that employs math. Just as how calculus has advanced these disciplines, adjoint-derived sensitivities provide new insight into problems.

Adjoint models were first introduced in oceanography in the context of data assimilation. Based on sensitivity of model-data differences, adjoint models are used to fit corresponding forward models to observations. This approach is also at the heart of ECCO's state estimation products. Values of an adjoint's sensitivity also tell us what observations are most effective for monitoring different quantities of interest and are useful in designing observing systems. Check out this StoryMap for one example:

ECCO's adjoint models help pinpoint sensible locations for climate sensors in the North Atlantic-Arctic gateway region.

StoryMap | Loose et al. 2020

The adjoint also provides an effective means to investigate the workings of the ocean. For instance, a change in passive tracer – think of dye spreading in water – can tell us how that tracer-tagged water moves over time, in other words, the "fate" of the water. In contrast, the sensitivity of such quantity to tracer concentrations in the past is determined using the adjoint. It tells us where that tracer-tagged water came from, or the "origin" of the water.

These movies show how ECCO is used to examine the origin and fate of water in the surface layer of the central Equatorial Pacific (5°S-5°N, 150°W-90°W) at a chosen year ("0") deduced by a passive tracer (years > 0) and its adjoint (years < 0). The adjoint passive tracer is evaluated backward in time but below is animated forward in time. The evolving spatial extent of the tracer, shown in blue, illustrates the pathway of ocean circulation. Note the vast spatial extent of both origin and fate compared to where the tracer is at year 0. Click here to watch both movies together.

Movement of tracer-tagged water, forward model

Movement of tracer-tagged water, adjoint model

Blue indicates the outer envelope of the origin and fate of the water mass, inside of which 95% is found. Red indicates regions of highest concentration; it is visible in this 3D perspective only where the blue outer envelope intersects the ocean surface. The evolving blue envelope illustrates the convoluted pathway of ocean circulation; water converges from the subtropics at depth (adjoint model) and then spreads away on the surface moving westward (forward model). (Adapted from Fukumori et al., 2004, doi: 10.1175/2515.1 or view the article here.)

Adjoint models are also employed in studies of causation and attribution ("adjoint gradient decomposition"). With this method, sensitivity is used to quantify effects of different elements driving the ocean. Thus, it provides a tool to assess the relative contributions of various drivers (e.g., north-south wind, east-west wind). In contrast to common practice of using correlation, which quantifies similarities, adjoints reveal causation thanks to first principles they are based on. Check out these StoryMap examples:

The ECCO adjoint is used to untangle the contributions to extreme sea level rise in the Beaufort Sea.

StoryMap | Fukumori et al. 2021

The adjoint helped isolate contributions of wind, temperature & salinity to the North Atlantic’s overturning circulation.

StoryMap | Kostov et al. 2021

What sets the heat content of Southern Ocean mode water formation regions? Not only that but where and when?

StoryMap | Boland et al. 2021

Use of adjoint models is still at a nascent stage and is ripe for innovation. What would you do with an adjoint?

What's special about ECCO's adjoint?

ECCO's ocean model, MITgcm, is one of the few state-of-the-art general circulation models that has an adjoint readily available.

Coding the adjoint of a complex numerical model is time consuming and difficult. It can be comparable in effort to development of the forward code itself. However, ECCO's underlying ocean model was purposefully written to obtain its adjoint code in an automatic way. How? Using Algorithmic Differentiation (AD).

AD software can automatically transform computer programs line by line into their adjoint. In fact, MITgcm was created with automatic differentiation in mind, with ECCO's ocean state estimation advancing hand-in-hand with establishing such tools. As MITgcm was developed, new features were advanced in AD to work seamlessly with ocean circulation models. Furthermore, the style of MITgcm's programming was restricted to those amenable to AD. Today, a rigorous process is still in place to test the "adjointability" of any new feature introduced in MITgcm. This assures availability of the adjoint for the latest version of MITgcm for use in ECCO's ocean state estimation and its various applications.

Publications that may help you better understand how adjoints are generated:

Thacker, W., and Long, R. (1988). Fitting Dynamics to Data, JGR 93(C2), 1227-1240, doi: 10.1029/JC093iC02p01227. AGU »
Giering, R. and T. Kaminski (1998). Recipes for Adjoint Code Construction, ACM Transactions on Mathematical Software 24(4), 437-474, doi: 10.1145/293686.293695. ACM Digital Library »
Marotzke, J., et al. (1999). Construction of the adjoint MIT ocean general circulation model and application to Atlantic heat transport availability, JGR Oceans 104(C12), 29529-29547, doi: 10.1029/1999JC900236. AGU »
Stammer. D., et al. (2002). Global ocean circulation during 1992–1997, estimated from ocean observations and a general circulation model, JGR Oceans 107(C9), doi: 10.1029/2001JC000888. AGU »
Heimbach, P., et al. (2008). The MITgcm/ECCO adjoint modelling infrastructure, CLIVAR Exchanges 13(1), 13-17. CLIVAR »
Wunsch, C., et al. (2009). The global general circulation of the ocean estimated by the ECCO-Consortium. Oceanography 22(2), 88-103, doi: 10.5670/oceanog.2009.41. Oceanography »

Want to learn more about MITgcm? Check out the User Manual. You may find Chapter 7 (Automatic Differentiation) and Chapter 10 (Ocean State Estimation Packages) particularly useful.