### Communicating Across Disciplines

“The ass that tries to sit between two chairs ends up on the floor.” Evgeni Fedorovich

I recently gave a talk at my alma mater, the University of Science and Arts of Oklahoma. The title of the talk was “Using Mathematics to Understand and Predict the Weather,” and it introduces three simple applications of basic mathematical thinking to problems of interest in meteorology. Putting it together was a lot of fun, and tied together a lot of my experiences in learning the basic ideas and problems of the field, and I try to convey the fact that “I picked it up as I went along” to the audience.

USAO is a liberal arts college, with a very strong focus on interdisciplinary thinking. This is a buzzword, with different meanings depending on the source. What I mean by it is that as a student, I was immersed in an environment that eschewed the traditional practice of extracting a set of disciplinary knowledge and treating it as independent. Instead, our courses were structured so that history, literature, politics, economics, etc were considered different angles of considering “big problems” related to achieving the best life possible for as many of us as possible, and actually in defining what that means.

I’ve thought a lot about my education at USAO. In terms of mathematics, the program was not as strong as if I’d attended OU, simply because it was smaller and had less course offerings. But in a real way that education trained my mind to see big picture connections that would be very difficult to appreciate otherwise.

For example, decisions made by meteorologists have scientific aspects, safety impacts, and economic costs, and all of these things interact with one another. It’s not good enough to know how a hurricane should behave. The problem’s social dimension and associated economic costs that go along with evacuation are as important as where, when and with what intensity the hurricane will make landfall.

Understanding and celebrating the importance of each component, and coming up with balanced solutions (for example, balancing the physical science research with better social understanding for communicating warnings), is what my time at USAO taught me to do. These problems can be tackled individually in a disciplinary setting, but the solutions can really only be implemented with an interdisciplinary focus.

### Numerical Integration and the Fundamental Theorem of Calculus

I occasionally get to think about fun math problems while I’m doing work in applied science. The most recent example of this came up in a conference call. We are concerned about how a satellite measurement will respond to surface pressure changes. Theoretically, the measurement is an integral of the amount of a trace gas in the column, multiplied by an instrument specific kernel:

$\int_0^{p^*} q(p)\Delta\sigma(p) dp$

Note that the Fundamental Theorem of Calculus tells us that the sensitivity of this quantity with respect to surface pressure $p^*$ should exactly be the integrand evaluated at $p^*$. One of my collaborators asked me to verify that this was also true for our model approximation, which is a simple numerical calculation of this integral. I thought it would be a piece of cake. In our model coordinate system, the vertical coordinate is specified as a function of the surface pressure so that the model levels follow the terrain. This means that the knots on which we compute the numerical approximation also depend on the surface pressure. More specifically, we have vectors $\vec{A}$ and $\vec{B}$, so that the vertical coordinate is $\vec{p} = \vec{A}+\vec{B}p^*$. For simplicity, let $f(p) = q(p) \Delta \sigma(p)$ A simple right hand approximation (since the surface pressure occurs at the right end of the integration domain) would be

$\int_0^{p^*} f(p) dp \approx \sum_{i=1}^{N} f(p_i) (p_i-p_{i-1})$
$= \sum_{i=1}^{N} f(a_i + b_i p^*) \left((a_i-a_{i-1}) -(b_i-b_{i-1})p^*\right)$

Now, the right hand side is a function of $p^*$ only, and is easily differentiable using the product rule. After taking a derivative, one would expect that some nice math tricks would lead us to canceling of terms, so that we end up with only $q(p^*) \Delta\sigma(p^*)$, but try as I might, I couldn’t get it to come out. This could easily be due to my own lack of cleverness, but a bit of time searching via Google turned up no results for the problem either. I’m hopeful that a clever mathematician out there can figure out some conditions on $\vec{A}$ and $\vec{B}$ that would force this to be true, but I’ve had to put it aside for now, after trying multiple approximations (left hand, trapezoid, Gaussian quadrature).

Numerical evidence suggests that in fact the statement is not true. I ran the (Enthought Distribution) Python script:

x = linspace(0,1,11)
y = x**2
for i in arange(10):
xn = linspace(0,1.+10.**(-i),11)
yn = xn**2

In this case, the surface pressure would be 1. An important facet of the problem is keeping the number of knots (11) fixed as we take the limit (as $i\rightarrow\infty$) . Running the script, you’ll note that the limit appears to be in the neighborhood of 2 (before roundoff kills the computation), even though it ought to be 1. Further experimentation with different degrees of polynomials points to a what may be an interesting math result, which I’ll let you discover for yourself.

I am posting this in the hope that someone will think the problem is interesting enough to have a whack at it. It seems unsatisfying that our tried and true numerical methods would fail such a simple test.

### Success and Failure in Research

“Success consists of going from failure to failure without loss of enthusiasm.”  Winston Churchill

### Math for Manufacturing Jobs

I listened to the NPR story “For Manufacturing Jobs, Workers Brush Up On Math”, in which the thesis of the reporter is that a lot of people are not getting jobs because they don’t have the basic arithmetic skills necessary to input the correct parameters to complex manufacturing machinery. The costs associated with inputting the wrong numbers to these machines are extremely large (~$1K to$10K), relative to the salary of the workers. As a result, manufacturing math courses are popping up at community colleges and training centers. The thing that’s odd about this story is that we begin learning these things in middle school, and the recursive nature of math education means that they’re going to see them in almost every math course from then on. Is it just that students forget these skills after they leave the high school environment? Or do they ever learn it? And if they’re not learning it, how are they graduating from high school? Is it possible to pass high school level math courses without knowing basic arithmetic?

### Craig Bishop: The Secrets of Model/Background Error Covariance Revealed

This is my first content post for the satellite data assimilation summer school, and the first big insights that I’ve gotten by being here. These come courtesy of Dr. Craig Bishop of the Naval Research Lab in Monterrey, California. In my previous post, I talked about the difficulty I had of envisioning model error as a random process with zero mean and some specified covariance $\mathbf{B}$. Craig gave a pair of illuminating lectures about this topic.

His first point was that we can actually envision model error using basic probability notions. Imagine an infinite collection of earths, each of which has a weather forecast office with a forecast model. Each of these earths has a true state, a forecast, and an observational history. We can collect all of the earths that have the same current true state, and examine the distribution of different forecasts. The resulting density, which we can denote $p\left(x^f|x^t\right)$, is called the “fixed truth” error. The important idea here is that the error is being described statistically, and it’s a mistake to put too much physical emphasis on individual correlations.

Given this basic statistical description of model error, the practical issue becomes actually estimating the “true” model error covariance. We discussed a few. One classic method was proposed by another speaker at the summer school, John Derber, along with his coauthor Parrish, in paper in the early 1990s. The basic idea is to make pairs of forecasts, one 48 hours, and the other for 24 hours, but both valid for the same end time. By taking differences, and averaging over all of the pairs, we can get a sense of the variability in the model starting from different initial conditions and integrating for two different lengths of time. This is usually called the “NMC Method”. Another method is to use a perturbed initial condition to generate an ensemble of forecasts, and then calculate the sample covariance of these different forecasts, much like you would do in an ensemble data assimilation method. These methods yield a static background error covariance, which doesn’t change with time.

In his second talk, Craig highlighted some of the issues with using flow dependent error covariances from ensemble methods. The raw sample error covariance can contain noisy features that persist far from the physical location of interest, which are understood to be spurious, and a result of the finite ensemble size. Localization is the name for the methods that do away with these unwanted features, and Craig spent the remainder of his talk explaining how different people do this, and then presenting a newer method of doing it that he’s been working on. To me, localization is one of the unpleasant things about ensemble methods. It’s a fix for a shortcoming of the method, but there’s not a really satisfying way to do it.

The real contribution from these talks for me was a much more intuitive understanding of how background error covariances should be thought about. The frequentist ideas that I talked about above are a really useful interpretation. Probably this isn’t a sticking point for anyone else, but it really was for me.

### Satellite Data Assimilation Summer School.

I arrived in Santa Fe earlier this afternoon, where I will be spending the next two weeks with some of the brightest grad students and postdocs in DA in the US, not to mention the star studded cast of presenters.  I’ll write a little bit about each day, and try to keep the tweets fresh as well (@seanmcrowell).  Perhaps I’m the only one that reads this blog, but I feel the need to communicate about this amazing opportunity with everyone, even the people who get here by accident while looking for that other Sean Crowell.

### My New Postdoc

I had been working at NSSL only since last September. My work has gone well. I’ve learned a lot of new skills that are important skills for doing independent research, like keeping up with literature, using and editing other peoples’ software, visualizing complex model output, and trying to find interesting questions to ask. More specifically to atmospheric science, I have learned a lot about how atmospheric models work, and how the data assimilation problem is much more complex when you are dealing with such complex models.  These are things that are not at all apparent from the mathematical formalisms.

I am transitioning to a new project, one that is very different than my previous experiences, because the spatial and temporal scales are so much bigger in magnitude.  My new project involves estimating the sources of carbon dioxide at the surface using concentration measurements from satellites. The mathematics of this problem are almost the same as the short range initial condition estimation, except that we’re interested in the lower boundary condition rather than the initial condition.

I’ll write more in a future post about the formal problem.  For now, just let me say that I’m excited about the opportunity, partly because it’s closer to the problem of global climate prediction, and partly because it’s a chance to learn about large scale atmospheric fluid mechanics.  I’ve spent the last couple of years on the small, short time scale events, and now I’m ready to learn about global, long time scale events.