Home > Combinatorics and Probability, Differential Equations, Linear Algebra, Markov Chains, Mathematics > On Lanchester's Differential Equations and their Transform into a Markov Transition Matrix (or, the Markovization of the Lanchester Equations)

On Lanchester's Differential Equations and their Transform into a Markov Transition Matrix (or, the Markovization of the Lanchester Equations)

I have been very curious as to whether the Markovization of differential equations is applicable to all systems, linear or not.  In the case of the SIR differential equations, Markovization was possible even when the system is both non-linear and coupled.  I think that perhaps there is an algorithm to frame the differential equations thusly... I haven't fleshed it out completely (other than by using quite a bit of intuition, as in the SIR case).  Markovization, I think, is distinct from putting differential equations in state-space form; the vector pulled through the state-space matrix represents the trajectories of a function and its derivatives, not the trajectories of a set of functions related in that their sum is one: probabilities or proportions.  Who knows, though, a happy mix may be possible.

I was examining a Calculus book, specifically by Deborah Hughes-Hallet, Andrew M. Gleason, et al. (New York, John Wiley & Sons, Inc., 1994), and came across this very interesting problem that models how armies or soldiers fight it out... and tries to predict who wins.  I hope you don't judge: even after having read the most complicated books in mathematics, I honestly believe that revisiting more basic books can bring about insights that one didn't think of before.  Maybe even ground the complicated knowledge one acquired, or refine it.  Such has been my own experience... so I think it's a useful exercise, for all of those who have resisted "going back to the basics." If you are one, perhaps you are missing the opportunity to bridge the gaps you may (or not) still have!  :-\

The problem on page 550 says that, if $x$ represents the number of soldiers from army $X$ and $y$ represents the number of soldiers from army $Y$, the rate at which soldiers in one army are injured or die can be thought of as proportional to the number of "firepower" the other army can concentrate on them.  The model was first thought by F.W. Lanchester:

$\frac{dx}{dt} = -ay$

$\frac{dy}{dt} = -bx$ with $a, b > 0$.

In order to Markovize Lanchester's differential equations, we need to flesh out what the states are: clearly, you can be alive and belong to army $X$, or army $Y$, or be dead (from a different vantage point, we could have "Dead from $X$" and "Dead from $Y$" if we wanted to count each individually).  The Markov transition matrix should look something like this, then:

$L = \left[ \begin{array}{ccc} something & 0 & something \\ 0 & something & something \\ 0 & 0 & 1 \end{array} \right]$

where $L_{1,1}$ represents the probability of being alive in army $X$ (if you are from army $X$) after a time step (or the proportion of alive soldiers in $X$ after a time step), $L_{1,2}$ represents the probability of "defecting" to army $Y$ if you are in army $X$ (I assume such doesn't happen), $L_{1,3}$ represents the probability of dying if you are in army $X$.  Then $L_{2,1}$ represents the probability of defecting to army $X$ if you are in army $Y$ (I also assume this doesn't happen), $L_{2,2}$ represents the probability of being alive in army $Y$, if you are from army $Y$, and $L_{2,3}$ is the probability of dying if you are in army $Y$.  The third row of $L$ simply represents the fact that one cannot come back from the dead once dead:  there are no zombies (such may be in fact possible in videogames, for example, hence why I make a point of saying this).

The next order of business is to note that the total population does not change (alives and deads).  Thus we would have that:

$p_{alive,x} = x/N$

$p_{alive,y} = y/N$, and

$p_{alive,x} + p_{alive,y} + p_{dead} = 1$.  Rewriting Lanchester's equations in terms of proportions, we would have that:

$\frac{d p_{alive,x}}{dt} = -a p_{alive,y}$

$\frac{d p_{alive,y}}{dt} = -b p_{alive,x}$

To be complete, we must model what the rate of change of the dead is, which we can figure out by differentiating in time the equation $p_{alive,x} + p_{alive,y} + p_{dead} = 1$.  In effect, what I'm saying is that, Lanchester's equations are incomplete - they are missing the key information that:

$\frac{d p_{dead}}{dt} = a p_{alive,y} + b p_{alive,x}$

This equation basically states that the dead increases as the rate at which soldiers in army $X$ are put down and the rate at which soldiers in army $Y$ are put down.

Euler's method suggests that to write the approximation to these differential equations, we need think of the derivative not at the limit, but taking tiny time steps.  So in the following equation

$lim_{\Delta t \rightarrow \infty} \frac{\Delta p_{alive,x}}{\Delta t} = -a p_{alive,y}$

let's forget about the limit for a second, and solve thus:

$\Delta p_{alive,x} \approx -a p_{alive,y} \Delta t$

and

$p_{alive,x} (t+\Delta t) \approx p_{alive,x} (t) - a p_{alive,y} \Delta t$.  Writing all differential equations thus:

$p_{alive,y} (t+\Delta t) \approx p_{alive,y} (t) -b p_{alive,x} \Delta t$ and

$p_{dead} (t+\Delta t) \approx p_{dead} (t) + a p_{alive,y} \Delta t + b p_{alive,x} \Delta t$

Euler's method gives us just what we need for our transition matrix construct.  If the approximations are in fact equalities, from the point-of-view of Markov chain theory, the left-hand side is the next-step vector, and the right-hand side explains how this vector would be obtained by a transformation on the at-step vector.  In terms of Markov dynamics:

$\textbf{p(t +} \Delta t \textbf{)} = \textbf{p(t)} \cdot L$

Playing a bit with the matrix $L$, it should look like:

$L = \left[ \begin{array}{ccc} 1 - a \cdot \frac{p_{alive,y}}{p_{alive,x}} \Delta t & 0 & a \cdot \frac{p_{alive,y}}{p_{alive,x}} \Delta t \\ 0 & 1 - b \cdot \frac{p_{alive,x}}{p_{alive,y}} \Delta t & b \cdot \frac{p_{alive,x}}{p_{alive,y}} \Delta t \\ 0 & 0 & 1 \end{array} \right]$

Letting $\Delta t$ be unity,

$\textbf{p(t +1)} = \textbf{p(t)} \cdot L$

with

$L = \left[ \begin{array}{ccc} 1 - a \cdot \frac{p_{alive,y}}{p_{alive,x}} & 0 & a \cdot \frac{p_{alive,y}}{p_{alive,x}} \\ 0 & 1 - b \cdot \frac{p_{alive,x}}{p_{alive,y}} & b \cdot \frac{p_{alive,x}}{p_{alive,y}} \\ 0 & 0 & 1 \end{array} \right]$

and, I think, we have successfully Markovized the differential equations, in effect discretizing them at unit time steps.  We should be cautious though, because now both $a, b$ have additional restrictions in order that the entries of $L$ remain below or equal to 1 (and specifically, the sum of each row must be equal to one).

Euler's method in effect is the link here between the continuous differential equations and the Markov transition matrix.  In the SIR discretization of my previous post, I went backwards, showing how we can write continuous differential equations from the Markov transition matrix.

Several interesting things to note about the matrix $L$: the probability of death, if one is in army $X$ is proportional to the attrition $a$ and by how many soldiers army $Y$ exceeds $X$ at the time.  A similar argument works for the death probability of soldiers in $Y$, with attrition $b$ and excedent of $X$ to $Y$.

As with the SIR equations, since $L$ is really changing every time step, it is $L(t)$.  So, from initial conditions, the amount of soldiers alive in army $X$, army $Y$, and dead at time step $n$ can be calculated using the discretized matrix by pulling the original vector through, as:

$\textbf{p(t +} n \Delta t \textbf{)} = \textbf{p(t)} \cdot L(t) \cdot L(t + \Delta t) \cdot \ldots \cdot L(t + (n-1) \Delta t)$

In my next post, I want to model, as in the Calculus book I mentioned, the Iwo Jima encounter during World War II, the fight between Japan and the US, and solving the trajectory numerically using the discretization above.

An additional note, from examining our assumptions of matrix $L$, we realize now simple ways in which we could extend Lanchester's differential equations, by filling in "probabilities that a soldier will defect to the other army."  We have indeed now enriched the model in the sense that we can now take into account other complicated dynamics (the least of which is the "creation of zombies").  I think this is very useful in any model, because it allows us to contemplate transitions into states we wouldn't have considered otherwise, and, more powerfully, allows us to transform our mindset to the realm of differential equations, where we can have DE-related insights, and back again, where we can have Markovian insights.

1. August 17th, 2010 at 07:37 | #1

Thanks for this! I've been working around the edges trying to figure this out, but I'm weak in DE and Markov theory, so I haven't been making much progress. Given your example as a pattern I might be able to work out the Markov form of the "Linear" Lanchester equations for myself.

2. August 17th, 2010 at 09:30 | #2

I'm so glad it's useful! :) I hope you'll let me know what your project is all about!!

3. August 17th, 2010 at 15:09 | #3

>I hope you’ll let me know what your project is all about!

Recently I've been writing about Lanchester's Laws, and I have been wanting to show the Markov or probability based version to demonstrate the inherent uncertainty of the results of this sort of prediction. I could do this demonstration in a spreadsheet, but I knew there had to be a more rigorous way to show it. I'll use this to better understand the games I like play, and maybe (eventually) to help design a game of my own.

More generally, I'm a biostatistician and wargamer with a hobby interest in the mathematics of games, which is the general topic of my blog. I'm pretty good at applied math and stats, but my education is lacking on some topics and I've never been great at theory. I can read the journals though (and math blogs!) , so I search for others writing about related topics to fill the gaps in my own knowledge.

I now have you on my RSS feed, added to my blog roll, and I'll be linking to this in an upcoming blog post. No doubt you will be see more of me in the future. :-)
---Dan

4. August 17th, 2010 at 15:43 | #4

Thank you! I'll take a look at yours too and add you to my blogroll as well. Perhaps you'll be interested to know then that, since this method I talk about here is Euler-approximation based, the further you are from the initial conditions (say, by n time-steps), the error grows as 1/n (the error on the probability/proportion trajectory, that is). I've written a bit about it on the SIR posts, more specifically here and here. Rigor can be added by researching Euler's method proper. :) Let me know if I can be of help! Markov chains are a topic that currently greatly interest me!

5. August 17th, 2010 at 21:52 | #5

I knew that you didn't get exactly the same (expected) result between the two representations, and approximation error could easily be the difference.

If I were to ask for help, it would be in calculating the variance of a Markov model. For instance, what is the variability of the results in the SIR model (on the model above). A nagging voice in my head tells me this ought to be analogous to calculating the variance of the geometric distribution, and therefore something I could do if I understood more about Markov chains. On the other hand, the variance at a given time will be dependent on the previous time, leading to a distribution of all possible outcomes at time t, and that seems like it might be a much harder problem.

Now that I've written all this, I'm beginning to see I may be misunderstanding what you have really done in the Markovization above, which may render my question meaningless. I should study! :-)

If this is of any interest to you, here is a link to a fairly recent paper on Lanchester attrition models, which also gives a Markov version of the model:
http://dspace.dsto.defence.gov.au/dspace/bitstream/1947/4233/1/DSTO-TR-1822%20PR.pdf
I also find this paper much more readable than most of the other literature on the topic.

6. August 18th, 2010 at 00:10 | #6

Hi! The link you provided is actually very interesting: I've been looking at the stochastic modeling part. The approach there and my approach are completely different. For example, I consider only three states or statuses: Alive (army X), Alive (army Y), and Dead. The link you provided is (vastly) more complicated in that it considers $N^2$ (countable because the Cartesian product of countable sets is countable) states, these being the actual outcome of the battle/intermediate and start states: say reds wins over greens with 5 units versus 0, this outcome would be in $N^2$ as $(5,0)$. Attached to each coordinate is a probability of transitioning to that state (and adjacent, "downward" states). Statements (18) and (19) are meant to remind us that the Markov process is a time process, that there is a probability attached to each state, and that we can reach any final state in $q$ steps by going through an intermediate state in $t$ steps and then jumping to the final state in $s$ steps provided $t + s = q$ (Chapman-Kolmogorov equation), and that we can take tiny time steps to make the process continuous (in time). Lastly, obviously we can't have an increasing path as we go from coordinate to coordinate because this means producing alive people from dead (i.e., the path is necessarily decreasing and absorbing into the x-coordinate or the y-coordinate or the origin). It later describes how to build the transition matrix by giving an algorithm on how to attach a probability to each coordinate state (20). Finally, the stochastic matrix can probably be calculated at its steady state by finding eigenvalues and using spectral decomposition-type approaches, I presume.

My approach: simpler! I don't consider the probability that $(5,0)$ is an outcome. Rather, I consider the probability of being alive (in army X or army Y) at the next time-step. This is easy because at the next time step we can calculate exactly how many people have died, from the actual differential equations (or approximate such). However there's a problem, in that the matrix changes with time, which is an approach we don't usually take in the Markov treatment (the transition matrix is usually fixed in time, so that powers of such a matrix, another matrix, tells us the distribution of a start vector after a time step equal to the power of the matrix). In my approach, we essentially have to drag the initial vector across a changing matrix, to obtain the trajectory of a start vector. So perhaps it is not a Markovian matrix after all (since we can only take powers of such if the time component drops out), but a discretization of the differential equations in a convenient (matrix) form, which I reasoned out thinking Markovianly.

As to the expected value and variance of the Markov model in the pdf you provided, I'm not sure what you mean by it being a geometric distribution (but I did catch the reference to a geometric distribution when it talks about fractal models). At worst, you can probably simulate the Markov model and calculate the expectation and variance, like this: pick a suitable bijection of the (allowable) coordinates to the positive integers (your new, y'-axis, there may be a question about how to order them, perhaps, although my hunch is use a diagonal method?), and remember that there will be several absorbing states, namely $m + n + 1$, where m reds and n greens are the start coordinates. Pick a time step (1-unit), and let unit increases be your new x'-axis (we want to visualize the actual Markov process). Next go to your start coordinate. Run the model and see how it ends (in other words, get a realization of the process, it ends when it hits an absorbing state). Calculate the expectation and variance (?) of the realization. Repeat many times and review point-estimators to figure out how to make sense of this data.

I don't recall actually seeing a treatment of expectation and variance of a Markov process, although I agree I should know it/investigate it further; the preoccupation is usually more about steady-state probabilities (in regular matrices, the higher-power matrices stabilize, which means that an initial vector will end up as a steady-vector at infinite steps of time). Thus, since any vector will eventually result in the steady-state (for regular matrices anyway), the expectation of the process (ensemble), as we take more and more (infinite) unit time steps, should be the inifinite-step probability steady-state vector dot the state vector (in other words, weigh the states by their respective steady probabilities). This however, is different from what I think you mean; from this viewpoint, you are getting an expected state, the intermediate state of all paths, which may not be an end-state (an outcome of the process, or, who wins on average). The variance of the process can probably be as great or as small depending on your state space and on whether states communicate with each other, so it may be realization-dependent. (Hence why we don't usually talk about variance in this context).

Does this help??

I think the pdf is really interesting, I'll write a post with a simulation, it may make things clearer.

7. August 18th, 2010 at 07:10 | #7

That helps very much, and there is more here for me after I have a chance to process all you wrote. Thank you for taking the time to respond in such detail.
You answered what I most needed to know - that to get at what I really want will require a simulation. I feel a bit better that I knowing I wasn't missing out on some simple and obvious solution. Fortunately simulation is something I understand quite well, and I agree with your formulation.

I have some simple simulations set up in a spreadsheet, intended for a post on my blog, but I want to set up something a little better in R. The spreadsheet needs some more work before it's ready to share, but I'll show it to you when it's done.

Again, many thanks for this discussion. :-)
---Dan