Archive

Archive for October, 2010

On Patchix by Patchix Products – Tying Up Loose Ends - (RWLA,MCT,GT,AM Part V)

October 17th, 2010 No comments

In this post I want to "tie up a few lose ends."  For example, in my last post I stated that the patchix pattern

 \begin{array}{ccc} p_1(x,y) & = & 1 - cos(2 \pi x) cos(2 \pi y) \\ p_2(x,y) & = & 1 + \frac{cos(2 \pi x) cos(2 \pi y)}{2} \\ p_3(x,y) & = & 1 - \frac{cos(2 \pi x) cos(2 \pi y)}{4} \\ p_2(x,y) & = & 1 + \frac{cos(2 \pi x) cos(2 \pi y)}{8} \\ \vdots \\ p_t(x,y) & = & 1 - \frac{cos(2 \pi x) cos(2 \pi y)}{(-2)^{t-1}} \end{array}

for  t \in \mathbb{Z^+} , but I didn't prove it.  It's simple to do by induction: by the inductive hypothesis,

 p_1(x,y) = 1 - cos(2 \pi x) cos(2 \pi y) = 1 - \frac{cos(2 \pi x) cos(2 \pi t)}{(-2)^{1-1}}

By the inductive step, assume

 p_k(x,y) = 1 - \frac{cos(2 \pi x) cos(2 \pi y)}{(-2)^{k-1}}

Then,

 \begin{array}{ccc} p_{k+1}(x,t) & = & \int_0^1 p_1(1-y,t) \cdot p_k(x,y) dy \\ & = & \int_0^1 \left( 1 - cos(2 \pi (1-y))cos(2 \pi t) \right) \cdot \left( 1 - \frac{cos(2 \pi x) cos(2 \pi y)}{(-2)^{k-1}} \right) dy \end{array}

Now, if one dislikes shortcuts one can expand the product and integrate term by term to one's heart's content.  The "shorter" version is to relate the story: notice the product of 1 with itself is 1, and such will integrate to 1 in the unit interval.  So we save it.  The integrals  \int_0^1 cos(2 \pi y) dy and  \int_0^1 cos(2 \pi - 2\pi y) dy both evaluate to zero, so we are left only with the task of evaluating the crossterm:

 \begin{array}{ccc} && \int_0^1 cos(2 \pi (1-y))cos(2 \pi t) \cdot \frac{cos(2 \pi x) cos(2 \pi y)}{(-2)^{k-1}} dy \\ & = & \frac{cos(2 \pi t) cos (2 \pi x)}{(-2)^{k-1}} \int_0^1 cos(2 \pi - 2 \pi y) cos(2 \pi y) dy \\ & = & \frac{cos(2 \pi t) cos (2 \pi x)}{(-2)^{k-1}} \int_0^1 cos^2(2 \pi y) dy \\ & = & \frac{cos(2 \pi t) cos (2 \pi x)}{(-2)^{k-1}} \cdot \frac{1}{2} \\ & = & -\frac{cos(2 \pi t) cos (2 \pi x)}{(-2)^{k}} \end{array}

Let's not forget the 1 we had saved, so:

 p_{k+1}(x,t) = 1 - \frac{cos(2 \pi x) cos(2 \pi t)}{(-2)^{k}} \rightsquigarrow 1 - \frac{cos(2 \pi x) cos(2 \pi y)}{(-2)^{k}} = p_{k+1}(x,y)

as we wanted to show.

So finally notice that, of course, if we take the limit as  t approaches infinity, the patch evolution tendency is to become 1, the uniform distribution:

 \lim_{t \rightarrow \infty} p_t(x,y) = 1 = u(x,y)

From here on out, I want to set up the operative framework of patchixes, in analogy with discrete matrices.  I want to show that in general, patchix products are non-commutative.  This is easily done by counterexample:

We want to show that  p(x,y) \star q(x,y) \neq q(x,y) \star p(x,y) . So suppose the patchixes  p(x,y) = x and  q(x,y) = y . Then

 p(x,y) \star q(x,y) = \int_0^1 p(1-y,t) \cdot q(x,y) dy = \int_0^1 (1-y) y dy = \int_0^1 y - y^2 dy = \frac{1}{6}

and

 q(x,y) \star p(x,y) = \int_0^1 q(1-y,t) \cdot p(x,y) dy = \int_0^1 (t \cdot x) dy = t \cdot x \rightsquigarrow x \cdot y

are clearly not-equal.  It would be great to say that, because patchixes are non-commutative, patches are too, but we don't know that patches as a whole subset of patchixes commute, so let's disprove it.  Now suppose the patches  p(x,y) = x + \frac{1}{2} and  q(x,y) = 1 + xy - \frac{y}{2} .  Then

 \begin{array}{ccc} p(x,y) \star q(x,y) & = & \int_0^1 p(1-y,t) \cdot q(x,y) dy \\ & = & \int_0^1 \left( \frac{3}{2} - y \right) \cdot \left( 1 + xy - \frac{y}{2} \right) dy \\ & = & \frac{5x}{12} + \frac{19}{24} \end{array}

where

 \begin{array}{ccc} q(x,y) \star p(x,y) & = & \int_0^1 q(1-y,t) \cdot p(x,y) dy \\ & = & \int_0^1 q(1-y,t) \cdot p(x) dy \\ & = & p(x) \int_0^1 q(1-y,t) dy \\ & = & p(x) \cdot u(t) = p(x) \\ & = & x + \frac{1}{2} \end{array}

By refraining from calculating this last bit explicitly, we have (serendipitously) proved that any patch by a patch that is solely a function of  x returns the last patch, a result which reminds us of the analogous distribution by patch result I have shown in my previous post (a distribution on [0,1] times a patch that is solely a function of  x returns the patch, that viewed from the point of view of functions is a distribution on [0,1]).  A quick note: the integral  \int_0^1 q(1-y,t) dy is the unit distribution because  \int_0^1 q(x,y) dx = u(y) and  x \rightsquigarrow (1-y) and  dx \rightsquigarrow -dy .

The end result of these observations is that patches are also, in general, non-commutative.

Next, I want to show that patchixes in general are associative.  This is a bit tricky because of the "after integral" transformations we have to do, but it is doable if we keep careful track of our accounting.  We want to show that  [p(x,y) \star q(x,y)] \star r(x,y) = p(x,y) \star [q(x,y) \star r(x,y)] .  Let's begin with the left hand side.

 \begin{array}{ccc} [p(x,y) \star q(x,y)] \star r(x,y) & \rightsquigarrow & [p(x,w) \star q(x,w)] \star r(x,y) \\ & = & \left( \int_0^1 p(1-w, y) \cdot q(x, w) dw \right) \star r(x, y) \\ & = & \int_0^1 \left( \int_0^1 p(1-w, t) \cdot q(1-y, w) dw \right) \cdot r(x, y) dy \\ & = & \int_0^1 \int_0^1 p(1-w, t) \cdot q(1-y, w) \cdot r(x, y) dw dy \\ & = & s(x,t) \rightsquigarrow s(x,y) \end{array}

Now the right hand side

 \begin{array}{ccc} p(x,y) \star [q(x,y) \star r(x,y)] & \rightsquigarrow & p(x,w) \star \left( \int_0^1 q(1-y, w) \cdot r(x,y) dy \right) \\ & = & \int_0^1 p(1-w, t) \cdot \left( \int_0^1 q(1-y, w) \cdot r(x,y) dy \right ) dw \\ & = & \int_0^1 \int_0^1 p(1-w,t) \cdot q(1-y, w) \cdot r(x,y) dy dw \\ & = & s(x,t) \rightsquigarrow s(x,y) \end{array}

The two sides are equal when we can apply the Fubini theorem to exchange the order of integration.

Of course, patches, being a subset of patchixes, inherit associativity.

Defining a patchix left and right identity is extremely difficult, in the sense that, if we take a hint from discrete matrices, we'd be looking at a very special function on the  xy plane, so that  i(1-y,y) = i(x,1-x) = 1 and  0 everywhere else.  Because there is no "pretty" way to define this as a function of  x and  y both, showing that when we multiply a patchix by this function on either the right or the left requires elaborate explication. Unless we take it as axiomatic high ground, postulating the existence of an identity function  i(x,y) so that  i(x,y) \star p(x,y) = p(x,y) = p(x,y) \star i(x,y) to make the framework work, there is no easy way out.  Let's give it a shot then.

Left identity:

 i(x,y) \star p(x,y) = \int_0^1 i(1-y,t) \cdot p(x,y) dy

Now  i(1-y,t) = 1 only for values where  t = y , as we've defined it, otherwise the integral is zero and there is nothing to solve.  So then we've got

 \int_0^1 i(1-t,t) \cdot p(x,t) dy = \int_0^1 (1) \cdot p(x,t) dy = p(x,t) \rightsquigarrow p(x,y)

which is essentially the argument I make for the zero patch power in my informal paper on continuous Markov transition matrices or patches (however, there's a problem with this definition on patches, more of this below).  There's the question of why we didn't force the change of  dy \rightsquigarrow dt , and this is because the only way to obtain a function of both  x and  t is to force the patchix to the  x t plane and let the integral be taken in the  x y plane.  If this argument is unsatisfactory, consider this one:  at  t = 0 = y the patchix takes the values  p(x, 0) which is a function of  x alone.  Thus,

 \int_0^1 i(1,0) \cdot p(x,0) dy = p(x,0) \int_0^1 (1) dy = p(x,0)

if we do this for all  t \in [0,1] , we are certainly left with  p(x,t) .  We may raise the objection that, if we create a mental picture of the situation, at  t = 0 ,  i takes a value of 1 only at  y = 0 , so that, on the  x y plane, all values of  p(x, y) are zeroed except those at  y = 0 .  Thinking about it this way creates the difficulty of the integral with respect to  y : it evaluates to zero (there is no "area" in the  x y plane anymore, only a filament or fiber at  y=0 ), and we would be left with the zero patchix.  There is no way to resolve this except two ways: to send the patchix  p(x,y) to  p(x,t) before we take the integral in the  x,y plane, and then toss the integral out the window (or take it on the uniform distribution), or, to think of the filament  p(x,0) = p_0(x) as  p_0(x) \times [0,1] = p_0(x,y) and then integrate in the  x y plane to obtain  p_0(x) \rightsquigarrow p(x,0) and do this for all  t to get  p(x,t) .  Hence yes, the difficulty of defining the identity function on "surface" matrices (because it is not smooth like they are and because it is defined piece-wise).

Right identity:

 p(x,y) \star i(x,y) = \int_0^1 p(1-y,t) \cdot i(x,y) dy

Here we remind ourselves that  i(x,1-x) = 1 and zero otherwise, so that we can make the substitution

 \int_0^1 p(x,t) \cdot i(x,1-x) dy = \int_0^1 p(x,t) \cdot (1) dy = p(x,t) \rightsquigarrow p(x,y)

We of course have issues: it may seem redundant to send  x \rightsquigarrow 1-y \rightsquigarrow x , sending  x back to itself, but again this is the only way to remain consistent and get back the original function.  Again there's an issue of why we didn't send the integral  dy \rightsquigarrow -dx , but this has to remain in the  x y plane for the mechanics to work.  Other objections are likewise not easily resolved; but the argument would work out algebraically if we concede on a few things: otherwise we cannot but shrug at the fact that it is, indeed, a little bit of hocus pocus, and we return to our suggestion to postulate the identity function as an axiom. Perhaps maybe these issues can be resolved or elucidated a little later, I don't lose hope.

Defining inverse patchixes will also present a great difficulty, particularly because they have to produce the identity function when we "patchix multiply" two mutually inverse patchixes  together.  I was thinking that we could perhaps determine whether a particular patchix has one, by extending Sarrus's rule (for determinants) to be continuous, which would involve, I'm sure, multiple integrations.  This will be a topic of further investigation for me. The cool thing is, if we can elucidate how this "continuous version" of the determinant works, many different results from Linear Algebra could follow.  I am also trying to figure out how two inverse patchixes would look like, and if I can produce an example (at all), virtually from thin air.  If I can, then perhaps we're on our way to constructing patchix groups of all flavors.

Unfortunately, patches can't inherit the identity as we've defined it: the integral with respect to  x of  i(x,y) is zero for all  y .  Thus  i(x,y) is not a patch.

This problem makes us want to think of the uniform distribution  u(x,y) as another possible candidate for the identity for patchixes all, and it might just work if we agree that, when we don't have a function of  t or of  x after doing the setup-transformations for the integral, we send whatever function remains there before taking the integral.

Left identity:

 u(x,y) \star p(x,y) = \int_0^1 u(1-y,t) \cdot p(x,y) dy \rightsquigarrow \int_0^1 (1) \cdot p(x,t) dy = p(x,t) \rightsquigarrow p(x,y)

Right identity:

 p(x,y) \star u(x,y) = \int_0^1 p(1-y,t) \cdot u(x,y) dy \rightsquigarrow \int_0^1 p(x,t) \cdot (1) dy = p(x,t) \rightsquigarrow p(x,y)

This has several happy consequences: we avoid dealing with a piece-wise defined function  i(x,y) which is zero everywhere except on  y = 1-x , the uniform distribution is smooth, we can now more easily define inverses (by finding multiplicative inverse functions, more on this below), and, specifically regarding patches,  \int_0^1 u(x,y) dx = u(y) = 1 so the uniform distribution is indeed a patch.

In my mental picture, the "patchix product" of the uniform distribution with a patchix (and vice versa) doesn't "add up" (pun intended), but the algebraic trickery would seem to be the same even when using the alternative  i(x,y) .  So.  At this point I sort of have to convince myself into accepting this for now.

On Patch by Patch Products, Part II - (RWLA,MCT,GT,AM Part IV)

October 15th, 2010 3 comments

Last time I talked about a concept I invented, and based on my studies on Markov chains.  They are, essentially, "continuous matrices" (a surface on  [0,1] \times [0,1] ) with the property that they add to 1 if we take the integral with respect to  x for any  y , in analogy to the requirement in the usual Markov matrix treatment.  I dubbed such "patches," and explained a way to construct them.  In my previous post, I began thinking that patches seem to be very special, in the sense that self patch powers can represent the state of a liquid in time, if we allow ourselves to be a little imaginative.  Let's say that we disturb a uniform distributed patch to an initial state, the initial state patch, like this:

 p(x,y) = 1 - cos(2 \pi x) cos(2 \pi y)

It is easy to see that if we integrate with respect to  x our result is 1, so that it is indeed a patch. (I also constructed this function by letting  f_1(x) = cos(2 \pi x), g_1(y) = cos(2 \pi y), f_2(x) = 1 and calculated that  g_2(y) = 1 using the technique I talked about here).

Let's say we have depressed the liquid at the four corners and center of the confined space (necessarily a cube of dimensions  1 \times 1 \times h ), essentially giving it energy. Next calculate the patch powers (as described in my previous posts).  Interestingly, if we map the patch powers of such liquid, they will converge to a steady state, just like Markov matrixes would:

 p(x,y) = 1 - cos(2 \pi x) cos(2 \pi y)

 p_2(x,y) = 1 + \frac{cos(2 \pi x) cos(2 \pi y)}{2}

 p_3(x,y) = 1 - \frac{cos(2 \pi x) cos(2 \pi y)}{4}

 p_4(x,y) = 1 + \frac{cos(2 \pi x) cos(2 \pi y)}{8}

 p_5(x,y) = 1 - \frac{cos(2 \pi x) cos(2 \pi y)}{16}

 p_6(x,y) = 1 + \frac{cos(2 \pi x) cos(2 \pi y)}{32}

The evolution in time of this particular patch is easy to guess (although I should, technically, prove this by induction... I do in my next post):

 p_t(x,y) = 1 - \frac{cos(2 \pi x) cos(2 \pi y)}{(-2)^{t-1}}

for  t \in \mathbb{Z^+} , and letting this parameter represent both time and the patch power.  Of course, if we integrate with respect to  x any of these, the result is 1, and so, they are, indeed, patches.

I would like to state in a different post the conditions in which a steady-state is achievable; my suspicion is that, in analogy to Markov chains, steady-state happens if the patch is non-zero for some power (and above) on  [0,1] \times [0,1] , a property that is called regularity within that context, and of course, I would like to be able to calculate the steady state as easily as it can be done with discrete Markov chains (I was afraid that, in this particular example, I wouldn't be able to achieve steady state because of the initial patch having zeroes at the corners and center).  It's pinned as one of my to-dos.  At any rate, the fact that there are patches that converge to a state (a 2D surface), and, specifically, that can converge to the uniform distribution surface, suggests that such systems, from the viewpoint of Physics, must dissipate energy and there is linked the concept of entropy.  Of course from a probabilistic point of view, entropy in this sense is non-existent; patches merely describe the probability of movement to another "position" on each  y fiber.

There are of course patches that do not converge to the uniform surface distribution, but to other types: in my previous post, the patch I constructed converged to a plane that is tilted in the unit cube.  I wonder if such cannot have a physical interpretation that relates it to gravity: the liquid experiences a uniform acceleration (gravity) normal to the (converging) plane, which of course says, from a physical interpretation, "the cube is tilted."  Again from the probabilistic point of view, the concept of gravity is an explanatory link to Physics, but the end-state arises without its action on the fluid at all!

There are fun topological considerations too: the fact that we can do this on the unit cube does not preclude us from doing it in, for example, a unit cylinder (a cup or mug!), provided we can find the appropriate retract-into-the-square function and vice-versa.  This I think might be very interesting to map movement in all kinds of containers.

I have already talked about a couple potential venues in Group Theory, which I really would also like to go into further at some point.

As in other posts, another possible area of investigation is the evolution of the surface in smaller bits of time. I was able to link, in previous Markov treatments, discrete representations of Markov chains to continuous time differential equations.  Here is where it would be immensely interesting to see if patches, under this light, do not converge to partial differential equation representations.  Which leads me to the last point, regarding Navier-Stokes turbulent flow (which I admit know very little about), and a potential link to its differential equation representation:

Here is why I think that turbulent flow can be explained by generalizing patches a little bit, to "megapatches" (essentially 3d-patches or tensors), since, now we can think not of a 2D surface converging, but a 3D one in time: a water sphere in space (I once saw a cool video on this and was left thoroughly fascinated) or a water balloon being poked could be understood this way, for example, so that the movement of water throughout the flexible container could be similarly traced (by mapping the probability of movement in the container)!  I need to flesh this out a little more, but I think it's also very interesting, potentially.

These studies make me ask myself, again: what is the relationship between stochastic processes and deterministic representations?  They seem to be too intimately linked to be considered separate.

On Patch by Patch Products - (RWLA,MCT,GT,AM Part III)

October 12th, 2010 No comments

In my previous post, I described the concept of a "patchix" and of a special kind, the "patch." I described how to multiply a continuous function on  [0,1] by a patch(ix). Today I want to talk about how to multiply a patch by a patch and certain properties of it.

In my description in my informal paper, I basically said that in order to (right) multiply a patchix by a patchix, say  p(x,y) with  q(x,y) , we would have to send  p(x,y) \rightsquigarrow p(1-y,t) and then integrate as:

 r(x,t) = \int_0^1 p(1-y,t) \cdot q(x,y) dy \rightsquigarrow r(x,y)

If the patchix is furthermore special, so that both  \int_0^1 q(x,y) dx = \int_0^1 p(x,y) dx = u(y) = 1 , the uniform distribution on [0,1] (so that  u of any fiber is 1), then  p(x,y) and  q(x,y) are "patches." I want to show that, when we "patchix multiply" two patches we obtain another one. Here's why: assume then  p(x,y) and  q(x,y) are "patches." Then the resultant  r(x,t) is a patch too if  \int_0^1 r(x,t) dx = u(t) = 1 . Thus:

 \int_0^1 r(x,t) dx = \int_0^1 \int_0^1 p(1-y,t) \cdot q(x,y) dy dx

If there is no issue with absolute convergence of the integrals (as there shouldn't be in a patch), by the Fubini theorem we can exchange the order of integration:

 \int_0^1 \int_0^1 p(1-y,t) \cdot q(x,y) dx dy = \int_0^1 p(1-y,t) \int_0^1 q(x,y) dx dy

The inner integral evaluates to  u(y) = 1 by hypothesis. Then we have  \int_0^1 p(1-y,t) dy = u(t) = 1 because the transformation  x \rightsquigarrow 1-y changes the orientation so that the direction of integrating to 1 now changes to be in the  y direction ( dx \rightsquigarrow -dy ). Nicely, we have just proven closure of patches.

Because this strongly suggests that patches may form a group (as may patchixes with other properties), I want to attempt to show associativity, identity and inverses of patches in my next post (and of other patchixes with particular properties).

For now, I'm a little more interested in solving a concrete example by calculating self-powers. In my last post, I constructed the following patch:

 p(x,y) = x^2 y^3 + x \left( 2 - \frac{2 y^3}{3} \right)

To calculate the second power, send  p(x,y) \rightsquigarrow p(1-y,t) . I get, in expanded form, from my calculator:

 p(1-y,t) = t^3 y^2 - \frac{4 t^3 y}{3} + \frac{t^3}{3} - 2y + 2

so that

 p_2(x,t) = \int_0^1 p(1-y,t) \cdot p(x,y) dy = \frac{29 x}{15} + \frac{t^3 x}{90} + \frac{x^2}{10} - \frac{t^3 x^2}{60}

Last, let's send  p_2(x,t) \rightsquigarrow p_2(x,y) = \frac{29 x}{15} + \frac{y^3 x}{90} + \frac{x^2}{10} - \frac{y^3 x^2}{60}

We can corroborate that this is a patch by integrating

 \int_0^1 p_2(x,y) dx = \int_0^1 \frac{29 x}{15} + \frac{y^3 x}{90} + \frac{x^2}{10} - \frac{y^3 x^2}{60} dx = 1 which is indeed the case.

To calculate the third power, we can:

 p_3(x,t) = \int_0^1 p(1-y,t) \cdot p_2(x,y) dy = \frac{1741 x}{900} - \frac{t^3 x}{5400} + \frac{59 x^2}{600} + \frac{t^3 x^2}{3600}

Then, send  p_3(x,t) \rightsquigarrow p_3(x,y) = \frac{1741 x}{900} - \frac{y^3 x}{5400} + \frac{59 x^2}{600} + \frac{y^3 x^2}{3600}

Again, we can corroborate that this is a patch by

 \int_0^1 p_2(x,y) dx = \int_0^1 \frac{1741 x}{900} - \frac{y^3 x}{5400} + \frac{59 x^2}{600} + \frac{y^3 x^2}{3600} dx = 1

which is the case.

Here's a countour 3D-plot of  p(x,y), p_2(x,y) \ldots p_7(x,y) : in other words, the 7-step time evolution of the patch (the "brane").  By looking at the plot, you can probably begin to tell where I'm trying to get at:  the patch evolution shows how a fluid could evolve in time (its movement, oscillation), provided the appropriate first-patch generator can be found for a particular movement.  The fact that, if patches mirror Markov "thinking", a patch that will eventually settle to its long-term stable distribution means that this (patch) treatment, when applied to the physical world, takes into account some sort of "entropy," or loss of energy of the system.  Also some sort of "viscosity," is my belief.  The patch evolution catches nicely and inherently all relevant physical properties.  I will continue to explore this in my next post, I think.

The above image has been scaled differently for the different functions so that they can be better seen as they converge.  In my next post, I would like to expound on the evolution of the following first-patch:

On Patchixes and Patches - or Pasqualian Matrixes - (RWLA,MCT,GT,AM Part II)

October 10th, 2010 3 comments

For the last few months I have been thinking about several very curious properties of patchixes and patches (mentioned here); in particular, having studied patch behavior in a "continuous Markov chain" context, and, at having been drinking a bowl of cereal and  observing the interesting movement of the remaining milk, it hit me: a patch could certainly describe milk movement at particular time steps.  It is my hope to try to elucidate this concept a little better here today.  In particular, I think I have discovered a new way to describe waves and oscillations, or rather, "cumulative movement where the amount of liquid is constant" in general, but, in my honest belief, I think this new way and the old way converge in limit (this based on my studies, here and here, or discrete Markov chains at the limit of tiny time steps, so that time is continuous), although it is a little bit unclear to me how at the moment.  It is my hope that this new way not only paves the way for a new and rich field of research, but I foresee it clarifying studies in, for example, turbulence, and, maybe one day, Navier-Stokes related concepts.  This last part may sound a little lofty and ambitious, but an approach in which, for example, vector fields of force or velocity need to be described for every particle and position of space, with overcomplicated second and third order partial derivatives, is in itself somewhat ambitious and lofty, and often prohibitive for finding exact solutions;  perhaps studying particle accumulations through a method of approximation, rather than individual particles, is the answer.

I want to attempt to describe the roadmap that led me to the concept of a patchix (pasqualian matrix) in the first place; it was in the context of discrete Markov chains.  Specifically, I thought that, as we study linear algebra, for a function or transformation  T(\textbf{v}) , with  \textbf{v} is an n-vector with  n entries (finite), we have  T can be described succinctly by an  n \times n matrix.  Such a matrix then, converts  \textbf{v} into another n-dimensional vector, say  \textbf{w} .  This field is very well studied of course: in particular, invertible transformations are very useful, and many matrixes can be used to describe symmetries, so that they underlie Group Theory:

 \textbf{v} \underrightarrow{T} \textbf{w}

Another useful transformation concept resides in  l_2 , the space of sequences whose lengths squared (dot product with itself) converge, that was used, for example by Heisenberg, in quantum mechanics, as I understand it.  For example, the sequence  x_1 + x_2 + \ldots can be transported to another  y_1 + y_2 + \ldots via  T , as by  T(x_1 + x_2 + \ldots) = y_1 + y_2 + \ldots .  Key here then was the fact that  x_1^2 + x_2^2 + \ldots converged, so that  \sqrt{x_1^2 + x_2^2 + \ldots} , the norm, is defined.  Also the dot product  x_1 y_1 + x_2 y_2 + \ldots converges (why?).  Importantly, however, this information points in the direction that a transformation matrix could be created for  T to facilitate computation, with an infinite number of entries, so that indeed a sequence is taken into another in this space in a manner that is easy and convenient.  I think this concept was used by Kolmogorov in extending Markov matrices as well, but I freely admit I am not very versed in mathematical history.  Help in this regard is muchly appreciated.

In function space such as  C^{\infty}[0,1] , the inner product of, say, f(x) with g(x) is also defined, as  \langle f(x), g(x) \rangle = \int_0^{1} f(x) \cdot g(x) dx , point-wise continuous multiplications of the functions summed absolutely convergently (which results from the integral).  Then the norm of  f(x) is  \sqrt{\langle f(x), f(x) \rangle} = \sqrt{\int_0^{1} f(x)^2 dx} .  The problem is of course no convenient "continuous matrix" that results in the transform  T(f(x)) = g(x) , although transforms of a kind can be achieved through a discrete matrix, if its coefficients represent, say, the coefficients of a (finite) polynomial.  Thus, we can transform polynomials into other polynomials, but this is limiting in scope in many ways.

The idea is that we transform a function to another by point-wise reassignment: continuously.  Thus the concept of a patchix (pasqualian matrix) emerges, we need only mimic the mechanical motions we go through when conveniently calculating any other matrix product.  Take a function  f(x) defined continuously on  [0,1] , send  x \rightsquigarrow 1-y so that  f(1-y) is now aligned with the y-axis. From the another viewpoint, consider  f(1-y) as  f(1-y,t) so that, at any value of  t , the cross-section looks like  f .  Define a patchix  p(x,y) on  [0,1] \times [0,1] .  Now "multiply" the function (actually a patchix itself from the different viewpoint) with the patchix as  \int_{0}^{1} f(1-y) \cdot p(x,y) dy = g(x) to obtain  g(x) .  The patchix has transformed  f(x) \rightsquigarrow g(x) as we wanted.  I think there are profound implications from this simple observation; one may now consider, for example, inverse patchixes (or how to get  g(x) back to  f(x) , identity patchixes, and along with these one must consider what it may mean, as crazy as it sounds, to solve an infinite (dense) system of equations; powers of patchixes and what they represent; eigenpatchixvalues and eigenfunctionvectors; group theoretical concepts such as symmetry groups the patchixes may give rise to, etc.

As much as that is extremely interesting to me, and I plan on continuing with my own investigations, my previous post and informal paper considered the implications of multiplying functions by functions, functions by patchixes, and patchixes by patchixes.  Actually I considered special kinds of patchixes  p(x,y) , those having the property that for any specific value  y_c \in [0,1] , then  \int_0^1 p(x,y_c) dx = 1 .  Such special patchixes I dubbed patches (pasqualian special matrixes), and I went on to attempt an extension of a Markov matrix and its concept into a Continuous Markov Patch, along with the logical extension of the Chapman-Kolmogorov equation by first defining patch (discrete) powers (this basically means "patchix multiplying" a patch with itself).  The post can be found here.

So today what I want to do is continue the characterization of patches that I started.  First of all, emulating some properties of the Markov treatment, I want to show how we can multiply a probability distribution (function) "vector" by a patch to obtain another probability distribution function vector. Now this probability distribution is special, in the sense that it doesn't live in all of  \mathbb{R} but in  [0,1] .  A beta distribution, such as  B(2,2) = 6(x)(1-x) , is the type that I'm specifically thinking about. So suppose we have a function  b(x) , which we must convert first to  b(1-y) in preparation to multiply by the patch.  Suppose then the patch is  p(x,y) with the property that, for any specific  y_c , then  \int_0^1 p(x,y_c) dx = 1 .  Now, the "patchix multiplication" is done by

 \int_0^1 b(1-y) \cdot p(x,y) dy

and is a function of  x .  We can show that this is indeed a probability distribution function vector by taking the integral for every infinitesimal change in  x , and see if it adds up to one, like this:

 \int_0^1 \int_0^1 b(1-y) \cdot p(x,y) dy dx

If there is no issue with absolute convergence of the integrals, there is no issue with the order of integration by the Fubini theorem, so we have:

 \int_0^1 \int_0^1 b(1-y) \cdot p(x,y) dx dy = \int_0^1 b(1-y) \int_0^1 p(x,y) dx dy

Now for the inner integral,  p(x,y) adds up to 1 for any choice of  y , so the whole artifact it is in effect a uniform distribution in  [0,1] with value 1 (i.e., for any choice of  y \in [0,1] , the value of the integral is 1).  Thus we have, in effect,

 \int_0^1 b(1-y) \int_0^1 p(x,y) dx dy = \int_0^1 b(1-y) \cdot u(y) dy = \int_0^1 b(1-y) (1) dy

for any choice of  y in  [0,1] , and that last part we know is 1 by hypothesis.

Here's a specific example:  Let's declare  b(x) = 6(x)(1-x) and  p(x,y) = x + \frac{1}{2} .  Of course, as required,  \int_0^1 p(x,y) dx = \int_0^1 x + \frac{1}{2} dx = (\frac{x^2}{2} + \frac{x}{2}) \vert^1_0 = 1 .  So then  b(1-y) = 6(1-y)(y) , and by "patchix multiplication"

 \int_0^1 b(1-y) \cdot p(x,y) dy = \int_0^1 6(1-y)(y) \cdot \left(x + \frac{1}{2} \right) dy = x + \frac{1}{2}

Thus, via this particular patch, the function of  b(x) = 6(x)(1-x) \rightsquigarrow c(x) = x + \frac{1}{2} , point by point.  Which brings me to my next point.

If  p(x,y) is really solely a function of  x , then it follows that  b(x) \rightsquigarrow p(x) any initial probability distribution becomes the patch function distribution (from the viewpoint of a single dimension, than two).  Here's why:

 \int_0^1 b(1-y) \cdot p(x,y) dy = \int_0^1 b(1-y) \cdot p(x) dy = p(x) \int_0^1 b(1-y) dy = p(x)

I think, of course, a lot more interesting are patches that are in fact functions of both  x and of  y .  There arises a problem in constructing them.  For example, let's assume that we can split  p(x,y) = f(x) + g(y) .  Forcing our requirement that  \int_0^1 p(x,y) dx = 1 for any  y \in [0,1] means:

 \int_0^1 p(x,y) dx = \int_0^1 f(x) dx + g(y) \int_0^1 dx = \int_0^1 f(x) dx + g(y) = 1

which implies certainly that   g(y) = 1 - \int_0^1 f(x) dx is a constant since the integral is a constant.  Thus it follows that  p(x,y) = p(x) is a function of  x alone.  Then we may try  p(x,y) = f(x) \cdot g(y) .  Forcing our requirement again,

 \int_0^1 p(x,y) dx = \int_0^1 f(x) \cdot g(y) dx = g(y) \int_0^1 f(x) dx = 1

means that  g(y) = \frac{1}{\int_0^1 f(x) dx} , again, a constant, and  p(x,y) = p(x) once more.  Clearly the function interactions should be more complex, let's say something like:  p(x,y) = f_1(x) \cdot g_1(y) + f_2(x) \cdot g_2(y) .

 \int_0^1 p(x,y) dx = g_1(y) \int_0^1 f_1(x) dx + g_2(y) \int_0^1 f_2(x) dx = 1

so that, determining three of the functions determines the last one, say

 g_2(y) = \frac{1-g_1(y) \int_0^1 f_1(x) dx}{\int_0^1 f_2(x) dx} is in fact, a function of  y .

Let's construct a patch in this manner and see its effect on a  B(2,2) .  Let  f_1(x) = x^2 , and  g_1(y) = y^3 , and  f_2(x) = x , so that

 g_2(y) = \frac{1 - g_1(y) \int_0^1 f_1(x) dx}{\int_0^1 f_2(x) dx} = \frac{1 - y^3 \int_0^1 x^2 dx}{\int_0^1 x dx} = \frac{1 - \frac{y^3}{3}}{\frac{1}{2}} = 2 - \frac{2y^3}{3}

and  p(x,y) = x^2 y^3 + x \left(2 - \frac{2y^3}{3} \right) .

So now the "patchix product" is

 \int_0^1 6(1-y)(y) \cdot \left(x^2 y^3 + x \left(2 - \frac{2y^3}{3} \right) \right) dy = \frac{x^2}{5} + \frac{28x}{15} which is a probability distribution on the interval  [0,1] and, as a matter of check, we can integrate with respect to  x to obtain 1.  Thus the probability distribution function  6(x)(1-x) is carried, point by point, as  6(x)(1-x) \rightsquigarrow \frac{x^2}{5} + \frac{28x}{15} which, quite frankly, is very amusing to me!

From an analytical point of view, it may be interesting or useful to see what happens to the uniform distribution on  [0,1] when it's "patchix multiplied" by the patch above.  We would have:

 \int_0^1 u(y) \cdot \left(x^2 y^3 + x \left(2 - \frac{2y^3}{3} \right) \right) dy = \int_0^1 (1) \cdot \left(x^2 y^3 + x \left(2 - \frac{2y^3}{3} \right) \right) dy = \frac{x^2}{4} + \frac{11x}{12}

so that  u(x) \rightsquigarrow \frac{x^2}{4} + \frac{11x}{12} .

In my next post, I want to talk about more in detail about "patchix multiplication" of, not a probability distribution on [0,1] vectors by a patch, but of a patch by a patch, which is the basis of (self) patch powers: with this I want to begin a discussion on how we can map oscillations and movement in a different way, so that perhaps we can trace my cereal milk movement in time.

On a Battle Markov Chain Model

October 1st, 2010 4 comments

About a month ago I befriended Dan, from Giant Battling Robots, online, as he was doing investigations into Lanchester's differential equations to model the outcome of battles, particularly those of videogames.  He sent me this very interesting link that describes a Markov chain model for battles, since he saw that I was trying figure out a way to coalesce Markov reasoning with Lanchesterian DEs.  The pdf describes quite a different approach that I've been taking or investigating, but I thought it would be interesting nevertheless to attempt and illustrate it by example.  The subject of this post is therefore expository, based on Dan's link, but the example and explanation is entirely mine, and so any error in concept is my own.

Suppose BLUE and GREEN are feuding - in fact, battling to the death, each possessing the same or differing technologies for their engagements.  The particular technologies translate to a probability of kill of the other's units.  For example, GREEN may have bazooka technology that assures that, when launched, the probability of kill/disabling of its enemy unit (perhaps a tank) is 0.45, where BLUE's tanks may kill bazooka-launchers with probability 0.55.  These probabilities represent the fact that, if an encounter is to occur at a time step, in 55 percent of encounters the tanker (BLUE) is successful in killing or disabling GREEN's unit;  on the other hand, it is unsuccessful in 45 percent of encounters.

For a battlefield in a given area, depending on hiding places, like trees, or forests, or caves (terrain), the probability of a tank or a bazooka-launcher to encounter at that time step is an exponentially distributed random variable.  What this means is basically this:  a single BLUE and a single GREEN will find each other with a given (expected probability) at a given time-step, and the higher probabilities of encounter are less likely than the lesser ones for any time-step (I may be having a conceptual problem here, since, an exponentially distributed random variable has a domain from 0 to infinity, where of course the probability space is 0 to 1). So in other words, the probability of encounter of a GREEN with a BLUE, let's say, is 0.87 on expectation.  It's a fuzzy 0.87, because in the next time step it may be less or more, but it's 0.87 on average.

Now, since the probability of finding each other and the probability of winning an encounter (we assume 1-on-1) is independent, then the probability of killing the opposing unit is simply the probability of encounter times the probability of succeeding in the kill.  If BLUE has 4 units and GREEN 3, our initial state, the probability of transitioning to state (3,3) is 0.39: 0.87, the (expected) encounter probability at the time step, times 0.45, the success probability of GREEN.  Notice we allow for only ONE encounter (or NONE) at any time step: this makes sense if the time-steps are sufficiently small and resolution of the encounter, if there is one, is instantaneous.  The probability of transitioning to state (4,2) is 0.48: the expected probability of encounter, 0.87, times the probability of success by BLUE, 0.55.  Finally, the probability of staying in state (4,3) is 1-.39-.48 = 0.13.

Neglecting the natural variations in our expected encounter probability, we can now craft the following chart that describes the transition probabilities to each state succinctly.  Yellow cells are possible states.  Directly to the left of a state is the probability of BLUE being killed one unit, and to the bottom the probability of GREEN being killed one unit at the next time-step.  The diagonal probability is the probability of remaining in that state at the next time-step:

The battle ends whenever one of the coordinates of our state is 0: for example, (0,3), or (0,1), or (2,0)...  Clearly, state (0,0) is unreachable: there must be at least one unit that remains in the end.  Of course, in this model, who wins in the end does not only depend on the technology constants (probability of kill or success), but on numerical superiority: namely, initial conditions or the initial state coordinate.

It's difficult to order the states in any meaningful way if they are Cartesian coordinates like in this example.  I used a diagonal method to craft the transition probabilities in matrix form.  Orange states are absorbing:

As in the usual Markov chain treatment, the nth power of this matrix indicates the transition probabilities at the nth time step.  After many time steps, if the transition probability matrix possesses certain properties, like regularity, then this basically means that the difference between a matrix at the nth step and the (n+1)th is negligible: all entries are "stable."  At the same time, the initial vector state (in our case 0 everywhere except 1 at state (4,3)) morphs into (converges to) a stable steady-state vector (while going through several intermediate vectors).  This stable vector indicates the probability of each outcome after a very long time: it indicates the proportion of time we will see GREEN or BLUE win after considering every possible battle, even the very long ones; so it is indeed a true probability of win.  For example, the following chart

shows that, if we simulate battles with identical initial states and transition probabilities and up to 20 time-steps, after 20 time-steps, BLUE is more likely to have won the majority of simulations (by obliterating the opposition, and remaining with 1, 2, 3 or 4 units up in the end).  Notice the fix proportion of wins for BLUE and for GREEN.  Else think about it this way.  Let U represent the battle is undecided at that time step, and B that BLUE has won at that time-step.  B will continue to be the state of the battle after it has become B for the first time (the battle is now over, so B remains the state of the system).  G indicates likewise that GREEN has won at that time step, and an infinite string of Gs after the first G indicates that G will continue to be the state of the battle after GREEN has won:

Simulation 1:  UUBBBBBBBBBBB....

Simulation 2: UUUUUUUGGGGG.....

etc.

We will have strings of Us followed by either B or G but not both, infinitely.  If we count the proportion of Us at the first time step (1st position of the string) across all simulations, this corresponds to the chart's 1 proportions.  If we count the proportion of Gs and Bs at the 20th step across all simulations, this corresponds to the chart's 20th step proportions.  The higher the time-step will indicate the true probability of win after most battles are (long) over, and this probability is stable/convergent.  In my particular example, most battles are over by the time they reach the 20th time step, and BLUE has won most battles, no matter how short or long they took.

I have crafted an .xls file that might make things clearer.  The items that you may want to modify are in grey, and the chart is there too, in a different sheet.  For the (grey) state vector, put a 1 in the initial state and 0 everywhere else; otherwise make sure you enter proportions and that they add up to 1 in the end. I only did the chart with states up to (4,3), but you can probably extrapolate the idea for states that have more than 4 BLUE units and 3 GREEN ones.  You can probably use this chart for smaller dimension games without much effort, too.

As always, comments or questions are welcome!

Enjoy!

MarkovBattle