Archive

Posts Tagged ‘Mexico’

On the Swine Flu

April 27th, 2009 5 comments

So I've been MIA for the last month-and-a-half because of several exciting things that have happened to me, and I apologize for the hiatus... but it may be prolonged for a while still.  However, in the meantime, since a lot of my Mexican colleagues and students have been asking me about the swine flu, in somewhat of a panic, I thought I would try to put some numbers into some well-known SIR equations and see what insights can be obtained regarding the dynamics and proportions of the purported epidemic.  I believe this kind of analysis is routine in CDC in the US and in Secretaria de Salud in Mexico, although I'm not too sure how mathematically equipped the latter institution is in my home country. In much of what follows, I have to do some hand-waving because of the unavailability of accurate information in the web-o-sphere, or my inability to access true statistics.  Much of what I know I got from CNN and NYTIMES reports of today.  I find such fuzzy math unacceptable, personally, except to derive a notion of the magnitude of a problem, and so it would be a mistake to take the calculations as set fact or hard evidence.  Caveat in mind, I don't derive the model (it's up to the reader to find several excellent sites that might explain it), but instead delve to account for my assumptions and my results.

The SIR model (Susceptibles-Infecteds-Recovereds) uses coupled differential equations to analyze the progress of a disease in a closed population.  I focus on the population of Mexico City, assuming that the count of infected individuals nationally can be mostly found there.  Thus, out of Mexico City's 20 million people, most of the suspected 1600 currently-infecteds are in that city (all for simplicity).  Recovered individuals are the sum of those dead (149) plus those that did not die.  The coupled time-differential equations are: 

 \frac{dS}{dt} = -a S I

 \frac{dR}{dt} = b I

 \frac{dI}{dt} = a S I - b I

which basically says that the change in susceptible individuals is proportional to the amount infected and the amount currently susceptible, that the change in recovereds (removed from infection) is proportional to those who are already infected, and that the change in infecteds depends is the rate at which susceptibles get sick minus the rate at which infecteds get removed from the population.  The proportionality constants can be calculated with some cleverness (though not necessarily accuracy), like this:  

 a = \frac{-\frac{dS}{dt}}{S I}

and so we must guesstimate  \frac{dS}{dt} as well as  S and  I .  If there are currently 1600 individuals that are infected, linearly, 400 have been infected per day since this became news four days ago.  So my guesstimate is that the current rate  \frac{dS}{dt} = -400 individuals per day. The number of susceptibles is the population of Mexico City, so all 20 million, minus infecteds, about (I'm thinking naturally immune individuals are so few that the population number doesn't change much).  Finally, the number of current infecteds (Mexico-wide? focused in Mexico City) is 1600 according to the NYTIMES article I've been linking to.  This gives a value of  a = 1.25 \times 10^{-8} .  Calculating  b is a bit trickier.  The datum says that 149 people have died from the swine flu, but I don't think all people infected with the swine flu die.  I'm going to guesstimate that approximately 20% of the infecteds either die or recover (since apparently 10% of them die) in a day.  There's no reason for this except my hunch. So  b \approx .2 .

By the chain rule,  \frac{dI}{dt} = \frac{dI}{dS} \cdot \frac{dS}{dt} , and so  \frac{dI}{dS} = \frac{\frac{dI}{dt}}{\frac{dS}{dt}} .  With  I not zero, this means

 \frac{dI}{dS} = \frac{1.25 \times 10^{-8} S I - 0.2 I}{-1.25 \times 10^{-8} S I} = -1 + \frac{16,000,000}{S} .  The partial is zero at  S_* = 16,000,000 , and has initial conditions  S_0 = 19,998,400 and  I_0 = 1600 (twenty million total).

The value  S_* is called the threshold value, and it is less than the initial condition  S_0 \approx 20,000,000 .  This suggests an epidemic in fact occurs.

Luckily,  \frac{dI}{dS} can be solved in closed form, as

 I = -S + 1.6 \times 10^{7} ln(S) + C .

The value of  C is of course determined by the initial conditions, as  C = I_0 + S_0 - 1.6 \times 10^7 ln(S_0) \approx -2.489 \times 10^8 . Then 

 I = -S + 1.6 \times 10^7 ln (S) - 2.489 \times 10^8 .  

With this in mind, the maximum number of infecteds at a time occurs at  S_* = 16,000,000 and is

 I_* \approx 500,000 ,

or about half a million people, equivalent to about 2.5% of Mexico City's population.

If there is enough interest, I may calculate the time dynamics (how long the epidemic lasts, etc.) with numerical methods (as by Euler's method), unfortunately by hand since access to fast computers and cool software is limited to me at present.

------

UPDATE May 5, 2009.

So it appears that the foundational numbers above were vastly overstated (since Mexico hadn't confirmed the particular strains of the alleged infecteds due to under-equipment): from the number of actual infecteds to the actual number of deaths related to the illness.  It now appears that the progression of the swine flu is a lot slower, and also that our derived coefficients are vastly different than originally thought.  Still, a happy exercise using the SIR equations.  I may yet post a new derivation that reflects reality more truly.

On the National Mexican Lottery, II

January 3rd, 2009 3 comments

The jackpot this week is up at 300 million (MXP)! A few million more and this game turns into a fair or favorable game, how cool!

In my last post, I mentioned that some of my friends and students said that one could be really lucky and win the jackpot if one bought the first few tickets and won: 

"Unless he got truly lucky and won the jackpot before he spent too much, as in buying the first few tickets, and then quit, they argued!"

And there wasn't really much I could say about it.  It is true after all that one could be so lucky, with minuscule probability.  However, "what is the average number of tickets you have to buy before you win the first time, if you buy the 6-choice/7-choice/etc. repeatedly?" was a question that people kept asking me with some insistence.  Although to me it seemed somewhat evident that you needed to buy about  \binom{56}{6} tickets on average for the 6-choice, my friends and students weren't convinced until I showed them the mathematics that supported this.

It is really not difficult to calculate such if one understands what expected value is.  So let us assume the words

LLLLLLLLLW

LW

W

LLLLLLLLLLLLLLW,

etc., are Bernoulli-trial strings (there really are two possibilities, the binary win or lose), and they are allowable if we stop after the first win.  For the 6-choice, each word has probability  (1 / \binom{56}{6}) \cdot (1 - (1 / \binom{56}{6}))^{n-1}  because the nth win is preceded by  n - 1 losses.

The expected value is:

 (1 / \binom{56}{6}) \sum_{n=1}^{\infty} n \cdot (1 - (1 / \binom{56}{6}))^{n-1}

One recognizes this as a convergent geometric series* (all probabilities are less than one so they lie inside the radius of convergence), and thus the above equals

 (1 / \binom{56}{6}) \cdot \frac{1}{(1 / \binom{56}{6})^2} = \binom{56}{6} ,

the sum having been substituted adequately.  Confirming my "far-out" claim (to me really unsurprisingly), you have to wait an average of  \binom{56}{6} or about thirty-two million tickets before you'll see the first win.

This idea can be extended for the 7, 8, 9, 10-choice and so on.

*NB

The series representation of the function

 \frac{1}{(1-x)^2} = \sum_{n=0}^{\infty} n \cdot x^{n-1} with radius of convergence  -1 < x < 1 .  

This is obtainable by taking the derivative of the series representation for:

 \frac{1}{1-x} = \sum_{n=0}^{\infty} x^n with radius of convergence  -1 < x < 1 .

On the National Mexican Lottery, I (a Cool Combinatorial Identity)

December 27th, 2008 No comments

I sometimes help people to prepare for any of the plethora of standardized tests required for everything academic, and one of my students (now applying for a Fulbright), also a high school friend, while on the improbable topic (since it scarcely appears in the general exams) of probability, asked me about the likelihood of winning the most popular game of chance by the National Mexican Lottery (Julio is naturally curious, but these hard times surely provide an additional motivation!).  In the beginning he phrased it thusly: you have a piece of paper with fifty-six numbers and you can pick six of them.  Then, at the lottery, if the six balls match your chosen six, you win the grand prize.  Naturally, I replied almost without thinking, that the probability of winning was

 \frac{1}{\binom{56}{6}} ,

or one in about thirty million.

This follows from basic considerations in combinatorics and probability: suppose you can fill six slots. In the first slot, you can place any one of the 56 numbers.  Having chosen one number in the first slot, there are 55 left that can go into the second slot and so on... 54... 53, 52, and finally 51 remaining numbers can go in the last slot.  By the counting principle, the number of arrangements is simply then  56 \cdot 55 \cdot ... \cdot 51 = \frac{56!}{50!} .  Since the ordering doesn't matter, and there are 6! ways in which, having chosen a particular configuration of six, such can be ordered differently, the total number of arrangements must be modded by 6!.  Thus, we obtain  \frac{56!}{50!6!} possible outcomes.  This is really the definition of the combinatorial operation "choose:"   \binom{n}{s} = \frac{n!}{s! (n-s)!} .

My friend Julio then told me that there was the option of buying an extra choice.  In other words, rather than the 6 basic choices available, you could purchase 7 (the lottery would still pick 6 of 56 balls).  In fact, he said, you can purchase 8, 9, and even 10 choices.  He wanted to know how much his probability of win had increased in each of the cases.  He was thinking of buying a ticket or several, especially because at the time (last week or so) the jackpot was 206 million pesos or about 18 million dollars (now it's at 240 mill MXP, or about 21 mill USD, having there been no winner), and was interested in knowing what strategy maximized his probability of winning.  I thought to myself: "Hmm... lotteries aren't usually fair games... but let's try it out and see if we can figure something with the power of mathematics."  Admittedly at first I was stumped... I had to think this through a bit! It wasn't as obvious as you might have thought from the first derivation.  In the end I reasoned it as follows:

The fact that now I can purchase 7 options means, by the above reasoning that I can choose in   \binom{56}{7} ways seven numbers, ordering not mattering.  This is my sample space.  Now, the thing is that out of these, there are six set or marked-for-win balls, so if I assume I have actually chosen them and I will win, there is a remaining one-number that can be wrong, and it can be any of 50 numbers.  There are  \binom{50}{1} to choose such.  My probability of win is therefore  \frac{\binom{50}{1}}{\binom{56}{7}} .

For eight options, a similar argument means that my sample space is  \binom{56}{8} .  With six balls set to win, I have two balls that can be wrong, or  \binom{50}{2} ways in which I can be right.  The probability of winning in this case is  \binom{50}{2}/\binom{56}{8} .

For nine options, the probability is  \binom{50}{3}/\binom{56}{9} , and for ten it is  \binom{50}{4}/\binom{56}{10} .

Generally speaking and by the above argument, I thought, if I have a set of n balls to choose from, with s marked to win, and the possibility of choosing r, with  r \geq s , it must be that the probability of win is therefore:

 \binom{n-s}{r-s} / \binom{n}{r}        (A)

Happy at my apparent triumph, it never occurred to me that there was another way to argue the matter at all.  In fact I would soon discover there was a simpler way to think about it!  I began concocting a table of probabilities to show Julio, and, as my sister does, she meanders circuitously and then into my room, finally asking about what I'm doing.

A genius engineer like she is, my sister tends to think of things in a lot simpler and efficient ways than I can ever possibly.  I think it is a blessing to have someone like my sister.  In so many ways she's very much like me, but also so dissimilar, and so she comes up with different considerations on a problem... such as sometimes lead to tiny discoveries, like the identity I'll be proving in a bit.  By working on the problem of winning probabilities, she argued the following: there are  \binom{56}{6} possible outcomes of choosing six balls from fifty six.  If I have seven slots to choose six correct balls, then there are  \binom{7}{6} ways I can do this.  If there are eight slots, there are  \binom{8}{6} ways to do this, and so on. 

In general, my sister's argument for the probability of win can be expressed as:

 \binom{r}{s} / \binom{n}{s}       (B)

The most interesting thing about this exchange is that it happened in a matter of minutes... such is opportunity.  Thrilling, amusing, and... evanescent.  Kind of like life.

The Pasquali-Pasquali combinatorial identity.  I'm calling it like this temporarily because, despite my efforts, I have been unable to find it explicitly in this form in either combinatorics or probability texts.  Unfortunately, I haven't access to scholarly mathematics magazines, but I'm much grateful to my readership if they would point me to a proper reference.  In the meantime, it's nice that this identity has its motivation in a real-world combinatorial argument. 

 \binom{n-s}{r-s} \cdot \binom{n}{s} = \binom{n}{r} \cdot \binom{r}{s} , with  n \geq r \geq s \geq 0 .

Proof.

By the definition of the choice operation,

 \binom{n-s}{r-s} \cdot \binom{n}{s} = \frac{(n-s)!}{(r-s)!(n-r)!} \cdot \frac{n!}{s!(n-s)!} = \frac{n!}{(r-s)!(n-r)!s!} =

 = \frac{n! r!}{r!(n-r)!s!(r-s)!} = \frac{n!}{r!(n-r)!} \cdot \frac{r!}{s!(r-s)!} = \binom{n}{r} \cdot \binom{r}{s} \verb| | \Box

I am sure I have read this interesting datum about the National Mexican Lottery somewhere, but I cannot pinpoint exactly from what book: it is the number one (or two) source of income of the Mexican government.  Everybody plays this game of chance, in the hopes of becoming millionaires from one day to the next; such is the Mexican inclination, such is the Mexican character: sensation-craving.  A real desire to change an otherwise ordinary existence.

Although it is true, as will be seen in the file I'll be linking to, that the probability of win is increased by 7 times if one purchases the 7th extra choice (as compared to the base case of 6 choices), by 28 times if one purchases the 8th choice, 84 times for the 9th, 210 times for the tenth, and so on... this can only be leading or encouraging pieces of information (in fact the National Lottery publishes these probabilities in the hopes of persuading people to play).  Firstly, it is still extremely improbable to win. Secondly, what one must really focus on is expected profit.  Negative expected profit means that, if you play repeatedly, on average, you will be losing money despite the occasional win (!).  Such are called "unfair" games, because money goes out of your pocket and into the coffers of the House. "Fair" games are those in which the expected profit is equal to zero, and "favorable" if expected profit is positive, as you are in effect winning money on average.

The point at which this particular game is "fair" is in reality determined by the cost of the ticket.  The normal 6 choice ticket is 15 pesos, but the expected profit on the ticket is about -9 pesos:

 P = (1 / \binom{56}{6}) \cdot (206,000,000 - 15) + (1 - (1 / \binom{56}{6})) \cdot (-15)

To be fair, the jackpot would have to be around 488 million pesos:

 0 = (1 / \binom{56}{6}) \cdot (J - 15) + (1 - (1 / \binom{56}{6})) \cdot (-15)

Above 488 million, it is really to your advantage to buy as many (different-combination) tickets as possible.  My suggestion to Julio and some other very interested friends was to wait until the jackpot accumulated about 488 million pesos, so that they would have a real chance of earning some money.

Julio conceded, but my not very mathematically inclined friends and students complained.

First of all, of course it would never reach that much!  It has never been 206 mill (let alone now 240 mill MXP), and someone was SURE to buy tickets until they got it.  This was a golden opportunity, you see.  I replied that they forgot what expected profit meant: the individual in question would spend more money than the jackpot before he won, most probably, and that if he kept at it provided the same jackpot on average no matter how many times he won he would still be losing money.  Unless he got truly lucky and won the jackpot before he spent too much, as in buying the first few tickets, and then quit, they argued!  I had to concede, but I also offered another solution: boycott this particular game of chance until the ticket cost descends to a price that would make the game fair.  In this case, the jackpot would have to remain at 206 mill (or 240 mill this week), and the ticket price would have to go down to about 6 (and something) pesos.

Everyone groaned!  I was in effect suggesting not to play the game, but that was not it at all.  I was merely suggesting playing the game when circumstances were more favorable, or to go into the game with the mindset of not winning.  "The game is a game of chance you are sure to lose.  Buy the ticket for social reasons, because your friends are doing it, because you like the thrill of choosing 6 numbers... or what have you.  But not because you think you're lucky and you are sure you will win, because in effect it's exactly the opposite.  The National Lottery is smart!"  My statement and somewhat lopsided grin allowed my friends a way out.  

They played, and lost.

Would it have been better if they had purchased 7, 8, 9, or 10 options? The answer is no.  The price of buying an extra option is determined by the size of the fair jackpot, in this case about 488 mill (based on the 15 peso 6 choice ticket), and is proportional to the probability of winning:

Example, determining the 7 choice fair price at 488 mill (based on the 15 peso 6 choice ticket):

 0 = (\binom{7}{6} / \binom{56}{6}) \cdot (488,000,000 - T_7) + (1 - (\binom{7}{6} / \binom{56}{6})) \cdot (-T_7)

Buying 7, 8, 9 or 10 options has an escalating negative expected profit respectively, at 206 mill (at anything less than 488 mill, really).  It's like being penalized more harshly and more harshly for wanting to better your chances of win!  

Example, determining the expected profit having solved for  T_7 above, 7 choice ticket:

 P = (\binom{7}{6} / \binom{56}{6}) \cdot (206,000,000 - T_7) + (1 - (\binom{7}{6} / \binom{56}{6})) \cdot (-T_7)

So you really are better off and losing less money if you play the 6-choice for fun  (actually as infrequently as possible).  Since you are going to lose anyway, better lose less money than more money, is what I say!