Tuesday, March 29, 2011

March Madness II

If you're into sports, I'm sure you've been hearing/talking about how crazy this year's NCAA Men's Basketball tournament has been.  Even if you're not into sports, you've probably been hearing about this. 11th seeded Virginia Commonwealth University (from Richmond, VA) has won 5 games as the underdog and leads the final field in terms of surprises.  Many experts thought that VCU shouldn't have even made the tournament.  Butler University (from Indianapolis, IN) is also in the Final Four as an #8 seed. Yes, they were in the Final Four (and championship) last year, but they were a 5 seed then and still considered a 'Cinderella team.' I'm sure this Cinderella status was helped by the fact that they came from a "mid-major" conference, a conference outside the traditional powerhouses.  The two other teams: Kentucky (a 4 seed) and University of Connecticut (a 3 seed).  While less surprising, these two teams were far from the "obvious" choices.

So, how crazy is this year's tournament?

Out of more than 5.9 million ESPN brackets, 2 people correctly picked the Final Four. Less than 7% of people picked all four of these four teams to win. Those picks were crazy.

Yeah, well most people don't know anything about basketball and just pick a bunch of high seeds or teams with mascots with funny names.  Doesn't make this crazy.

This is the first year since 1980 that no #1 seeds are in the Final Four. That's crazy.

Who says having all 4 #1 seeds would be normal?  It's only happened once before (2008). That would be crazy.

This is the first year ever (since seeding began in 1979) that there aren't any #1 or #2 seeds in the final four. It's crazy that none of the eight best teams are in the finals.

Yeah, yeah. Well on the day those eight teams played they weren't the best eight teams.  This is also the first year ever that there was a team that began with the letter V and a team that began with the letter B.  Doesn't make that special...or crazy.

According to the Vegas odds at the beginning of the tournament, Ohio State, the overall #1 seed, had a 62% chance of making it to the Elite Eight and they did not.  I could have made crazy money on that game1.

According to the Vegas odds at the beginning of the tournament, there was a 9.6% chance of no #1 seed making it to the Final Four.  Small, but an expectation that this will happen once a decade is not crazy small.

If this had been last year, VCU wouldn't have even made the tournament.  They only got a chance because the field expanded from 65 teams (with one "play in" game) to 68 teams (with four play-in games). It's crazy that the last team in is in the Final Four.

Well then I guess it was crazy of the selection committee to only include 65 teams.  The only reason to have VCU in the tournament is if they have a shot.  See, not crazy.

There's an 11 seed in the Final Four!  This team was, at best, ranked below 40 other teams. How crazy is that?

An 11 seed made the Final Four just 5 years ago (George Mason University).  And they were also from Virginia.  Not crazy at all.

The odds of VCU making the Final Four was 3 in 10,000.  That's crazy small.

Yes, but people would have been saying the same thing if any of the high seeds were in the final four.  In fact, the odds of an 8 through 16 seed of making the Final Four is not actually that small, it's 38.6% (actually, the math on this is not quite right because this percentage includes the possibility of having more than 4 teams seeded 8 or lower in the Final Four which, since there are only 4 slots, is not possible.  Any thoughts on how to resolve this issue without doing a lot of computations are welcomed). As a comparison, this is about the same percentage as Duke (40.3%).2  Not so crazy small after all.

Yeah, but there are two teams seeded lower than 8 in the final four.  That's only happened once before (2000) and they were both 8 seeds. An 8 and an 11 seed?  One of which is guaranteed to be in the final?  Crazy.

I spent some time thinking about the best way to compute the Vegas odds of having 2 teams with an 8 seed or lower in the final 4. You could find the probability of two specific teams making it by multiplying their individual probability and then do this for every possible 8-16 seed legal combination (remembering that teams from the same bracket can't both make it). This is what I ended up doing (by writing a Python program). I have yet to come up with a more clever way of thinking about this. Anyway, I got 3.59%3. And since I'm sure you're wondering, there's a 3.75% chance of having at least 2 teams seeded 8 or lower in the final 4.  A small percentage, yes, but not absurd...about the same odds as Gonzaga had of making the Final Four. Considering their success in recent years, I'm not sure people would have called this crazy.

So, what do you think?  Crazy? 

1Vegas odds are based on the underlying idea that bookies want to take the same number of bets for both teams...well, unless they get greedy. So if they determine that the chances of team A winning are 15% (based on whatever metric they devise), the line might be something like +850 for that team.  This means that if you bet $100 and the underdog wins, you'd win $850 (a good enough return to incentivize betting for the underdog).  Whereas the line for team B, the favorite, might be -870 which means that you need to bet $870 to win $100. So consider if 1000 people bet on both teams. The casino takes in $10,000 from the people who bet on the underdog and $87,000 from the people who bet on the favorite for a total of $97,000. If team A, the underdog, wins, the casino pays out 1000*(850+100)=$95,000 and banks $2,000.  If team B wins the casino pays out 1000*(870+100)=$97,000 and still makes $1,000. Wash. Rinse. Optimize for maximum profit. Repeat.

2This brings up a common misconception in probability theory.  If you flip a coin ten times, which is more likely:

HHHHHHHHHH or THHTTHTTTH?

In fact, both are equally likely even though the first looks much sketchier than the second. The problem is, the second "looks more random" than the first.

3Here's the code I used for this computation.

#Vegas odds for seeds 8 through 16 in each region

East=[.004, .012, .004, .017, .001, .0004, .0002, .0001, .000001]
West=[.009, .004, .012, .016, .0005, .002, .0006, .0001, .000004]
Southwest=[.024, .029, .014, .0003, .01, .0004, .0002, .0002, .00002]
Southeast=[.01, .012, .039, .064, .046, .001, .0003, .00009]

#step through each of the possible pairs of teams from different regions that
#could represent in the final 4
psum=0
for i in range(0,8):
     for j in range(0,8):
          psum+=East[i]*West[j]
          psum+=East[i]*Southwest[j]
          psum+=East[i]*Southeast[j]
          psum+=West[i]*Southwest[j]
          psum+=West[i]*Southeast[j]
          psum+=Southwest[i]*Southeast[j]

print psum

Monday, March 14, 2011

March Madness Baby!

It's a busy week.  Today's Sorta-Pi-Day and you have two fewer days to fill out your NCAA bracket since there are two play in games tomorrow evening. I have lots of decisions to make, including whether to pick Ohio State to go all the way (Go Bucks!) or have them lose on the early side so that I'll be a little happy either way.



I love the office pool (even though I don't work in an office and the school I work at doesn't do a pool).  There's the classic pool where you get a certain number of points for each correct pick and the points awarded increase by some amount.  There are lots of alternatives.  Some involve giving greater weights to upsets. One of my favorites is for you to pick five players as your tournament team. You get as many points as they score in the tournament (which means that you want to pick players who can score and who play on teams that will do well and therefore play more games).  Heck, you could also run a tournament where you try to pick the losers making the first round the most important round (one pool I was a part of for a number of years gave second to last place 10% of the winnings).

Anyway, speaking of mathematics there are some great mathematical questions here: a classic What Can You Do With This?
  • How many people would need to be in your office to guarantee a perfect bracket (a nice counting problem)?  
  • How many total games are played (a nice counting problem that has a very clever solution involving no counting)?
  • Using the Vegas odds of each team winning the tournament, what's the most likely bracket outcome?  What's the probability of this being the actual outcome?
  • And finally...how many hours of basketball am I going to be watching over the next month?
Feel free to add your own mathematical questions or give your team support (even though they will inevitably be steamrolled by The Ohio State University.

Cow Pi Day

Happy Pi Day!  Woo hoo!  March 14!  3.14 (btw, MIT releases admission decisions today at 1:59, the next three digits of pi).

Ok, enough happiness and joy and thoughtless love of mathematical constants that sound like yummy desserts.  I'm going to risk putting a big cow pie in your Pi Day and talk about the absurdities of the whole idea.

Base number systems


Pi (the ratio of the circumference to the diameter of ANY circle) is approximately 3.14159265358.  Well, that's what it is in base-10.  By base-10, I mean using a number system with 10 symbols (0, 1, 2, 3, 4, 5, 6, 7, 8, and 9).  In this system we use these ten symbols to count up and then when we run out we use a combination of these symbols and the power of place value to describe larger values (372 is really shorthand for 3 hundreds, 7 tens, and 2 ones).  Humans haven't always used based-10 though and there isn't anything particularly special about it.  The Mayans used a base-20 system and the Babylonians used a base-60 system.  Yes, that means you'd have to learn 60 different symbols, but the number 3599 is only 2 digits.  Some believe that when computers take over the world we'll be forced to use a base-2 number system.  Computers like this system because they can really only understand two things: "electricity is flowing" and "electricity is not flowing."  You might have seen binary numbers before (0, 1, 10, 11, 100, etc).

Some mathematical concepts remain regardless of what base you're in.  A prime number in base 10 will be a prime number in base 2, base 20, base 60, and base 243112609-1 (which is currently the largest known prime number).  Pi, on the other hand, will not always be 3.14159265358.....


Pi in base-2?


First of all, what in the world does that even mean.  Remember how someone (that would be me) mentioned earlier that 372 is really shorthand for 3 hundreds, 7 tens, and 2 ones.  We can also think about this as [;3\times10^2+7\times10^1+2\times10^0;].  Remember that [;10^0;]=1 (which can be explained using, among other methods, pattern sniffing with decreasing powers).  Well this isn't a coincidence and if we wanted to describe the number of days in a non-leap-year year in base-5, we'd write it as 2430 or [;2\times5^3+4\times5^2+3\times5^1+0\times5^0;].  Well, we might use symbols other than 2, 4, 3, and 0 but you hopefully get the idea.  Anyway, this idea can be continued into the decimal realm.  The first few digits of pi in base-10 are 3.14159 which, in its expanded form, can be written as [;3\times10^0+1\times\frac{1}{10^1}+4\times\frac{1}{10^2}+1\times\frac{1}{10^3}+5\times\frac{1}{10^4}+9\times\frac{1}{10^5};] or [;3\times10^0+1\times10^{-1}+4\times10^{-2}+1\times10^{-3}+5\times10^{-4}+9\times10^{-5};]. This just means that there are 3 ones, 1 tenth, 4 hundredths, 1 thousandth, 5 ten thousandths, and 9 hundred thousandths.  So what would this look like in base 2?  Well we'd just need to figure out how many eights, fours, twos, ones, halves, fourths, eighths, sixteenths, etc there are in pi.  Well that sounds like a fun project to figure out.

Pi Day in other bases
One nice thing about figuring out pi in other bases is that I can give you the answers without really giving anything away (the method is the hard...err fun...part).  So...

Base 2: 11.00100100001111 (1 two, 1 one, no halves and fourths, 1 eighth, etc)
Base 3: 10.01021101222201
Base 4: 3.02100333122220
Base 5: 3.03232214303343
Base 6: 3.05033005141512
Base 7: 3.06636514320361
Base 8: 3.11037552421026
Base 9: 3.12418812407442  

Anyway, you can make the argument that base-10 is our normal base and that makes today special (although our calendar system really isn't a base 10 system.  It's more of a mixture of base-7, base-10, base-12, and base 28/29/30/31).  Sort of...

My point is not to rain on your Pi Day parade.  I'm all for having days where people actually want to talk about and think about math. I'm just saying that October 1st could also be Pi Day (in base 3).  Or March 2nd, 3rd, 5th, 6th, 11th, and 12th.

Maybe we should just call March Pi Month?

Monday, March 7, 2011

If It's Not Scottish Calculus, It's Crap

In memory ay me matile Colin Maclaurin, here's a summary ay whit ah did in mah Calculus BC class teh day bfer yeserday (ok, I'm afraid that's all the Scottish I know).

From our work with geometric series, we knew that [;\sum_{n=0}^{\infty} x^n =\frac{1}{1-x};] if |x| < 1


A complete side note...I'm using Text the World to embed all the mathy stuff (thanks Sue).  While I've found it to be a bit buggy and clunky, it supposedly will work in blogger, google docs, gmail, etc. so I'm excited about the possibility.  This is my first attempt to use this so let me know if you're having trouble reading anything.

Anyway, from an algebraic perspective, I think this is quite surprising.  Basically, we're saying that this infinite polynomial is equal to a nice "simple" rational expression.

So is this true for other functions?

I asked my students what functions would be especially interesting to look at.

[;x^2+2x-4;] is not a terribly interesting expression to write as a polynomial.  sin(x) on the other hand...that would be curious.

Here's what they came up with as a class:
  • trig functions
  • log functions
  • [;e^x;] 
  • [;\sqrt{x};] 
I then had the following dialogue:

Me: So [;e^x;] is a particularly interesting function in calculus because...

Class: it's derivative is also [;e^x;]

Me: Great.  Try and find a polynomial with this same property.

They broke into small groups and began exploring.  Every group quickly realized that a finite polynomial won't cut it.  They then began trying different infinite polynomials.  I gave a few groups some guidance around starting with [;a_0+a_1x+a_2x^2+a_3x^3+...;] and then trying to find the coefficients that would make the property of f(x)=f'(x) hold.  Different groups approached this in different ways.  Some differentiated [;a_0+a_1x+a_2x^2+a_3x^3+...;] to get [;a_1+2a_2x+3a_3x^2+...;] and then set the coefficients of each power to be equal to each other.  One group did something similar with the integral instead of the derivative.  Another group did something similar to this, but in more of an informal way where they realized that as you repeatedly differentiate you'll end up with factorials in from of your coefficients and then played around with some different polynomials with factorials in them.

I didn't help at all.   To my students' credit, when I asked 15 minutes later if they wanted us to come back together and talk/get hints/share ideas.  I got a resounding no.  Woo hoo!


They ended up working independently for about 30 minutes.  Every group found a polynomial that worked. Some groups realized that any multiple of their solution would have the property that f(x)=f'(x).  Two groups went further and attempted to define some equality between [;e^x;] and their polynomial.  One individual began working to find a polynomial with derivative characteristics to sin(x) ie f(x)=-f''(x).  Anyway, everyone independently came up with some version of
[;\sum_{n=0}^{\infty} \frac{a_0x^n}{n!};]

I was pretty excited about this.

Sure, we still have work to do in terms of formalization to define a more robust definition of equality (more than just having two functions with the same property).  It'll be interesting, though, to see if this helps students understand MacLaurin Series and Taylor Series.