Fick’s Law is a principle of physics, and forms the basis and foundation of diffusion and more generally mass transfer. Diffusion is the action where molecules of dye in water spread out out from their source instead of staying concentrated where the source is, even if you don’t mix up the water at all. In layman’s terms I would define it like this: “molecules diffuse from areas of high consentration to areas of low concentration. They diffuse faster when the difference in concentration between the areas is larger, and the diffuse faster when the distance between the areas is smaller.”

A more accurate definition would be stated like this: “the diffusive flux is proportional to the difference in concentration, and inversely proportional to the distance.” First of all let’s look at the word ‘flux’, a very technical term. It’s basically a way of quantifying how much ‘stuff’ is moving at any particular point in space. It’s defined by the following:

 \large{ \left[ \text{flux of stuff} \right] = \large{ \frac{\left[ \text{stuff} \right]}{\left[ \text{area} \right] \left[ \text{time} \right] } } }

So basically flux means the rate of flow of stuff per unit area. The ‘stuff’ is for whatever moving quantity you are interested in. For example in SI units molar flux would be in units of mol/(m2·s), heat flux (or more generally energy flux) would be J/(m2·s) which is equivalent to W/m2, and mass flux would be kg/(m2·s). A bit more esoterically in fluid dynamics there is momentum flux, information theory has entropy flux, electromagnetism uses electric flux and magnetic flux, and quantum mechanics even has probability flux!

Anyway, in the definition for Fick’s Law the word ‘proportional’ means a linear relationship: if the difference in concentration doubles, then the flux doubles. If the difference in concentration reduces to one half, then the flux reduces to one half as well. ‘Inversely proportional’ means the opposite: if the distance doubles, then the flux halves. If the distance halves, then the flux doubles.

For example, look at the following simple example:
simple Fick's Law example
On the left we have some medium that contains substance A at concentration C_1 , separated by a membrane of thickness L from another medium that contains substance A at some lower concentration C_2 . Substance A then diffuses from the left through the membrane to the right.

From the definition of Fick’s Law we know that it’s the difference between the concentrations on the left and the right that’s important. We’ll define this as \Delta C = C_2 - C_1.

So now if we define the flux of A through the membrane as N_A , then we can say the following:

1. N_A is proportional to \Delta C, which is equivalent to
 N_A = \left[ \text{constant} \right] \times \Delta C

2. N_A is inversely proportional to L , which is equivalent to
 N_A = \left[ \text{constant} \right] \times \frac{ 1 }{ L }

3. We can then combine the two statements to say:
 N_A = \left[ \text{constant} \right] \times \frac{ \Delta C }{ L }
Basically we’re just combining the two constants into a single conglomerate constant. We can do this because while we have stated it’s a constant, we haven’t stated what value it must be, so we can combine multiple constants and call it a single constant.

So what do we do for this constant? The above equation tells us that so long as the material of the membrane separating the two remains the same and that it is substance A that is diffusing through it remains the same, \Delta C and L can change to practically any value and the value of the constant will not not change. So it becomes a property of the material that is being diffused through.

We call this constant the diffusivity, and use the notation D_{AB} , which denotes ‘diffusivity for substance A diffusing through substance B ‘.

So now we can write Fick’s Law as a general equation:

 \huge{ N_{A} = - D_{AB} \frac{\Delta C_A}{ L } }

First of all, what’s with the negative sign? This basically tells us that the species A if flowing from areas of high concentration to low concentration, assuming that D_{AB} > 0 (which it is by definition). Otherwise we would have things spontaneously flowing from low concentration to high concentration, which violates the 2nd law of thermodynamics.

What are the units on D_{AB} ? We know the units of everything else in the equation, so it’s fairly simple to work out. Units of the molar flux N_A is mol/(m2·s), concentration difference \Delta C is mol/m3, and distance L is of course simply m. So units for D_{AB} work out be m2/s.

Some typical values of D_{AB} are shown below:

\small{\begin{array}{ccc} \text{solute} \: A & \text{medium} \: B & D_{AB} \: [\text{m}^2 / \text{s}] \\ \hline \text{water vapor} & \text{air} & 2.6 \times 10^{-5} \\ \text{napthalene} & \text{air} & 6.1 \times 10^{-6} \\ \text{sodium chloride} & \text{water} & 1.2 \times 10^{-9} \\ \text{ethanol} & \text{water} & 7 \times 10^{-10} \\ \text{helium} & \text{pyrex} & 4.5 \times 10^{-15} \\ \text{cadmium} & \text{copper} & 2.7 \times 10^{-19} \\ \text{aluminum} & \text{copper} & 1.3 \times 10^{-34} \end{array}}

There are some simplifications I’ve made, for example diffusivity in gases generally varies linearly with pressure, so I showed the diffusivity in air at atmospheric pressure. Diffusivity in liquids can also vary with the overall concentration itself, so I reported typical average values. Also diffusivity in solids can vary greatly with temperature, so I reported values for 298 K or room temperature. But still you can usually make the following generalizations: diffusivity values in gases (at atmospheric pressure) are around 1 \times 10^{-5} [\text{m} ^2 / \text{s}] , diffusivity values in liquids are around 1 \times 10^{-9} [\text{m} ^2 / \text{s}] , and diffusivity values in solids vary widely, but are always much smaller than those in liquids and gases.

The only remaining thing we have is to reformulate the equation in a differential form. This means we take the limit at the thickness L goes to zero, the right-hand-side of the equation becomes a derivative:

 \huge{ N_{A} = - D_{AB} \frac{d C_A}{ d x } }

where now x is the coordinate in the same direction as L . This is the form for one direction in a Cartesian coordinate system. The more general form that is independent of coordinate systems is written as

 \huge{ N_{A} = - D_{AB} \nabla C_A }

where \nabla is the gradient operator. This is the form we’ll start with when we actually solve the problem of the bubble collapse, because the bubble is spherical so we’ll use spherical coordinates.

The other day I was playing with some soap bubbles and I started thinking: what makes them pop? And furthermore, sometimes they don’t pop, instead they will shrink down to nothing. So what causes each of these different behaviors? Can the failure mechanism be predicted? Can you predict when it will fail?

Basically the collapse of the bubble can be attributed to one of two mechanisms. 1.) The air inside of the bubble diffuses through the bubble film, causing the bubble to shrink and eventually collapse. 2.) The liquid in the bubble film evaporates, causing the film to get thinner and thinner until it finally ruptures, causing the bubble to pop.

First I’d like to analyze the bubble shrinking and collapsing, but I’ll split it into a series of posts. The first few will be brief explanations of specific pieces of physics that are needed in order to perform the analysis. The order I think I’ll do them in is the following:

1. Fick’s Law
2. Quasi-steady-state Approximation
2. Henry’s Law
3. Ideal Gas Law
4. Laplace Pressure
5. Geometry Simplification

Then I’ll show how to put all those pieces together to derive a single equation that shows how long the bubble will last until it collapses.

Or I might change my mind on the structure and the order part-way through. We’ll see.

In my last post, I explained the game theory mathematics of the standard game of Rock Paper Scissors (RPS), and what the Nash equilibrium is and how it works. Now I will show a similar analysis, but based on a variant of RPS where winning with rock gets you 1 point, winning with scissors gets you 2 points, and winning with paper gets you 3 points.

We can solve for the Nash equilibrium by doing a probability tree of all the different possible combinations, as before with the standard RPS. In instances where you win, add that many points. When you lose, subtract that many points (your opponent gaining points is equivalent to you losing points). As before, we will choose R, P, and S according to some random distribution, where R is our probability of choosing rock, P is our probability of choosing paper, and S is our probability of choosing scissor. Similarly we define the variables R', P', and S' to be the probabilities chosen by our opponent. Doing so, we get the following expected point gain each time the game is played:

g=0RR' - 3RP' + RS' + 3PR' + 0PP' - 2PS' - 1SR' + 2SP' + 0PP'

Remove the zero terms and then group them by R‘, P', and S':

(-3R + 2S)P' + (R-2P)S' + (3P-S)R'

For the equilibrium condition, we want the total to be 0, and we want it to be independent of whatever our opponent chooses for R', P', or S'. We do that by saying that each of the terms in the parenthesis must be equal to 0. This gives us 3 equations:

\begin{matrix}   -3R + 2S = 0 \\    R - 2P = 0 \\    3P - S = 0   \end{matrix}

Solve the 3 equations for the 3 unknowns, and you get R = \frac{1}{3}, S = \frac{1}{2}, and P = \frac{1}{6}. R+P+S = 1 as it should. You can substitute these into the original equation and prove to yourself that no matter what your opponent chooses for R', P', and S', the expected return is always 0.

So… if you’re one of the 3 people on the entire internet that is still reading this thing, you may be asking yourself, “what about when my opponent doesn’t choose the Nash equilibrium? Is there an optimum R, P, S that will maximize my expected score?”

It turns out there is, and the answer is a quite simple, though getting there takes a bit of algebra. Take the formula for the expected point gain as before (terms equal to 0 eliminated):

-3RP' + RS' + 3PR' - 2PS' - 1SR' + 2SP'

Use R+P+S=1 and R'+P'+S'=1 to eliminate S and S' so it’s only in terms of R, P, R', and P', and define this as a function g(R,P):

g(R,P) = -6RP' + 2P' + 6PR' - 2P + R - R'

Now we have a surface function of two variables, R & P, with R' and P' being constants. We can imagine this being in a 2D domain with R as the x-axis and P as the y-axis. So the domain in question will be a triangle with vertices at points (1,0), which corresponds to R=1 and P=S=0. Similarly, (0,1) corresponds to P=1 and R=S=0, and (0,0) corresponds to S=1 and R=P=0.

triangle plot #1

We could then take the calculus approach and find the local maxima of g(R,P), but we really don’t have to do that. Looking at g(R,P), it is linear w.r.t. both R and P, and so g(R,P) is a flat plane, tilted in some direction determined by R' and P'. Also we know from calculating the Nash equilibrium that g\left(\frac{1}{3},\frac{1}{6}\right)=0 (i.e. when we choose the Nash equilibrium that expected point gain g(R,P)=0 ), and that when R'=\frac{1}{3} and P'=\frac{1}{6} that the plane will be coplanar with the X-Y plane [i.e. when our opponent chooses the Nash equilibrium then g(R,P) will always be zero no matter what we choose, that is only possible when the flat plane is zero everywhere].

Any deviation from those values by R' and P' will cause the plane to tilt in some direction. Since it is now tilting, there will be a direction that increases the value of g(R,P), and since g(R,P) has a constant slope everywhere, the maximum within the triangle-shaped domain must be on one of the points (1,0), (0,1), or (0,0) [or perhaps two of the points will have the same maximum value].

So the optimal choice will either be R=1 (with P=S=0), P=1 (with R=S=0), or S=1 (with R=P=0). Simply evaluating the first equation above for these three choices and seeing which one maximizes the expected point gain will give you your choice.

So for example, here is the surface for g(R,P) we get when our opponent chooses rock, paper, and scissors with equal frequency, or (R',P',S')=\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right). In our equation S' was eliminated, so for calculating g(R,P) we just need to say that (R',P')=\left(\frac{1}{3},\frac{1}{3}\right):

surface map of g(R,P)

In this plot the gray triangle is the original domain in the previous image, the red triangle is the actual surface of g(R,P), and the black arrow is a normal vector from the plane coming from the Nash equilibrium point at g\left(\frac{1}{3},\frac{1}{6}\right)=0.

It appears that the red triangle is tilted along the R=\frac{1}{3} line, so that both (R,P)=(0,0) and (R,P)=(0,1), or in other words playing scissors every time or paper every time (or some mix of the two) will give you the maximum expected point gain of \frac{1}{3} every time that you play.

Now what if our opponent chooses a strategy that is close to but not quite the Nash equilibrium? How much of a point gain can we expect by using the optimal strategy?

Here is the surface for g(R,P) if our opponent chooses (R',P')=\left(\frac{1}{3}+0.01,\frac{1}{6}-0.01\right):

g(R,P) plot 2

The slope of the response surface is much less, as reflected by the fact that our opponent’s choice of R,P, and S is much closer to the Nash equilibrium. In this case, it turns out that choosing either rock or paper every time will give the optimal point gain, in this case an expected point gain of only 0.03 each time we play.

Of course choosing rock or paper every time would be something that our opponent, if not a mindless robot, would be able to easily capitalize on. What if instead we only slightly perturb our choice of (R,P) away from the Nash equilibrium like our opponent did, but instead perturb it in the direction that gives us the maximum expected point gain?

The distance we move from (R,P)=\left(\frac{1}{3},\frac{1}{6}\right) is the same as our opponent. He moved 0.01 in the R-direction, and -0.01 in the P-direction, for a total distance of \frac{\sqrt{2}}{100}. We know that playing rock or paper every time will maximize our expected score, this corresponds to the line P+R=1. The shortest path to this line is to move perpendicular to it, which is the vector direction (1,1). So this means we should choose the point (R,P)=\left(\frac{1}{3}+0.01,\frac{1}{6}+0.01\right). If we do so, our expected point gain each time we play the game is g(R,P)=0.0012.

My next question is then, how many games do we have to play in order to have, say, a 95% confidence that our score will be greater than our opponent? Remember that what we calculated above is the expected gain, which is kind of like taking a series of many, many games and taking the average. The outcome of any 1 game is random, but after many games we expect to pull ahead slightly. However I’m really rusty on this kind of math and I don’t remember how to do it. I’ll have to review a bit.

Everyone knows the age-old game of rock, paper, and scissors, it’s been around pretty much the whole world for a century or so, and originated in China ~2000 years ago.

Generally the game is approached in 1 of 2 ways: the first is where you try and out-think your opponent, trying to exploit the inherent non-randomness of people when they play the game. This is essentially applied psychology.

The other way to approach the game is where you assume that you and your opponent are totally rational, i.e. both of you know everything about the game and there is no ‘human’ element to exploit. This is essentially game theory.

In game theory, two players are perfectly matched: neither one has any particular advantage over the other. Both players have the same chance to win each game. So the best strategy is to randomly choose rock, paper, or scissors each time (with an equal probability of each).

Using game theory and probability, you can actually prove this. Here’s one way it can be done. Define the following: if you win a game, you get 1 point. If you lose a game, you lose 1 point. R, P, and S are the probabilities of you choosing rock, paper, and scissors respectively. Similarly R', P', and S' are the probabilities your opponent chooses. Obviously R+P+S=1, R'+P'+S'=1, and 0\le R, P, S, R', P', S' \le 1.

Now we calculate the expected point gain per game by taking every possible game outcome, multiplying the point gain/loss of that particular outcome times the probability of it occurring, and then summing them all together. So the points won per game is the following:


Since we know that the best choice is to choose R, P, and S with equal probability, then R=P=S=\frac{1}{3}. Since our opponent is also perfectly rational, he will also choose the same probabilities. So the points won per game is:


Everything sums up to 0, which is what you expect when the game is evenly matched. However, this choice of (R,P,S)=\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right) lends itself to a very interesting consequence. Let’s say your opponent chooses some (R',P',S')\neq \left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right). What happens then? Let’s say he’s really stupid and chooses rock every time, or (R',P',S')=\left(1,0,0\right). Of course the obvious choice for us to win would be to choose paper every time, or (R,P,S)=\left(0,1,0\right). But if we don’t change our strategy, the following happens:


Of course, this is obvious without having to calculate it. If your opponent chooses rock every time, and you are choosing each one 1/3 of the time, then 1/3 of the time you’ll tie with rock, 1/3 of the time win with paper, and 1/3 of the time lose with scissors. So let’s see what happens when we choose (R,P,S)=\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right), but R', P', and S' are still unknown:


Rearrange the terms a bit:


So we see that for choosing (R,P,S)=\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right), it doesn’t matter what kind of strategy or ratio that our opponent chooses: the expected point gain will always be zero and we will always be evenly matched with our opponent.

This is what’s called the Nash equilibrium, named after John Nash, the subject of A Beautiful Mind. As I best understand it, for a zero-sum game (of which RPS certainly is), there always exists a state where one player can force the game to remain in equilibrium, no matter what the other player does. For the standard RPS, this comes out to be (R,P,S)=\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right). If you choose to play with this distribution, you and your opponent will always be in equilibrium no matter what strategy your opponent tries.

So that brings us to variants of RPS. This whole line of inquiry was inspired by a homework problem that my younger daughter had a few weeks ago. Basically it had a variant of RPS where if you win with rock you get 1 point, win with scissors you get 2 points, and win with paper you get 3 points. Since my daughter is in 1st grade the questions were very easy, i.e. “if you have 10 points and win with scissors, how man points do you have?” etc. But this got me thinking about how this game would actually work. Of course you want to win with paper, but you run into the same problem as regular RPS: you and your opponent are both trying to out-guess each other, so there’s no simple strategy that’s guaranteed to win.

So the next question is, is there a Nash equilibrium for this version of RPS, and if so what is it?

This post is already too long, so I’ll show how to find the Nash equilibrium for this version in my next post.

Take a look at all these photos of city streets in the rain. I found them all on google image search and I am shamelessly hotlinking them:

1 2 3 4 5 6 7 8 9

I was driving home from work today in the rain, and saw many similar scenes. However I noticed something interesting that I have never noticed before. The street lights, car lights, traffic lights, etc. all reflect off of the wet surface of the road. However since the road is not a mirror-flat surface, the reflection is smeared out or diffused due to the rough texture of the road. So my question is this: why is the smearing or diffusion of the reflection almost entirely in the vertical direction? Certainly the surface of the asphalt or concrete does not have a sufficiently heterogeneous asperity (i.e. the roughness is uniform from all directions) to account for this. Shouldn’t the light spread out as far horizontally as it does vertically? Any insights here?

Most people have heard of the Bacon number, the number of degrees of separation an actor has from Kevin Bacon, based of the Six Degrees of Separation idea. Well, this concept of looking up and assigning numbers based on the separation distance was actually borrowed from the Erdős number, which gives the degrees of collaboration between the person and the mathematician Paul Erdős.

I was thinking the other day that there is a good chance I have an Erdős number. I have published a peer-reviewed paper while in grad school, and my co-author has published dozens of papers in several fields, and his PhD advisor and then his PhD advisor are/were very famous in fluid dynamics, which is closely tied to mathematics, so there is a good probability of me having a fairly low Erdős number. You can look up the Erdős number for a person here, but since it only indexes from mathematics journals it’s not nearly as robust and complete as it could be. So while myself and my PhD advisor aren’t listed as having Erdős numbers, his advisor is, so I can just add 2 to that number to get my Erdős number. So my Erdős number trace is the following:

One of the more interesting skills learned as an engineer or a scientist is the art/skill (it’s really both) of being able to make reasonable order-of-magnitude estimates. Using a combination of knowledge, common sense, reasoning, intuition, and some quick hand calculations, a skilled engineer/scientist will often try and estimate the critical parameter or result of some system before going through the tedious calculations to get a more accurate answer. This is useful for several reasons: it gives you an idea of what orders of magnitude you will be dealing with, it gives you a ‘sanity check‘ for when you calculate a more accurate answer, and most importantly, often times an order of magnitude estimate is all that you really need.

This process is sometimes called handwaving, back-of-the-envelope calculation, guesstimate, or a ballpark estimate. They are also sometimes called Fermi problems, since the physicist Enrico Fermi was renowned for being able to perform such simple estimates and get within a factor of 2 or 3 of the actual answer (this is extremely good for a simple estimate). Two famous examples of are his calculation for how many piano tuners there are in Chicago, and his estimation of the energy yield of the first atomic bomb test by dropping some scraps of paper and seeing how far the blast blew them.

Often times when you are making such estimates, you simply round (logarithmically) each number to it’s nearest power of 10. This is because your estimates are off by factors at least that large anyway, so there’s little point in carrying through precise numerical calculations. A personal favorite anecdote on this principal came from my brother Porter when explaining why he replaced \pi with the number 1, “Why did I make \pi equal to 1? Because it’s not 10.” (In actuality though, since \sqrt{10}=3.16 and \pi=3.14 , \pi is right close to the dividing line between assigning it to the value of 1 or 10. You can choose either, or just make it 3, which is what I usually do.)

The reason why I’m talking about this is because today I stumbled upon this page for a class at MIT that deals entirely with this subject. The name of the course? “Lies and Damn Lies: The Art of Approximation in Science”. I skimmed through the first chapter and it was very well-written and interesting. You could learn a lot about this skill just by reading the chapter and working through the problems yourself.

Another closely related but slightly more accurate method of estimation uses dimensional analysis. The wikipedia article I linked to is a bit obtuse for the uninitiated, but the two examples halfway through the article are fairly simple to follow and stand well on their own. For dimensional analysis you use your knowledge of the underlying physics of a system to make reasonable assumptions about what parameters are pertinent in your analysis. A more mathematically formalized version of this is called the Buckingham π theorem, and is an extremely useful and versatile tool for the initial analysis of a system.

My favorite example is at the end of the article, where it shows how Geoffrey I. Taylor used the Buckingham π theorem to estimate, again, the energy output of the first atomic explosion. A summation of his original paper with his analysis can be found online. Basically he was able to determine the following relationship:

\displaystyle{R\approx\left(\frac{Et^2}{p}\right)^{1/5}} ,

where R is the radius of the exploding shockwave at some time after detonation, t is the time, p is the atmospheric pressure and E is the energy released upon detonation. He used recently declassified movies of the explosion to get the radius and time values, and that allowed him to estimate the energy. In fact, when he published his results it caused quite a commotion in the US Defense Department because the energy output of the atomic bomb was still classified at the time and Taylor’s result was far too accurate for their liking!

A couple of years ago I wrote a post where I talked about taking the imaginary number i to it’s own power an infinite amount of times, essentially


I showed numerically at least that it converges upon a single value in the complex plane, but then I speculated about what would happen if I did the same for other numbers? I thought it would most likely be a fractal like the Mandelbrot set or the Julia set, but I never got around to actually solving it.

Well, for the class that I’m TA’ing this semester (intro to computing for Chemical Engineers), I got hold of some simple code for making the Mandelbrot set in MATLAB to show the students a fun and interesting example of using for loops. After I did that though, I figured that it would be fairly simple to modify the code to see if taking a complex number to it’s own power created a fractal or not. So my algorithm was the following:

For any point c in the complex plane, let
In the limit as n goes to infinity, if z remains finite, then c is within the set. Otherwise it is not in the set.

Since you can’t really evaluate it to infinity, what you do instead is implement an algorithm where you perform the iteration a large but finite number if times, and see if the iteration stays inside a set radius. So here is my fractal. The first plot shows the the complex plane from -20 to +20 on both the real and imaginary axis:

Here the blue football-shaped region in the center is centered at the origin. These points are considered to be ‘within’ the set, since a large number of iterations did not give numbers that were outside of the convergence radius (I set the number of iterations to 50, and the convergence radius to 1000). Any points that are not that deep blue are considered to be ‘outside’ the set, with the color being an indication of how many iterations it took for that point to leave the convergence radius. The dark red points took the least iterations, with the orange, yellow, green, and light blue areas taking progressively more iterations.

I think the feathery wing structures going in the positive and negative imaginary directions are really interesting. They continue up and down seemingly without end, or at least I didn’t see any end for even large plots that I made. This is in sharp contrast to the Mandelbrot and Julia sets which are confined to a finite area.

But the center looks the most interesting, so let’s get a closeup of that.

There’s a lot of interesting stuff going on in this area, with some really beautiful feathery wings, and a jumbled area that just looks like ‘static’. Here’s a zoom-in that let’s us see a little bit of both better.

Wow. Those feathers really are quite beautiful if I do say so myself. That random static-looking area bugs me though, since random noise really isn’t a feature of fractals. Instead they exhibit more of an ‘ordered chaos’ showing infinite complexity. So for my last image here is a much closer zoom on the static-looking area.

This last image has a width and height of just 0.005. Here we can see that it really is infinitely complex and not just random noise. I did another with a width and height of 0.0005, and it looks very similar, so there is some degree of self-similarity, much like we see in other fractals.

So the only thing this set is missing is a name. In reality probably someone has calculated this set and given it a name, but on the slight chance that no one has, I hereby name it the Bassett set. (How’s that for hubris? I bet Benoit Mandelbrot didn’t name the set after himself, but that someone else named it after him in honor of his discovery)

I was chatting with my brother Porter the other day and he told me how he made a Tower of Hanoi game for his children to play with. It’s a fairly simple game with a well-known binary sequence to solving it, in fact it’s one of the few puzzle games where the God’s Algorithm (a shortest-number-of-moves algorithm that can be mathematically proven) exists and is known. The shortest-path solution takes exactly 2^n-1 moves to move the disks from one post to another. Porter mentioned in his next blog post that him and his son then calculated how high of a stack The Flash could do if he could move 1 million disks per second, and concluded that it would still take him longer than the age of the universe to complete the 100-disk version.

That’s a good start, but any self-respecting geek has to take things a notch further. First I needed to derive a formula that let’s me calculate how high of a stack I can complete in a given time given a rate of how many disks per second I can move. The original equation we can rearrange as:

\displaystyle{N=\frac{\ln (n+1)}{\ln  2}} ,

where N is the number of disks, and n is the number of moves. Then we can replace the n with n=\omega t, where \omega is the frequency of number of moves per unit of time, and t is the total time. So now our equation is:
\displaystyle{N=\frac{\ln (\omega  t)}{\ln  2}} .

You may notice that I have ignored the \small{+1}. This is because the \omega t will be so large that we can ignore the the \small{+1}, as it is inconsequential.

This formula let’s us do some basic calculations. If we take t to be the age of the universe in seconds (about \small{4.33\times 10^{17}s}) then we have the following number of disks we can expect to complete within that time for the given number of disks moved per second:

\begin{array}{cc} \omega ,s^{-1} & N,\text{disks} \\ \hline 1 & 58 \\ 10 & 61 \\ 100 & 65 \\ 1000 & 68 \\ 1\times 10^6 & 78 \\ 1\times 10^9 & 88\end{array}

So we can see that if an immortal Flash were to move disks for the entire age of the current universe at the rate of 1000 disks/s, then he could only expect to complete a stack of 68 disks. Increasing it to 1 billion disks/s only increases it to 88 disks! Still a far cry from completing 100 disks.

So the next question is what kind of energy/power requirement would we need to move the disks at these kind of rates? First we’ll need to make some assumptions on mass and distance. One typical move is shown in the animated gif below (if someone knows how to set it so that it will repeat more than once, please let me know):

In terms of mechanics, to move the disk from column one to column two the disk has to do six steps: 1) accelerate up until its halfway up the column, 2) decelerate until it comes to a stop having just cleared the column, 3) accelerate horizontally until its halfway to the next column, 4) decelerate until it comes to a stop above the new column, 5) accelerate down until it’s halfway down the new column, and finally 6) decelerate until it comes to a stop in the new position.

Now you may say that we don’t need to decelerate the disk as it reaches its resting point and just let it smack into the resultant stack, but when the speeds become really fast the energy will be enough to obliterate any disk, so we’ll include the final deceleration. We’ll also assume this is done in a vacuum so we can ignore wind resistance, otherwise the heat would be more than enough to ionize the disks (those that have read the well known Physics of Santa will be familiar with this). At some point the acceleration alone will become greater than the structural integrity of the disks, as well as relativistic effects at some point coming into play. We’ll get to those in a moment. For now though, we’ll simplify the 6 steps above by making them all the same length for every step. We can do this by putting the three columns in a triangular arrangement and then saying the disks are thin enough that we can neglect the change in the height of the stack as we make progress. We’ll say the columns are 10 cm apart and 10 cm high, so each of the 6 steps will be 5 cm. Also we’ll say each disk weighs 100 g.

Using classical mechanics (ignoring relativistic effects), the time to complete one step is simply 1/6 of a move, so that is:

\displaystyle{t=\frac{1}{6}\omega ^{-1}} .

The maximum velocity attained (so that we can know if we’re getting close to relativistic speeds) is:
\displaystyle{v_m=\frac{2L}{t}} .

The acceleration the disk undergoes (to see of we are overcoming structural integrity) is:
\displaystyle{a = \frac{2 L}{t^2}} .

The energy required to complete one step is:
\displaystyle{E=\frac{2m L^2}{t^2}} .

And finally the power required to keep the disks moving is:
\displaystyle{P=\frac{2m L^2}{t^3}} .

Putting all these together in a handy chart we can see the results.

\small{\begin{array}{ccccccc}\omega, s^{-1}& N & t_{step}, s & v_{\max},m/s & a,m/s^2 & E, J & P, W  \\ \hline 1 & 58 & 1.67\times 10^{-1} & 0.6 & 3.6 & 1.8\times 10^{-2} & 0.108 \\ 10 & 61 & 1.67\times 10^{-2} & 6 & 360 & 1.8 & 108 \\ 100 & 65 & 1.67\times 10^{-3} & 60 & 3.6\times 10^4 & 180 & 1.08\times 10^5 \\ 1000 & 68 & 1.67\times 10^{-4} & 600 & 3.6\times 10^6 & 1.8\times 10^4 & 1.08\times 10^8 \\ 1\times 10^6 & 78 & 1.67\times 10^{-7} & 6\times 10^5 & 3.6\times 10^{12} & 1.8\times 10^{10} & 1.08\times 10^{17} \\ 1\times 10^9 & 88 & 1.67\times 10^{-10} & 6\times 10^8 & 3.6\times 10^{18} & 1.8\times 10^{16} & 1.08\times 10^{26}\end{array}}

And the results are pretty interesting. We can determine what the trends are with the formulas above, and we can see how the various values scale with the frequency by replacing t in the equations with \frac{1}{6}\omega ^{-1}. That gives us the following scaling arguments: v_m\sim \omega , a\sim \omega^2 , E\sim \omega^2 , and P\sim \omega^3 . In other words, if we double the frequency of the steps we will double the maximum velocity, but the acceleration and the energy will change by 4x, and the required power will change by 8x! This comes out to be a huge power requirement for the higher frequencies.

To get an idea of how huge the power requirements become, here are some power equivalents on the same order of magnitude:
To get \small{0.1 W}, you would need the power consumption of about 10 DVD drive lasers.
To get \small{100 W}, you would need the electrical output of a 1×1 m solar panel in full sunlight.
To get \small{1\times 10^{5} W}, you would need the power output of a typical automobile.
To get \small{1\times 10^{8} W}, you would need the power output of a Boeing 777.
To get \small{1\times 10^{17} W}, you would need the total power received by the earth from the sun.
And to get \small{1\times 10^{26} W}, you would need about 1/3 the total power output from the sun.
(All order of magnitude power estimates are from this Wikipedia page.)

How does this highest frequency look with our other limits on velocity and acceleration? We should be OK on velocity, since 6\times 10^8 m/s is still only 1/5 the speed of light, slow enough that we can still ignore relativistic effects. As for the stress that the object undergoes due to acceleration, that’s fairly simple to calculate too. Stress is simply force over area, given in this simple formula:

\displaystyle{\sigma=\frac{F}{A}} .

Where F is the force and a is the cross-sectional area. We can get the force from Newton’s 2nd law: F=ma. For the area we will need to assume a cross-sectional area of the disks themselves. Assuming the disks are about 3 cm in diameter, that gives about 7 cm^2 in area. Calculating the stress that each disk receives we get the following:
\begin{array}{cc} \omega ,s^{-1} & \sigma , \text{Pa} \\ \hline 1 & 510 \\ 10 & 5.1\times 10^4 \\ 100 & 5.1\times 10^6 \\ 1000 & 5.1\times 10^8 \\ 1\times 10^6 & 5.1\times 10^{14} \\ 1\times 10^9 & 5.1\times 10^{20}\end{array}

Looking up the strength of various materials, the highest two were for high-impact steel and diamond, both being around 1 GPa \left(1\times 10^9\text{Pa}\right). So from the table, we could expect to move 1000 disks per second, but the disks would fail due to the acceleration if we tried to go much faster than that.

Assuming that we could make disks out of impossibilium that can take any amount of stress, then for The Flash to complete a Tower of Hanoi with 88 disks, he would need to move 1 billion disks per second at a speed of about 1/5 the speed of light for 10 trillion years, and he would require the constant power output of about 1/3 of the sun (or a series of equivalent-sized stars, since a single star won’t last that long).

That’s it for this post, in the next post I’ll try to include relativistic effects for even faster speeds.

Except for my quest to find an internet connection, the first few days here in Imazu were pretty uneventful. Usually we’re too jet-lagged to do much anyway, so it’s fine with us. Yesterday though, we took a trip to Fukui Prefecture with Ryoko’s aunt and uncle.

I’ve blogged a little about Fukui Prefecture before a few years ago, when we went to visit the nuclear power plant located on one of the many peninsulas there. Interestingly enough, that is right next to the city of Obama, which has gained some notoriety due to the name of the U.S. President. We didn’t go to Obama (I’ve been there before though when my father came to meet Ryoko’s parents. There isn’t much to the city, to tell you the truth), but I did manage to take the following picture as we were driving through Tsuruga, a nearby town:

That’s an advertisement for Obama Ramen. Unfortunately my timing was a bit off and the right edge is cut off by a telephone pole.

The first place we went to was the Fukui Prefectural Dinosaur Museum. It turns out the only place where dinosaur fossils have been found in Japan are in Fukui Prefecture, and to commemorate it the prefecture has built what I can only call a truly, truly excellent museum. I like to go to natural history museums anywhere I go, and this is one of the best I have ever seen. It’s built inside of a giant metallic sphere, which makes the museum easy to spot from several miles away, and makes for a really impressive ceiling once you get into the main body of the museum.

Next Page »