Skip to content


October 16, 2011

This is prompted by a back and fore I just had with Martillo on SciForums. Those of you who frequent physics related forums will know him as the author of ‘A New Light in Physics’. I’m not going to go into that stuff, which he’s been pushing for years. Instead the discussion reminded me of something which is often done by cranks, namely the conflagration of the following statements;

  1. I think you are wrong
  2. I think the mainstream is right

All too often cranks seem to think that because I’m not buying what they are selling then I must be asserting the current mainstream explanation is right. Right in the sense of the universe working precisely as outlined by the model in question. I’m not sure why it’s such a common issue for them to grasp. Perhaps it’s a peak into how their mind works, in that they think their work must be exactly true and therefore if I don’t agree with them then there must be some other model I think is exactly true?

In reality you’ll be hard pressed to find a particle physicists who thinks the Standard Model is the final word in particle physics until you hit quantum gravity energy scales. That isn’t the say they, we, don’t have confidence in it. When pressed it’s more likely you’ll get comments along the line of how the SM is extremely good up to energies of about 100 GeV and thus any particle process below that can be explained to sufficient accuracy by the SM now (if you could compute the relevant integrals etc).  It’s much like Newtonian gravity’s relationship to relativity. NG is great if you want to launch a satellite, it is much simpler and cheaper to work with than relativity and gets the job done. As such it wraps up gravity for slow moving, not too big objects where timing to the nanosecond isn’t important and you aren’t modelling too far into the future. It’s not exact but it’s useful for those jobs. In the case of gravity we know an improvement to NG, general relativity. It’s currently beyond our capability to find an error in it but it can’t be quantised directly so it too is just an approximation to something more fundamental.

These implicit “I don’t think its perfect but its damn good” views are all over physics, where effective models are common place.  Having a justified confidence that a model is accurate to a certain number of decimal places in specified conditions is a long way from having a belief it’s absolutely perfect.

Contrast that with hack models which get names like ‘The Everlasting Theory’ or ‘The Final Theory’ or ‘The Ultimate Cosmology’ (sometimes with the author’s name in the title too, just for extra ego stroking). Rather than just saying “I have a superior model than the mainstream one”, it’s “I have a superior model to everything ever because its perfect.” Often discussions then boil down to hacks saying “The SM must be wrong, as it disagrees with my model”. For example. Sylwester, the author of ‘The Everlasting Theory’, once asserted the SM doesn’t predict a value for the strong coupling when it runs to high energy. Actually it does and he knows it, as he and I had discussed it many times, even talking specific values at specific energy scales. What he meant was it didn’t predict what his ‘everlasting theory’ predicted so it was a point against the SM. Circular reasoning doesn’t seem to be something he understands. When someone is that invested in the absolute certainty of their conclusions there’s little which can really be done in honest discussion, they are pretty much incapable of it. In such cases the whole “If you disagree with me you must be saying the SM is perfect” clearly is some form of projecting their attitudes onto others.

It cannot be (rationally) denied that things like general relativity and the Standard model are very accurate in the domains we have tested them. Even if something like the Higgs doesn’t exist the accuracy of the SM up to 100 GeV (the Higgs lies somewhere north of that, if it exists) is a demonstrable fact. It might be the case that in 100 years when we can measure 5 extra decimal places we find problems with the SM in that range, just as we can measure errors in Newtonian physics using atomic clocks on aeroplanes, but no one in physics research would be shocked by that. It wouldn’t invalidate all of current physics or result in every physicist being fired, as some hacks have asserted will happen when the Higgs isn’t found (which they are certain of…). There’s justified reason to use the SM or GR to model phenomena we can easily produce and test, as we’ve tested them in such conditions and found them accurate. Not perfect or exact or proven true, accurate.

Renormalisation and Infinities

October 16, 2011

Renormalisation is one of those areas of quantum field theory people are generally bemused by. I won’t pretend I understand it in any great detail but one of the areas I think I can help people with is the whole “How does \infty - \infty make sense?” thing people have. For those with the book, this is a stripped down overview of part of Section 7.5 of Peskin & Schroeder, covering dimensional regularisation. Wilsonian or other methods are sufficiently unfamiliar to me that I’m not going to touch them.

Feynman Diagrams 

Let’s start with a quick run through of why people get to the above ‘calculation’. Quantum field theories start with a Lagrangian \mathcal{L}, an expression which formally summarises the kinetic and potential energies of all the particle fields in the system, from which the equations of motion of the fields can be computed. The terms in the Lagrangian tell which which fields couple to one another, what the charges of that coupling are, which fields are dynamic, which fields are massless, what the dimensions of the couplings are, what symmetries the theory has, all sorts of things.

A description of how particles interact is obtained by using the Feynman rules, which tell you how to convert a Feynman diagram into an integral. Feynman diagrams are brilliant things, they are a simple intuitive way to draw how particles come in, which ones interact with which and which particles come out at the end. Feynman’s insight was that you can take the Lagrangian and the diagram and pretty much immediately write down the corresponding huge mathematical expression which describes the contribution that diagram makes to your observable properties.

Unfortunately the huge mathematical expression is huge and typically extremely unpleasant. When a diagram includes a loop, ie a path which ends where it started, then the expression includes an integral. This is because the relevant quantities in the calculations are the momentum of various particles in the diagram. By momentum conservation if there are no loops you know the momenta of all the bits of the diagram. However, a loop means you can put in any amount of momentum around the loop and still satisfy momentum conservation. As a result you have to integrate over all possible momenta!

Unfortunately (again!) this can be a problem. Sometimes the integral doesn’t converge and gives an infinite amount. This can happen either when the momentum around the loop goes to zero or when it goes to infinite, much like the following examples,

\int \frac{\textrm{d}p}{p^{2}} \quad,\quad \int p^{2} \textrm{d}p.

 The first has problems at p=0 and the second at \vert p\vert \to \infty. Let’s consider a typical case which we can do something with.

Feynman Integrals

We want to compute the following integral where we are working in d dimensional space-time,

\int \frac{\textrm{d}^{d}x}{(2\pi)^{d}}\frac{1}{(x^{2}+k)^{n}} = \int \frac{\textrm{d}\Omega_{d}}{(2\pi)^{d}}\int_{0}^{\infty}\textrm{d}x\frac{x^{d-1}}{(x^{2}+k)^{n}} .

The second expression is split into two parts. The second part is difficult to compute, while the first part is proportional to a unit d-dimensional sphere. That bit is easy to compute, in terms of the Gamma function,

\int \textrm{d}\Omega_{d} =2 \frac{\pi^{\tfrac{d}{2}}}{\Gamma(d/2)}.

We can just look up the value of \Gamma(d/2) for any d we want. That leaves the second integral. In our universe (setting aside string theory) we have d=4. Unfortunately when d=4 this integral is infinite. There’s a number of problems with it. Suppose k=0, so that the integrand is \frac{x^{d-1}\textrm{d}x}{x^{4}}. There are d factors of x up top and 4 on the bottom. If d<4 this blows up at x=0. If d>4 it blows up at x \to \infty. At d=4 doesn’t scale much with \vert x \vert but then you might be adding some O(1) value over an infinite region, so it’s a grey area. If k isn’t zero the issue with d>4 still exists, so all we do is remove the issue with x=0.

Non-Integer Dimensions 

Since d=4 is a problem, it leads to the infinities people don’t like in quantum field theory, perhaps we should stay away from it. Fortunately if we don’t set d then we can actually do the integral. I won’t go into the details but when we compute it with the sphere volume we get

\int \frac{\textrm{d}^{d}x}{(2\pi)^{d}}\frac{1}{(x^{2}+k)^{n}} = (4\pi)^{-\tfrac{d}{2}}\frac{\Gamma(n-\tfrac{d}{2})}{\Gamma(n)}\left( \frac{1}{k} \right)^{n-\tfrac{d}{2}}.

Now we can see explicitly the small x (otherwise known as infra-red) and large x (ultra-violet) divergences.  The 4\pi term is not a problem, neither is \Gamma(n) because the Gamma function is never zero. That leaves \Gamma(n-\tfrac{d}{2}) and k^{\tfrac{d}{2}-n}. For simplicity we can relabel $n-\tfrac{d}{2} = m$ so we are considering \Gamma(m) and k^{-m}. Obviously if m>0 then k=0 is a problem, as previously commented. But what about \Gamma(m)?. Let’s remember its definition,

\Gamma(z) = \int_{0}^{\infty}e^{-t}t^{z-1}\textrm{d}t.

The Gamma function is an extremely interesting function which appears all over the place in physics and pure mathematics. It appears in the expressions for n-sphere volumes (as is the case here), it appears in the original paper on string theory and it plays a role in the Riemann Zeta function’s properties. For m a positive integer, 1,2,3,… we have \Gamma(m) = (m-1)!. It’s from this you can obtain 0!=1. However, if m is a negative integer, -1,-2,-3,…. then it is infinite. This can be obtained from its integral form or with a less than rigorous use of the factorial interpretation,

n! = n(n-1)! \quad \Rightarrow \quad (n-1)! = \frac{n!}{n}.

If n=1 we obtain 0! = \frac{1!}{1} = 1 but if n=0 we get (-1)! = \frac{0!}{0} = \infty, then $(-2)! = \frac{(-1)!}{-1} = \infty$. That’s a little arm wavey but it is indeed what you get if you compute the Gamma function for non-positive integers, including 0. Oddly enough it is finite for say -1/2, even though its infinite for 0 and -1. One of those weird things about mathematics. Anyway…. that means if for integer m = n - \frac{d}{2} \leq 0 the above expression is infinite, ie n \leq \frac{d}{2}. If d=4 then n needs to be larger than 2, else the denominator is doesn’t grow fast enough and things don’t decay properly. Unfortunately there’s no way around this integral, we want to make sense of it. So to do that we seperate the infinite from the finite!

 Laurent Expansions

So how precisely can you do that? After all can’t you just do nonsense like \infty = \infty + \textrm{anything}  so \infty-\infty = \textrm{anything}? Well yes but what if everything is still finite, just perhaps really big? To illustrate this we go back to the Gamma function and let’s set n=2 and write d = 4-2\epsilon. Now we have an expression involving \Gamma(\epsilon). As we’ve just seen as \epsilon \to 0 this explodes. This implies there’s some sort of \frac{1}{\epsilon^{n}} thing going on. In fact this can be made rigorous using a generalisation of a Taylor expansion, known as a Laurent expansion. Rather than summing up polynomial terms for x^{n} where n is non-negative we now allow ALL of the integers!

\Gamma(\epsilon) = \sum_{n=-\infty}^{\infty} \gamma_{n}\epsilon^{n}

There’s truck loads of mathematics devotes to things of this form. In fact, the Gamma function’s properties in the complex plane and the nature of its singularities are perhaps the most examined of all ‘standard’ functions, not lead because it relates to the other big name function, the Riemann Zeta function via its functional reflection formula. To cut a long and elaborate story short(er) it turns out that the Gamma function has an order 1 pole at all non-positive integers. This means the Laurent expansion doesn’t go ‘worse’ than \epsilon^{-1},

\Gamma(\epsilon) = \gamma_{-1}\frac{1}{\epsilon} + \gamma_{0} + \sum_{n=1}^{\infty} \gamma_{n}\epsilon^{n} .

Counter Terms

So what does the above expansion tell us? The first term is ‘bad’, it blows up in the limit d \to 4. The second term doesn’t change as we change \epsilon, while all the other terms vanish in the limit. So we have bad, invariant and irrelevant. Thus if we could remove the bad we’d have something meaningful in the limit. Unlike working in the limit, where the bad swamps everything and we can’t compute anything, in this slightly perturbed setup we have a clear distinction. So the question is whether  we can modify our construction in such a way to remove the \epsilon^{-1} bit but leave the \epsilon^{0} bit unchanged. This is done by adding something known as a counter term to the Lagrangian, \mathcal{L} \to \mathcal{L}+\mathcal{L}_{c.t.}. It’s properties are designed to remove \epsilon^{-1} terms without altering anything else. You don’t want to do this for every calculation, a theory is renormalisable if you only need to add finitely many counter terms. In any good model you need to get out more than you put in and if you put in infinitely many counter terms you’re basically not getting any predictions. So hopefully once you’ve computed a few counter terms from the simpler processes you are all set and you don’t have to add any more. There are ways of estimating and sometimes proving whether or not a Lagrangian leads to a renormalisable theory. It’s generally easier to prove something isn’t renormalisable than prove it is. As just outlined with the Gamma function, the dimensionality of space can play a role. 2 dimensional quantum gravity is much easier than general quantum gravity for example, 2d has lots of quirky properties which make it great to work in, just ask your friendly neighbourhood string theorist (or take my word for it as I used to be one).

 Cut offs

The conclusion is that by altering your formulation to allow you to work ‘close’ to the thing you’re interested in, ie d=4, you control the infinities and allow you to separate off the good from the bad. You introduce a cut off, stopping your calculations short of the problem region but close enough to give you insight. Other methods work on similar principles. For example, the integrals discussed above involve integrating up to infinite momentum. Instead, why not integrate up to some energy scale \Lambda, work out everything and then try to remove anything which explodes as $latex \Lambda \to \infty$. This is a momentum cut off rather than a length cut off but in relativity energy scales go up as length scales go down so there’s a similar underlying principle in \epsilon \to 0 as \Lambda \to \infty.

Hopefully that helps someone understand WTF renormalisation tries to do with infinities. I’m sure someone who does renormalisation for a living will see plenty of mistakes here but it’s a mixture of cutting corners and my own somewhat ignorance on the details. As long as there’s enough to get people to see that it’s not just “Infinity minus infinity is whatever the hell I like, quantum field theory works!”, as some people try to claim, then this has served its purpose.

Bases and Lorentz Boosts

October 9, 2011

This post (and perhaps a few to follow) is prompted by this thread over on SciForums. It’s a discussion about Lorentz transforms and, surprise surprise, someone doesn’t get it. In this case Magneto.

The thread started with Rpenner going over the form of a Lorentz transform, specifically a boost. He did what most textbooks do and selected the boost in the x axis direction. This is mathematically simpler than a general one and helps to show that the directions orthogonal to the velocity vector do not get changed by the boost. While the choice of velocity to go in the x axis might sound like not a general case it is actually equivalent to it. Unfortunately Magneto doesn’t understand this (despite claiming to be well read in this stuff, so much so he thought he was in the running to head up a research group at CalTech, which he wasn’t). It’s problem I can imagine others having, though most people wouldn’t be as dense as Magneto when they are given the explanation which I’ll do now.

Bases of \mathbb{R}^{3}

For the sake of simple examples I will do everything in the familiar realm of 3 dimensional space but all of this can be generalised to any number of dimensions (even infinite in some cases!), usually by just changing 3 to N and repeating a few things enough times.

So what’s a basis of \mathbb{R}^{3}? A basis is a list of 3 vectors in terms of which all vectors can be written. If \{\mathbf{e}_{1},\mathbf{e}_{2},\mathbf{e}_{3}\} is a basis then it means any vector \mathbf{v} can be written as a linear combination of them, \mathbf{v} = A_{1}\mathbf{e}_{1}+A_{2}\mathbf{v}_{2}+A_{3}\mathbf{v}_{3}. We call A_{i} the components of \mathbf{v} in the basis \{\mathbf{e}_{1},\mathbf{e}_{2},\mathbf{e}_{3}\}. If we change basis to \{\mathbf{f}_{1},\mathbf{f}_{2},\mathbf{f}_{3}\} then we get new components because now we have \mathbf{v} = B_{1}\mathbf{f}_{1}+B_{2}\mathbf{f}_{2}+B_{3}\mathbf{f}_{3}. A simple example is that if \mathbf{f}_{i} = 2\mathbf{e}_{i} then obviously we have B_{i} = \frac{1}{2}A_{i} but more complicated combinations are allowed.

So how do we know if a set of vectors form a basis? They must be linearly independent, which means you can’t write one in terms of the others. A quick way to check this is to compute the triple vector product, \mathbf{e}_{1} \cdot (\mathbf{e}_{2} \times \mathbf{e}_{3}). If it is zero then they are not a basis. If it is non-zero then you’re good to go. It’s often convenient to have the vectors be mutually orthogonal, which means they are at right angles to one another so \mathbf{e}_{i}\cdot \mathbf{e}_{j} = 0 if i \neq j.

All of these things are true for the x,y,z axis directions, \mathbf{e}_{x},\mathbf{e}_{y},\mathbf{e}_{z}. If we take \mathbf{e}_{1} = \mathbf{e}_{x} etc then the components of \mathbf{e}_{x} are (1,0,0) because obviously \mathbf{e}_{x} = 1\mathbf{e}_{1} + 0\mathbf{e}_{2}+0\mathbf{e}_{3} and likewise for the rest. This is the standard basis everyone is familiar with from school.


Rotations are things most people should be familiar with. In linear algebra a rotation can be represented as a matrix acting on vectors. Again, as for vectors, the components of the matrix depend upon the choice of basis. Suppose we want to consider a rotation about \mathbf{e}_{1}. This will leave \mathbf{e}_{1} invariant and will mix the other two directions. It will depend on the angle, call it \theta. Through some simple geometry (see Wikipedia) you find the matrix has the following form,

R(\theta,\mathbf{e}_{1}) = \left( \begin{array}{rrr} 1 & 0 & 0 \\ 0 & \cos \theta & -\sin \theta \\ 0 & \sin \theta & \cos \theta \end{array}\right)

The effect of this on the basis is the following,

\begin{array}{lcrcrcl}  \mathbf{e}_{1} &\to& \mathbf{e}_{1} &&&\equiv& \mathbf{f}_{1} \\  \mathbf{e}_{2} &\to& \cos\theta\, \mathbf{e}_{2} &-& \sin \theta\, \mathbf{e}_{3} &\equiv& \mathbf{f}_{2} \\  \mathbf{e}_{3} &\to& \sin\theta \,\mathbf{e}_{2} &+& \cos \theta \,\mathbf{e}_{3} &\equiv& \mathbf{f}_{3}  \end{array}.

Similar things occur for R(\theta,\mathbf{e}_{2}) and R(\theta,\mathbf{e}_{3}), just shuffle around the 1,2,3 labels. So what’s nice about rotations? They don’t change the dot product between two vectors is the really important thing. This is a special case of what has previously been talked about for Lorentz transforms, the group SO(N) leaves the Euclidean metric \delta invariant.

\delta(\mathbf{X},\mathbf{Y})=\delta(R\cdot\mathbf{X},R\cdot\mathbf{Y}) \quad \Leftrightarrow \quad \delta = R^{\top} \cdot \delta \cdot R

In a less formal way it means rotations don’t change the length of vectors and they don’t change the angles between vectors and you can write the dot product between vectors in terms of those things,

\mathbf{X}\cdot \mathbf{Y} = \Vert \mathbf{X} \Vert \Vert \mathbf{Y} \Vert \cos \alpha,

where \alpha is the angle between the vectors. So what does that mean for a basis? Well we used the rotation R(\theta,\mathbf{e}_{1}) to define a set of new vectors \mathbf{f}_{i} above. Well let’s consider their dot products,

\mathbf{f}_{i} \cdot \mathbf{f}_{j} = (R(\theta,\mathbf{e}_{1}) \cdot \mathbf{e}_{i}) \cdot (R(\theta,\mathbf{e}_{1})\cdot \mathbf{e}_{j}) = \mathbf{e}_{i} \cdot \mathbf{e}_{j} .

Another thing about rotations is that if two vectors have the same length then there exists a rotation (in more than 2 dimensions there are infinitely many!) which will turn one into the other.

\Vert \mathbf{X} \Vert = \Vert \mathbf{Y} \Vert \quad \Rightarrow \quad \exists \, R_{\mathbf{X},\mathbf{Y}} \; \textrm{such that} \; \mathbf{X} = R_{\mathbf{X},\mathbf{Y}} \cdot \mathbf{Y}

For example, if \mathbf{v} = (v_{1},v_{2},v_{3}) then there is a rotation which changes it to (v,0,0) where v^{2} = v_{1}^{2}+v_{2}^{2}+v_{3}^{2} and vice versa.

Lorentz Boosts

So let’s return to Rpenner’s Lorentz boost, where the velocity was (v,0,0). It results in the y and z directions being unchanged, ie the boost warps only 1 direction of 3. This can be seen from the matrix representation, which is the identity in those two directions.

\Lambda(v,0,0) = \left( \begin{array}{cccc}  \gamma & -v \gamma & 0 & 0 \\  -v \gamma & \gamma & 0 & 0\\  0 & 0 & 1 & 0\\  0 & 0 & 0 & 1  \end{array} \right)\quad \Rightarrow \quad \Lambda(v,0,0) \cdot \left( \begin{array}{c} A_{0} \\ A_{1} \\ A_{2} \\ A_{3} \end{array} \right) = \left( \begin{array}{c} \gamma A_{0} - v A_{1} \\ \gamma A_{1} - vA_{0} \\ A_{2} \\ A_{3} \end{array} \right)

Acting the above on a space-time vector will not change the last two components. Magneto’s Lorentz boost is in the \mathbf{v} = (v_{1},v_{2},v_{3}) direction. He claims it is ‘more general’ in the sense that all the components change,

\Lambda(v_{1},v_{2},v_{3}) = \left( \begin{array}{cccc}  \gamma&-v_{1}\,\gamma&-v_{2}\,\gamma&-v_{3}\,\gamma\\  -v_{1}\,\gamma&1+(\gamma-1)\dfrac{v_{1}^2}{v^2}&(\gamma-1)\dfrac{v_{1} v_{2}}{v^2}&(\gamma-1)\dfrac{v_{1} v_{3}}{v^2}\\  -v_{2}\,\gamma&(\gamma-1)\dfrac{v_{2} v_{1}}{v^2}&1+(\gamma-1)\dfrac{v_{2}^2}{v^2}&(\gamma-1)\dfrac{v_{2} v_{3}}{v^2}\\  -v_{3}\,\gamma&(\gamma-1)\dfrac{v_{3} v_{1}}{v^2}&(\gamma-1)\dfrac{v_{3} v_{2}}{v^2}&1+(\gamma-1)\dfrac{v_{3}^2}{v^2}\\  \end{array} \right)

where \gamma = \gamma(v) = (1-v^{2})^{-\tfrac{1}{2}} and we have used the notation v^{2} = v_{1}^{2}+v_{2}^{2}+v_{3}^{2} (this is not a coincidence). Clearly acting that on a general vector will change all 3 spatial components, so is Magneto correct in claiming all 3 directions are warped, rather than 1?

As just explained, if two vectors are of the same length there is a rotation which turns one into the other. Since v^{2} = v_{1}^{2}+v_{2}^{2}+v_{3}^{2} the two vectors used by Rpenner and Magneto are equal in size and since rotations are special cases of Lorentz transforms there’s a Lorentz transform which converts Rpenner’s vector into Magneto’s and vice versa. Now notice how the direction which is warped in Rpenner’s case includes a contradiction factor \gamma and is time dependent. Clearly it’s length has changed and changes in time. So the question is can we find a direction (or even 2) in Magneto’s version which isn’t changed?

Remember that rotations don’t change inner products and that the size of a vector is expressed in terms of an inner product, \Vert \mathbf{X} \Vert^{2} = \mathbf{X} \cdot \mathbf{X}. Remember the vectors \mathbf{e}_{y} and \mathbf{e}_{z} don’t change their length under Rpenner’s boost. But we also have the fact we can rotate Rpenner’s boost vector to become Magneto’s. So we can do the following things,

  • Rotate Magneto’s setup to be the same as Rpenner’s setup
  • Do Rpenner’s boost
  • Rotate Rpenner’s setup back to Magneto’s setup

There is some vector \mathbf{Y} in Magneto’s setup which is rotated into \mathbf{e}_{y} in Rpenner’s setup, will not be changed by the boost and then is rotated back to \mathbf{Y}. Since \mathbf{e}_{y} is orthogonal to latex $\mathbf{e}_{x}$ it follows that latex $\mathbf{Y}$ is orthogonal to latex $\mathbf{v}$. So we have determined there is a direction which Magneto’s boost doesn’t warp! Similarly with \mathbf{e}_{z} and so there is actually 2 directions! If we did this in N dimensional space there would be N-1 such independent directions.

In fact this method of rotate, boost, unrotate is precisely why Rpenner’s method is just as general as Magneto’s. The rotation is equivalent to picking a basis set of vectors and it’s logical to pick one which makes your life easy. It’s easy to compute rotations and it’s easy to do boosts in the x direction. When you combine the rotation matrices with Rpenner’s boost matrix you end up with Magneto’s.

Of course some of you might be thinking that I should actually construct the direction in question, else my logic might not be valid. Firstly, the logic stands, regardless of whether I construct the vector. Secondly that’s precisely what I’m going to do now!

Explicit Construction

The x,y,z directions form an orthogonal basis, they are all at right angles to one another and that makes life easier in general so the first thing to do is to make an orthogonal basis which includes \mathbf{v} = (v_{1},v_{2},v_{3}). With a moment’s thought a vector orthogonal to this can be found, \mathbf{u} = (-v_{2},v_{1},0). To get the third vector we use the cross product,

\mathbf{w} = \mathbf{v} \times \mathbf{u} = (-v_{2}v_{3} , -v_{1}v_{3} , v_{1}^{2}+v_{2}^{2}).

Let’s call this set \{\mathbf{v},\mathbf{u},\mathbf{w}\} = \{\mathbf{f}_{1},\mathbf{f}_{2},\mathbf{f}_{3}\}. Since they are orthogonal they are a basis and any vector can be written in terms of them, \mathbf{X} = A_{1}\mathbf{f}_{1} + A_{2}\mathbf{f}_{2} + A_{3}\mathbf{f}_{3}. We can make them orthonormal by normalising but what is important are the directions. One gripe which someone might have is that if they represent velocities why does \mathbf{w}  have squared terms. You can view them as dimensionless coefficients, with the [tex]\mathbf{e}_{i}$ carrying the units.

In the previous section I said that because the x and y directions are orthogonal a direction orthogonal to \mathbf{f}_{1} should be unchanged by Magneto’s boost. So let’s check that by acting Magneto’s boost on each one. Remember, at the end the only changes which should occur should be in the \mathbf{v} = \mathbf{f}_{1}, the coefficients of the other vectors shouldn’t change. We start by acting \Lambda(v_{1},v_{2},v_{3}) on the components of \mathbf{f}_{2}. Since this is space-time we have to pick a time, so we use t=T for the time component. The time component is allowed to be altered, as its common to all vectors.

\begin{array}{rcl}\Lambda(v_{1},v_{2},v_{3})\cdot \mathbf{f}_{2} &=& \left( \begin{array}{cccc} \gamma&-v_{1}\,\gamma&-v_{2}\,\gamma&-v_{3}\,\gamma\\  -v_{1}\,\gamma&1+(\gamma-1)\dfrac{v_{1}^2}{v^2}&(\gamma-1)\dfrac{v_{1} v_{2}}{v^2}&(\gamma-1)\dfrac{v_{1} v_{3}}{v^2}\\ -v_{2}\,\gamma&(\gamma-1)\dfrac{v_{2} v_{1}}{v^2}&1+(\gamma-1)\dfrac{v_{2}^2}{v^2}&(\gamma-1)\dfrac{v_{2} v_{3}}{v^2}\\ -v_{3}\,\gamma&(\gamma-1)\dfrac{v_{3} v_{1}}{v^2}&(\gamma-1)\dfrac{v_{3} v_{2}}{v^2}&1+(\gamma-1)\dfrac{v_{3}^2}{v^2}\\ \end{array} \right) \left( \begin{array}{c} T \\ -v_{2} \\ v_{1} \\ 0 \end{array}\right) \\ &=& \left( \begin{array}{c} \gamma T \\ -\gamma T v_{1}-v_{2} \\ -\gamma T v_{2}+ v_{1} \\ -\gamma T v_{3} \end{array}\right)\end{array}

So how has the spatial part changed? Clearly there’s some T dependent terms and some non-T dependent terms. Dropping the time component we have the 3-vectors

\left( \begin{array}{c} -\gamma T v_{1}-v_{2} \\ -\gamma T v_{2}+ v_{1} \\ -\gamma T v_{3} \end{array}\right) = -\gamma T \left( \begin{array}{c} v_{1} \\ v_{2} \\ v_{3} \end{array}\right) + \left( \begin{array}{c} -v_{2} \\ v_{1} \\ 0 \end{array}\right) = -\gamma T \mathbf{f}_{1} + \mathbf{f}_{2}

So it turns out the coefficient of \mathbf{f}_{2} doesn’t change, only a shift in the \mathbf{f}_{1} direction occurs. Doing the same calculation for \mathbf{f}_{3} gives the same result. Overall we have the effect of Magneto’s Lorentz boost being

\Lambda(v_{1},v_{2},v_{3}) \;:\; A_{1}\mathbf{f}_{1} + A_{2} \mathbf{f}_{2} + A_{3} \mathbf{f}_{3} \to \tilde{A}_{1}\mathbf{f}_{1} + A_{2} \mathbf{f}_{2} + A_{3} \mathbf{f}_{3}.

The A_{1} coefficient is altered, gaining time dependence, while A_{2},A_{3} are untouched. We can therefore conclude that Magneto’s assertions were incorrect.

The one thing he did get right is that this is general in the sense of being able to recover specific cases, you can just slap in any values for the v_{i} components you want or define the \mathbf{f}_{j} appropriately. In Rpenner’s example just set \mathbf{f}_{1} = \mathbf{e}_{x} etc,  this is nothing more than just applying a rotation to align your vectors nicely before you start churning through the algebra. This method generalises to \mathbb{R}^{N}, where a boost in a given direction warps that direction and leaves the N-1 perpendicular directions unchanged. None of this material is particularly complicated, it’s just something you get a handle on by working through examples and gaining some experience. Unfortunately, despite his years spend writing a book on what he thinks relativity is Magneto has little experience with this stuff. He even confuses vectors and scalars, despite such things being the fundamental concepts upon which relativity is built. Maybe one day he’ll stop writing books about relativity and actually read some instead….

Frames and light spheres Cont.

October 6, 2011

…. continued

Light Emitters

Rather than talk about abstract centres it is better to talk about specific objects. We can alter the scenario slightly, by having 2 light emitters involved, E and E’. E’ moves at speed v relative to E. E is stationary in Frame S and thus E’ is stationary in E. We have sync’d the frames such that (t,x) = (0,0) is the same point in space-time as (t’,x’) = 0. Therefore the emitters pass through one another at that moment. The emitter velocities are then vectors defined at (0,0). If an emitter is moving with speed v in a frame then its velocity vector is (1,v). At (0,0) the light sphere is emitted. So which emitter emits it?

Someone in Frame S will say “Emitter E is at the centre of the sphere, it did!” but someone in Frame S’ will say “No, E’ is at the centre, it did!”. Add more emitters who all meet at the same time and who are all moving in different directions and someone in each of their rest frames will say “NO, you’re all wrong! Mine did!”. Jack_ and Motor Daddy of SciForums consider to be a problem. They claim it means relativity says there’s multiple emission events. So how is this resolved?

Light Cones

The problem is that the light sphere is not the whole story. In relativity it’s common to draw space-time diagrams, where you draw the line an object moves along through space-time. So a photon moving through space sweeps out the line x = t (or x = -t). In 1 dimension a light ‘sphere’ is just a pair of points, at time T (t,x) = (T,\pm T). In higher dimensions they make circles or spheres (obviously). But when you consider all the spheres together, growing linearly in time, you get a light cone. In 1 dimension it’s an upside down triangle with its apex at (0,0).

Suppose we now ask the different frames where in space-time the apex of the light cone is. They will all agree, pointing (metaphorically, as they can’t point through time) to the emission event when all the emitters are at the same location. When asked at any time after that where the centre of the sphere is they will each point to the emitter which is at rest in their respective frames. But what they are doing is saying where the centre of  a slice through the light cone is. Just the equation y=6 defines a line in the (x,y) plane of a graph setting time to some value defines a line through space-time. It is that frame’s notion of ‘now’. In Newtonian physics everyone, no matter how they are moving, agrees on this slice but in relativity different frames take different slices.

In 1 dimension each slice is a line. In Frame S the ‘now’ slice for E is a horizontal line at t=T. All x values are allowed, so we can parameterise this line by (t,x) = (T,s) \equiv L(s) for parameter s. Note that when s = \pm T the point is on the light cone and that \Vert L(-s)- L(0) \Vert = \Vert L(+s)- L(0) \Vert. However, if you asked E what the equation for the ‘now’ slice of E’ it would be something different. To work it out you just apply the Lorentz transform the line from the Frame S’ . In S’ the line is (t',x') = (T,s) = L'(s).

\Lambda^{-1}\cdot L'(s) = \gamma T (1,v) + \gamma s (v,1) \equiv \tilde{L}(s)

Written in this form makes it clear that the old intercept on the t’ axis, when s=0, maps to a point off the t axis, \gamma T (1,v). What about the s = \pm T points? They map to  the following,

\gamma T (1\pm v,v \pm 1) = \gamma T (1\pm v,\mp (1 \pm v)) .

So we see they too satisfy x = \pm t and thus are on the light cone. As expected, points on the light cone map to points on the light cone, even though they are no longer occurring at the same time. Furthermore Frame S sees that \Vert \tilde{L}(-s)- \tilde{L}(0) \Vert = \Vert \tilde{L}(+s)- \tilde{L}(0) \Vert. So it seems the different frames agree about the layout of points along slices too. Below is the set of lines in Frame S. The red is the light cone, the blue the Frame S’ axes, the horizontal grey is ‘now’ to Frame S and the diagonal grey is ‘now’ to Frame S’.

If we reverse the analysis we obtain the point of view of Frame S’ ; the light cone, the axes of Frame S and the ‘now’ of each frame.

What remains now is to address whether or not, to use Jack_’s phrasing, relativity says there are multiple emission points. The true emission point is the apex of the light cone, a point in space-time, not a point in space. Previously we asked which of the emitters did the emitting, so which is it? Well we’ve just seen that it doesn’t matter, they all would produce the same light cone. All that defines a light cone is the location of the emitter, not it’s velocity. We’ve seen all the different frames agree on the space-time structure of the light cone. They also agree on what is inside the cone and what is outside.

What they disagree on is what points lie on a spatial slice of the cone. This is equivalent to disagreeing on what the velocity of the emitter was because the slice is defined by the choice of frame and the emitter which is stationary in a given frame is the one which appears to be in the middle of the sphere from that frame’s perspective. But as we’ve seen, it doesn’t alter the structure of events or create contradictions about the light cone. For each ‘now’ slice in Frame S’ there is a line slice through Frame S with the same layout, it’s just at an angle. This shows how all frames agree on the space-time structure and causality, which is the underlying physically important thing. Having a different ‘now’ slice might sound odd but it is no different in terms of mathematical consistency to doing a rotation.

There are ways to make this much more formal, talking about how Lorentz transforms strictly act on vectors in tangent spaces, not coordinates and then are upgraded by the flat properties of Minkowski space-time to acting on coordinates. The lack of dependence on the velocity can be phrased in terms of tangent bundles. I went over it with Jack_ on SciForums ages ago here so if anyone is interested have a read of that.

Frames and light spheres

October 5, 2011

Light spheres seem to be a sticking problem for people who don’t like special relativity. Let’s have a look at them. Throughout we use units where c=1 and \gamma = \frac{1}{\sqrt{1-v^{2}}}

Lorentz Boosts

We could work in general N dimensional space but if we’re considering Lorentz boosts then we can just use time and one space dimension. We have 2 frames, S and S’, with coordinates (t,x) and (t’,x’) respectively. A Lorentz boost by speed v in the x direction is then written as

t' = \gamma(t-vx) \quad,\quad x' = \gamma(x-vt) .

The inverse of this is just a boost in the opposite direction so we just swap the coordinates and put a minus sign on the v,

t = \gamma(t'+vx') \quad,\quad x = \gamma(x'+vt') .

Light Spheres

A light sphere centred on the origin in S is then written as x^{2} = t^{2}. What does this become after a Lorentz transform? First we rearrange it to 0 = -t^{2}+x^{2} and so

- \gamma^{2}(t'+vx')^{2} + \gamma^{2}(x'+vt')^{2} = \gamma^{2}\Big( -(1-v^{2})(t')^{2} + (1-v^{2})(x')^{2} Big) = -(t')^{2} + (x')^{2}.

So this is the same form as the original light sphere, the Lorentz boost maps a light sphere centred on the origin to a light sphere centred on the origin. We have done this with a specific Lorentz transform but its actually true of any Lorentz transform. This is because all Lorentz transforms, in any number of dimensions, preserve the Minkowski metric and the light sphere in question can be written in terms of the Minkowski metric \eta. In Cartesian coordinates \eta is a diagonal matrix with -1 as the first entry on the diagonal and +1 on all the others. For our case we have

\eta = \left( \begin{array}{rr} -1 & 0 \\ 0 & +1 \end{array}\right) .

Let’s do this explicitly. We write \mathbf{X} = (t,x) as the vector of coordinates of S and \mathbf{X}' for those of S’. From that it follows

-t^{2} + x^{2} = \mathbf{X}^{\top} \cdot \eta \cdot \mathbf{X} \equiv \eta(\mathbf{X},\mathbf{X}) .

So we now want to show that \eta(\mathbf{X},\mathbf{X}) = \eta(\mathbf{X}',\mathbf{X}'). To do this we need to relate $latex \mathbf{X}$ and \mathbf{X}'. We define a matrix representation for the Lorentz transform via $\mathbf{X}’ = \Lambda \cdot X$. For the above explicit examples we get

\Lambda = \left( \begin{array}{rr} \gamma & - \gamma\,v \\ - \gamma\,v & \gamma \end{array}\right) \quad,\quad \Lambda^{-1} = \left( \begin{array}{rr} \gamma & \gamma\,v \\ \gamma\,v & \gamma \end{array}\right) .

Note that since \Lambda is a Lorentz transform for any value of v then so is its inverse, as it is obtained by changing v to -v. We can now state in a simple way one of the defining and essential properties of Lorentz transforms, that they preserve \eta, easily,

\eta = \Lambda^{\top} \cdot \eta \cdot \Lambda.

For those at home you can see this explicitly for yourself for the above specific case by just putting in the numbers. Now we’re ready to prove the result in general. We start by considering \eta(\mathbf{X}',\mathbf{X}'),

\eta(\mathbf{X}',\mathbf{X}') = \eta(\Lambda \cdot \mathbf{X},\Lambda \cdot \mathbf{X}) = \mathbf{X}^{\top} \cdot(\Lambda^{\top} \cdot \eta \cdot \Lambda ) \cdot \mathbf{X} .

The term in the bracket is, by definition of the Lorentz transforms, just the metric and so we arrive at the result

\eta(\mathbf{X}',\mathbf{X}') = \mathbf{X}^{\top} \cdot \eta \cdot \mathbf{X} = \eta(\mathbf{X},\mathbf{X}) .

Sphere Centres

Both frames see a sphere centred on the origin, their origin. In just 1 spatial dimension this corresponds to x=0 in S and x’=0 in S’. When S has seen time T pass the light sphere has radius T (remember c=1). So the origin is then the space-time location \mathbf{X}_{\textrm{or}} = (T,0). So where does this map to under the Lorentz boost? Easy, \Lambda \cdot \mathbf{X}_{\textrm{or}} = \gamma(T,-vT). This is obviously not \mathbf{X}'_{\textrm{or}} = (0,0). Similarly, the origin in S’ maps to \Lambda^{-1} \cdot \mathbf{X}'_{\textrm{or}} = \gamma (T,v T), which is not \mathbf{X}_{\textrm{or}}.


So here’s the issue, if everything is consistent how can two frames disagree? After all if you put an object like a ball into the space-time then if relativity is consistent when you ask two people in different frames to point at the ball they should agree. So how is it that if they are asked to point at the centre of the sphere they will disagree? This is the crux of the issues had by, among other people, Jack_ on SciForums. An example thread can be found here. I’ll go over that next….

Are simpler models inherently better?

October 4, 2011

It is common for cranks to be motivated in their denunciation of the mainstream or in their development of a pet theory by a specific phenomenon, X. More often than note this phenomenon is one which the mainstream community has an incomplete understanding of, allowing the crank to make unjustified or eroneous claims with less fear of being smacked down by experimental observations contradicting it. Alternatively their mind set is akin to that of creationist; by making the flawed rational that since mainstream understanding of X is not well developed then any other tenative explanation is equally worthy of consideration. In the case of creationism many creationists think that if they disprove evolution then somehow their position is more valid.

Regardless of why a crank might pick a particular X by narrowing the part of physics they consider the above claim is flawed. By narrowing attention to X the number of different things the model must describe are vastly reduced and the construction of the model is bottom-up. Much of mainstream physics works in the same manner initially but given sufficient time and development top-down models are often constructed to unify the various seperate results obtained by a bottom-up approach. In its most naive form bottom-up approaches are little more than ‘curve fitting’, given a set of experimental results a few of ad hoc equations are found which output those results as closely as possible and in some lucky cases it might be possible to spot a pattern or particularly nice closed form expression for the relevant formulae. With only a ‘small’ (as all phenomena other than X are ignored) amount of data to explain the equations obtained might be quite simple. Unfortunately this tells you very little about the nature of the system, obtaining a mathematical formula does not automatically provide someone with physical insight, nor is there any apriori reason to think the results will apply to any related but dissimilar phenomena. A top-down construction of a model is generally done by making some physical statements about the system, converting them into mathematical expressions and then deriving results from them. If the physical statements apply to other phenomena aside from X it would be reasonable to think the model can be applied to other phenomena too.

Two examples can be used to illustrate this rationale; dark matter in gravitational models and the construction of electromagnetism.

Dark matter and gravity

General relativity (GR) surplanted Newtonian gravity (NG) in the 1910s and 1920s when it successfully explained the precesion of Mercury, which NG couldn’t, and the deflecting of light by the Sun, which NG gave the wrong value for and since then its applications have spread to cover the entirety of cosmology. Anyone who has looked at textbooks on GR will know it is vastly more complex than the simple gravitational force formula Newton gave but is applies to things NG cannot explain. However, observations of galaxy rotation rates cannot be explained by GR unless the galaxies are also filled with dark matter and this addition to an already complex theory is viewed by some, particularly cranks, as the last straw and thus believe GR should be done away with. Thus Modified Newtonian Dynamics (MOND) was born, to explain the rotation rates of the galaxies without dark matter being needed and it is this phenomenon which makes up the majority of MOND development.

MOND has failed in two ways; it has failed to explain the rotation curves of observed galaxies in a way which works for all of them and it has failed to correct the issues standard NG has with other phenomena. MOND’s development was/is guided almost entirely by “We must explain galaxy rotations” with little attention given to the question of if MOND applies to anything else and it didn’t. Even though MOND is mathematically simpler than GR it has nowhere near the predictive power of GR nor does it have the same pseudo-top-down origins of GR. MOND is not free of somewhat ad-hoc assumptions either and thus Occam’s Razor applies; both NG and GR require extension in order to even vaguely explain galaxy rotation curves accurately but in the case of GR it then explains other phenomena like the Bullet Cluster. A model must be as simple as possible but no simpler than that and typically cranks make their models simple by not worrying if they apply to anything other than X.


The second example is how electromagnetism might be developed. Faraday realised through a great many experiments that electricity and magnetism are related to one another and through his careful measurements people such as Maxwell were able to develop a series of equations to formalise the relationship which now bear his name. These equations received extensive interest and it was seen that they possessed an invariance under a set of transformations we now call Lorentz transformations and these transforms took centre stage in special relativity. The nature of the Maxwell equations prompted much important research in physics but their origin was heavily based on experimental results. However later development in differential geometry and Lie theory provided tools which allowed not only for a more fundamental top-down construction of electromagnetism but also its generalisations, including their application in quantum mechanics, now known as gauge theory.

The underlying principle of gauge theory is that the predictions of a model for a given system should not depend on arbitrary choices we have in how we describe it, in a manner analogous to how special relativity follows from the postulation that all inertial frames are equivalent in their descriptive power and predictions. The formalism in gauge theory is much more advanced than the basic vector calculus needed to understand and apply Maxwell’s equations to a given physical problem so would undoubtedly be regarded as more complex but the insight into other areas of mathematical physics gauge theory provides cannot be obtained by a bottom-up approach based on ‘curve fitting’ experimental results.

Why bother with peer review?

October 4, 2011

Not all major papers are published in journals, particularly nowadays given websites like ArXiv so why should anyone bother? After all it is not unheard of work for which was published and mainstream to turn out to be wrong, right?

Peer review is by no means perfect but it serves an important purpose. It provides a fresh set of eyes to a problem, as its all too easy to miss mistakes when you constantly work on something. It forces researchers to organise their results in a way which is concise and conveys the essence of their work in as convenient a way as possible. Most importantly it prevents the community being deluged with poor quality work riddled with errors. Yes, some get through but the majority do not. A journal will send a submitted paper to someone who is very familiar with the subject matter so that errors are more likely to be spotted. Subtle errors may not be noticed by others or people new to a field of research and the requirement a paper passes the scrutiny of an expert means that people new to that field can be more confident in the work, as they have not yet developed the knowledge and experience to review it properly. It takes the expert some time to read and evaluate a paper but it saves the research community time overall as it prevents any errors leaving into the literature.

ArXiv does not peer review but it has basic standards, someone without an association to a reputable research institution must get someone to vouch for them. ViXra was started up by people who got banned from ArXiv and they allow anything to be put on there. As this post illustrates, the material on viXra is much lower quality and generally either incoherent or riddled with mistakes an undergraduate should not be making, let alone someone claiming to have killed special relativity.