Tensor Calculus

Partial Derivative of a Tensor

Partial differentiation of a tensor is in general not a tensor.  Depending on the circumstance, we will represent the partial derivative of a tensor in the following way

(3.1)

where we have taken the special case of a contravariant vector

We now show explicitly that the partial derivative of a contravariant vector cannot be a tensor. Consider the transformation relation for such a tensor.

 (3.2)

Differentiating with respect to coordinate , we find

 (3.3)

Using the chain rule this becomes:

 (3.4)

Expanding this out we get:

(3.5)

If only the first term on the right-hand side were present, then this would be the usual tensor transformation law for a tensor of type (1,1).  However, the presence of the second term prevents  from behaving like a tensor.

This problem arises because of the very definition of the derivative. Differentiation involves comparing a quantity evaluated at two neighbouring points, P and Q say, dividing by some parameter representing the separation of P and Q, and then taking the limit as this parameter goes to zero.  In the case of a contravariant vector field , this would involve computing

 (3.6)

for some appropriate parameter .  However, from the transformation law

in the form ,

 (3.7)

we see that

 (3.8)

and

 (3.9)

This involves the transformation matrix evaluated at different points! Thus it is clear that  is not a tensor.  Similar remarks hold for general rank tensor differentiation.

To define a tensor derivative we shall introduce a quantity called an affine connection and use it to define covariant differentiation.  We will then introduce a tensor called a metric and from it build a special affine connection, called the metric connection, and again we will define covariant differentiation but relative to this specific connection.

The Affine Connection and Covariant Differentiation

Consider a contravariant vector field  evaluated at a point Q, with coordinates , near to a point P, with coordinates .  Then, by Taylor's theorem,

(3.10)

to first order.  If we denote the second term by , i.e.

(3.11)

then  is not tensorial since it involves subtracting tensors evaluated at two different points.  We are going to define a tensorial derivative by introducing a vector at Q that in some general sense is 'parallel' to  at P.  Since  is close to , we can assume that the parallel vector only differs from  by a small amount, which we denote .

By the same argument as in previous discussion of the partial derivative,  is not tensorial, but we shall construct it in such a way as to make the difference vector

(3.12)

tensorial.  It is natural to require that  should vanish whenever  or  does.  Then the simplest definition is to assume that  is linear in both  or , which means that there exist multiplicative factors  where

and the minus sign is introduced to agree with convention.

We have therefore introduced a set of  functions  on the manifold, whose transformation properties have yet to be determined.  This we do by defining the covariant derivative of , (usually written in one of the following notations  ) by the limiting process

(3.14)

In other words, it is the difference between the vector  and the vector at Q that is still parallel to , divided by the coordinate differences, in the limit as these differences tend to zero.  Using (3.10) and (3.13), we find

(3.15)

or in terms of the semi-colon notation

(3.16)

Note that in the formula the differentiation index  comes second in the downstairs indices of .  If we now demand that  is a tensor of type (1,1), then a straightforward calculation (exercise) reveals that  must transform according to

or equivalently (exercise)

(3.18)

.

If the second term on the right-hand side were absent, then this would be the usual transformation law for a tensor of type (1,2).  However, the presence of the second term reveals that the transformation law is linear inhomogeneous. (3.17) or (3.18) is called an affine connection [or sometimes simply a connection or affinity].  A manifold with a continuous connection prescribed on it is called an affine manifold.  From another point of view, the existence of the inhomogeneous term in the transformation law is not surprising if we are to define a tensorial derivative, since its role is to compensate for the second term that occurs in (3.5).

We next define the covariant derivative of a scalar field to be the same as its partial derivative, i.e.

 (3.19)

If we now demand that covariant differentiation satisfies the usual product rule of calculus, then we find

(3.20)

Notice again that the differentiation index comes last in the  -term and that this term enters with a minus sign.  The name covariant derivative stems from the fact that the derivative of a tensor of type (p, q) is of type (p, q+1), i.e. it has one extra covariant rank.  The expression in the case of a general tensor is:

(3.21)

It follows directly from the transformation laws that the sum of two connections is not a connection or a tensor.  However, the difference of two connections is a tensor of type (1,2), because the inhomogeneous term cancels out in the transformation.  For the same reason, the anti-symmetric part of a , namely,

 (3.22)

is a tensor (called the torsion tensor).  If the torsion tensor vanishes, then the connection is symmetric, i.e.

 (3.23)

Affine Geodesics

If  is any tensor, then we introduce the notation

 (3.24)

that is,  of a tensor is its covariant derivative contracted with  .  A contravariant vector field  determines a local congruence of curves,

whenever the tangent vector field to the congruence is

We next define the absolute derivative of a tensor  along a curve C of this congruence,

written ,

by the following relation

The tensor  is said to be parallely propagated, or parallel transported, along the curve C if

This is a first-order ordinary differential equation for , and so given an initial value for , say , equation (3.26) determines a tensor along C which is everywhere 'parallel' to .

Using this notation, an affine geodesic is defined as a privileged curve along which the tangent vector is propagated parallel to itself.  In other words, the parallely propagated vector at any point of the curve is parallel, that is, proportional  to the tangent vector at that point:

 (3.27)

Using (3.25), the equation for an affine geodesic can be written in the form

 (3.28)

or equivalently (exercise)

 (3.29)

The last result is very important and so we shall establish it afresh from first principles using the notation of the last section.  Let the neighbouring points P and Q on C be given by  and

 (3.30)

to first order in , respectively.  This is essentially a Taylor expansion. We define

 .

The vector  at P is now the tangent vector .  The vector at Q parallel to  is, by (3.13) and(3.31),

 (3.32)

The vector already at Q is

 (3.33)

to first order in .  These last two vectors must be parallel, so we require

(3.34)

where we have written the proportionality factor as  without loss of generality, since the equation must hold in the limit .  Subtracting produces the equation we obtained before:

 (3.35)

Note that  appears in the equation multiplied by the symmetric quantity , and so even if we had not assumed that  was symmetric the equation picks out its symmetric part only.

If the curve is parameterized in such a way that  vanishes (that is, by the above, so that the tangent vector is transported into itself), then the parameter is a privileged parameter called an affine parameter, often conventionally denoted by s, and the affine geodesic equation reduces to

 (3.36)

or equivalently

 (3.37)

An affine parameter s is only defined up to an affine transformation (exercise)

 (3.38)

where  and  are constants.  We can use the affine parameter s to define the affine length of the geodesic between two points  and  by , and so we can compare lengths on the same geodesic.  However, we cannot compare lengths on different geodesics (without a metric) because of the arbitrariness in the parameter s.  From the existence and uniqueness theorem for ordinary differential equations, it follows that corresponding to every direction at a point there is a unique geodesic passing through the point as shown below.

Similarly, as long as the points are sufficiently 'close', any point can be joined to any other point by a unique geodesic.  However, in the large, geodesics may focus, that is, meet again as shown in the following diagram.