differentials

An elementary, informal viewpoint on differentials
— with emphasis on noncommutive multiplication.

Version of Thursday 6 July 2023.
Dave Barber's other pages.

§1 Introduction. Within the field of mathematics, elementary calculus textbooks routinely employ symbols such as dx and dy when discussing differentiation. In the opinion of the present author, these textbooks often fail to explain clearly what these symbols mean, and why they are introduced.

As a result, students might think that dx indicates to multiply variable d by variable x. There can be confusion about what d²x, dx², (dx)², and d(x²) could denote, and how they might differ from one another. Further, d seems to appear alone in the expression d/dx. Additionally, the student might be left with the impression that differentials are used only for numerical approximation, and that something like dy is calculated merely by multiplying a derivative by dx.

To begin the explanation, it is convenient to distinguish two kinds of differentiation:

In respectful differentiation, a function is differentiated with respect to one of its variables. Another way to say this is that the output is differentiated with respect to one of the inputs. This operation produces a derivative, which is a function in its own right.
In unrespectful differentiation, an expression is differentiated without respect to anything. The result is a differential, which is an expression in its own right.

It is possible to first establish differentials, and use them to build derivatives. However, in modern mathematics, the usual method is the opposite: derivatives first, then differentials. One reason is that the second method is felt to be more analytically rigorous than the first. Another reason is that derivatives have immediate use in science and engineering, as well as in other areas of mathematics, and are likely to be of great use to the students, not all of them mathematics majors, who take elementary calculus courses. Differentials are more abstract, and often any immediate concrete application is not conspicuous.

In contrast to mathematical tradition, this report does not employ typographical juxtaposition to indicate multiplication; a centered dot · is used instead. Were juxtaposition chosen, numerous parentheses would have become necessary to disambiguate exactly what was being multiplied by what. The dots are a much cleaner notation for the purposes here.

§2 Sundry topics. Consider this elementary expression:

y = x² + 3 · x − 5

In an expression like this, the right-hand member is usually regarded as a function with x as the input (an independent variable) and y is the output (a dependent variable). Variable y is said to be "a function of" variable x.

Respectful differentiation of y with respect to x leads to the derivative, typically notated:

dy/dx = 2 · x + 3

So exactly what are dx and dy? They are new variables introduced to facilitate differention. These variables have names that are two letters, rather than one, so that dx is not d · x and dy is not d · y. In fact, there is no variable d. It would have been perfectly correct to have given dx and dy single-letter names such as m and n, or 𝜌 and 𝜃. However, the two-letter names are chosen because dx is regarded as being related to x in some way; and dy to y.

Is that a division slash between dy and dx? Yes it is, although to regard this as an algebraic division operation is a delicate matter, requiring caution.

Elementary calculus textbooks sometimes say that dy/dx is no more and no less than an incomposite symbol denoting the derivative of y with respect to x. Students are left to ponder why this obviously non-incomposite notation was introduced, particularly since the much simpler symbol y′ is likely to have already appeared in the textbook. This report hopes to shed light on the matter.

This report assumes that:

addition is associative;
addition is commutative;
multiplication is associative;
multiplication of a real number with any quantity is commutative;
multiplication is distributive over addition.

Multiplication of two quantities, if neither of them is a real number, is allowed to be non-commutative.

To reduce wordiness, the brief term "comm-mult" will be used to represent "commutative multiplication", "multiplicative commutativity", "multiplicatively commutative", and the like. Hence the opposite, "non-comm-mult".

If x is a real number, then dx will usually be chosen from the real numbers; analogously for complex numbers; vectors; matrices; quaternions; and so forth. The last three of these are non-comm-mult.

Technically, the value of dx is not restricted by the value of x, or vice versa. However, it is customary that dx will be limited to values near zero. Often the sum x + dx will be of interest. Important is to remember that only under comm-mult can x · dx be assumed to equal dx · x.

If x is to be designated as independent and y dependent, then in the usual case dx will be classified as independent and y dependent. However, the manipulation of differentials often does not require any such designation.

§3 Some basic differentials. Many useful differentials can be calculated by adopting some algebraic postulates. Assume that x, y, and z are variables; while a, b, and c are constants. The reader who has experience with respectful differentiation will find that these formulas look rather familiar.

Definition: The name of the differential of a variable or constant is simply d followed by the quantity's name. For instance, the differential of x is dx. Parentheses around the original variable's name do not invalidate the symbol, and sometimes provide clarity: d(x) and dx are the same.

Definition: The differential of a constant is zero. In other words, da = 0.

Definition: The differential of a sum equals the sum of the differentials:

If z = x + y, then dz = dx + dy.

Here, the original variables are x, y, and z; while the created variables are dx, dy, and dz. This can be written in a more concise manner:

d(x + y) = dx + dy.

Perhaps surprisingly, d(x + y) is a very long name for exactly the same variable as dz.

§4 Differential of a product. Definition:

d(x · y) = x · dy + dx · y

If calculations are being done in a non-comm-mult algebra, it would be wrong to write dy · x for the first addend, or y · dx for the second. (The corresponding caution is well known to apply when finding the derivative of the cross product of two three-dimensional vectors.)

This follows:

d(x · y · z) = x · y · dz + x · dy · z + dx · y · z

The formula for the differential of the product of a constant and a variable simplifies because da = 0:

d(a · x) = a · dx + da · y
d(a · x) = a · dx

Similarly, d(x · a) = dx · a.

The differential of a square relies on the product rule:

d(x²) = d(x · x) = x · dx + dx · x

If comm-mult, d(x²) = 2 · x · dx. With that, write y = x² to give this:

dy = 2 · x · dx

If dx ≠ 0, divide both sides of the equation by dx to obtain a familiar expression for the derivative of y = x² with respect to x:

dy/dx = 2 · x

Caution: Under non-comm-mult, the whole matter of derivatives becomes far more involved, with many subtle questions to answer. By contrast, differentials behave quite manageably.

Now that d(x²) is established, it is good to point out that dx² = (dx)² = dx · dx. This is ordinary arithmetic multiplication.

In much literature, a seemingly isolated d appears in expressions like:

d/dx (a · x² + b · x + c)

"d/dx" means "the derivative with respect to x of whatever follows", and is merely a typographical variant of the following format:

d(a · x² + b · x + c)/dx

Later will be addressed d²x.

§5 Higher powers. Because xⁿ = x · xⁿ⁻¹, repeated application of the product rule gives the following, from which the general pattern can be observed:

non-comm-mult

d(x²) = x · dx
+ dx · x

d(x³) = x² · dx
+ x · dx · x
+ dx · x²

d(x⁴) = x³ · dx
+ x² · dx · x
+ x · dx · x²
+ dx · x³

d(x⁵) = x⁴ · dx
+ x³ · dx · x
+ x² · dx · x²
+ x · dx · x³
+ dx · x⁴

comm-mult
d(x²) = 2 · x · dx d(x³) = 3 · x² · dx d(x⁴) = 4 · x³ · dx d(x⁵) = 5 · x⁴ · dx

Little simplification is available under non-comm-mult, and this limitation has a major effect on funtions defined with power series. For example, with comm-mult, a convenient result for the sine is obtained: d(sin x) = dx · cos x. But with non-comm-mult, no such brief form emerges.

Under non-comm-mult, attempts to define division induce complications, but at least reciprocals work without trouble:

d(x⁻¹) = − x⁻¹ · dx · x⁻¹ when x ≠ 0

Non-comm-mult would confound attempts to use the chain rule in finding the derivatives of composite functions, but for differentials the chain rule is merely substitution, and difficulty is avoided. For instance, here are two substitutive ways to combine formulas from above to find d(x⁻²) when x ≠ 0:

d(x⁻²) = d((x⁻¹)²)
= x⁻¹ · d(x⁻¹) + d(x⁻¹) · x⁻¹
= x⁻¹ · (− x⁻¹ · dx · x⁻¹) + (− x⁻¹ · dx · x⁻¹) · x⁻¹
= − x⁻² · dx · x⁻¹ − x⁻¹ · dx · x⁻²

d(x⁻²) = d((x²)⁻¹)
= − (x²)⁻¹ · d(x²) · (x²)⁻¹
= − x⁻² · (dx · x + x · dx) · x⁻²
= − x⁻² · dx · x · x⁻² − x⁻² · x · dx · x⁻²
= − x⁻² · dx · x⁻¹ − x⁻¹ · dx · x⁻²

As an alternative to substitution, the product rule could have been used:

d(x⁻²) = d(x⁻¹ · x⁻¹)
= x⁻¹ · (−x⁻¹ · dx · x⁻¹) + (−x⁻¹ · dx · x⁻¹) · x⁻¹
= − x⁻² · dx · x⁻¹ − x⁻¹ · dx · x⁻²

Higher negative powers form a pattern:

non-comm-mult

d(x⁻³) = − x⁻³ · dx · x⁻¹
− x⁻² · dx · x⁻²
− x⁻¹ · dx · x⁻³

d(x⁻⁴) = − x⁻⁴ · dx · x⁻¹
− x⁻³ · dx · x⁻²
− x⁻² · dx · x⁻³
− x⁻¹ · dx · x⁻⁴

d(x⁻⁵) = − x⁻⁵ · dx · x⁻¹
− x⁻⁴ · dx · x⁻²
− x⁻³ · dx · x⁻³
− x⁻² · dx · x⁻⁴
− x⁻¹ · dx · x⁻⁵

comm-mult
d(x⁻³) = −3 · x⁻⁴ · dx d(x⁻⁴) = −4 · x⁻⁵ · dx d(x⁻⁵) = −5 · x⁻⁶ · dx

As a verification:

d(1) = d(x⁰)
= d(x⁻² · x²)
= x⁻² · d(x²) + d(x⁻²) · x²
= x⁻² · (x · dx + dx · x) + (− x⁻² · dx · x⁻¹ − x⁻¹ · dx · x⁻²) · x²
= x⁻¹ · dx + x⁻² · dx · x − x⁻² · dx · x − x⁻¹ · dx
= 0

There is a relationship between positive and negative powers, as shown in the following example:

d(x⁻³) = − x⁻³ · dx · x⁻¹ − x⁻² · dx · x⁻² − x⁻¹ · dx · x⁻³
= − x⁻³ · (dx · x² + x · dx · x + x² · dx) · x⁻³
= − x⁻³ · d(x³)· x⁻³

More nearly symmetrical is to write it this way:

x³ · d(x⁻³) = − d(x³) · x⁻³

These complicated differential expressions exemplify why this report uses the dot, and not juxtaposition, to indicate multiplication.

§6 Higher differentials. To find the differential of some expression that is already a differential is a purely mechanical procedure. As for notation, d(dx) can also be written as ddx or d²x; the last of these is the usual practice.

Example:

d²(x · y) = d(d(x · y))
= d(x · dy + dx · y)
= d(x · dy) + d(dx · y)
= (x · d²y + dx · dy) + (dx · dy + d²x · y)
= x · d²y + 2 · dx · dy + d²x · y

Some further differentials of products:

d³(x · y) = x · d³y
+ 3 · dx · d²y
+ 3 · d²x · dy
+ d³x · y

d⁴(x · y) = x · d⁴y
+ 4 · dx · d³y
+ 6 · d²x · d²y
+ 4 · d³x · dy
+ d⁴x · y

d⁵(x · y) = x · d⁵y
+ 5 · dx · d⁴y
+ 10 · d²x · d³y
+ 10 · d³x · d²y
+ 5 · d⁴x · dy
+ d⁵x · y

The coefficients can be found in Pascal's triangle.
These formulas do not simplify under comm-mult.

As another example, the first four differentials of x⁻¹ are shown on another page.

In a generalized discussion will sometimes be written d¹x for dx, and d⁰x for x. Also, d^m(dⁿx) = d^(m+n)x.

Strictly speaking, d^mxⁿ is not ambiguous; it indicates that the variable whose name is d^mx will be raised to the nth power. However, parentheses are suggested in order to prevent mistakes: (d^mx)ⁿ. For instance, d³x² = (d³x)² = d³x · d³x.

§7 Extricating a derivative. A lengthy explanation of how to do this is given on another page. What follows here is a summary.

An example begins with y as a function of x:

y = x · x · x · x

A major point of this report is the claim that the following are suitable as definitions for the first three derivatives of y with respect to x in a non-comm-mult algebra. They must of course be interpreted as implicit functions.

dy = x · x · x · dx
+ x · x · dx · x
+ x · dx · x · x
+ dx · x · x · x

d²y = 2 · x · x · dx · dx
+ 2 · x · dx · x · dx
+ 2 · dx · x · x · dx
+ 2 · x · dx · dx · x
+ 2 · dx · x · dx · x
+ 2 · dx · dx · x · x

d³y = 6 · x · dx · dx · dx
+ 6 · dx · x · dx · dx
+ 6 · dx · dx · x · dx
+ 6 · dx · dx · dx · x

If comm-mult is introduced, the above formulas simplify into well-known elementary forms:

The original formula: y = x⁴
First derivative: dy = 4 · x³ · dx → dy/dx = 4 · x³
Second derivative: d²y = 12 · x² · dx² → d²y/dx² = 12 · x²
Third derivative: d³y = 24 · x · dx³ → d³y/dx³ = 24 · x

§8 Integrals. The differentials developed here are precisely what go under the integral sign.

Under comm-mult, recall d(x²) = 2 · x · dx. Then ∫ 2 · x · dx equals x² plus an arbitrary constant, as expected.

Under non-comm-mult, recall d(x²) = x · dx + dx · x. Then ∫ (x · dx + dx · x) equals x² plus an arbitrary constant, again as expected. Note, however, that ∫ dx · x and ∫ x · dx might not exist individually, even though their sum does. This is because the value of each, as a definite integral, could depend on the path of integration. With quaternions for example, which are non-comm-mult, this non-existence does occur.

An open question is whether it is a good idea to write d⁻¹(x · dx + dx · x) for ∫ (x · dx + dx · x).

§9 Alternate notation. Some researchers might prefer something more compact. Consider:

− x⁻³ · dx · x⁻¹ − x⁻² · dx · x⁻² − x⁻¹ · dx · x⁻³

Substitute the single-letter name D = dx:

− x⁻³ · D · x⁻¹ − x⁻² · D · x⁻² − x⁻¹ · D · x⁻³

Now the dots for multiplication can be removed with no loss of clarity:

− x⁻³Dx⁻¹ − x⁻²Dx⁻² − x⁻¹Dx⁻³

Further can be introduced names such as E = dy, F = dz, D² = d²x, and so forth.

Personal note: Many years ago, the author took an introductory calculus course that did not explain differentials thoroughly. Neither were they properly covered in his subsequent instruction, including a course in advanced calculus; hence this page.

d(x⁻²)	= d((x⁻¹)²)
	= x⁻¹ · d(x⁻¹) + d(x⁻¹) · x⁻¹
	= x⁻¹ · (− x⁻¹ · dx · x⁻¹) + (− x⁻¹ · dx · x⁻¹) · x⁻¹
	= − x⁻² · dx · x⁻¹ − x⁻¹ · dx · x⁻²

d(x⁻²)	= d((x²)⁻¹)
	= − (x²)⁻¹ · d(x²) · (x²)⁻¹
	= − x⁻² · (dx · x + x · dx) · x⁻²
	= − x⁻² · dx · x · x⁻² − x⁻² · x · dx · x⁻²
	= − x⁻² · dx · x⁻¹ − x⁻¹ · dx · x⁻²

d(x⁻²)	= d(x⁻¹ · x⁻¹)
	= x⁻¹ · (−x⁻¹ · dx · x⁻¹) + (−x⁻¹ · dx · x⁻¹) · x⁻¹
	= − x⁻² · dx · x⁻¹ − x⁻¹ · dx · x⁻²

d(1)	= d(x⁰)
	= d(x⁻² · x²)
	= x⁻² · d(x²) + d(x⁻²) · x²
	= x⁻² · (x · dx + dx · x) + (− x⁻² · dx · x⁻¹ − x⁻¹ · dx · x⁻²) · x²
	= x⁻¹ · dx + x⁻² · dx · x − x⁻² · dx · x − x⁻¹ · dx
	= 0

d(x⁻³)	= − x⁻³ · dx · x⁻¹ − x⁻² · dx · x⁻² − x⁻¹ · dx · x⁻³
	= − x⁻³ · (dx · x² + x · dx · x + x² · dx) · x⁻³
	= − x⁻³ · d(x³)· x⁻³

d²(x · y)	= d(d(x · y))
	= d(x · dy + dx · y)
	= d(x · dy) + d(dx · y)
	= (x · d²y + dx · dy) + (dx · dy + d²x · y)
	= x · d²y + 2 · dx · dy + d²x · y

dy	= x · x · x · dx
	+ x · x · dx · x
	+ x · dx · x · x
	+ dx · x · x · x

d²y	= 2 · x · x · dx · dx
	+ 2 · x · dx · x · dx
	+ 2 · dx · x · x · dx
	+ 2 · x · dx · dx · x
	+ 2 · dx · x · dx · x
	+ 2 · dx · dx · x · x

d³y	= 6 · x · dx · dx · dx
	+ 6 · dx · x · dx · dx
	+ 6 · dx · dx · x · dx
	+ 6 · dx · dx · dx · x

The original formula:	y = x⁴
First derivative:	dy = 4 · x³ · dx	→ dy/dx = 4 · x³
Second derivative:	d²y = 12 · x² · dx²	→ d²y/dx² = 12 · x²
Third derivative:	d³y = 24 · x · dx³	→ d³y/dx³ = 24 · x