generalized multidimensional matrix multiplication

⊕⊗
A generalized multidimensional matrix multiplication.
Version of Tuesday 14 June 2016.
Dave Barber's other pages.

§ 1. Widely studied, and extensively used, is the matrix multiplication of elementary linear algebra. This operation takes two inputs that are two-dimensional (hereafter "2-D") matrices; the output is also a 2-D matrix.

Later we will define more precisely what a matrix is, but for now note that it houses components (which are often real numbers) in a rectangular grid. When 2-D, the components are said to be organized into rows and columns. To extend the applicability of multiplication:

a column vector of m elements is deemed 2-D with m rows and 1 column;
a row vector of n elements is deemed 2-D with 1 row and n columns;
a scalar can (but need not) be deemed 2-D with 1 row and 1 column.

Because matrices often contain many components, they are frequently manipulated by computer programs, often of a numerical nature. A programmer would use some sort of data structure, probably an array, to store the component information.

Can matrix multiplication be extended to matrices of three or more dimensions? Of course it can be, and it certainly has been. However, such operations are, relative to the 2-D version, infrequently seen. Moreover, there are various ways that multiplication for n-D matrices might be defined, and no one of them has risen to prominence. In this report, we offer a very general approach subsuming some of the definitions that already exist.

This report is an outgrowth of another project, the present author's mat_gen_dim, which developed an n-D array storage method for the C++ programming language. The manner of implementating therein the outer product, and of implementing n-D generalizations of contraction and the inner product, led to a broad definition of n-D matrix multiplication which deserved a separate mathematically-oriented description. The mat_gen_dim pages discuss the operations in a computer programming parlace, which reads altogether differently from this report.

As "generalized multidimensional matrix multiplication" is an unwieldy phrase, we will often use the symbol ⊕⊗ for the operation being defined. The rationale for this notation will become clear later.

Sections 2 through 5 of this report deal with preliminaries; ⊕⊗ itself is covered beginning in section 6. Note that we never denote any kind of multiplication by typographical juxtaposition, always preferring to use an explicit symbol of some sort.

§ 2. What is a matrix? A collection of components, typically numbers, which can be individually accessed by use of an index. "Individually accessed" means that reading or changing one component does not affect the others. Also, the value of one component does not constrain the set of possible values for other components.

An index is an ordered n-tuple of integers; without loss of generality we choose in this report to limit them to positive values. Every matrix has a fixed dimensionality, which is a nonnegative integer. If a matrix is n-dimensional, any index to be used with it must have exactly n elements. Note our terminology: matrices have components while indices have elements.

Many writers use the word subscript where we are using index, from the typographical custom of using subscripts for indices. We avoid that here because subscripts become difficult to read when indices are complicated, especially when subscripts themselves have subscripts. Another reason is that tensor algebra partitions an index's elements into opposing categories denoted by superscripts or subscripts, and we do not want to be appearing to suggest any tensorial interpretation. However, our approach to matrices can be used to carry out many operations of tensor algebra.

In this report, and in most applications elsewhere, matrices have finitely many components. Index elements reflect this, being restricted to certain values. When a matrix is created, not only is its dimensionality established, but also the ranges of its index elements; we use the double dot notation to denote an inclusive range of integers. For instance, this ordered n-tuple of index ranges:

(1 .. 3, 1 .. 5, 1 .. 4)

describes a 3-D matrix the index for any component of which must have three elements. The first element can be 1, 2, or 3; the second 1, 2, 3, 4, or 5; and the third 1, 2, 3, or 4. There are 60 = 3 × 5 × 4 possible combinations, all of which are valid, and each of which designates a different component of the matrix. Hence the matrix has 60 components, and this number will not change. We follow the popular mathematical convention that the minimum value of any index element can be 1; on the other hand are systems that always start with 0. Further, environments such as the Ada programming language and mat_gen_dim employ no universal base.

If matrix A has the index ranges above, a notation for a particular component is A[2, 1, 4], and for a general component is A[i₁, i₂, i₃]. Note the square brackets for component selection. Meanwhile, A[2, 1, 4, 7] is wrong for having too many components, and A[2, 9, 4] is wrong for being out of range in the middle element. If the dimensionality of matrix B is denoted as b, an index for B can be written as B[i₁, i₂ … i_b].

If matrix C has zero dimensions, its sole component is written C[ ]. Why one component and not zero? Because the number of components in a matrix is determined by multiplying the number of indices in each dimension, and the multiplicative identity for integers is one, not zero.

When all of a matrix's index ranges are equal, for instance (1 .. 7, 1 .. 7, 1 .. 7, 1 .. 7), we say that the matrix is equilateral. Any matrix of 0 or 1 dimensions is equilateral by convention. Although equilateral matrices are favored in tensor algebra, the matrix multiplication to be defined in this report has little need for for that characteristic. To emphasize that point, example matrices will be equilateral as little as possible.

§ 3. Two matrices are comorphic if they have the same dimensionality and the same ranges for respective index elements. Between two comorphic matrices, a component of the first corresponds to a component of the second if the two components have the same index.

Two comorphic matrices are equal if every component of the first equals the corresponding component of the second. Symbolically, A = B means that A[i₁, i₂ … i_d] = B[i₁, i₂ … i_d]. The mutual dimensionality of A and B is represented by d, and all i_n must (of course) be within their respective index ranges.

Two comorphic matrices can be added, giving an output comorphic with the inputs. Corresponding components of the inputs are added, and the sum placed in the corresponding location of the output. More briefly, components are added in parallel. Here is an example using an HTML table for some 2-D matrices:

2 −9 64
−3 14 x
+
5 1 36
9 18 y
=
7 −8 100
6 32 x + y

Symbolically, A + B = C means that A[i₁, i₂ … i_d] + B[i₁, i₂ … i_d] = C[i₁, i₂ … i_d].

For many operations, the notion of conformability can be defined. Broadly, it means that input matrices (and other items) have compatible dimensionalities, index ranges, et cetera. (Operations differ in their requirements, so "compatible" might not mean "equal".) In the case of addition, conformability is the same as comorphism.

Subtraction is completely analogous to addition, and frequently useful, but multiplication in parallel is almost never encountered. Valuable instead is multiplication of a matrix by a scalar, for example:

1 −2
5 −6
0 8
× 12 =
12 −24
60 −72
0 96

If y is a scalar, A × y = B has the component-by-component meaning A[i₁, i₂ … i_d] × y = B[i₁, i₂ … i_d]. Meanwhile, division of a matrix by a scalar amounts to multiplying the matrix by the scalar's reciprocal. Any attempt to define the opposite operation, dividing a scalar by a matrix, poses many questions not easily answered.

The multiplication of a matrix by a scalar is always conformable. Beyond that, the operation is distributive over addition; hence with matrices A and B, and with scalars y and z:

A × (y + z) = (A × y) + (A × z)
(A + B) × y = (A × y) + (B × z)

Negation is no problem: −A = A × −1.

§ 4. The outer product takes two matrices as input and delivers one as output. The input matrices need not be of the same dimensionality, nor need their index ranges match in any way. Indeed, the outer product exists for any two matrices; and by induction, for countably many matrices. Hence comformability is assured. We use the circle-times symbol as a prefix notation for this operation; thus the outer product of A and B is written ⊗(A, B).

In ⊗(A, B) = C, each component of A is multiplied by each component of B, and that product will be a component of C. If A has 15 components and B has 63, then C must have 15 × 63 = 945 components; the outer product can be quite a large matrix. The dimensionality of C is the sum of the dimensionalities of A and B. More specifically, the index of each component of C is the catenation of the indices of the contributing components of A and B:

A[i₁, i₂ … i_a] × B[j₁, j₂ … j_b] = C[i₁, i₂ … i_a, j₁, j₂ … j_b].

Similarly, the index ranges of C are catenative of the index ranges of A and B. A lengthy example of the outer product is on a separate page.

The outer product is associative, meaning this:

⊗(⊗(A, B), C) = ⊗(A, ⊗(B, C))

This it is unambiguous to write ⊗(A, B, C). Observe by contrast that commutativity fails; in other words ⊗(A, B) generally does not equal ⊗(B, A). They will have the same dimensionality, but they might not be comorphic. Still, the outer product does distribute over addition whenever the matrices are conformable for addition. This means:

⊗(A, B + C) = ⊗(A, B) + ⊗(A, C)

An identity for outer multiplication is the 0-D matrix whose sole component is the number 1. Denoting that as U (for unit), this means:

⊗(A, U) = A = ⊗(U, A)

The outer product has never gained much currency in the field of linear algebra. A likely reason is that although the outer product is often huge, it contains no more information than the factors that comprise it. Yet for us, it is a valuable steppingstone toward defining ⊕⊗.

§ 5. Besides the outer product, we need a contraction operation in order to establish ⊕⊗. As a preliminary we define the bunting, which is an ordered n-tuple of boolean values; we use Greek uppercase letters to represent buntings. Each boolean within the bunting is called a flag.

Contraction, which is tricky, involves three items:

input: a matrix, for brevity called X here in section 5 of this report;
input: a bunting, called Φ;
output: a matrix, called Y.

Of the two criteria to make contraction conformable, the first is that the number of flags in Φ must equal the dimensionality of X.

Individual flags can be addressed with square brackets: Φ[1], Φ[2], et cetera. The number of true flags is termed the dimensionality of the contraction. Our notation for contraction uses the circle-plus character in prefix position: ⊕(X, Φ) = Y.

Among the flags of Φ, the first is associated with the first of the X's index ranges, the second with the second, et cetera. Here is an example with 8-D input and 5-D output:

X's index ranges: ( 1 .. 5, 1 .. 4, 1 .. 5, 1 .. 6, 1 .. 5, 1 .. 8, 1 .. 4, 1 .. 5 )
↕ ↕ ↕ ↕ ↕ ↕ ↕ ↕
Φ: ( true, false, true, false, true, false, false, false )
↕ ↕ ↕ ↕ ↕
Y's index ranges: ( 1 .. 4, 1 .. 6, 1 .. 8, 1 .. 4, 1 .. 5 )

X's index ranges:	(	1 .. 5,	1 .. 4,	1 .. 5,	1 .. 6,	1 .. 5,	1 .. 8,	1 .. 4,	1 .. 5	)
		↕	↕	↕	↕	↕	↕	↕	↕
Φ:	(	true,	false,	true,	false,	true,	false,	false,	false	)
			↕		↕		↕	↕	↕
Y's index ranges:	(		1 .. 4,		1 .. 6,		1 .. 8,	1 .. 4,	1 .. 5	)

The second of the two criteria for conformability is that all the index ranges corresponding to true flags must be equal. By contrast, the index ranges corresponding to the false flags need not equal anything in particular.

Y is a matrix whose index ranges are collected from those of X corresponding to the false flags, in this example (1 .. 4, 1 .. 6, 1 .. 8, 1 .. 4, 1 .. 5), so Y is a 5-D matrix. On the other hand, X's index ranges corresponding to trues are absorbed by the operation. Useful fact:

The dimensionality of the input matrix
minus
the dimensionality of the contraction
equals
the dimensionality of the output matrix.

Each component of Y is a summation of certain components of X. Specifically, every component of X satisfying the following two properties is an addend:

the X component's indices in the false positions equal the Y component's indices;
the X component's indices in the true positions all equal one another.

Continuing the example above, here is the addition formula:

Y[i₂, i₄, i₆, i₇, i₈] = X[1, i₂, 1, i₄, 1, i₆, i₇, i₈]
+ X[2, i₂, 2, i₄, 2, i₆, i₇, i₈]
+ X[3, i₂, 3, i₄, 3, i₆, i₇, i₈]
+ X[4, i₂, 4, i₄, 4, i₆, i₇, i₈]
+ X[5, i₂, 5, i₄, 5, i₆, i₇, i₈]

Remarks:

The no-true contraction is always legal, in other words, conformable. The one-true contraction is legal if X has at least one dimension. With more trues, index ranges must match.
If there are no trues, then Y will have the same dimensionality and index ranges as X, and will be filled with the additive identity, namely zero.
If the number of trues is more than 1, then some components of X will not be used.
No component of X will be an addend for more than one component of Y.
If there are no falses, then the sole component of Y will be the sum of the components of X's principal diagonal; this is a generalized trace.
Unless X is filled with zeroes, no combination of flags can cause Y to equal X.
Two 1-D contractions in succession generally give a result different from one 2-D contraction, and similarly for higher dimensionalities of contraction.
Contraction is distributive over addition: ⊕(A + B, Φ) = ⊕(A, Φ) + ⊕(B, Φ).

If several contractions are performed on a matrix, then in general the sequence in which they are performed will affect the result. However, the result can be independent of the sequence when the contractions are performed on separate indices. For instance:

Let matrix A have the index ranges (1 .. 2, 1 .. 3, 1 .. 4, 1 .. 3, 1 .. 2).
Form matrix B₁ by contracting A in the two dimensions that have subscript range 1 .. 3 (the second and fourth indices).
Form matrix B₂ by contracting B₁ in the one dimension that has subscript range 1 .. 4 (now the second index, but originally the third).
Form matrix C₁ by contracting A in the one dimension that has subscript range 1 .. 4 (the third index).
Form matrix C₂ by contracting C₁ in the two dimensions that have subscript range 1 .. 3 (now the second and third indices, but originally the second and fourth).

Then B₂ = C₂. When a matrix is equilateral, or more nearly so, this principle still applies; but care is required to keep track of which indices are involved in which contractions, because indices are shifting to the left.

§ 6. Now the background is laid for the definition of how to perform our generalized multidimensional matrix multiplication, which is a simple two-step procedure:

Calculate the outer product of any two or more matrices.
Perform any conformable contraction on that product.

A notation for the inner product is thus:

⊕(⊗(A, B), Φ)

but fewer parentheses will suffice:

⊕⊗(A, B, Φ)

and with more matrices, it becomes:

⊕⊗(A, B, C, Φ) = ⊕(⊗(A, B, C), Φ)
⊕⊗(A, B, C, D, Φ) = ⊕(⊗(A, B, C, D), Φ)
et cetera

In fact, a contraction by itself can be regarded as a boundary case of ⊕⊗, because the unit U can be introduced as a second factor:

⊕(A, Φ) = ⊕(⊗(A, U), Φ) = ⊕⊗(A, U, Φ)

Note that the ⊗ symbol was chosen for the outer product because its internal operation is multiplication; and ⊕ was chosen for contraction because its internal operation is addition. We suggest retaining the ⊕⊗ symbol sequence for this generalization of the multidimensional matrix product, because the characters are distinctive and they emphasize the bipartite nature of the operation; and because many kinds of matrix multiplication have been defined elsewhere, with notations of all sort.

Because ⊗ and ⊕ are individually distributive over addition, so is ⊕⊗:

⊕⊗(A, B + C, Φ) = ⊕⊗(A, B, Φ) + ⊕⊗(A, C, Φ)

The author is open to suggestions about how to pronounce ⊕⊗. Incidentally, the HTML notation for it is:

&oplus;&otimes;

§ 7. Any two or more n-D matrices have an outer product, and for any matrix there exists a bunting enabling a contraction. Therefore ⊕⊗ exists for any two or more matrices; depending on how many of their index ranges match, there may be several (but finitely many) valid ⊕⊗s, each with a different bunting. The choice of buntings is governed by the rules for conformability.

This guarantee of existence, and potential for multiple values, is one justification for characterizing ⊕⊗ as generalized multidimensional matrix multiplication. Another justification is that ⊕⊗ subsumes ordinary matrix multiplication. Here is an example of that:

Let Φ = (false, true, true, false).

Input matrix A:

A[1, 1] = 2 A[1, 2] = 3 A[1, 3] = −1
A[2, 1] = 4 A[2, 2] = −2 A[2, 3] = 5

Input matrix B:

B[1, 1] = 2 B[1, 2] = −1 B[1, 3] = 0 B[1, 4] = 6
B[2, 1] = 1 B[2, 2] = 3 B[2, 3] = −5 B[2, 4] = 1
B[3, 1] = 4 B[3, 2] = 1 B[3, 3] = −2 B[3, 4] = 2

Intermediate matrix C = ⊗(A, B):

C[1, 1, 1, 1] = 4 C[1, 1, 1, 2] = −2 C[1, 1, 1, 3] = 0 C[1, 1, 1, 4] = 12
C[1, 1, 2, 1] = 2 C[1, 1, 2, 2] = 6 C[1, 1, 2, 3] = −10 C[1, 1, 2, 4] = 2
C[1, 1, 3, 1] = 8 C[1, 1, 3, 2] = 2 C[1, 1, 3, 3] = −4 C[1, 1, 3, 4] = 4
C[1, 2, 1, 1] = 6 C[1, 2, 1, 2] = −3 C[1, 2, 1, 3] = 0 C[1, 2, 1, 4] = 18
C[1, 2, 2, 1] = 3 C[1, 2, 2, 2] = 9 C[1, 2, 2, 3] = −15 C[1, 2, 2, 4] = 3
C[1, 2, 3, 1] = 12 C[1, 2, 3, 2] = 3 C[1, 2, 3, 3] = −6 C[1, 2, 3, 4] = 6
C[1, 3, 1, 1] = −2 C[1, 3, 1, 2] = 1 C[1, 3, 1, 3] = 0 C[1, 3, 1, 4] = −6
C[1, 3, 2, 1] = −1 C[1, 3, 2, 2] = −3 C[1, 3, 2, 3] = 5 C[1, 3, 2, 4] = −1
C[1, 3, 3, 1] = −4 C[1, 3, 3, 2] = −1 C[1, 3, 3, 3] = 2 C[1, 3, 3, 4] = −2
C[2, 1, 1, 1] = 8 C[2, 1, 1, 2] = −4 C[2, 1, 1, 3] = 0 C[2, 1, 1, 4] = 24
C[2, 1, 2, 1] = 4 C[2, 1, 2, 2] = 12 C[2, 1, 2, 3] = −20 C[2, 1, 2, 4] = 4
C[2, 1, 3, 1] = 16 C[2, 1, 3, 2] = 4 C[2, 1, 3, 3] = −8 C[2, 1, 3, 4] = 8
C[2, 2, 1, 1] = −4 C[2, 2, 1, 2] = 2 C[2, 2, 1, 3] = 0 C[2, 2, 1, 4] = −12
C[2, 2, 2, 1] = −2 C[2, 2, 2, 2] = −6 C[2, 2, 2, 3] = 10 C[2, 2, 2, 4] = −2
C[2, 2, 3, 1] = −8 C[2, 2, 3, 2] = −2 C[2, 2, 3, 3] = 4 C[2, 2, 3, 4] = −4
C[2, 3, 1, 1] = 10 C[2, 3, 1, 2] = −5 C[2, 3, 1, 3] = 0 C[2, 3, 1, 4] = 30
C[2, 3, 2, 1] = 5 C[2, 3, 2, 2] = 15 C[2, 3, 2, 3] = −25 C[2, 3, 2, 4] = 5
C[2, 3, 3, 1] = 20 C[2, 3, 3, 2] = 5 C[2, 3, 3, 3] = −10 C[2, 3, 3, 4] = 10

Output matrix D = ⊕(C, Φ) = ⊕(⊗(A, B), Φ) = ⊕⊗(A, B, Φ) = traditional matrix product of A and B:

D[1, 1] = 3 D[1, 2] = 6 D[1, 3] = −13 D[1, 4] = 13
D[2, 1] = 26 D[2, 2] = −5 D[2, 3] = 0 D[2, 4] = 32

Elements of C that are completely ignored in the contraction are shaded in red in the table above; those used are shaded in green. This highlights the fact that, although the most convenient way to define ⊕⊗ is by way of an outer product followed by a contraction, such is far from the most efficient way to implement ⊕⊗. In matrices of practical size, the proportion of wasted calculation can easily exceed 99 percent. The mat_gen_dim software uses a direct approach to figure ⊕⊗ for two matrix inputs, the key function there being called inner_product. There, unused products are not calculated.

Each component of D is the dot product of one row of A and one column of B. Nomenclature varies, but the dot product is often termed "the" inner product or the "standard" inner product. Thus ordinary matrix multiplication, which yields a matrix full of dot products, might be regarded as a multidimensional generalization of the inner product. That explains why the mat_gen_dim software refers to ⊕⊗ by the name inner_product, although another reason is that the C++ language in which the software is written does not allow characters such as ⊕ and ⊗ in the names of functions. Our avenue of generalizing the inner product, by increasing the dimensionality, is separate from, but not in conflict with, the approach employed in the study of inner product spaces.

§ 8. Matrices can be associative under ⊕⊗ with suitable choices of flags. Although a general theory has yet to emerge, examples are plentiful. Consider these ordinary matrices:

Matrix A with index ranges (1 .. 4, 1 .. 4)
Matrix B with index ranges (1 .. 3, 1 .. 3, 1 .. 3, 1 .. 4)
Matrix C with index ranges (1 .. 3, 1 .. 4, 1 .. 4)

The components of these matrices need not have any particular values. The aim is to find four buntings (Ψ₁, Ψ₂, Θ₁, and Θ₂), which need not have any particular relationship among one another, that will not only establish conformability in the following expression but that will also make the equation true:

⊕⊗(⊕⊗(A, B, Ψ₁), C, Ψ₂) = ⊕⊗(A, ⊕⊗(B, C, Θ₁), Θ₂)

A simple computer search turned up dozens of quadruples of buntings that result in associativity, including these five:

Yielding a 1-D matrix:

Ψ₁ = ( false, false, true, true, true, false )
Ψ₂ = ( true, true, true, false, true, true )
Θ₁ = ( true, true, true, false, false, false, false )
Θ₂ = ( true, true, true, false, true, true )

Yielding a 2-D matrix:

Ψ₁ = ( true, true, false, false, false, true )
Ψ₂ = ( true, true, true, true, false, false )
Θ₁ = ( true, true, true, false, true, false, false )
Θ₂ = ( true, true, true, false, false )

Yielding a 3-D matrix:

Ψ₁ = ( false, false, true, true, false, false )
Ψ₂ = ( true, false, false, true, false, true, true )
Θ₁ = ( true, true, false, false, false, false, false )
Θ₂ = ( true, false, false, true, false, true, true )

Yielding a 4-D matrix:

Ψ₁ = ( false, false, true, false, true, false )
Ψ₂ = ( true, false, false, true, false, true, false )
Θ₁ = ( true, false, true, false, false, false, false )
Θ₂ = ( true, false, false, true, false, true, false )

Yielding a 5-D matrix:

Ψ₁ = ( false, false, false, true, true, false )
Ψ₂ = ( true, false, false, false, false, true, false )
Θ₁ = ( false, true, true, false, false, false, false )
Θ₂ = ( true, false, false, false, false, true, false )

Aside from the trivial case of making all flags false, unknown is whether five buntings to satisfy this three-member equation exist:

⊕(⊗(A, B, C), Ω) = ⊕⊗(⊕⊗(A, B, Ψ₁), C, Ψ₂) = ⊕⊗(A, ⊕⊗(B, C, Θ₁), Θ₂)

There is a family of associative extensions to the ordinary 2-D matrix multiplication that appears in section 7 above. Let A, B, and C be matrices all of the same dimensionality d. Let Φ be a bunting with 2 × d elements, half of them true and half false; with all the true flags occupying consecutive positions. Then

⊕⊗(⊕⊗(A, B, Φ), C, Φ) = ⊕⊗(A, ⊕⊗(B, C, Φ), Φ)

whenever the individual ⊕⊗ operations are conformable; in other words, whenever the left member and right member exist. To illustrate, here is a table of the five configurations from the 4-D case, where p, q, r and s are positive integers:

If these matrices have
these index ranges: and if: then:
A₀(1 .. q, 1 .. q, 1 .. q, 1 .. q)
B₀(1 .. r, 1 .. r, 1 .. r, 1 .. r)
C₀(1 .. s, 1 .. s, 1 .. s, 1 .. s) Φ₀ =
(true, true, true, true,
false, false, false, false) ⊕⊗(⊕⊗(A₀, B₀, Φ₀), C₀, Φ₀)
=
⊕⊗(A₀, ⊕⊗(B₀, C₀, Φ₀), Φ₀)
A₁(1 .. p, 1 .. q, 1 .. q, 1 .. q)
B₁(1 .. q, 1 .. r, 1 .. r, 1 .. r)
C₁(1 .. r, 1 .. s, 1 .. s, 1 .. s) Φ₁ =
(false, true, true, true,
true, false, false, false) ⊕⊗(⊕⊗(A₁, B₁, Φ₁), C₁, Φ₁)
=
⊕⊗(A₁, ⊕⊗(B₁, C₁, Φ₁), Φ₁)
A₂(1 .. p, 1 .. p, 1 .. q, 1 .. q)
B₂(1 .. q, 1 .. q, 1 .. r, 1 .. r)
C₂(1 .. r, 1 .. r, 1 .. s, 1 .. s) Φ₂ =
(false, false, true, true,
true, true, false, false) ⊕⊗(⊕⊗(A₂, B₂, Φ₂), C₂, Φ₂)
=
⊕⊗(A₂, ⊕⊗(B₂, C₂, Φ₂), Φ₂)
A₃(1 .. p, 1 .. p, 1 .. p, 1 .. q)
B₃(1 .. q, 1 .. q, 1 .. q, 1 .. r)
C₃(1 .. r, 1 .. r, 1 .. r, 1 .. s) Φ₃ =
(false, false, false, true,
true, true, true, false) ⊕⊗(⊕⊗(A₃, B₃, Φ₃), C₃, Φ₃)
=
⊕⊗(A₃, ⊕⊗(B₃, C₃, Φ₃), Φ₃)
A₄(1 .. p, 1 .. p, 1 .. p, 1 .. p)
B₄(1 .. q, 1 .. q, 1 .. q, 1 .. q)
C₄(1 .. r, 1 .. r, 1 .. r, 1 .. r) Φ₄ =
(false, false, false, false,
true, true, true, true) ⊕⊗(⊕⊗(A₄, B₄, Φ₄), C₄, Φ₄)
=
⊕⊗(A₄, ⊕⊗(B₄, C₄, Φ₄), Φ₄)

If these matrices have these index ranges:	and if:	then:
A₀(1 .. q, 1 .. q, 1 .. q, 1 .. q) B₀(1 .. r, 1 .. r, 1 .. r, 1 .. r) C₀(1 .. s, 1 .. s, 1 .. s, 1 .. s)	Φ₀ = (true, true, true, true, false, false, false, false)	⊕⊗(⊕⊗(A₀, B₀, Φ₀), C₀, Φ₀) = ⊕⊗(A₀, ⊕⊗(B₀, C₀, Φ₀), Φ₀)
A₁(1 .. p, 1 .. q, 1 .. q, 1 .. q) B₁(1 .. q, 1 .. r, 1 .. r, 1 .. r) C₁(1 .. r, 1 .. s, 1 .. s, 1 .. s)	Φ₁ = (false, true, true, true, true, false, false, false)	⊕⊗(⊕⊗(A₁, B₁, Φ₁), C₁, Φ₁) = ⊕⊗(A₁, ⊕⊗(B₁, C₁, Φ₁), Φ₁)
A₂(1 .. p, 1 .. p, 1 .. q, 1 .. q) B₂(1 .. q, 1 .. q, 1 .. r, 1 .. r) C₂(1 .. r, 1 .. r, 1 .. s, 1 .. s)	Φ₂ = (false, false, true, true, true, true, false, false)	⊕⊗(⊕⊗(A₂, B₂, Φ₂), C₂, Φ₂) = ⊕⊗(A₂, ⊕⊗(B₂, C₂, Φ₂), Φ₂)
A₃(1 .. p, 1 .. p, 1 .. p, 1 .. q) B₃(1 .. q, 1 .. q, 1 .. q, 1 .. r) C₃(1 .. r, 1 .. r, 1 .. r, 1 .. s)	Φ₃ = (false, false, false, true, true, true, true, false)	⊕⊗(⊕⊗(A₃, B₃, Φ₃), C₃, Φ₃) = ⊕⊗(A₃, ⊕⊗(B₃, C₃, Φ₃), Φ₃)
A₄(1 .. p, 1 .. p, 1 .. p, 1 .. p) B₄(1 .. q, 1 .. q, 1 .. q, 1 .. q) C₄(1 .. r, 1 .. r, 1 .. r, 1 .. r)	Φ₄ = (false, false, false, false, true, true, true, true)	⊕⊗(⊕⊗(A₄, B₄, Φ₄), C₄, Φ₄) = ⊕⊗(A₄, ⊕⊗(B₄, C₄, Φ₄), Φ₄)

The expression defining associativity:

⊕⊗(⊕⊗(A, B, Φ), C, Φ) = ⊕⊗(A, ⊕⊗(B, C, Φ), Φ)

is complicated, and a simpler notation comes to mind:

⊕⊗(A, B, C, Φ)

However, there is a danger that this latter expression would be confused with

⊕(⊗(A, B, C), Φ)

which is one contraction of the outer product of three matrices, as opposed to two contractions each being of an outer product of two matrices.

It follows from the table above that when an n-D matrix A is equilateral, there are at least n + 1 choices for a bunting Φ that will make the following true:

⊕⊗(⊕⊗(A, A, Φ), A, Φ) = ⊕⊗(A, ⊕⊗(A, A, Φ), Φ)

Thus equilateral matrices exhibit power associativity. Once a particular bunting is chosen, it makes sense to talk about raising A to a positive integer power. A notation for this is:

⊕⊗(A¹, Φ) = A
⊕⊗(A², Φ) = ⊕⊗(A, A, Φ)
⊕⊗(A^m+1, Φ) = ⊕⊗(A^m, A, Φ)

At this point, we can employ a Taylor series to define a sine function:

sin (A, Φ) = ⊕⊗(A¹, Φ) ÷ 1!
− ⊕⊗(A³, Φ) ÷ 3!
+ ⊕⊗(A⁵, Φ) ÷ 5!
− ⊕⊗(A⁷, Φ) ÷ 7!
et cetera

If we regard the components of A as a collection of variables, then ⊕⊗(Aⁿ, Φ) becomes an nth-degree polynomial in those variables. Within the terms of the series, the dividends thus grow exponentially, but the divisors grow factorially, so all the respective matrix components will ultimately converge.

The cosine is more difficult than the sine, because the usual expansion requires adding a (two-sided) multiplicative identity value, which we have not yet established because among matrices such an identity usually fails to exist. To see more of what is going on, we need to introduce the one-sided identity:

L is a left identity for a particular Φ when ⊕⊗(L, A, Φ) for all A.
R is a right identity for a particular Φ when ⊕⊗(A, R, Φ) for all A.

Some functions have more than one right identity, or more than one left identity. If a function has both a left identity and a right identity, they are equal, and there are no other identities.

For an example of what can happen, define the matrix M as what one might suppose an identity matrix to be when its index ranges are (1 .. n, 1 .. n, 1 .. n):

M[1, 1, 1] = M[2, 2, 2] = … = M[n, n, n] = 1
M[others] = 0

For ⊕⊗, M is a right identity when Φ = (false, false, true, true, true, false), and is a left identity when Φ = (false, true, true, true, false, false), but neither case has a two-sided identity.

This can be generalized to d dimensions, with the obvious modification to M, and with writing superscripts to indicate repeated flags in the buntings:

M is a right identity for ⊕⊗ when Φ = (false^d−1, true^d, false¹).
M is a left identity for ⊕⊗ when Φ = (false¹, true^d, false^d−1).

In two dimensions, the left and right identities merge, and the cosine becomes possible.

Beyond these, it is difficult to find any useful or interesting identity values in the equilateral associative environment.

This is obvious:

⊕⊗(⊕⊗(A, B, Φ), C, Φ) = ⊕⊗(A, ⊕⊗(B, C, Φ), Φ)
implies
⊕⊗(⊕⊗(A, A, Φ), A, Φ) = ⊕⊗(A, ⊕⊗(A, A, Φ), Φ)

Surprisingly, numerical experimentation suggests the converse:

⊕⊗(⊕⊗(A, A, Φ), A, Φ) = ⊕⊗(A, ⊕⊗(A, A, Φ), Φ)
is conjected to imply
⊕⊗(⊕⊗(A, B, Φ), C, Φ) = ⊕⊗(A, ⊕⊗(B, C, Φ), Φ)

§ 9. Consider for example:

matrix A with index ranges (1 .. 3, 1 .. 5, 1 .. 2, 1 .. 2)
matrix B with index ranges (1 .. 6, 1 .. 4)
Φ = (false, false, true, true; false, false)

In Φ a semicolon has been substituted for one of the commas. This does not affect the meaning of the bunting, but it informally separates the four flags that correspond to index ranges of A from the two associated with B. With that in mind, define

Φ_A = (false, false, true, true)
Φ_B = (false, false)

Because Φ_B has no trues, we obtain this:

⊕⊗(A, B, Φ) = ⊕(⊗(A, B), Φ) = ⊗(⊕(A, Φ_A), B)

In the left and middle members ⊗ is performed before ⊕, but in the right member ⊕ occurs before ⊗. Also, in the right member B no longer contributes to the contraction.

Naturally, the interchange of subordinate operations still works if the left operand of ⊕⊗ is trueless. With:

Ψ = (false, false; false, false, true, true)
Ψ_A = (false, false, true, true)

this follows:

⊕⊗(B, A, Ψ) = ⊗(B, ⊕(A, Ψ_A))

§ 10. The reader may feel that notation such as this:

⊕⊗(⊕⊗(A, B, Ψ₁), C, Ψ₂) = ⊕⊗(A, ⊕⊗(B, C, Θ₁), Θ₂)

is cumbersome. In response, observe that all three of the following would almost have to mean the same thing:

⊕⊗(A, B, Φ)
⊕⊗(A, Φ, B)
⊕⊗(Φ, A, B)

This prompts the following condensations:

⊕⊗(⊕⊗(A, B, Ψ₁), C, Ψ₂) ⇒ ⊕⊗(A, B, Ψ₁, C, Ψ₂)
⊕⊗(A, ⊕⊗(B, C, Θ₁), Θ₂) ⇒ ⊕⊗(Θ₂, A, Θ₁, B, C)

If in some context the inputs of ⊕⊗ will always be exactly one bunting and two matrices, then the bunting could serve as an infix operator symbol:

⊕⊗(A, B, Φ) ⇒ A Φ B
⊕⊗(⊕⊗(A, B, Ψ₁), C, Ψ₂) ⇒ (A Ψ₁ B) Ψ₂ C
⊕⊗(A, ⊕⊗(B, C, Θ₁), Θ₂) ⇒ A Θ₂ (B Θ₁ C)

In these briefer notations, care must be exercised if A, B, or C is a 1-D matrix whose components happen to be booleans instead of numbers; then the matrix could be mistaken for a bunting.

Y[i₂, i₄, i₆, i₇, i₈]	=	X[1, i₂, 1, i₄, 1, i₆, i₇, i₈]
	+	X[2, i₂, 2, i₄, 2, i₆, i₇, i₈]
	+	X[3, i₂, 3, i₄, 3, i₆, i₇, i₈]
	+	X[4, i₂, 4, i₄, 4, i₆, i₇, i₈]
	+	X[5, i₂, 5, i₄, 5, i₆, i₇, i₈]

A[1, 1] = 2	A[1, 2] = 3	A[1, 3] = −1
A[2, 1] = 4	A[2, 2] = −2	A[2, 3] = 5

B[1, 1] = 2	B[1, 2] = −1	B[1, 3] = 0	B[1, 4] = 6
B[2, 1] = 1	B[2, 2] = 3	B[2, 3] = −5	B[2, 4] = 1
B[3, 1] = 4	B[3, 2] = 1	B[3, 3] = −2	B[3, 4] = 2

C[1, 1, 1, 1] = 4	C[1, 1, 1, 2] = −2	C[1, 1, 1, 3] = 0	C[1, 1, 1, 4] = 12
C[1, 1, 2, 1] = 2	C[1, 1, 2, 2] = 6	C[1, 1, 2, 3] = −10	C[1, 1, 2, 4] = 2
C[1, 1, 3, 1] = 8	C[1, 1, 3, 2] = 2	C[1, 1, 3, 3] = −4	C[1, 1, 3, 4] = 4
C[1, 2, 1, 1] = 6	C[1, 2, 1, 2] = −3	C[1, 2, 1, 3] = 0	C[1, 2, 1, 4] = 18
C[1, 2, 2, 1] = 3	C[1, 2, 2, 2] = 9	C[1, 2, 2, 3] = −15	C[1, 2, 2, 4] = 3
C[1, 2, 3, 1] = 12	C[1, 2, 3, 2] = 3	C[1, 2, 3, 3] = −6	C[1, 2, 3, 4] = 6
C[1, 3, 1, 1] = −2	C[1, 3, 1, 2] = 1	C[1, 3, 1, 3] = 0	C[1, 3, 1, 4] = −6
C[1, 3, 2, 1] = −1	C[1, 3, 2, 2] = −3	C[1, 3, 2, 3] = 5	C[1, 3, 2, 4] = −1
C[1, 3, 3, 1] = −4	C[1, 3, 3, 2] = −1	C[1, 3, 3, 3] = 2	C[1, 3, 3, 4] = −2
C[2, 1, 1, 1] = 8	C[2, 1, 1, 2] = −4	C[2, 1, 1, 3] = 0	C[2, 1, 1, 4] = 24
C[2, 1, 2, 1] = 4	C[2, 1, 2, 2] = 12	C[2, 1, 2, 3] = −20	C[2, 1, 2, 4] = 4
C[2, 1, 3, 1] = 16	C[2, 1, 3, 2] = 4	C[2, 1, 3, 3] = −8	C[2, 1, 3, 4] = 8
C[2, 2, 1, 1] = −4	C[2, 2, 1, 2] = 2	C[2, 2, 1, 3] = 0	C[2, 2, 1, 4] = −12
C[2, 2, 2, 1] = −2	C[2, 2, 2, 2] = −6	C[2, 2, 2, 3] = 10	C[2, 2, 2, 4] = −2
C[2, 2, 3, 1] = −8	C[2, 2, 3, 2] = −2	C[2, 2, 3, 3] = 4	C[2, 2, 3, 4] = −4
C[2, 3, 1, 1] = 10	C[2, 3, 1, 2] = −5	C[2, 3, 1, 3] = 0	C[2, 3, 1, 4] = 30
C[2, 3, 2, 1] = 5	C[2, 3, 2, 2] = 15	C[2, 3, 2, 3] = −25	C[2, 3, 2, 4] = 5
C[2, 3, 3, 1] = 20	C[2, 3, 3, 2] = 5	C[2, 3, 3, 3] = −10	C[2, 3, 3, 4] = 10

D[1, 1] = 3	D[1, 2] = 6	D[1, 3] = −13	D[1, 4] = 13
D[2, 1] = 26	D[2, 2] = −5	D[2, 3] = 0	D[2, 4] = 32

sin (A, Φ)	= ⊕⊗(A¹, Φ) ÷ 1!
	− ⊕⊗(A³, Φ) ÷ 3!
	+ ⊕⊗(A⁵, Φ) ÷ 5!
	− ⊕⊗(A⁷, Φ) ÷ 7!
	et cetera