COMBINATORIAL MUSIC THEORY
Journal of the Audio Engineering Society, vol. 39, pp. 427-448. (1991 June).
© 1991, Audio Engineering Society and Andrew Duncan. All rights reserved.

Andrew Duncan
aduncan@cs.ucsb.edu
71035.1100@compuserve.com
[Graphs (this part) |
Scales and Chords |
The Fingerboard |
Symmetries]
SYNOPSIS
Musical patterns may be investigated with the mathematical tools more
commonly applied in science and engineering. For example, the cyclic
autocorrelation of a musical scale describes its interval content. Fingering
patterns on string instruments are embedded in a space with an unusual
topology. Ideas from crystallography may be applied to the description of
structure-preserving transformations of melodies. These phenomena are
explored for the particularly common case of the twelve-note equally-tempered
scale.
"Nothing can be farther from the working musician's mind than
counting, nothing farther from the working mathematician's mind than singing, and
yet there is something common to both." [1]
0 INTRODUCTION
The purpose of this paper is to examine some unique properties of commonly used
musical patterns. It assumes some familiarity with musical ideas: octaves and
intervals, the major scale and the concept of key. It also assumes some knowledge
of abstract algebra: elementary number theory, groups, and graphs. In order to
address a readership of wide-ranging backgrounds, I present these ideas with the
mathematical and musical aspects evolving in parallel.
We will see that there is a natural way to describe the internal structure of a
musical scale that is closely related to the process of autocorrelation
used in digital signal theory. Attempting to pull musical patterns into a second
dimension will reveal that these patterns may be thought of as embedded in a
12-point space with peculiar connectivity. The automorphisms, or
structure-preserving self-mappings of this space, will correspond to precisely
those musical transformations that preserve melodic and harmonic relations
between notes, or interchange those relations. These mappings are analogous to
the symmetry groups of tilings or crystals.
Where a term is used in a more restrictive sense than is common, or already has a
well-defined or circumscribed meaning, it appears (where defined) in bold.
In a case where I would stress or accent (or shout) a word when describing these
ideas verbally, the word is in italics. I hope this does not make for a
bouncy ride.
1 SOUNDS
Sounds are vibrations in the air: variations in air pressure about a mean. The
average ambient air pressure is roughly 100 Newtons of force per square meter of
area, and the variations caused by ordinary conversation are about seven orders
of magnitude (powers of ten) smaller. We will consider a sound to be represented
by a real function of time describing this variation. In most cases of interest,
this function can be broken up into a sum of pure sine waves of different
frequencies. These sine waves are the components of the sound. The term
frequency refers to the number of times per second that the oscillation
goes through a full cycle. Frequency is measured in cycles per second, or
Hertz (Hz). It is conventional to describe the range of the human ear as
20 Hz - 20,000 Hz (20 kHz), although most people's hearing falls rapidly after 15
kHz or so. For comparison, the notes on a piano range from 27 Hz (low A) to 4.2
kHz (highest C).
In certain cases of particular interest, such as sounds produced by vibrating
strings, columns of air, or reeds, the frequencies of the sound's components are
related in a particularly simple way: they are all integer multiplies of a base
frequency. The sinusoidal component with this lowest frequency is called the
fundamental of the sound. When we refer to the frequency of a sound, we
mean the frequency of the fundamental. For example, although the fundamental
frequency of the piano's middle C is 261 Hz, there are also components at higher
frequencies that give "body" to the note.
We use the measurement of frequency to give an ordering to the set of all
sinusoids, and to a large number of sounds, via their fundamental component. This
ordering corresponds closely to our intuitive notion of pitch:
"low-pitched" sounds have low frequencies, and so with high. Perception is
somewhat more complicated, but the ordering that frequency brings is of
fundamental importance to music theory.
2 THE TWELVE NOTES
The "universe" in which we work can be defined in a multi-step process. First,
we define the most fundamental musical relationship: the octave. Two
frequencies are said to be separated by an octave when their ratio is 2:1. We say
that 2 Hz is an octave above 1 Hz. To the ear, notes separated by any number of
octaves sound somehow "the same". (There are well-known physical reasons for
this, which we will take as given.) Fig. 1 shows the frequency axis. (Note that
this figure is a fractal: it looks "the same" at all levels of magnification.)

Fig. 1
Any point on this line corresponds to some frequency, some pitch. In this way,
frequency is used as a coordinate to locate pitches in an absolute way. On
the axis are marked all the notes related by octaves to the frequency of 1 Hz.
These frequencies are all powers of two: those frequencies greater than 1 Hz have
positive exponents; those less than one have negative exponents. These pitches
are our first landmarks on the frequency axis. Note that they are certainly not
the only available frequencies. For the moment all frequencies are
equally accessible.

Fig. 2
We next proceed to deform the frequency axis and its labels in various convenient
ways. In Fig. 2a, we have taken to using the exponents (of two) to represent
the frequencies, rather than the frequencies themselves. This has the effect of
converting multiplication into addition: the movement of an octave is now the
addition or subtraction of 1. This is really a very fundamental change.
We do this to conform to the ear's feeling
that movement of one octave constitutes a particular size "step" or jump, which
is always the same "size" wherever it occurs. Algebraically, we perceive (or
learn to perceive) pitch logarithmically. Acknowledging this, we stretch and
squeeze the axis until these octaves are the same distance apart, in Fig.
2b.

Fig. 3
We will have repeated occasion to divide the octave into equal parts. For various
reasons, it is very fruitful to divide the octave into twelve equal steps. To
avoid fractions, we multiply everything by 12, as shown in Fig. 3. Our
frequency axis is now essentially completed. It relates directly to conventional
musical ideas: for example, the frequencies of the keys of a piano are
integer points on this axis. However, we still are considering the axis to
be continuous. Observe that two notes that are an octave apart now have numbers
that differ by 12.

Fig. 4
The next step quite overshadows the previous two: we twist the axis into a circle
(Fig. 4). We define two notes to be equivalent if we get the same remainder
when dividing either by 12. Thus 12 becomes 0, 13 becomes 1, and so forth. Doing this
splits up the set of all frequencies into classes, each containing precisely
those frequencies related by octaves. For example, one such class is {... -11, 1,
13, ...}. For simplicity of notation, we would designate this class by (1). We
still have an (uncountably) infinite number of these classes, one for every point
on the circle. In music theory, such a class is often referred to as a
pitch-class (abbreviated PC). In the following, we will use the term note
to refer specifically to a set of equivalent pitches. For instance, we refer to
the note C, we do not have in mind a particular C (middle C, or any
other), but the "idea of "C-ness"". (More below on the letter names for notes.)
The circumference of this circle is one octave: we let this octave stand for
all octaves.

Fig. 5
Finally, we decide to restrict ourselves to the integer notes on the
circle. Thus we have divided the octave into twelve notes, separated by twelve
equal steps. This is the 12-tone equally tempered scale. Fig. 5 shows
the notes arranged in a circle. (We now consider there to be nothing between
them.) In the further interest of notational simplicity, we remove the brackets
denoting an equivalence class, and refer to the notes with bare numbers. These
numbers still represent a coordinate system for the 12 notes. Later we
will find uses for a coordinate-free representation.
It is certainly easy to envision dividing the octave up into a different number
of notes. The number twelve is a justly popular choice. We refer to the count of
notes in an octave as the order of a musical system (The word "size" will
be used differently later on.)
3 CONVENTIONAL NOTE NAMES
Historically, there have been letter names also associated with each note. One
such convention, called the scientific tuning, gives the name C to the
note 0. (Recall that this note represents the frequencies {... 1/2, 1, 2, 4,
...}.) Using this convention, middle C would have a frequency of 256 Hz. As noted
above, middle C is customarily put at (approximately) 261 Hz. This is another
convention called A 440 tuning. Still another convention, concert
tuning, puts middle C somewhere else.
For convenience, we introduce the convention of naming the top (zero) note "C".
Ascending through the melodic circle, the names for the notes are as follows: C,
C#/Db, D, D#/Eb, E, F, F#/Gb, G, G#/Ab, A, A#/Bb, B, (and back to C again). (It
is a matter of musical context whether, for example, one refers to the second
note as C# or Db.) Fig. 5 shows the cyclic twelve-note scale with the letter
names for each note. Observe that we have assigned the note name C to represent
the note number 0. This is a common choice, but is not a requirement.
Why do we use these names instead of, for instance, naming them A through L? The
notes whose names contain no accidentals „ sharps or flats „ form a
particularly interesting subset of the twelve notes: the diatonic scale.
More about this scale in the following.
4 ADJACENCY AND GRAPHS
We may define two notes to be adjacent if their difference is 1. This
relation between notes can be represented by a graph,
conventionally known as C12, as shown here in Fig. 6.

Fig. 6
We often blur the distinction
between notes and the vertex that represents them. For example, we start to think
of the top vertex as being the "note" 0. What is the difference between
this graph, and for example Fig. 5? A graph is a picture of a relationship: the
round dots, or vertices, represent objects of some sort (for us, notes)
and the lines, or edges, connect any two objects that have the given
relationship (in our case musical adjacency). Thus we have abstracted away all
but the essential elements.
Note that since we have numbered the vertices, the graph is at least implicity
directed: we distinguish between clockwise and counterclockwise motion
around the graph. One edge in Fig. 6 has an arrow along it defining our
"forward" direction. This distinction affects our later considerations of
symmetry. In particular, the mirror image of a directed graph can not properly be
superimposed on the original, as the direction of traversal along the edges will
be reversed. Where the distinction is important, we will denote directed graphs
by bold italic symbols (e.g. C12), in the same way vectors are
sometimes denoted.
[Graphs |
Scales and Chords |
The Fingerboard |
Symmetries]