COMBINATORIAL MUSIC THEORY

Journal of the Audio Engineering Society, vol. 39, pp. 427-448. (1991 June).
© 1991, Audio Engineering Society and Andrew Duncan. All rights reserved.





Andrew Duncan
aduncan@cs.ucsb.edu
71035.1100@compuserve.com



[Graphs (this part) | Scales and Chords | The Fingerboard | Symmetries]


SYNOPSIS

Musical patterns may be investigated with the mathematical tools more commonly applied in science and engineering. For example, the cyclic autocorrelation of a musical scale describes its interval content. Fingering patterns on string instruments are embedded in a space with an unusual topology. Ideas from crystallography may be applied to the description of structure-preserving transformations of melodies. These phenomena are explored for the particularly common case of the twelve-note equally-tempered scale.

"Nothing can be farther from the working musician's mind than counting, nothing farther from the working mathematician's mind than singing, and yet there is something common to both." [1]


0   INTRODUCTION

The purpose of this paper is to examine some unique properties of commonly used musical patterns. It assumes some familiarity with musical ideas: octaves and intervals, the major scale and the concept of key. It also assumes some knowledge of abstract algebra: elementary number theory, groups, and graphs. In order to address a readership of wide-ranging backgrounds, I present these ideas with the mathematical and musical aspects evolving in parallel.

We will see that there is a natural way to describe the internal structure of a musical scale that is closely related to the process of autocorrelation used in digital signal theory. Attempting to pull musical patterns into a second dimension will reveal that these patterns may be thought of as embedded in a 12-point space with peculiar connectivity. The automorphisms, or structure-preserving self-mappings of this space, will correspond to precisely those musical transformations that preserve melodic and harmonic relations between notes, or interchange those relations. These mappings are analogous to the symmetry groups of tilings or crystals.

Where a term is used in a more restrictive sense than is common, or already has a well-defined or circumscribed meaning, it appears (where defined) in bold. In a case where I would stress or accent (or shout) a word when describing these ideas verbally, the word is in italics. I hope this does not make for a bouncy ride.


1   SOUNDS

Sounds are vibrations in the air: variations in air pressure about a mean. The average ambient air pressure is roughly 100 Newtons of force per square meter of area, and the variations caused by ordinary conversation are about seven orders of magnitude (powers of ten) smaller. We will consider a sound to be represented by a real function of time describing this variation. In most cases of interest, this function can be broken up into a sum of pure sine waves of different frequencies. These sine waves are the components of the sound. The term frequency refers to the number of times per second that the oscillation goes through a full cycle. Frequency is measured in cycles per second, or Hertz (Hz). It is conventional to describe the range of the human ear as 20 Hz - 20,000 Hz (20 kHz), although most people's hearing falls rapidly after 15 kHz or so. For comparison, the notes on a piano range from 27 Hz (low A) to 4.2 kHz (highest C).

In certain cases of particular interest, such as sounds produced by vibrating strings, columns of air, or reeds, the frequencies of the sound's components are related in a particularly simple way: they are all integer multiplies of a base frequency. The sinusoidal component with this lowest frequency is called the fundamental of the sound. When we refer to the frequency of a sound, we mean the frequency of the fundamental. For example, although the fundamental frequency of the piano's middle C is 261 Hz, there are also components at higher frequencies that give "body" to the note.

We use the measurement of frequency to give an ordering to the set of all sinusoids, and to a large number of sounds, via their fundamental component. This ordering corresponds closely to our intuitive notion of pitch: "low-pitched" sounds have low frequencies, and so with high. Perception is somewhat more complicated, but the ordering that frequency brings is of fundamental importance to music theory.


2   THE TWELVE NOTES

The "universe" in which we work can be defined in a multi-step process. First, we define the most fundamental musical relationship: the octave. Two frequencies are said to be separated by an octave when their ratio is 2:1. We say that 2 Hz is an octave above 1 Hz. To the ear, notes separated by any number of octaves sound somehow "the same". (There are well-known physical reasons for this, which we will take as given.) Fig. 1 shows the frequency axis. (Note that this figure is a fractal: it looks "the same" at all levels of magnification.)



Fig. 1


Any point on this line corresponds to some frequency, some pitch. In this way, frequency is used as a coordinate to locate pitches in an absolute way. On the axis are marked all the notes related by octaves to the frequency of 1 Hz. These frequencies are all powers of two: those frequencies greater than 1 Hz have positive exponents; those less than one have negative exponents. These pitches are our first landmarks on the frequency axis. Note that they are certainly not the only available frequencies. For the moment all frequencies are equally accessible.



Fig. 2


We next proceed to deform the frequency axis and its labels in various convenient ways. In Fig. 2a, we have taken to using the exponents (of two) to represent the frequencies, rather than the frequencies themselves. This has the effect of converting multiplication into addition: the movement of an octave is now the addition or subtraction of 1. This is really a very fundamental change. We do this to conform to the ear's feeling that movement of one octave constitutes a particular size "step" or jump, which is always the same "size" wherever it occurs. Algebraically, we perceive (or learn to perceive) pitch logarithmically. Acknowledging this, we stretch and squeeze the axis until these octaves are the same distance apart, in Fig. 2b.



Fig. 3


We will have repeated occasion to divide the octave into equal parts. For various reasons, it is very fruitful to divide the octave into twelve equal steps. To avoid fractions, we multiply everything by 12, as shown in Fig. 3. Our frequency axis is now essentially completed. It relates directly to conventional musical ideas: for example, the frequencies of the keys of a piano are integer points on this axis. However, we still are considering the axis to be continuous. Observe that two notes that are an octave apart now have numbers that differ by 12.



Fig. 4


The next step quite overshadows the previous two: we twist the axis into a circle (Fig. 4). We define two notes to be equivalent if we get the same remainder when dividing either by 12. Thus 12 becomes 0, 13 becomes 1, and so forth. Doing this splits up the set of all frequencies into classes, each containing precisely those frequencies related by octaves. For example, one such class is {... -11, 1, 13, ...}. For simplicity of notation, we would designate this class by (1). We still have an (uncountably) infinite number of these classes, one for every point on the circle. In music theory, such a class is often referred to as a pitch-class (abbreviated PC). In the following, we will use the term note to refer specifically to a set of equivalent pitches. For instance, we refer to the note C, we do not have in mind a particular C (middle C, or any other), but the "idea of "C-ness"". (More below on the letter names for notes.) The circumference of this circle is one octave: we let this octave stand for all octaves.



Fig. 5


Finally, we decide to restrict ourselves to the integer notes on the circle. Thus we have divided the octave into twelve notes, separated by twelve equal steps. This is the 12-tone equally tempered scale. Fig. 5 shows the notes arranged in a circle. (We now consider there to be nothing between them.) In the further interest of notational simplicity, we remove the brackets denoting an equivalence class, and refer to the notes with bare numbers. These numbers still represent a coordinate system for the 12 notes. Later we will find uses for a coordinate-free representation.

It is certainly easy to envision dividing the octave up into a different number of notes. The number twelve is a justly popular choice. We refer to the count of notes in an octave as the order of a musical system (The word "size" will be used differently later on.)


3   CONVENTIONAL NOTE NAMES

Historically, there have been letter names also associated with each note. One such convention, called the scientific tuning, gives the name C to the note 0. (Recall that this note represents the frequencies {... 1/2, 1, 2, 4, ...}.) Using this convention, middle C would have a frequency of 256 Hz. As noted above, middle C is customarily put at (approximately) 261 Hz. This is another convention called A 440 tuning. Still another convention, concert tuning, puts middle C somewhere else.

For convenience, we introduce the convention of naming the top (zero) note "C". Ascending through the melodic circle, the names for the notes are as follows: C, C#/Db, D, D#/Eb, E, F, F#/Gb, G, G#/Ab, A, A#/Bb, B, (and back to C again). (It is a matter of musical context whether, for example, one refers to the second note as C# or Db.) Fig. 5 shows the cyclic twelve-note scale with the letter names for each note. Observe that we have assigned the note name C to represent the note number 0. This is a common choice, but is not a requirement.

Why do we use these names instead of, for instance, naming them A through L? The notes whose names contain no accidentals „ sharps or flats „ form a particularly interesting subset of the twelve notes: the diatonic scale. More about this scale in the following.


4   ADJACENCY AND GRAPHS

We may define two notes to be adjacent if their difference is 1. This relation between notes can be represented by a graph, conventionally known as C12, as shown here in Fig. 6.



Fig. 6


We often blur the distinction between notes and the vertex that represents them. For example, we start to think of the top vertex as being the "note" 0. What is the difference between this graph, and for example Fig. 5? A graph is a picture of a relationship: the round dots, or vertices, represent objects of some sort (for us, notes) and the lines, or edges, connect any two objects that have the given relationship (in our case musical adjacency). Thus we have abstracted away all but the essential elements.

Note that since we have numbered the vertices, the graph is at least implicity directed: we distinguish between clockwise and counterclockwise motion around the graph. One edge in Fig. 6 has an arrow along it defining our "forward" direction. This distinction affects our later considerations of symmetry. In particular, the mirror image of a directed graph can not properly be superimposed on the original, as the direction of traversal along the edges will be reversed. Where the distinction is important, we will denote directed graphs by bold italic symbols (e.g. C12), in the same way vectors are sometimes denoted.



[Graphs | Scales and Chords | The Fingerboard | Symmetries]