Frequency and pitch
The human perception of sound frequency as a place of it at a scale

Pitch is a perceptual property of sounds that allows their ordering on a frequency-related scale, or more commonly, pitch is the quality that makes it possible to judge sounds as "higher" and "lower" in the sense associated with musical melodies. Pitch is a major auditory attribute of musical tones, along with duration, loudness, and timbre.

Pitch may be quantified as a frequency, but pitch is not a purely objective physical property; it is a subjective psychoacoustical attribute of sound. Historically, the study of pitch and pitch perception has been a central problem in psychoacoustics, and has been instrumental in forming and testing theories of sound representation, processing, and perception in the auditory system.

Perception

Pitch and frequency

Pitch is an auditory sensation in which a listener assigns musical tones to relative positions on a musical scale based primarily on their perception of the frequency of vibration. Pitch is closely related to frequency, but the two are not equivalent. Frequency is an objective, scientific attribute that can be measured. Pitch is each person's subjective perception of a sound wave, which cannot be directly measured. However, this does not necessarily mean that most people won't agree on which notes are higher and lower.

The oscillations of sound waves can often be characterized in terms of frequency. Pitches are usually associated with, and thus quantified as, frequencies (in cycles per second, or hertz), by comparing the sounds being assessed against sounds with pure tones (ones with periodic, sinusoidal waveforms). Complex and aperiodic sound waves can often be assigned a pitch by this method.

According to the American National Standards Institute, pitch is the auditory attribute of sound according to which sounds can be ordered on a scale from low to high. Since pitch is such a close proxy for frequency, it is almost entirely determined by how quickly the sound wave is making the air vibrate and has almost nothing to do with the intensity, or amplitude, of the wave. That is, "high" pitch means very rapid oscillation, and "low" pitch corresponds to slower oscillation. Despite that, the idiom relating vertical height to sound pitch is shared by most languages. At least in English, it is just one of many deep conceptual metaphors that involve up/down. The exact etymological history of the musical sense of high and low pitch is still unclear. There is evidence that humans do actually perceive that the source of a sound is slightly higher or lower in vertical space when the sound frequency is increased or reduced.

In most cases, the pitch of complex sounds such as speech and musical notes corresponds very nearly to the repetition rate of periodic or nearly-periodic sounds, or to the reciprocal of the time interval between repeating similar events in the sound waveform.

The pitch of complex tones can be ambiguous, meaning that two or more different pitches can be perceived, depending upon the observer. When the actual fundamental frequency can be precisely determined through physical measurement, it may differ from the perceived pitch because of overtones, also known as upper partials, harmonic or otherwise. A complex tone composed of two sine waves of 1000 and 1200 Hz may sometimes be heard as up to three pitches: two spectral pitches at 1000 and 1200 Hz, derived from the physical frequencies of the pure tones, and the combination tone at 200 Hz, corresponding to the repetition rate of the waveform. In a situation like this, the percept at 200 Hz is commonly referred to as the missing fundamental, which is often the greatest common divisor of the frequencies present.

Pitch depends to a lesser degree on the sound pressure level (loudness, volume) of the tone, especially at frequencies below 1,000 Hz and above 2,000 Hz. The pitch of lower tones gets lower as sound pressure increases. For instance, a tone of 200 Hz that is very loud seems one semitone lower in pitch than if it is just barely audible. Above 2,000 Hz, the pitch gets higher as the sound gets louder. These results were obtained in the pioneering works by S. Stevens and W. Snow. Later investigations, i.e. by A. Cohen, had shown that in most cases the apparent pitch shifts were not significantly different from pitch‐matching errors. When averaged, the remaining shifts followed the directions of Stevens' curves but were small (2% or less by frequency, i.e. not more than a semitone).

Theories of pitch perception

Theories of pitch perception try to explain how the physical sound and specific physiology of the auditory system work together to yield the experience of pitch. In general, pitch perception theories can be divided into place coding and temporal coding. Place theory holds that the perception of pitch is determined by the place of maximum excitation on the basilar membrane.

A place code, taking advantage of the tonotopy in the auditory system, must be in effect for the perception of high frequencies, since neurons have an upper limit on how fast they can phase-lock their action potentials. However, a purely place-based theory cannot account for the accuracy of pitch perception in the low and middle frequency ranges. Moreover, there is some evidence that some non-human primates lack auditory cortex responses to pitch despite having clear tonotopic maps in auditory cortex, showing that tonotopic place codes are not sufficient for pitch responses.

Temporal theories offer an alternative that appeals to the temporal structure of action potentials, mostly the phase-locking and mode-locking of action potentials to frequencies in a stimulus. The precise way this temporal structure helps code for pitch at higher levels is still debated, but the processing seems to be based on an autocorrelation of action potentials in the auditory nerve. However, it has long been noted that a neural mechanism that may accomplish a delay—a necessary operation of a true autocorrelation—has not been found. At least one model shows that a temporal delay is unnecessary to produce an autocorrelation model of pitch perception, appealing to phase shifts between cochlear filters; however, earlier work has shown that certain sounds with a prominent peak in their autocorrelation function do not elicit a corresponding pitch percept, and that certain sounds without a peak in their autocorrelation function nevertheless elicit a pitch. To be a more complete model, autocorrelation must therefore apply to signals that represent the output of the cochlea, as via auditory-nerve interspike-interval histograms. Some theories of pitch perception hold that pitch has inherent octave ambiguities, and therefore is best decomposed into a pitch chroma, a periodic value around the octave, like the note names in western music—and a pitch height, which may be ambiguous, that indicates the octave the pitch is in.

Just-noticeable difference

The just-noticeable difference (jnd) (the threshold at which a change is perceived) depends on the tone's frequency content. Below 500 Hz, the jnd is about 3 Hz for sine waves, and 1 Hz for complex tones; above 1000 Hz, the jnd for sine waves is about 0.6% (about 10 cents). The jnd is typically tested by playing two tones in quick succession with the listener asked if there was a difference in their pitches. The jnd becomes smaller if the two tones are played simultaneously as the listener is then able to discern beat frequencies. The total number of perceptible pitch steps in the range of human hearing is about 1,400; the total number of notes in the equal-tempered scale, from 16 to 16,000 Hz, is 120.

Aural illusions

The relative perception of pitch can be fooled, resulting in aural illusions. There are several of these, such as the tritone paradox, but most notably the Shepard scale, where a continuous or discrete sequence of specially formed tones can be made to sound as if the sequence continues ascending or descending forever.

Definite and indefinite pitch

Not all musical instruments make notes with a clear pitch. The unpitched percussion instrument (a class of percussion instrument) does not produce particular pitches. A sound or note of definite pitch is one where a listener can possibly (or relatively easily) discern the pitch. Sounds with definite pitch have harmonic frequency spectra or close to harmonic spectra.

A sound generated on any instrument produces many modes of vibration that occur simultaneously. A listener hears numerous frequencies at once. The vibration with the lowest frequency is called the fundamental frequency; the other frequencies are overtones. Harmonics are an important class of overtones with frequencies that are integer multiples of the fundamental. Whether or not the higher frequencies are integer multiples, they are collectively called the partials, referring to the different parts that make up the total spectrum.

A sound or note of indefinite pitch is one that a listener finds impossible or relatively difficult to identify as to pitch. Sounds with indefinite pitch do not have harmonic spectra or have altered harmonic spectra—a characteristic known as inharmonicity.

It is still possible for two sounds of indefinite pitch to clearly be higher or lower than one another. For instance, a snare drum sounds higher pitched than a bass drum though both have indefinite pitch, because its sound contains higher frequencies. In other words, it is possible and often easy to roughly discern the relative pitches of two sounds of indefinite pitch, but sounds of indefinite pitch do not neatly correspond to any specific pitch.

Labeling pitches

Pitches are labeled using:

  • Letters, as in Helmholtz pitch notation
  • A combination of letters and numbers—as in scientific pitch notation, where notes are labelled upwards from C0, the 16 Hz C
  • Numbers that represent the frequency in hertz (Hz), the number of cycles per second

For example, one might refer to the A above middle C as a′, A4, or 440 Hz. In standard Western equal temperament, the notion of pitch is insensitive to "spelling": the description "G4 double sharp" refers to the same pitch as A4; in other temperaments, these may be distinct pitches. Human perception of musical intervals is approximately logarithmic with respect to fundamental frequency: the perceived interval between the pitches "A220" and "A440" is the same as the perceived interval between the pitches A440 and A880. Motivated by this logarithmic perception, music theorists sometimes represent pitches using a numerical scale based on the logarithm of fundamental frequency. For example, one can adopt the widely used MIDI standard to map fundamental frequency, f, to a real number, p, as follows

p = 69 + 12 × log 2 ⁡ ( f 440 Hz )

This creates a linear pitch space in which octaves have size 12, semitones (the distance between adjacent keys on the piano keyboard) have size 1, and A440 is assigned the number 69. Distance in this space corresponds to musical intervals as understood by musicians. An equal-tempered semitone is subdivided into 100 cents. The system is flexible enough to include "microtones" not found on standard piano keyboards. For example, the pitch halfway between C (60) and C♯ (61) can be labeled 60.5.

Pitch standards and standard pitch

A pitch standard (also concert pitch) is the conventional pitch reference a group of musical instruments are tuned to for a performance. Concert pitch may vary from ensemble to ensemble, and has varied widely over musical history.

Standard pitch is a more widely accepted convention. The A above middle C is usually set at 440 Hz (often written as "A = 440 Hz" or sometimes "A440"), although other frequencies, such as 442 Hz, are also often used as variants. Another standard pitch, the so-called Baroque pitch, has been set in the 20th century as A = 415 Hz—approximately an equal-tempered semitone lower than A440 to facilitate transposition. The Classical pitch can be set to either 427 Hz (about halfway between A415 and A440) or 430 Hz (also between A415 and A440 but slightly sharper than the quarter tone). And ensembles specializing in authentic performance set the A above middle C to 432 Hz or 435 Hz when performing repertoire from the Romantic era.

Transposing instruments have their origin in the variety of pitch standards. In modern times, they conventionally have their parts transposed into different keys from voices and other instruments (and even from each other). As a result, musicians need a way to refer to a particular pitch in an unambiguous manner when talking to each other.

For example, the most common type of clarinet or trumpet, when playing a note written in their part as C, sounds a pitch that is called B♭ on a non-transposing instrument like a violin (which indicates that at one time these wind instruments played at a standard pitch a tone lower than violin pitch). To refer to that pitch unambiguously, a musician calls it concert B♭, meaning, "...the pitch that someone playing a non-transposing instrument like a violin calls B♭."

A440 (pitch standard)

A440 (also known as Stuttgart pitch) is the musical pitch corresponding to an audio frequency of 440 Hz, which serves as a tuning standard for the musical note of A above middle C, or A4 in scientific pitch notation. It is standardized by the International Organization for Standardization as ISO 16. While other frequencies have been (and occasionally still are) used to tune the first A above middle C, A440 is now commonly used as a reference frequency to calibrate acoustic equipment and to tune pianos, violins, and other musical instruments.

History and use

Before standardization on 440 Hz, many countries and organizations followed the French standard since the 1860s of 435 Hz, which had also been the Austrian government's 1885 recommendation. Johann Heinrich Scheibler recommended A440 as a standard in 1834 after inventing the "tonometer" to measure pitch, and it was approved by the Society of German Natural Scientists and Physicians the same year.

The American music industry reached an informal standard of 440 Hz in 1926, and some began using it in instrument manufacturing.

In 1936, the American Standards Association recommended that the A above middle C be tuned to 440 Hz. This standard was taken up by the International Organization for Standardization in 1955 (reaffirmed by them in 1975) as ISO 16.

It is designated A4 in scientific pitch notation because it occurs in the octave that starts with the fourth C key on a standard 88-key piano keyboard. On MIDI, A440 is note 69 (0x45 hexadecimal).

Modern practices

A440 is widely used as concert pitch in the United Kingdom and the United States. In continental Europe the frequency of A4 commonly varies between 440 Hz and 444 Hz. In the period instrument movement, a consensus has arisen around a modern baroque pitch of 415 Hz (with 440 Hz corresponding to A♯), a 'baroque' pitch for some special church music (in particular, some German church music, e.g. the pre-Leipzig period cantatas of Bach) known as Chorton pitch at 466 Hz (with 440 Hz corresponding to A♭), and classical pitch at 427–430 Hz.

The US time and frequency station WWV broadcasts a 440 Hz signal at two minutes past every hour, with WWVH broadcasting the same tone at the first minute past every hour. This was added in 1936 to aid orchestras in tuning their instruments.

History of pitch standards in Western music

Historically, various standards have been used to fix the pitch of notes at certain frequencies. Various systems of musical tuning have also been used to determine the relative frequency of notes in a scale.

Pre-19th century

Until the 19th century, there was no coordinated effort to standardize musical pitch, and the levels across Europe varied widely. Pitches did not just vary from place to place, or over time—pitch levels could vary even within the same city. The pitch used for an English cathedral organ in the 17th century, for example, could be as much as five semitones lower than that used for a domestic keyboard instrument in the same city.

Even within one church, the pitch used could vary over time because of the way organs were tuned. Generally, the end of an organ pipe would be hammered inwards to a cone, or flared outwards, to raise or lower the pitch. When the pipe ends became frayed by this constant process they were all trimmed down, thus raising the overall pitch of the organ.

From the early 18th century, pitch could also be controlled with the use of tuning forks (invented in 1711), although again there was variation. For example, a tuning fork associated with Handel, dating from 1740, is pitched at A = 422.5 Hz, while a later one from 1780 is pitched at A = 409 Hz, about a quarter-tone lower. A tuning fork that belonged to Ludwig van Beethoven around 1800, now in the British Library, is pitched at A = 455.4 Hz, well over a half-tone higher.

Overall, there was a tendency towards the end of the 18th century for the frequency of the A above middle C to be in the range of 400 to 450 Hz.

The frequencies quoted here are based on modern measurements and would not have been precisely known to musicians of the day. Although Mersenne had made a rough determination of sound frequencies as early as the 17th century, such measurements did not become scientifically accurate until the 19th century, beginning with the work of German physicist Johann Scheibler in the 1830s. The term formerly used for the unit of pitch, cycle per second (CPS) was renamed the hertz (Hz) in the 20th century in honor of Heinrich Hertz.

Pitch inflation

During historical periods when instrumental music rose in prominence (relative to the voice), there was a continuous tendency for pitch levels to rise. This "pitch inflation" seemed largely a product of instrumentalists competing with each other, each attempting to produce a brighter, more "brilliant", sound than that of their rivals. On at least two occasions, pitch inflation had become so severe that reform became needed. At the beginning of the 17th century, Michael Praetorius reported in his encyclopedic Syntagma musicum that pitch levels had become so high that singers were experiencing severe throat strain and lutenists and viol players were complaining of snapped strings. The standard voice ranges he cites show that the pitch level of his time, at least in the part of Germany where he lived, was at least a minor third higher than today's. Solutions to this problem were sporadic and local, but generally involved the establishment of separate standards for voice and organ (German: Chorton, lit. 'choir tone') and for chamber ensembles (German: Kammerton, lit. 'chamber tone'). Where the two were combined, as for example in a cantata, the singers and instrumentalists might perform from music written in different keys. This system kept pitch inflation at bay for some two centuries.

Concert pitch rose further in the 19th century as may be seen reflected in the tuning forks of France. The pipe organ tuning fork in Versailles Chapel in 1795 is 390 Hz, but in the Paris Opera an 1810 tuning fork gives A = 423 Hz, an 1822 fork gives A = 432 Hz, and an 1855 fork gives A = 449 Hz. At La Scala in Milan, the A above middle C rose as high as 451 Hz.

19th- and 20th-century standards

The strongest opponents of the upward tendency in pitch were singers, who complained that it was putting a strain on their voices. Largely due to their protests, the French government passed a law on February 16, 1859, which set the A above middle C at 435 Hz. This was the first attempt to standardize pitch on such a scale, and was known as the diapason normal. It became quite a popular pitch standard outside France as well, and has also been known at various times as French pitch, continental pitch or international pitch (the last of these not to be confused with the 1939 "international standard pitch" described below). An 1885 conference in Vienna established this value among Italy, Austria, Hungary, Russia, Prussia, Saxony, Sweden and Württemberg. This was included as "Convention of 16 and 19 November 1885 regarding the establishment of a concert pitch" in the Treaty of Versailles in 1919 which formally ended World War I. The diapason normal resulted in middle C being tuned at about 258.65 Hz.

An alternative pitch standard known as philosophical or scientific pitch fixes middle C at 256 Hz (that is, 28 Hz), which results the A above it being approximately 430.54 Hz in equal temperament tuning. The appeal of this system is its mathematical idealism (the frequencies of all the Cs being powers of two). This system never received the same official recognition as the French A = 435 Hz and has not been widely used. This tuning has been promoted unsuccessfully by the LaRouche movement's Schiller Institute under the name Verdi tuning since Italian composer Giuseppe Verdi had proposed a slight lowering of the French tuning system. However, the Schiller Institute's recommended tuning for A of 432 Hz is for the Pythagorean ratio of 27:16, rather than the logarithmic ratio of equal temperament tuning.

British attempts at standardisation in the 19th century gave rise to the old philharmonic pitch standard of about A = 452 Hz (different sources quote slightly different values), replaced in 1896 by the considerably "deflated" new philharmonic pitch at A = 439 Hz. The high pitch was maintained by Sir Michael Costa for the Crystal Palace Handel Festivals, causing the withdrawal of the principal tenor Sims Reeves in 1877, though at singers' insistence the Birmingham Festival pitch was lowered (and the organ retuned) at that time. At the Queen's Hall in London, the establishment of the diapason normal for the Promenade Concerts in 1895 (and retuning of the organ to A = 435.5 at 15 °C (59 °F), to be in tune with A = 439 in a heated hall) caused the Royal Philharmonic Society and others (including the Bach Choir, and the Felix Mottl and Arthur Nikisch concerts) to adopt the continental pitch thereafter.

In England the term low pitch was used from 1896 onward to refer to the new Philharmonic Society tuning standard of A = 439 Hz at 68 °F, while "high pitch" was used for the older tuning of A = 452.4 Hz at 60 °F. Although the larger London orchestras were quick to conform to the new, low pitch, provincial orchestras continued using the high pitch until at least the 1920s, and most brass bands were still using the high pitch in the mid-1960s. Highland pipe bands continue to use an even sharper tuning, around A = 470–480 Hz, over a semitone higher than A440. As a result, bagpipes are often perceived as playing in B♭ despite being notated in A (as if they were transposing instruments in D-flat), and are often tuned to match B♭ brass instruments when the two are required to play together.

The Stuttgart Conference of 1834 recommended C264 (A440) as the standard pitch based on Scheibler's studies with his Tonometer. For this reason A440 has been referred to as Stuttgart pitch or Scheibler pitch.

In 1939, an international conference recommended that the A above middle C be tuned to 440 Hz, now known as concert pitch. As a technical standard this was taken up by the International Organization for Standardization in 1955 and reaffirmed by them in 1975 as ISO 16. The difference between this and the diapason normal is due to confusion over the temperature at which the French standard should be measured. The initial standard was A = 439 Hz, but this was superseded by A = 440 Hz, possibly because 439 Hz was difficult to reproduce in a laboratory since 439 is a prime number.