The essential elements of music
Music comes in all shapes and forms and enriches our lives in a variety of ways. It intensifies our emotions, helps us focus, accompanies us on road trips, and motivates us for workouts. But what exactly is it comprised of?
Due to its subjective nature, a multitude of definitions of music have been put forward. However, the composer Edgard Varèse most famously defined it as “organized sound”. In The Liberation of Sound, he writes:
… I decided to call my music “organized sound” and myself, not a musician, but “a worker in rhythms, frequencies, and intensities.”
The basic elements
The fundamental building blocks of music are loudness, pitch, duration (or rhythm), tempo, timbre, and reverberation. When combined carefully, they give rise to higher-level concepts such as meter, harmony, melody, and key. Thus, music differs from random sound in the combination of its basic elements and the relations that form between them. Let’s briefly define them.
Pitch is the frequency of a particular tone, related to its relative position in the musical scale. The higher the frequency, the higher the tone is perceived.
Rhythm refers to the durations of a series of notes or tones and how they group together to form units.
Tempo refers to the overall pace of the song or piece.
Loudness relates to how much energy an instrument creates, i.e. how much air it displaces, which then travels as sound waves toward our eardrums, where it is turned into perceivable sound.
Timbre, perhaps the most interesting element, distinguishes between the tonal colors of different instruments. If someone plays a note of a particular pitch, duration, and loudness with a trumpet, followed by someone else playing that very same note with a clarinet, the difference in tonal quality becomes apparent — this is timbre.
Reverberation is defined as the perception of how distant the sound source is to us, combined with how large a room or hall the music is in. This is often referred to as spaciousness, or simply ‘echo’.
Loudness and pitch are both constructs of the mind, or brain-interpreted properties. That is, it doesn’t exist in the real world. For instance, if you turn up the volume knob on your stereo, the amplitude of the vibrations of air molecules that move toward your eardrum will increase; however, it takes a brain to interpret and notice this change.
The ratio between the softest sound we can detect and the loudest one without causing permanent damage is one to a million, as measured as sound-pressure levels in the air. On the decibel (dB) scale, it is 120dB, and referred to as our dynamic range. An example of 0dB is a mosquito flying in a quiet room, ten feet away from your ears, whereas 120dB would be a jet engine heard on the runway from 300 feet away or a typical rock concert. If a recording has a dynamic range of 80dB, it means that the difference between the softest and loudest sound on that track is 80dB. One needs to be cognizant about the fact that the dB scale is logarithmic, meaning that the doubling of the intensity of the sound source results in a 3dB increase in sound. Therefore, 126dB is four times as loud as 120dB!
Meter arises from the grouping of tones with one another across time. For instance, a waltz meter organizes into tones of three, whereas a march meter into groups of two or four. This information is extracted by our brains from the overall rhythm and loudness cues.
Key refers to the tonal hierarchy of importance in a musical piece. It’s a human construct and doesn’t exist in the real world — it’s purely a function of our experiences with specific musical styles, idiomas, and mental schemas.
Melody is the main theme of a musical piece — the part that tends to get stuck in our minds when we listen to a song that we are particularly fond of. Interestingly, the actual pitch of notes, i.e. the frequency, is not necessarily relevant to melody. It is the relative distance between the notes, or interval, that matters. For instance, not every “Happy Birthday” starts on the same pitch (or note); yet, if the relative distance between the notes remains the same, we are able to identify the song with relative ease.
Finally, harmony is related to the pitches of different tones and the relationship between them. This leads to the ability to set up tonal contexts, which lead to certain musical expectations that a composer can either meet or violate for expressive purposes. An example of such a violation is deceptive cadence, a chord progression in which the dominant chord is followed by a chord other than the tonic.
Overall, sound is a mental image created by our brains. A source, typically a musical instrument or vocal chords, generates a sound wave comprised of the elements above, which then displaces air molecules as it travels through its surrounding environment, eventually reaching our eardrum, which in turn starts wiggling at the same frequency as the pitch that the sound was endowed with. Humans can hear within a frequency range of 20Hz to 20kHz — that means, our physiological properties are sensitive within this range.
Elaborating on timbre
Timbre is regarded as one of music’s most mysterious and ill-defined properties and thus deserves some further discussion. Now this requires a little physics, but we shall keep it light.
The general tuning standard used today is called A440. This denotes the note A, found above the middle C on a piano, as having a frequency of 440Hz. Having this system in place helps us collaborate with fellow musicians across the globe by having our instruments’ tuning standardized. However, since there are only 12 notes in an octave, multiple keys with the note A will be found on a piano. These are all integer-multiples of 440Hz, i.e. 55Hz, 110Hz, 220Hz, 880Hz, etc. In harmonics, this is called an overtone series and is essential to the perception of timbre. When you hear a saxophone play a note at 220Hz (first harmonic), you actually perceive not only the note at 220Hz, but also its overtone series at 440Hz (second harmonic), 880Hz (third harmonic), 1760Hz (fourth harmonic), etc.
What makes each instrument unique is the intensity of each of these overtone frequencies. Clarinets, for instance, are characterized by having high amounts of energy in the odd harmonics — third, fifth, seventh, and so on. Trumpets, by contrast, have relatively even amounts of energy in the odd and even harmonics. Timbre can also change on the same instrument. For instance, bowing a violin in the center yields mostly odd harmonics and thus can sound similar to a clarinet. However, when bowed one third of the way down, the violin emphasizes the third harmonic and its multiples (sixth, ninth, twelfth, etc.). This is also the principle by which synthesizers work. They essentially generate frequencies with a specific overtone profile to either mimic an already existing instrument or produce less explored, at times otherworldly, sounds.
As we have seen, the careful concoction of the basic elements of music can give rise to higher-level concepts that are ultimately perceived by our brains as one coherent piece of song. For interested readers who would like to learn more about these concepts and how artists exploit the functionality of our brains to make their compositions as resonating and enjoyable as possible, I recommend having a look at the references listed below.
 Varèse, E., 1966, The Liberation of Sound. Perspectives of New Music, Vol. 5, №1, pp. 11–19.
 Levitin, D., 2006, This Is Your Brain on Music. The Science of a Human Obsession, New York, Dutton.
 Powell, J., 2011, How Music Works: The Science and Psychology of Beautiful Sounds, from Beethoven to the Beatles and Beyond. New York, Little, Brown Spark.