The Evolution of Song

Jonathan L. Friedmann, Ph.D.

The earliest rudiments of musical expression were most likely vocal. This basic premise connects diverse speculations about music’s origins. Whether music—broadly defined as structured, controlled, and purposeful sound—began with grunts of aggression, wails of pain, mating howls, or infant-directed communication, the vocal instrument was the source from which it sprang. Despite the lack of records stretching back hundreds of thousands of years, speculative musicologists have sketched cursory evolutions of vocal music. According to Alfred Einstein, the eon-spanning process had three stages: pathogenic (emotion-born), logogenic (language-born), and melogenic (melody-born). This hypothesis, presented in his 1954 essay “Word and Music,” is unique for its qualitative editorializing. In Einstein’s view, the combination of voice and music becomes increasingly problematic as the stages unfold.

The first stage, pathogenic music, represents the “starkest expression of pure emotion.” Einstein viewed the spontaneous, wordless tones of so-called “primitives” as the most pristine type of vocal music. Beyond romanticizing the “noble savage,” he argued that “the meaningful word weakens rather than strengthens such pure expression, since convention tends to attenuate it.” The union of word and music pollutes the original purity.

The degrading effect is less pronounced in stage two: logogenic music. In word-born song, melodic shape, movement, phrasing, and cadences are directed by the ebb and flow of a text, rather than a consistent beat or meter. It is a form of musical grammar—sometimes called speech-melody or stylized speaking—wherein accents and inflections are stressed through unobtrusive, arhythmic, word-serving melodic figures. Such is the mode of Greek epic poetry, Gregorian plainchant, and Jewish scriptural cantillation. Logogenic music has its own disadvantage—namely, the neutralizing of emotion. Because the music serves the text with formulaic motives (described by Einstein as a “minimum of music”), the same sounds are invariably used to transmit texts of varying thematic and emotional content. In this sense, it is the opposite of pathogenic vocalizing.

The third stage is song proper: a short poem or set of words fitted to a metrical tune. By and large, musical considerations, like rhythm and melody, outweigh textual concerns. Although songs often grow from or reflect upon emotional states, the rules of style and form tend to restrain raw feelings. The structure limits the amount of syllables available, and the measured phrases reduce word options. The result is filtered sentiment—a contrast to both unfettered pathogenic music and text-first logogenic music.

Without doubt, Einstein’s scheme has its weaknesses. Not only is the evolution of song non-linear (all three forms still exist today), but blending is also not uncommon. For instance, blues singing, which adheres to highly conventional forms, is known for its “pure emotion.” Within a strict melogenic framework, short phrases and repeated words convey rich layers of emotional content. Even so, Einstein’s three-stage outline raises awareness of the potential impediments of the various types of vocal music. Knowledge of these built-in barriers can help the performer or songwriter transcend them in their own musical quests.

Jonathan L. Friedmann, Ph.D.

Musician and naturalist Bernie Krause identifies two categories of organism-derived sounds: biophony, sounds created by non-human animals, and anthrophony, sounds produced by human beings. Some of these sounds are “musical” in the inclusive sense of displaying structured and intentional patterns that unfold over time. Precisely which sounds fit under this broad definition is debatable. However, on a basic level, we are intuitively attentive to musical sounds around us, both creaturely and human-made. What is perhaps less obvious—and more fundamental—is the extent to which our sense of music is physiologically derived.

This anthrogenic (human-born) appreciation centers on two essential musical elements: rhythm and melody. Both originate with inborn “instruments.” Heartbeats and breathing lay the foundations of rhythm. The voice sets the template for melody. As individuals mature and cultures progress, these internal mechanisms are translated into external instruments, which are themselves imitations and expansions of the organ-instruments within.

Rhythmic awareness begins in the womb. The underlying neural structures of hearing develop early in utero. By the end of the third trimester, a baby can distinguish a wide range of frequencies. This includes her own heart rate, which beats 120 to 160 times per minute, and her mother’s, which beats 60 to 80 times per minute. When the infant is born, the tempo of breathing is added to the mix. As the child develops, rhythmic exposure and experimentation are diversified: rocking, clapping, banging, shaking, walking, stomping, dancing. It is no coincidence that excited music is fast-paced, mimicking quick breaths and heartbeats, while relaxed music is slow-paced, mimicking calm breaths and heartbeats. Techno, dirges, marches, meditations, and all manner of musical styles play off these natural rhythms.

Similarly with melody. The mother’s voice, which also resonates in the womb, is our first introduction to melodic patterns. Newborns show a preference for music (organized sound) over noise (confused sound), and for vocal music over instruments. Mothers instinctively communicate through “motherese”—high-pitched, sliding, infant-directed intonations—which, through exaggeration, reinforces characteristics of the native language. The infant, in turn, babbles in language-patterned speech-song long before she can form words. These verbal and verbal-imitative vocables set the framework of melody, both sung and instrumental. In every culture, melody is deeply rooted in the phrasing, inflections, and articulations of the spoken vernacular.

We cannot escape the physiological/anthrogenic basis of music perception and production. Rhythmic and melodic sense are born with us. Our hearts, breath, and voice invariably inform which sounds—human and non-human—we hear as music, and which ones we do not.

Hard (Melodic) Cases Make Bad (Melodic) Law

Jonathan L. Friedmann, Ph.D.

“Hard cases make bad law.” This legal maxim cautions against seeking general principles in the extremes. A case that is hard, either because it is unusually complicated or emotionally loaded, occupies disputed territory outside of the uncontroversial center. General law is derived from average situations and common concerns; difficult cases neither fit within its parameters nor contribute to them. Similarly, aesthetic outsiders offer little to normative notions of art. Duchamp’s Fountain and Cage’s 4’33” might be fertile topics for discussion, but without a basic consensus about what constitutes art, they would simply be an out-of-place urinal and a prolonged awkward silence.

Philosophers of art often give undue attention to fringe examples and provocative excursions, as if the existence of rule breakers sends aesthetics into a whirlwind of subjectivity. Who is to say whether Piss Christ is any more or less magnificent than Venus de Milo? The absurdity of this question reiterates the importance of the artistic center and its values. There is, of course, room for divergent approaches and variegated judgments; but art is generally recognized as art. (Incidentally, the outsider pieces cited above—Fountain, Cage’s 4’33” and Piss Christ—have each been accused of not being art.)

The extent to which artistic conceptions are natural is demonstrated by melody. Certain elements are present in almost every Western tonal melody, from Baroque to mariachi to soul to grunge. These include repeating devices (e.g., melodic intervals and rhythms), a range within an octave-and-a-half, conjunct motion with occasional leaps (steps and skips), gravity (ascension, climax and dissension), and harmonic movement resolving to the root. These and other components are conventional to the point of being intuitive: any spontaneously imagined tune will likely contain most or all of them. This does not mean that adventures are forbidden in mainstream melodies. Standard components can be periodically stretched, as long as the overall integrity of the melody remains in tact.

“Hard cases” in the world of melody are those that actively disregard this musical intuition. Twelve-tone serialism is a prime example, with its lack of tonal center, tone rows (non-repetitive arrangement of the notes of the chromatic scale), and regulated obscuration of patterns. Such musical experiments are conscious departures from the norm: they take account of the conventional building blocks, and proceed to knock them over. As with peculiar litigations, they can be thought provoking and foster debate; but their influence on melodic standards and recognition is minimal at best.

Schoenberg vs. The People

Jonathan L. Friedmann, Ph.D.

Arnold Schoenberg invented his twelve-tone method to replace normative conceptions of melody. In so doing, he discarded or otherwise obscured the most attractive and enduring elements of music: repetition, anticipation, and predictability. Musical satisfaction derives from our ability to identify phrases, discern tensions, predict resolutions, detect climaxes, perceive suspensions, and recognize other structural features. We are pleased when these expectations are fulfilled and surprised when anticipations are foiled or delayed. The relative unpredictability of Schoenberg’s system tosses all of this out.

According to the rules of twelve-tone technique, the chromatic scale must be organized in a tone row wherein no note is sounded more often than another. This eliminates intuitive patterns, annihilates key signatures, and contradicts millennia-old musical tendencies. When the row occurs again, as it does with mathematical regularity, its wide intervals, variation, and turbulent character do little to please the pattern-hungry ears of the average auditor.

Despite its novelty and intellectual intrigue, Schoenberg’s method has been called “senseless,” “unbearable,” “torturous,” and worse. In 1930 the Musical Times of London declared, “The name of Schoenberg is, as far as the British public is concerned, mud.” Two decades later the Boston Herald published this invective: “The case of Arnold Schoenberg vs. the people (or vice versa, as the situation may be) is one of the most singular things in the history of music. For here is a composer . . . who operates on the theory that if you know how to put a bunch of notes on a piece of score paper you are, presto, a composer” (Rudolph Elie, November 11, 1950).

Witty attacks like these are far too numerous to begin listing here. But are charges of misanthropy warranted? According to psychologist David Huron, Schoenberg’s system is less atonal (without a tonal center) than it is contratonal: it deliberately circumvents tonal implications. If the twelve notes were put into a randomizing computer program, they would occasionally occur in sequences resembling melody as we know it. But Schoenberg and his twentieth-century disciples meticulously avoided even hints of such patterns. As such, they expunged from their music precisely that which human ears have evolved to enjoy.

Lest this seem an overstatement, Huron and his colleague Joy Ollen found that roughly ninety-four percent of music contains clear and verbatim repetition within the first few seconds. This figure derives from examples spanning five continents and inclusive of styles ranging from Navajo war songs to Estonian bagpipes to Punjabi pop. It is probable that Schoenberg’s music wouldn’t even be recognized as music in many of these cultures.

This does not, of course, mean that twelve-tone serialism is without its admirers, or that Schoenberg’s name is unanimously considered “mud.” Some of his works even approach accessibility (in their own way), notably Moses und Aron and A Survivor from Warsaw. But general responses echo those of the Boston Herald, which went on to state: “[His music] never touches any emotion save curiosity, never arouses any mood save speculation on how the conductor can conduct it and how the musicians can count the bars.”

The Rise and Fall of Melody

Jonathan L. Friedmann, Ph.D.

Music exhibits the human propensity to imitate nature and the delight we take in that imitation. Rhythm is a stylization of natural motion. Beating hearts, falling rain, rustling leaves, prancing animals and other organic patterns inspire rhythmic mimesis. Birdsong has influenced musicians throughout history, from indigenous folk singers to classical composers like Mahler and Messiaen. Harmonic dissonances and consonances are unconsciously sensed as simulations of human passions. Since the beginning, natural forces have molded and been woven into music’s very essence.

The bond between music and nature did not escape Italian Renaissance composer and music theorist Franchinus Gaffurius. A noted humanist and personal friend of Leonardo da Vinci, Gaffurius was keenly interested in how people derive musical sounds from their environment and utilize those sounds to achieve specific aims. Among his contributions to the naturalistic conception of music is the notion of “musical gravity,” which he introduced in his major treatise Practica musicae (1496): “A descent from high to low causes a greater sense of repose.” With this simple statement, Gaffurius encapsulated the instinct of tonal music to resolve in a cadence to the tonic, or first scale degree.

This movement is imitative in two important ways. First, the downward movement of the musical line resembles forces that regulate motion in the natural world. The descending pull reinforces our orientation toward the tonic and causes us to feel as though we have arrived at the ground level. Second, it simulates a sense of emotional resolution or closure. By bringing us back to the home or tonic note, melody gives a sensation of gratifying release.

Acknowledging the tendency of musical phrases to descend and rest at the tonic, composers of tonal music employ various methods to protract the time leading to the inevitable conclusion. What often results is a series of ascensions, which generate tension and energy, followed by the much-anticipated resolution, which bestows satisfaction proportional to the duration the listener has waited for it.

Music theorists since Aristotle have recognized tension as one of music’s fundamental properties. Like a coiled spring that is pushed and pulled, musical passages portray a cyclic dance, passing through increases and decreases in intensity on their way to a resting position. Human beings seem hardwired to perceive this musical interplay. We feel musical tension on a primal level, as if it were a visceral or kinesthetic experience. When musical suspense reaches its height, our muscles tighten, and with musical resolution, our muscles relax. Of course, no tone, interval, or harmony is intrinsically tense. The impression of tension stems from culturally derived expectations, which may differ from place to place. But, regardless of cultural variation, musical gravity almost universally wields its power on melodic structure, alleviating tension through downward movement.

The mutually reinforcing elements of musical gravity and tension and release go a long way toward explaining our affinity for melody. These forces are an imitation of nature, both in terms of mimicking the rise and fall of objects and in terms of replicating emotional life. Moreover, the usual melodic path toward repose appeases our longing for closure. Through a succession of notes, melody creates and resolves drama in a clean and logical manner that is a human ideal.

