Auditory Submodalities: The Internal Voice Most Practitioners Ignore

Most auditory submodalities NLP work gets skipped. Practitioners learn to elicit visual submodalities fluently, find the brightness and distance drivers, run the swish or mapping across, and declare the job done. The auditory channel gets a passing mention during elicitation and then drops out of the intervention. This is a significant gap because for roughly 30-40% of clients, the auditory coding is the primary driver of emotional response, not the visual.

The client who says “I keep telling myself I’m going to fail” is not speaking metaphorically. There is a literal internal voice with specific auditory submodalities: a particular volume, pitch, tempo, spatial location, and tonal quality. Those parameters determine how the message lands. The same words, “you’re going to fail,” spoken in a high-pitched, squeaky voice from behind the left ear produce a different response than the same words spoken in a deep, authoritative voice from directly inside the head. Change the auditory coding, and the emotional impact changes with it.

This matters for submodality work broadly because a practitioner who defaults to visual interventions will produce incomplete results with auditory-dominant clients. The visual shift brings partial relief. The internal voice continues running its original coding. The client reports that “something still doesn’t feel right,” and the practitioner, having exhausted the visual tools, has nowhere to go.

Eliciting Auditory Submodalities

The elicitation follows the same logic as visual elicitation but requires different questions. Most clients have never been asked to describe the qualities of their internal voice, so the questions may need to be more specific.

Start with location. “When you hear that internal voice, where does it come from? Inside your head? Behind you? To one side? Above?” Location is often the most accessible auditory submodality for clients who have not done this work before.

Then pitch. “Is the voice high or low?” Tempo: “Does it speak quickly or slowly?” Volume: “Is it loud or quiet?” Tone: “Is the voice warm, cold, harsh, mocking, flat?” Whose voice: “Is it your voice? Someone else’s? If someone else’s, whose?” This last question is clinically significant. A critical internal voice that speaks in a parent’s or former teacher’s tone carries different weight than one in the client’s own voice.

Additional auditory submodalities to check: is the voice constant or intermittent? Does it have rhythm? Is it monotone or does it shift pitch? Is there an echo quality? Does it sound close (like a whisper) or distant (like it’s coming through a wall)?

Document everything. The auditory profile is as detailed as the visual one and just as variable across clients.

The Critical Difference Between Auditory and Visual Drivers

Visual submodality shifts tend to produce immediate, noticeable changes in state. The client pushes an image away, and the feeling diminishes within seconds. Auditory shifts work differently. They often produce a delayed response, with the emotional shift arriving five to ten seconds after the submodality change. This delay causes practitioners to underestimate the impact and move on too quickly.

The delay occurs because auditory processing runs on a different temporal scale than visual processing. An image change is instantaneous. An auditory change unfolds over time, the voice needs to speak its message at the new pitch or from the new location before the full effect registers. Give it time. Ask the client to let the adjusted voice run for a full sentence or two before reporting the shift.

The other critical difference: auditory submodalities interact strongly with the kinaesthetic response. A harsh internal voice at high volume does not just produce an auditory experience. It generates a physical contraction, often in the throat or chest. Shifting the voice to a lower pitch and moving it to a more distant location frequently releases the physical contraction spontaneously, without any direct kinaesthetic intervention.

Intervention: Changing the Critical Voice

The most common auditory submodality intervention targets the self-critical internal voice. The protocol is straightforward but requires precision.

Step 1: Baseline. Have the client activate the critical voice and report its full submodality profile. Location, pitch, volume, tempo, tone, whose voice.

Step 2: Identify the driver. Test each auditory submodality one at a time while keeping the others constant. Change only the pitch. What happens to the feeling? Reset. Change only the location. Reset. Change only the volume. The driver is the submodality whose shift produces the largest emotional change.

Step 3: Shift the driver to a neutral or absurd coding. Two strategies work.

The neutral strategy shifts the driver to a value that removes emotional impact. If the driver is pitch, move the voice from its authoritative low pitch to a mid-range, neutral tone, like a news anchor reading the weather. If the driver is location, move the voice from inside the head to across the room.

The absurd strategy shifts the driver to a value that makes the message impossible to take seriously. Change the voice to a cartoon character. Speed it up until it sounds like a chipmunk. Give it a ridiculous accent. The words remain the same, but the coding makes them laughable. This strategy works faster but some clients resist it, feeling that it trivializes their experience. Read the client.

Step 4: Repetition. Run the shift five to seven times with a break state between each. By the fifth repetition, the critical voice should automatically appear with the new coding when activated.

Step 5: Test. Ask the client to think about a situation that normally triggers the critical voice. Does the voice appear? If so, in what coding? If the new coding holds under contextual activation, the intervention has generalized.

Auditory Submodalities in the Swish Pattern

The standard swish pattern is visual: a cue image swishes to a desired self-image. For auditory-dominant clients, an auditory swish can be more effective. The cue is the critical voice at full intensity. The desired state is a resourceful internal voice (different words, different coding). On the swish, the critical voice fades rapidly while the resourceful voice rises in volume and clarity.

The mechanics are the same: rapid shift, one direction only, break state between repetitions. The auditory swish takes slightly longer per repetition because the voice needs time to “speak” at each coding level, but five to seven repetitions still complete the intervention within fifteen minutes.

Building Auditory Awareness in Clients

Many clients are not initially aware of their auditory submodalities. They report “a feeling” without recognizing the internal voice that precedes it. Building awareness requires specific exercises.

Start with a simple noticing exercise: “For the next two minutes, listen to whatever internal dialogue is running. Do not change it. Just notice it. What is the voice saying? Where is it located? What does it sound like?”

Most clients discover, sometimes with surprise, that they have been running continuous internal commentary without conscious awareness. This discovery alone shifts their relationship to the voice. What was experienced as “how I feel” becomes “what a voice in my head is saying in a particular tone from a particular location.” That externalization creates the distance needed for submodality work to proceed.

Clients who practice this noticing exercise between sessions develop the auditory awareness that makes practitioner-guided interventions faster and more precise. Assign it as homework before any session where auditory submodality work is planned.