Ring Modulator

Steve Harris wrote:

Ages ago I remember seeing some (probably Rob Hordijk ;-) describing how you could implement a pitch shift with a ring modulator and phase shift, something like the difference between the signal ringmoded with a sine wave of the freq you want to shift by and the signal 90deg phase shifted, ringmoded with the sine. Can anyone tell me what it acutally was, please? Or better still an algorithm for implementing pitchshift on a general purpose CPU.

Rob Hordijk wrote:

Sending this reply to the list as it might be of interest to others as well.

Actually there's pitchshift and spectrumshift, two in a way related but in fact different effects. With pitchshift you shift a tone up or down with all its harmonics, this effect is generally available on many effect machines. Its an effect based on either a clever manipulation of a somewhat longer delayline or on a "Fourier series analysis/resynthesis" principle. Spectrumshift (technically called "single sideband modulation") you can do with two sinewaveoscillators 90 degrees out of phase, a phaseshifting allpass filter that shifts all frequencies on one output 90 degrees in respect of another output and a couple of ringmodulators combining the phaseshifted signals with the sinewaves. The effect is that all frequencies (=harmonics) are shifted up or down a fixed number of Hz, this number being the frequency of the two sinewaves.. This means that the stronger the effect the more the harmonics loose their harmonic relation. (This in contrast to pitchshifting!) So the effect is that you "morph" a sound between the dry sound and a "ringmodulation" effect sound at extreme shifts. A shift of a few Hz can e.g. make a static synthetic sound more lively. The problem is to make the phasedifference filter, such a filter only works right in a limited frequencyrange, so a bank of bandpassfilters, each followed by the phasedifference filter giving approximately 90 degrees phaseshift for that band should be used to make the effect high quality. But now it gets really interesting if you start to randomly modulate the shifts in each band. This can give very beautiful "very big choir" chorus effects. If you use two spectrumshifters in each band, one shifting in the reversed direction of the other and mix those to left and right channels, the effect becomes very spatial as well. One would typically use 33 1/3 octave bandpass filters, 66 spectrumshifters and 33 random modulators. In the calculations one can modify a spectrumshifter to give two outputs, one shifted up and the other shifted down, so only 33 are needed. In addition you can give every band a random(ly modulated) delay of 10 msec with random predelays between 0 and 300msec for each band. This shifts the formants in time in relation to each other and thickens the effect even more. Its clear that you need a lot of DSP power and memory for the 33 delaylines to do this!

Some attempts by some people (including myself) were made to make a frequency shifter patch, but by lack of a good allpass phasedifference filter on the NM the results were not of very high "quality". Still interesting sounding, though... Have a look in the patch archives.

A traditional pitchshifter works by filling a fixed length delayline with a fixed rate and reading it with a variable rate. The rate difference causes the pitchshift. At a certain moment the reading point overhauls the writing point in the delayline and this would give a severe click. To compensate for this one calculates the absolute difference between the reading point and the writing point in the block of memory making up the delayline. If reading and writing point are the same the result is 0 and if they are at maximum absolute difference the result is normalized to 1 (absolute difference in samples divided by half the number of samples in the block of memory). The result value is multiplied by the read value from the delayline and that's the output. But this also gives AM modulation to the output. To compensate for this one actually reads two values instead of one from points exactly half the length of the delayline apart. This means that if the multiplication factor of one of them is 0 it is 1 for the other, so the AM is cancelled out. The lenght of the delayline influences the quality a bit and that's why you can find optimization parameters for different intervals on the pitchshifter effects in the better effects machines. These parameters in effect set the lenght of the delayline (think of the phase differences between the waveforms from the two reading points) and the number of reading points (you need at least two, but you can imagine that using more reading points and different windows can increase the quality).

The Fourier series analysis/resynthesis makes a spectral plot (analyses) of a monophonic! sound, extracts the fundamental (the first peak in the spectral plot) and uses the spectral plot to synthesize a signal with a different fundamental. This way you can do pitch correction, as the new pitch does not necessarily have a fixed frequencyratio to the original pitch as with the delayline method. If the analysis and resynthesis is done with enough resolution the result can be very natural. Or you can abuse it to make Cher effects (oh no, not again...!) Plosives, etc. are a bit of a problem as they do not have a fundamental, so there might be a "voiced/unvoiced" detection algoritm as well. One would typically do 50 to 100 FFT transforms a second.

Roland Kuit

David McClain wrote:

Rob's descriptions are Golden, as always! I would add only one comment:

Using the delay line pitch-shifting does shift all harmonics in a harmonic (or multiplicative) manner, but it also shifts the formant frequencies, and can give rise to unnatural sounds. A male voice can be pitch shifted in this manner to make it sound chipmunkish, without also speeding it up.

There is another technique of pitch shifting that does as the delay line method but does not shift the formants. Of course this is even more difficult and requires a harmonic spectral analysis. The harmonic components are both then shifted in frequency and scaled in amplitude so as to preserve the formant frequencies, and then the sound is resynthesized from this modified spectrum. The effect sounds more natural. This is another of the analysis supported by SMS tools ( http://www.iua.upf.es/~sms/ ), Kyma, the new Roland XV-5080, and several other advanced systems.