Adventures In Pitch Shifting
 
I utilize audio sample playback in a few ways in my Soundb0ard application - I have one shot sample playback, and I have a Looper which uses a form of granular synthesis to time stretch.
Sample files are stored in PCM Wav files which have a header, followed by the audio data stored in an array of numbers, one per sample. Normal playback entails playing those samples back at sample rate at which it was recorded, e.g. 44,100 samples per second.
In order to pitch shift a sample playback, i.e. slow it down or speed it up, you have a few options. You can think of pitch shifting as resampling, e.g. to play a sample back at twice the speed, you could resample at half the original sample rate, i.e. remove half the samples, and then playback the resampled audio at the original sample rate; or to slow down playback to half-speed, you could play every sample twice.
However, what happens if you want a fractional pitch, such as 1.1 x original speed or 0.8? The naive way, which I’ve been using up till now, was to progress through the array at the fractional speed, i.e. instead of moving through the array 1 sample at a time, I would maintain a float read_idx, that would increment at the sample ratio, e.g. 1.1 x and then calculate the playback value as a linear interpolation between the two closest points in the audio data array. This works ok for some ratios, but some can sound a bit too gnarly.
Recently via a reddit thread I came across this wonderful resource - Sean Luke, 2021, Computational Music Synthesis, first edition, available for free at http://cs.gmu.edu/~sean/book/synthesis/
"But it turns out that there exists a method which will, at its limit, interpolate along the actual band-limited function, and act as a built-in brick wall antialiasing filter to boot. This method is windowed sinc interpolation."
Windowed Sinc Interpolation relies on this Sinc Function:
 
"you can use sinc to exactly reconstruct this continuous signal from your digital samples."
The links on this page can explain the math better, but basically in order to convert the frequency / sample rate, you walk through your original samples as the new sample rate and apply this sinc operation over a window of neighboring samples before and after your current sample, applying and summing the result of the sinc function.
From the Sean Luke book above, i converted this algorithm into code:
 
My first implementation didn’t work. The pitched signal was recognisable but was amped too high and sounded a lil janky. I think I mixed up some indexes with the value they should be representing.
I then found this amazing Ron's Digital Signal Processing Page, which has a clear concise implementation in Basic:
 
I implemented this in C++, and the code was clearer to read. After applying the repitch my signal was still clean but no matter what pitch ratio I used, my return signal was always double the original pitch. I must have made a calculation wrong. Possibly to do with handling stereo values.
Lazily I turned to Google Gemini…
> can you give me some example c++ code that will change the frequency of an array of samples using sinc ?
..
<boom>>
> can you expand that example to handle a stereo signal?
<boom>>
> using an interleaved stereo signal, please
<boom>>
> can you improve the algorithm using a hann window?
<boom>>
Ok, quite impressed. I dropped the code into my Looper, and it worked great.
Heres the before, with linear playback:
Heres the after using windowed sinc.
I think it sounds cleaner and better, so i think the implementation works? I’ll play with it a while and see if I prefer it. Here’s the current code:
 
Job done?
No, there are some performance trade-offs.
I initially implemented it for the granular playback system, which meant only dealing with small arrays of data. However this meant I was doing redundant work, recalculating the same values upon each loop.
I moved the window sinc operation to be run once when you call the RePitch function. This becomes a performance bottleneck as those samples can be large arrays, and you dont want this being run on your audio thread as if it takes too long to run, you’ll experience audio drop outs. I looked to a newer feature of C++ to run the repitch algorithm, using std::async from <future>.