rouncer at May 12th, 2012 07:39 — #1
Youve got the fast fourier transform, and my question is how would i go about getting the chords, bends, note starts and note stops for synthesizing with a played guitar?
I heard from a forum post that you look for the harmonic with the highest amplitude and thats a fundamental harmonic, so you detect this and that forms the amplitude and pitch for the string, could i do all the strings at once like this? (even bending?)
thanks for any replies.
smile_ at May 12th, 2012 10:46 — #2
Harmonic with highest amplitude isn't always fundamental harmonic. Quite often second or third harmonic is loudest. Things become even more complicated with several notes played. IMO, automatic note detection is something like optical recognition by complexity.
My two cents:
use logarithmic frequency scale instead of linear (FFT),
for each frequency F with power A add scores C1*A, C2*A, C3*A... to frequencies F, F/2, F/3... then choose frequency with highest score as fundamental.
rouncer at May 12th, 2012 11:49 — #3
hmm maybe this will be harder than i thought, ill poke around some more, thanks for the input, even though i didnt really understand the 2 cents, but i got the rest.
smile_ at May 12th, 2012 12:48 — #4
Mmm... for example: note A, top harmonic x2, fundamental harmonic is lesser than x2,3,4. To detect fundamental tone you must account several harmonics. Also this is not FFT, FFT has too much resolution in high frequency and too little resolution in low frequency area.
reedbeta at May 12th, 2012 13:42 — #5
I've messed around with pitch detection a bit, but as Smile says it's a lot harder than it looks. One idea I had was to use the regular FFT for coarsely identifying peaks in the spectrum, then switch to the discrete-time Fourier transform plus golden section maximization to get the precise frequency of the peak. The DTFT uses discrete time samples but continuous frequencies, as opposed to the FFT which discretizes both time and frequency domains equally. However, even if you identify the peaks correctly it's not trivial to extract which notes are being played. I never did get it to work very well.
goz at July 19th, 2012 17:08 — #6
I know this is an old post but if you are looking for f0 (pitch) the easiest method is to use the harmonic product spectrum. Its really pretty simple. You take an FFT of the spectrum you are looking. Next make a copy of that spectrum but half the size (miss out every other sample) and add it to a copy of the original spectrum. Then take another copy of the original fft but miss out every 2 samples and add it again and so on a few times.
In the resulting "FFT" you have you will find the largest peak is the pitch.
This method works, is increidbly quick, but you get really rubbish resolution (its not bad for most things though).
The method I'm currently using for speech pitch tracking is based on an auto correlation. Basically you take an auto correlation of the window (as in window functioned window of audio) you wish to analyse and then you measure the distance to the next highest peak (This is complicated as suually the peak will lie between 2 samples so you need to break out the interpolation). This distance is the sample offset to your fundamental frequency (pitch) in samples. It is very fiddly getting this right but gives by FAR the best resolution I've come across. Its a very intensive calculation though! The bonus of this method is that by applying a levinson-durbin recursion to the resulting auto correlation data you can easily convert to an LPC representation. Find the roots and you have F1 through Fn as well.