Hi @ all,
i´m a newbie on programming audio.
I try to capture speech from Microphone and streaming fragments of a capture buffer to network.
I send the fragments by catching the Notification Event.
My Capture Buffer is 2560b big and is seperated in 4 segments.
Every segment gets 20ms of speech. The Samplerate is 16kHz with 16 bits per sample.
Here my Code of the Notification Thread.
DWORD capturePos, readPos;
readPos >= m_NextOffset ?
diff = readPos - m_NextOffset :
diff = readPos + m_BufferSize - m_NextOffset;
if(diff < m_NotifySize) return;
//Lock the capture Buffer
VOID* lockedBufferPointer = NULL;
m_captureBuffer->Lock(m_NextOffset, m_NotifySize, &lockedBufferPointer, &lockedBufferSize, NULL, NULL, 0L);
//Put data to Output Sink
const int SIZE = lockedBufferSize/2;
//write to wave file
//edit the sample
//Unlock the capture Buffer
m_captureBuffer->Unlock(lockedBufferPointer, lockedBufferSize, NULL, 0);
//Move the capture offset along
m_NextOffset += lockedBufferSize;
m_NextOffset %= m_BufferSize;
My Problem is that the speech I´m sending is very noisy after capturing.
What could I do wrong by Capturing?
I don´t know on and it´s hard to find some good information about capturing by directsound.
Ok i have it. The fragment was to small with 20ms. Now i have it with 60ms and it sounds good.
ive made music programs before, split it up into sinewaves with the furior transform
dont worry loading a wav is heaps easier doing it yourself than using direct x products.
[EDIT] just learn off the samples, itys what i do [/EDIT]
but theres no dolby in direct sound.
I've sometimes used furious transforms. It never turns out well...
this is very easy to get 'clicky' sounds, with improper sampling and/or mixing.
in this specific case, i think the playback quality is being affected by network latency. good to see you've got a solution already.
ive implemented it, with no phase smearing, or very little.
very good results.
i also coded the furious transform myself, even tho i cant even
say the name properly. hehe
im actually implementing a new wave file format that stores the oscillator
positions of every harmonic over time - so it will help with latency for
pre recorded sounds (samples), just not live inputs - they are the problem.
i dont recommend using the fft, the dft is simpler and does a better job.
Rouncer, the FFT is just a fast algorithm to compute the DFT. It gives the same results.
The FFT is more complex to implement, but you can also just use libfftw, which is very easy.
maybe it does change the sound...
maybe it does change the sound...
No, it does not. As Reedbeta said it's the same thing, just a faster algorithm. Think bubblesort and quicksort. Both algorithms compute the same result eventually, but quicksort is faster (most of the time at last).
FFT are a little tricky to use if you don't fully understand which one to pick. A FFT that is speed-optimized to give only 8 bit results but takes 16 bit inputs will sound bad due to roundoff-errors and truncating for example. These are fine for graphics but suck for audio.
I know your probably right, but my experiments in making a phase vocoder (which turned out successful, if you remember me asking silly questions about the fft a while back) is the longer a section you pass to it the more wooshy the sound comes back out of it, and coding the dft finally myself i could actually pick off each harmonic one by one and thats what gave it to me.
but ill definitely learn the fft eventually, so i dont really know what im talking about yet.
the woosh you get out of it sounds a little like reverb, except its a bit different and it sounds real nasty and im making music with it now, its totally awesome. harmonic domain effects are the future of digital.
turning every sine wave into a square wave individually is a new kind of expensive distortion, and theres opportunity for chorusing also.