The mathmatical method refered to in the MIR project, is it the same convolution method of solving Differential Equations? I would be interested to know more exactly(mathmatically) how this process works... or perhaps you could direct me to some deeper more technical liturature on the subject.
-
MIR math
-
alanjoseph,
I hate to be a spoil-sport, but inasmuch as Herb & the gang are onto something Really Right (TM), I think they should stop talking about implementation details of MIR publicly until it is a fait accompli. Features (what are they providing/you're buying) -- sure. How to do it -- silence please. The world's full of guys (and maybe even gals) who can implement anything someone else can explain to them, and they never seem to mind profiting from someone else's original IP.
I'm as curious as anyone else, but I'd like to see MIR succeed or fail on its merits, not due to a competitor -- especially an incumbent competitor -- appropriating their creative work. IMHO, too much has already been said.
Rag on me all you want on this subject, I honestly think this is necessary.
Best I can offer you Herb -- I'm still too mad about the other stuff.
dot
-
Hi Alan
I don't know what MIR will be using, but the convolution so often
mentioned in connection with impulse response is the normal math
discipline used for all sorts of things.
Here's a link to a DSP guide ( includes texts on convolution):
http://www.dspguide.com">http://www.dspguide.com
And here's some rather old books from my schooldays [:)]
For a pure math description:
'Advanced Engineering Mathematics' - Erwin Kreyszig
For DSP description/application:
'Discrete-Time Signal Processing' - Van Den Enden/Verhoeckx
'Digital Signal Processing' - DeFatta/Luca/Hodgkiss
Bjarne
-
I've been trying to understand how convolution actually works. Can somebody please explain it to me in layman's terms?
Sure I understand the modeling side - you take an impulse response/responses of something and derive its characteristics (transfer function, etc.) from that. What I don't understand is what is done to an incoming signal once the model has been built, i.e. what is it that takes so much computation?
Standard reverbs run everything through filters and delays, in other words. These processors are clearly doing much more than that.
-
Well, it's not that easy to describe it in layman's terms, but I'll give it a shot.
Any environment has a certain frequency response, which is basically how the environment responds to different frequencies. So to get the sound of the trumpet in a hall, for example, you need the frequency response of that hall. The problem now is how to get that frequency response.
One way is to play a pure tone, then record the response (amplitude and phase), play another tone, record again, etc until you've recorded a range of tones suitable for your purposes.
The magical convolution formula allows another way. You simply record the impulse response of the environment. After applying the mathematical convolution formula, you get the frequency response. Technically an impulse is a theoretically infinitely loud sound of no duration, with an area of one when you mathematically integrate it. In the real world it's a gunshot, or something similar. So you go into the hall, play a gunshot or whatever you want to use as the "impulse", record it, convolve it, and out comes the frequency response.
Apparently it's far easier for people to use this second method, which is why convolution is so popular these days.
By the way, I have a degree in Electrical Engineering, which is why I know all this crazy stuff.
Anthony Lombardi
www.mp3.com/alombardi
-
The reason convolution takes so much computer power is that there needs to be N multiply-accumulate operations for every input sample where N is the number of samples in the impulse response. So, for reverb, which requires a long impulse response (longer reverb tails means more impulse response samples), this is a lot of multiply-accumulates. Compare this to a simple filter, for instance, there is only one such operation for each input point.
-
Well - I'll try dusting of my old math and DSP [:)]
In order to describe the convolution process a few concepts needs to be explaned first:
Any description of the real world ( from an engineering standpoint ) frequently involves the concept of System. Anything, being it a pendulum, a motor or a concert hall, is idealy described as an isolated
system. Even the weather is perceived as a set of systems, and the reason that the weather boys are always in trouble is that they cannot
isolate it from the surrounding world - there's to many outside impacts
on the systems also there is another problem: It not liniar systems!
( At this point I should explain that I'm not a qualified meteorlogist - and
how the hell do you spell it anyway - I'm just making it up as I go along ).
The concept of liniar systems is in short:
If you feed the system with a signal x and get an output from the system that is y, you will expect that the following to be true:
2x yields 2y at the output. Take an amplifier: If you increase the input
you expect the output to increase proportionally (at least within normal operation range ). The liniarity of the system might collapse when exposed to exstreme conditions.
There are other conditions, qualifying a liniar system, but I think you get
the picture. In real life systems are in fact seldom real liniar - but frequently they are approximated liniar. In our context of convolution a concert hall is considered a liniar system.
So now we have a system, eg. a concert hall h(t), a signal x(t) which is feed to the system, eg. an orchestra playing and we have an output, or response to the signal, from the system. In our case this will be the signal itself and some more, y(t).
x(t) -> system -> y(t)
A way of describing the system responce in a more meaningfull way ( an orchestra can produce all sorts of signals and frequency content so its not a very good way to describe the system response ) is to use to standardized signals: the unit impulse, also called a delta function, and the unit step function.
The unit impulse is a very short, in theory infinitly short, pulse and the unit step function is a very long, in theory indefinitly long, signal. The unit impulseor delta function is good for describing a systems frequency response since the very short pulse generates a complete frequence spectra. On the other hand the unit step function is good for measuring a system's response time. As you might have quessed its the delta function which is of most interest to audio systems.
So far I've been using the notation x(t) to indicate that we're dealing with continous signals in the time domain, hence the t. This means that our audio signal is a function of time - just like when you
watch it with a scope, you'll see the signal moving up and down as time goes by. Now we gone move to to discrete time domain representation, x[n], where n is sample number. This is like looking at a sampled signal in your wave editor - you can zoom in on a particular sample and read its value. On the x-axis you have all the samles and on the y-axis the corresponding values or amplitudes. Instead of a contineuos signal the sampling process has chopped up the signal in number of discrete points, of which the number per second is the sampling frequency.
The impulse signal thus becomes delta[n] and look like this: 1 0 0.....0 in principle in eternity. That is the first samle is 1, the rest is zero.
One importent property of the delta function is that when transformed to the frequency domain, by a mathematical process like laplace- or fourier transform, it shows a frequency spectrum that covers the whole spectra. This means that if we know a systems response to such a signal, we know how the system response to all the frequncies. The output from a system, given a delta function input, is called the systems impulse response h[n]. And this is the function that is used in the convolution process. So we make a noise in the concert hall with an approximation of a the delta function, eg. a gunshot or better a truncated version to get it as short
as possible, then we sample the output from the concert hall via a mic and we have discret version of
the impulse response, h[n]. From this we are now able to calculate the systems responce to any signal,
eg. a sample of a violin. And this finally brings us to the convolution itself, which can be decribed as the process of taking a discrete signal sample by sample and run it through the impulse response - the proces of during so is not entirely straightforward. This is why its called convolution - the 2 signals, our violin sample and the impulse response from the concert hall is folded with each other.
The impulse responce is moved across the violin sample step by step - but flipped around.
That is the impulse reponse is flipped an becomes 0 0 0 ..... 1 - now in the real world there is no such thing as indefinitly long or short signals - indeed if this where the case we would have to wait till the end of times before we get an answer from the program doing the convolution. Instead a definite
version of the impulse response is used. To keep things simple we will use a very short version of both violin sample and impluse response:
Let the violin sample consist of 5 samples be x[n] = 1 2 3 4 0 and the impulse response of 5 samples h[n] = 0 2 4 2 0.
Remember the digits are amplitudes of a given sample. The sample numbers are enumerated from 0 and onward.
Now let's convolute:
Step 1.
Flip the impulse response: h'[n] = 0 2 4 2 0 ( notice the h' here meaning flipped ) In this case there's no difference since the h[n] is symmetrical
Step 2.
Position the h'[n] over the x[n] at the leftmost position of x[1] and multiply each x[n] with h[n], and add them together. Notice that in this first position only the right most sample of h'[n] is actually covering x[n] - namely x[0]. h'[0] - h[3] is before x[n] starts
Step 3.
Move h'[n] one step to right and repeat the process in step 2 - now h[n] is covering x[0] and x[1].
Step 4.
Repeat step 3 until h'[n] is moved right to the end of x[n], so that h'[0] is covering x[4].
You have to imaging h'[n] as a sort of window you slide across the violin sample, seeing only the part of violin sample that it covers
Each result in step 2 produces a point in the output signal.
Sample number one, x[0] in the violin has the value 1 this gives us the first sample in the output by:
y[0] = 1*0 = 0 - only one sample of x[n] is involved at this point
...
y[4] = 1*0 + 2*2 + 3*4 + 4*2 + 0*0 = 24 - now all samples in x[n] is
covered and thus involved
...
y[8] = 0*0 = 0 - now we have reach the end with only x[4] being covered
by h'[0], while h'[1] - h'[4] has moved beyond to right.
I have left out the calculations for most of y[n] but the priciple is there and you can carry out the rest. What is described here is actually the rising part of sawtooth wave being sent through a very simple sort of lowpass filter.
All that remains is to normalize the output signal - you might have noticed how some amplitudes becomes
exceedingly large, eg 24.
Also notice that the output signal is longer than any of the to signals involved in the convolution
process - this is always the case: lenght(y) = lenght(x) + lenght(h) - 1.
well - that's about it. Unfortunatly I cannot make drawings and formulaes in the forum editor, which
surely would be helpfull - but I hope its understandable anyway.
( maybe VSL could host a pdf version, including drawings and formulaes? )
feel free to ask questions.
kind regards
/Bjarne
-
Wow, Sapkiller, that's some kind of explanation! Awesome! Though you lost me after the second sentence.
I'm glad there are brains organized like yours though, because someone has to figure this stuff out. My brain has the mathematical hemisphere completely atrophied and decayed, with only a little of the music hemisphere still working.
William
-
Thanks very much everybody! I'm going to have to read Sapkiller's answer a few times for it all to sink in (you know how Stephen Hawkin's book starts out: "I was told by my publisher that the number of sales would be inversely proportional to the number of equations I use!"), but I sort of followed it the first time. These aren't really complicated equations, but the x(t) language takes a while to translate, since I've forgotten all the algebra I was taught.
dbudde's answer certainly helped me understand what did get of Sapkiller's explanation!