Extracting amplitude/tempo from mp3/sceAudio stream

AnonymousTipster · Post by **AnonymousTipster** » Thu Jun 15, 2006 3:57 am

When using libmad through a sceAudio channel, how do you find data about the current chunk playing? The sort of data that would be needed for a visualizer, ie, current amplitude or tempo of the stream.
I've looked through the FA++1.0 source, but couldn't find anything referencing from an audio stream, except a commented out function pspAudioGetFreq(,) which is apparently deprecated now.

Any help would be much appreciated.

ufoz · Post by **ufoz** » Thu Jun 15, 2006 11:22 am

If you mean equalizer stuff, there are techniques to do that but they are far from trivial. Google for fast fourier transform and beat detection / tempo detection.

Edit: I just realised that might not be what you asked. Looks like you need access to the decoded audio data? You can access that in your 'output' callback that you can specify in mad_decoder_init. At least according to the single source file 'documentation' that libmad provides. :)

AnonymousTipster · Post by **AnonymousTipster** » Sat Jun 17, 2006 8:09 pm

I'm currently playing the mp3 through a modified version of AhMan's iRShell MP3 player, which doesn't use mad_decoder_init, but I did see this bit of code:

Code: Select all

for&#40;i=0;i<Synth.pcm.length;i++&#41;	&#123;
				signed short	Sample;
				/* Left channel */
				Sample=MadFixedToSshort&#40;Synth.pcm.samples&#91;0&#93;&#91;i&#93;&#41;;
				*&#40;OutputPtr++&#41;=Sample&0xff;
				*&#40;OutputPtr++&#41;=&#40;Sample>>8&#41;;
				/* Right channel. If the decoded stream is monophonic then
			 	* the right output channel is the same as the left one.
			 	*/
				if &#40;MAD_NCHANNELS&#40;&Frame.header&#41; == 2&#41;
					Sample=MadFixedToSshort&#40;Synth.pcm.samples&#91;1&#93;&#91;i&#93;&#41;;
				*&#40;OutputPtr++&#41;=Sample&0xff;
				*&#40;OutputPtr++&#41;=&#40;Sample>>8&#41;;

				/* Flush the output buffer if it is full. */
				if &#40;OutputPtr == OutputBufferEnd&#41; &#123;
					int vol = MAXVOLUME*mp3_volume/100;
					sceAudioOutputPannedBlocking&#40;mp3_handle, vol, vol, &#40;char*&#41;OutputBuffer&#91;OutputBuffer_flip&#93; &#41;;
					OutputBuffer_flip^=1;
					OutputPtr=OutputBuffer&#91;OutputBuffer_flip&#93;;
					OutputBufferEnd=OutputBuffer&#91;OutputBuffer_flip&#93;+OUTPUT_BUFFER_SIZE;
				&#125;
			&#125;

Which I think should make sample a PCM stream.
By using this code:

Code: Select all

locShort&#91;0&#93; = getSample&#40;0,1&#41;;
		unsigned char valCh;
		valCh = &#40;locShort&#91;0&#93;&0xff&#41;;
		
		value = value/NUM_AVERAGING_SAMPLES;
		pspDebugScreenSetXY&#40;0,0&#41;;
		printf&#40;"%i",&#40;int&#41;valCh&#41;;

Where getSample(0,1) gets the 1th value from the left audio stream. I get a positive value from 0-255, but doesn't seem to bear any correlation to the music being played.

Have I got the wrong end of the stick, or should I just be manipulating the data in a different way?

memon · Post by **memon** » Sat Jun 17, 2006 10:50 pm

You're right, the average of the sample values does not really mean anything (well... it is lowpass filter of some sort). As suggested earlier oneway to do beat detection is to use fourier transform, but there is easier way too :)

Usually it is enough to just use set of band pass filters. That is what I used in my flower demo. Setup the set of bandpass filters at reasonable frequencies (like from 60Hz to 10kHz for example). You can use linear or logarithmic distribution. I think I used logarithmic or something like that :)

I think I used 8 bandpass filters for the flower demo (I dont have the source at hand so I cannot say for sure). I then took tha maximum of two neighbour channels so that in the end I had four values to feed to my effect.

In addition to that I calculated the first derivate of the bandpass values too, which is sometimes more interesting. I also dynamically adjust the min/max of the range, which will really help for the higher frequencies (look the usual output of a winamp and you notice that the there is a noticeable peak at the low freqs).

I can check later if I can find the flower sources. I also used libmad for playback.

AnonymousTipster · Post by **AnonymousTipster** » Sun Jun 18, 2006 12:57 am

Thanks for the advice, I'll start looking into band pass filters.

The flower demo does exactly the sort of thing I'm looking to do, if you come across the source code, it would be very beneficial to me, and much appreciated.

ector · Post by **ector** » Sun Jun 18, 2006 6:28 pm

Actually, you get something very close to an FFT inherently while decoding MP3, which is why WinAMP has always had that visualisation, even back when doing just the decoding took 90% CPU. No time for a separate FFT just for viz...

Of course, not every MP3 playback library will expose this.

Shazz · Post by **Shazz** » Sun Jun 18, 2006 7:39 pm

Obviously we use classic FFT code to go from the time domain to the frequency domain in FA++.

A.T., if you need details, just send me a pm.

Shazz/FA++

groepaz · Post by **groepaz** » Sun Jun 18, 2006 8:39 pm

The flower demo does exactly the sort of thing I'm looking to do, if you come across the source code, it would be very beneficial to me, and much appreciated.

i kinda doubt moppi will release the source for this (maybe in some years... :=P).

however, i would assume its using its own mp3 player internally, which pretty much makes that visualisation stuff trivial.

memon · Post by **memon** » Sun Jun 18, 2006 9:57 pm

This is how it works :)

Code: Select all

#define FILTER_SAMPLING_RATE 44100
#define FILTER_GAIN 1.0
#define FILTER_Q 0.5

typedef struct
&#123;
	float	b0, b1, b2, a0, a1, a2;
	float	x1, x2, y1, y2;
	float	cutoff;
&#125; SAudioFilter;

void InitFilter&#40; SAudioFilter* filt, float cutoff &#41;
&#123;
	filt->cutoff = cutoff;
	float	steep = 0.99f;
  float	r = steep * 0.99609375;
  float	f = cosf&#40; M_PI * cutoff / FILTER_SAMPLING_RATE &#41;;
  filt->a0 = &#40;1 - r&#41; * sqrtf&#40; r * &#40;r - 4 * &#40;f * f&#41; + 2&#41; + 1 &#41;;
  filt->b1 = 2 * f * r;
  filt->b2 = -&#40;r * r&#41;;

	filt->x1 = 0;
	filt->x2 = 0;
&#125;

float ProcessFilter&#40; SAudioFilter* filt, float x0 &#41;
&#123;
  float outp = filt->a0 * x0 + filt->b1 * filt->x1 + filt->b2 * filt->x2;
  filt->x2 = filt->x1;
  filt->x1 = outp;
	return outp;
&#125;

#define	AUDIO_BAND_COUNT		8
#define	ANALYZER_BUF_SIZE		2048
#define	FILTER_TIME_BLUR		0.2f
signed short	g_analyzerBuf&#91;ANALYZER_BUF_SIZE&#93;;
uint					g_analyzerIdx = 0;
SAudioFilter	g_filters&#91;AUDIO_BAND_COUNT&#93;;
float					g_bandValues&#91;AUDIO_BAND_COUNT&#93;;
float					g_bandDeltaValues&#91;AUDIO_BAND_COUNT&#93;;
float					g_bandNormValues&#91;AUDIO_BAND_COUNT&#93;;
float					g_bandNormDeltaValues&#91;AUDIO_BAND_COUNT&#93;;
float					g_bandMin&#91;AUDIO_BAND_COUNT&#93;;
float					g_bandMax&#91;AUDIO_BAND_COUNT&#93;;

int			AA_GetBandCount&#40;&#41;
&#123;
	return AUDIO_BAND_COUNT;
&#125;

float		AA_GetBandValue&#40; int iBand &#41;
&#123;
	return g_bandValues&#91;iBand&#93;;
&#125;

float		AA_GetBandDeltaValue&#40; int iBand &#41;
&#123;
	return g_bandDeltaValues&#91;iBand&#93;;
&#125;

float		AA_GetBandNormValue&#40; int iBand &#41;
&#123;
	return g_bandNormValues&#91;iBand&#93;;
&#125;

float		AA_GetBandNormDeltaValue&#40; int iBand &#41;
&#123;
	return g_bandNormDeltaValues&#91;iBand&#93;;
&#125;

float		AA_GetBandMin&#40; int iBand &#41;
&#123;
	if&#40; g_bandMax&#91;iBand&#93; < g_bandMin&#91;iBand&#93; &#41;
		return g_bandMax&#91;iBand&#93;;
	return g_bandMin&#91;iBand&#93;;
&#125;

float		AA_GetBandMax&#40; int iBand &#41;
&#123;
	if&#40; g_bandMin&#91;iBand&#93; > g_bandMax&#91;iBand&#93; &#41;
		return g_bandMin&#91;iBand&#93;;
	return g_bandMax&#91;iBand&#93;;
&#125;

void UpdateAnalyzer&#40; signed short* buffer, uint count &#41;
&#123;
	uint		c, i;
	float		f, val;
	float		tempBars&#91;AUDIO_BAND_COUNT&#93;;
	float		newVal;
	float		delta;
	float		blur = 0.4f;
	float		oldDelta;
	float		range;
	float		newNormVal;
	float		normDelta;
	
	for&#40; i = 0; i < AUDIO_BAND_COUNT; i++ &#41;
		tempBars&#91;i&#93; = 0.0f;

	for&#40; c = 0; c < count; c += 2 &#41;
	&#123;
		val = &#40;&#40;&#40;float&#41;buffer&#91;c&#93; + &#40;float&#41;buffer&#91;c + 1&#93;&#41; * 0.5f&#41; / 32768.0f;
		for&#40; i = 0; i < AUDIO_BAND_COUNT; i++ &#41;
		&#123;
			f = fabsf&#40; ProcessFilter&#40; &g_filters&#91;i&#93;, val &#41; &#41;;
			if&#40; f > tempBars&#91;i&#93; &#41;
				tempBars&#91;i&#93; = f;
		&#125;
	&#125;

	for&#40; i = 0; i < AUDIO_BAND_COUNT; i++ &#41;
	&#123;
		newVal = &#40;1.0f - FILTER_TIME_BLUR&#41; * g_bandValues&#91;i&#93; + FILTER_TIME_BLUR * tempBars&#91;i&#93;;
		if&#40; tempBars&#91;i&#93; > newVal &#41;
			newVal = tempBars&#91;i&#93;;

		delta = newVal - g_bandValues&#91;i&#93;;

		// Decay values towards min/max.
		g_bandMax&#91;i&#93; = g_bandMax&#91;i&#93; * 0.9997f;	// adopt max faster.
		g_bandMin&#91;i&#93; = 1.0f - &#40;1.0f - g_bandMin&#91;i&#93;&#41; * 0.9999f;

		if&#40; delta < 0.0f &#41;
		&#123;
			// Update max
			if&#40; newVal < g_bandMin&#91;i&#93; &#41;
				g_bandMin&#91;i&#93; = &#40;1.0f - FILTER_TIME_BLUR&#41; * g_bandMin&#91;i&#93; + FILTER_TIME_BLUR * newVal;
		&#125;
		else
		&#123;
			// Update min
			if&#40; newVal > g_bandMax&#91;i&#93; &#41;
				g_bandMax&#91;i&#93; = &#40;1.0f - FILTER_TIME_BLUR&#41; * g_bandMax&#91;i&#93; + FILTER_TIME_BLUR * newVal;
		&#125;

		if&#40; g_bandMax&#91;i&#93; < 0.0f &#41;
			g_bandMax&#91;i&#93; = 0.0f;
		if&#40; g_bandMax&#91;i&#93; > 1.0f &#41;
			g_bandMax&#91;i&#93; = 1.0f;

		if&#40; g_bandMin&#91;i&#93; < 0.0f &#41;
			g_bandMin&#91;i&#93; = 0.0f;
		if&#40; g_bandMin&#91;i&#93; > 1.0f &#41;
			g_bandMin&#91;i&#93; = 1.0f;
		
		oldDelta = g_bandDeltaValues&#91;i&#93;;
		g_bandDeltaValues&#91;i&#93; = &#40;1.0f - blur&#41; * g_bandDeltaValues&#91;i&#93; + blur * fabsf&#40; delta &#41;;
		g_bandValues&#91;i&#93; = newVal;

		range = g_bandMax&#91;i&#93; - g_bandMin&#91;i&#93;;
		newNormVal = newVal;
		if&#40; range > 0.0001f &#41;
		&#123;
			if&#40; newNormVal < g_bandMin&#91;i&#93; &#41;
				newNormVal = g_bandMin&#91;i&#93;;
			else if&#40; newNormVal > g_bandMax&#91;i&#93; &#41;
				newNormVal = g_bandMax&#91;i&#93;;
			newNormVal = &#40;newNormVal - g_bandMin&#91;i&#93;&#41; / range;
		&#125;
		if&#40; newNormVal > 1.0f &#41;
			newNormVal = 1.0f;
		normDelta = newNormVal - g_bandNormValues&#91;i&#93;;

		g_bandNormDeltaValues&#91;i&#93; = normDelta;
		g_bandNormValues&#91;i&#93; = newNormVal;
	&#125;
&#125;

void InitAnalyzer&#40;&#41;
&#123;
	uint	i;
	for&#40; i = 0; i < AUDIO_BAND_COUNT; i++ &#41;
	&#123;
		float	center = &#40;float&#41;i / &#40;float&#41;AUDIO_BAND_COUNT;
		center *= center;
		float	cutoff = 40.0f + center * &#40;17000.0f - 40.0f&#41;;
		InitFilter&#40; &g_filters&#91;i&#93;, cutoff &#41;;
	&#125;

	for&#40; i = 0; i < AUDIO_BAND_COUNT; i++ &#41;
	&#123;
		g_bandValues&#91;i&#93; = 0;
		g_bandDeltaValues&#91;i&#93; = 0;
		g_bandMin&#91;i&#93; = 1.0f;
		g_bandMax&#91;i&#93; = 0.0f;
	&#125;
&#125;

You should call InitAnalyzer() once before feeding in any data. I did it when I load a new mp3. It also works as a reset if you want to do something like that.

The analyser is updated while converting the data from mad format to to the format PSP eats, the loop looks like this:

Code: Select all

for &#40;i = 0; i < Synth.pcm.length; i++&#41; &#123;
	signed short SampleL;
	signed short SampleR;
	// Left channel
	SampleL = MadFixedToSshort&#40;Synth.pcm.samples&#91;0&#93;&#91;i&#93;&#41;;
	if &#40;MAD_NCHANNELS&#40;&Frame.header&#41; == 2&#41;
		SampleR = MadFixedToSshort&#40;Synth.pcm.samples&#91;1&#93;&#91;i&#93;&#41;;
	else
		SampleR = SampleL;

	// Update the analyzer
	g_analyzerBuf&#91;g_analyzerIdx++&#93; = SampleL;
	g_analyzerBuf&#91;g_analyzerIdx++&#93; = SampleR;
	if&#40; g_analyzerIdx >= ANALYZER_BUF_SIZE &#41;
	&#123;
		UpdateAnalyzer&#40; g_analyzerBuf, g_analyzerIdx &#41;;
		g_analyzerIdx = 0;
	&#125;

	if &#40;samplesOut < numSamples&#41; &#123;
		_buf&#91;samplesOut * 2&#93; = SampleL;
		_buf&#91;samplesOut * 2 + 1&#93; = SampleR;
		samplesOut++;
	&#125; else &#123;
		OutputBuffer&#91;samplesInOutput * 2&#93; = SampleL;
		OutputBuffer&#91;samplesInOutput * 2 + 1&#93; = SampleR;
		samplesInOutput++;
	&#125;
&#125;

The mp3 playing code is shamelessly lifted from someones mp3 player example/lib, I can't remember anymore where it came and it does not state who wrote it.

And then simple test program:

Code: Select all

void DrawMeters&#40;&#41;
&#123;
	sceGuDisable&#40; GU_CULL_FACE &#41;;
	sceGuDisable&#40; GU_TEXTURE_2D &#41;;

	uint	count = AA_GetBandCount&#40;&#41;;

	sceGuColor&#40;0xffffffff&#41;;

	Vert2* verts = &#40;Vert2*&#41;sceGuGetMemory&#40; count * 4 * sizeof&#40; Vert2 &#41; &#41;;
	Vert2* v = verts;

	// Update vertices.
	for&#40; uint i = 0; i < count; i++ &#41;
	&#123;
		v->x = 10.0f + i * 15.0f;
		v->y = 10.0f;
		v->z = 0;
		v->color = 0x800000ff;
		v++;

		v->x = 10.0f + i * 15.0f + 10.0f;
		v->y = 10.0f + 2 + AA_GetBandNormValue&#40; i &#41; * 100.0f;
		v->z = 0;
		v->color = 0x800000ff;
		v++;
	&#125;

	sceGumDrawArray&#40; GU_SPRITES, GU_TEXTURE_32BITF|GU_COLOR_8888|GU_VERTEX_32BITF|GU_TRANSFORM_3D, count * 2, 0, verts &#41;;

	sceGuEnable&#40; GU_CULL_FACE &#41;;
	sceGuEnable&#40; GU_TEXTURE_2D &#41;;
&#125;

The function AA_GetBandValue() returns the value computed by the band pass filter. That value usually varies within certain range depending in the music. The functions AA_GetBandMin() and AA_GetBandMax() returns the range the band pass values vary. The function AA_GetBandNormValue() returns a value which is normalized witin min/max. All these functions returns value between 0.0 and 1.0.

Basically the min and max values work like slow springs, the min value tries to reach the top and the max value tries move down (slowly) and the moving real values pushes them around. Simple and stupid, but works extremely well :) The nice side effect is that if you have more quiet part in the music, the range shrinks and then when the music gets more intense again, you get nice burst in the visualisation. You can make the spring slower if this is not what you want.

Finally there is the AA_GetBandDeltaValue(). It is not used in the flower demo, but it works great with some kind of flashing lights kind of thing. Or if you want to control some physics kind of things. There is no science other than black magic behind that delta thing, it just emerged from one of my visualisation projects once upon a time, here's image of it:
http://www.pingstate.nu/omnilayer/yksi/ ... _botti.jpg

Raphael · Post by **Raphael** » Sun Jun 18, 2006 10:33 pm

groepaz wrote:
The flower demo does exactly the sort of thing I'm looking to do, if you come across the source code, it would be very beneficial to me, and much appreciated.
i kinda doubt moppi will release the source for this (maybe in some years... :=P).

however, i would assume its using its own mp3 player internally, which pretty much makes that visualisation stuff trivial.

There you've got it. Some cool and friendly dev-helps-dev-and-shares-his-code in your face from memon ;)

@memon: That's really great from you to share this code snippet with us :) It will be very very handy when I try to do mp3 playback for PMP Mod :) Thanks a lot and thumbs up

groepaz · Post by **groepaz** » Sun Jun 18, 2006 10:38 pm

There you've got it. Some cool and friendly dev-helps-dev-and-shares-his-code in your face from memon ;)

hehe, i was more relating to the source of the full demo (as the original poster implied) - which would imho be a lot more interisting than something which can be found with a bit of googling anyway :=P

Raphael · Post by **Raphael** » Sun Jun 18, 2006 10:46 pm

Still it's exactly the code, that was asked for in this thread, and is really helpful when you have no experience in audio visualization yet :)

Anyway the full code of the demo would most likely overstrain a lot of people here ;) (Apart from the fact that there only would be thousands of copy-and-paste 'yeah-I-mad'-sum-ubercool-demo' kiddys, so it really is better to keep it closed source)

groepaz · Post by **groepaz** » Sun Jun 18, 2006 11:03 pm

Anyway the full code of the demo would most likely overstrain a lot of people here ;) (Apart from the fact that there only would be thousands of copy-and-paste 'yeah-I-mad'-sum-ubercool-demo' kiddys, so it really is better to keep it closed source)

hehe :) i might actually be happy with some non working pseudocode - just to verify if my guesses on the technique used was/is correct :=P

AnonymousTipster · Post by **AnonymousTipster** » Sun Jun 18, 2006 11:16 pm

Just saw the code posted *raises glass to memon*

Thanks a lot. ^_^

memon · Post by **memon** » Sun Jun 18, 2006 11:35 pm

groepaz, this is getting a bit off topic, but here's the flower thing in a nut shell:

The main stem:
1) pick 2 random numbers, the first is the lenght if a curve and the second is the curvature
2) linearly interpolate along that arc
3) if certain amount of distance has been passed, roll a dice again and based on the random number choose branch or one of petals/flowers
4) loop :)

The branches is handled the same way as the main stem, except that it is only one arc, where the main stem kind of always regenerates itself when it has finished one arc. Each "particle" has a life time whenre it first grows, then lives, and the dies. Depending on the depth of the tree the lifetime is shorter.

The real meat of the effect is how it is synced to the music imho. Basically what I do is that I vary the propability of certain items (especially the branches) based on the music, and also speed of growth (that linear interpolation) is based on the music.

There is also some great philosophical thingking behind the whole thing, but I will not bother with that today ;)

groepaz · Post by **groepaz** » Mon Jun 19, 2006 1:35 am

i see... almost exactly what i thought :)