Extracting amplitude/tempo from mp3/sceAudio stream
-
- Posts: 197
- Joined: Fri Jul 01, 2005 2:50 am
Extracting amplitude/tempo from mp3/sceAudio stream
When using libmad through a sceAudio channel, how do you find data about the current chunk playing? The sort of data that would be needed for a visualizer, ie, current amplitude or tempo of the stream.
I've looked through the FA++1.0 source, but couldn't find anything referencing from an audio stream, except a commented out function pspAudioGetFreq(,) which is apparently deprecated now.
Any help would be much appreciated.
I've looked through the FA++1.0 source, but couldn't find anything referencing from an audio stream, except a commented out function pspAudioGetFreq(,) which is apparently deprecated now.
Any help would be much appreciated.
If you mean equalizer stuff, there are techniques to do that but they are far from trivial. Google for fast fourier transform and beat detection / tempo detection.
Edit: I just realised that might not be what you asked. Looks like you need access to the decoded audio data? You can access that in your 'output' callback that you can specify in mad_decoder_init. At least according to the single source file 'documentation' that libmad provides. :)
Edit: I just realised that might not be what you asked. Looks like you need access to the decoded audio data? You can access that in your 'output' callback that you can specify in mad_decoder_init. At least according to the single source file 'documentation' that libmad provides. :)
-
- Posts: 197
- Joined: Fri Jul 01, 2005 2:50 am
I'm currently playing the mp3 through a modified version of AhMan's iRShell MP3 player, which doesn't use mad_decoder_init, but I did see this bit of code:
Which I think should make sample a PCM stream.
By using this code:Where getSample(0,1) gets the 1th value from the left audio stream. I get a positive value from 0-255, but doesn't seem to bear any correlation to the music being played.
Have I got the wrong end of the stick, or should I just be manipulating the data in a different way?
Code: Select all
for(i=0;i<Synth.pcm.length;i++) {
signed short Sample;
/* Left channel */
Sample=MadFixedToSshort(Synth.pcm.samples[0][i]);
*(OutputPtr++)=Sample&0xff;
*(OutputPtr++)=(Sample>>8);
/* Right channel. If the decoded stream is monophonic then
* the right output channel is the same as the left one.
*/
if (MAD_NCHANNELS(&Frame.header) == 2)
Sample=MadFixedToSshort(Synth.pcm.samples[1][i]);
*(OutputPtr++)=Sample&0xff;
*(OutputPtr++)=(Sample>>8);
/* Flush the output buffer if it is full. */
if (OutputPtr == OutputBufferEnd) {
int vol = MAXVOLUME*mp3_volume/100;
sceAudioOutputPannedBlocking(mp3_handle, vol, vol, (char*)OutputBuffer[OutputBuffer_flip] );
OutputBuffer_flip^=1;
OutputPtr=OutputBuffer[OutputBuffer_flip];
OutputBufferEnd=OutputBuffer[OutputBuffer_flip]+OUTPUT_BUFFER_SIZE;
}
}
By using this code:
Code: Select all
locShort[0] = getSample(0,1);
unsigned char valCh;
valCh = (locShort[0]&0xff);
value = value/NUM_AVERAGING_SAMPLES;
pspDebugScreenSetXY(0,0);
printf("%i",(int)valCh);
Have I got the wrong end of the stick, or should I just be manipulating the data in a different way?
You're right, the average of the sample values does not really mean anything (well... it is lowpass filter of some sort). As suggested earlier oneway to do beat detection is to use fourier transform, but there is easier way too :)
Usually it is enough to just use set of band pass filters. That is what I used in my flower demo. Setup the set of bandpass filters at reasonable frequencies (like from 60Hz to 10kHz for example). You can use linear or logarithmic distribution. I think I used logarithmic or something like that :)
I think I used 8 bandpass filters for the flower demo (I dont have the source at hand so I cannot say for sure). I then took tha maximum of two neighbour channels so that in the end I had four values to feed to my effect.
In addition to that I calculated the first derivate of the bandpass values too, which is sometimes more interesting. I also dynamically adjust the min/max of the range, which will really help for the higher frequencies (look the usual output of a winamp and you notice that the there is a noticeable peak at the low freqs).
I can check later if I can find the flower sources. I also used libmad for playback.
Usually it is enough to just use set of band pass filters. That is what I used in my flower demo. Setup the set of bandpass filters at reasonable frequencies (like from 60Hz to 10kHz for example). You can use linear or logarithmic distribution. I think I used logarithmic or something like that :)
I think I used 8 bandpass filters for the flower demo (I dont have the source at hand so I cannot say for sure). I then took tha maximum of two neighbour channels so that in the end I had four values to feed to my effect.
In addition to that I calculated the first derivate of the bandpass values too, which is sometimes more interesting. I also dynamically adjust the min/max of the range, which will really help for the higher frequencies (look the usual output of a winamp and you notice that the there is a noticeable peak at the low freqs).
I can check later if I can find the flower sources. I also used libmad for playback.
-
- Posts: 197
- Joined: Fri Jul 01, 2005 2:50 am
Actually, you get something very close to an FFT inherently while decoding MP3, which is why WinAMP has always had that visualisation, even back when doing just the decoding took 90% CPU. No time for a separate FFT just for viz...
Of course, not every MP3 playback library will expose this.
Of course, not every MP3 playback library will expose this.
http://www.dtek.chalmers.se/~tronic/PSPTexTool.zip Free texture converter for PSP with source. More to come.
Obviously we use classic FFT code to go from the time domain to the frequency domain in FA++.
A.T., if you need details, just send me a pm.
Shazz/FA++
A.T., if you need details, just send me a pm.
Shazz/FA++
- TiTAN Art Division -
http://www.titandemo.org
http://www.titandemo.org
i kinda doubt moppi will release the source for this (maybe in some years... :=P).The flower demo does exactly the sort of thing I'm looking to do, if you come across the source code, it would be very beneficial to me, and much appreciated.
however, i would assume its using its own mp3 player internally, which pretty much makes that visualisation stuff trivial.
This is how it works :)
You should call InitAnalyzer() once before feeding in any data. I did it when I load a new mp3. It also works as a reset if you want to do something like that.
The analyser is updated while converting the data from mad format to to the format PSP eats, the loop looks like this:
The mp3 playing code is shamelessly lifted from someones mp3 player example/lib, I can't remember anymore where it came and it does not state who wrote it.
And then simple test program:
The function AA_GetBandValue() returns the value computed by the band pass filter. That value usually varies within certain range depending in the music. The functions AA_GetBandMin() and AA_GetBandMax() returns the range the band pass values vary. The function AA_GetBandNormValue() returns a value which is normalized witin min/max. All these functions returns value between 0.0 and 1.0.
Basically the min and max values work like slow springs, the min value tries to reach the top and the max value tries move down (slowly) and the moving real values pushes them around. Simple and stupid, but works extremely well :) The nice side effect is that if you have more quiet part in the music, the range shrinks and then when the music gets more intense again, you get nice burst in the visualisation. You can make the spring slower if this is not what you want.
Finally there is the AA_GetBandDeltaValue(). It is not used in the flower demo, but it works great with some kind of flashing lights kind of thing. Or if you want to control some physics kind of things. There is no science other than black magic behind that delta thing, it just emerged from one of my visualisation projects once upon a time, here's image of it:
http://www.pingstate.nu/omnilayer/yksi/ ... _botti.jpg
Code: Select all
#define FILTER_SAMPLING_RATE 44100
#define FILTER_GAIN 1.0
#define FILTER_Q 0.5
typedef struct
{
float b0, b1, b2, a0, a1, a2;
float x1, x2, y1, y2;
float cutoff;
} SAudioFilter;
void InitFilter( SAudioFilter* filt, float cutoff )
{
filt->cutoff = cutoff;
float steep = 0.99f;
float r = steep * 0.99609375;
float f = cosf( M_PI * cutoff / FILTER_SAMPLING_RATE );
filt->a0 = (1 - r) * sqrtf( r * (r - 4 * (f * f) + 2) + 1 );
filt->b1 = 2 * f * r;
filt->b2 = -(r * r);
filt->x1 = 0;
filt->x2 = 0;
}
float ProcessFilter( SAudioFilter* filt, float x0 )
{
float outp = filt->a0 * x0 + filt->b1 * filt->x1 + filt->b2 * filt->x2;
filt->x2 = filt->x1;
filt->x1 = outp;
return outp;
}
#define AUDIO_BAND_COUNT 8
#define ANALYZER_BUF_SIZE 2048
#define FILTER_TIME_BLUR 0.2f
signed short g_analyzerBuf[ANALYZER_BUF_SIZE];
uint g_analyzerIdx = 0;
SAudioFilter g_filters[AUDIO_BAND_COUNT];
float g_bandValues[AUDIO_BAND_COUNT];
float g_bandDeltaValues[AUDIO_BAND_COUNT];
float g_bandNormValues[AUDIO_BAND_COUNT];
float g_bandNormDeltaValues[AUDIO_BAND_COUNT];
float g_bandMin[AUDIO_BAND_COUNT];
float g_bandMax[AUDIO_BAND_COUNT];
int AA_GetBandCount()
{
return AUDIO_BAND_COUNT;
}
float AA_GetBandValue( int iBand )
{
return g_bandValues[iBand];
}
float AA_GetBandDeltaValue( int iBand )
{
return g_bandDeltaValues[iBand];
}
float AA_GetBandNormValue( int iBand )
{
return g_bandNormValues[iBand];
}
float AA_GetBandNormDeltaValue( int iBand )
{
return g_bandNormDeltaValues[iBand];
}
float AA_GetBandMin( int iBand )
{
if( g_bandMax[iBand] < g_bandMin[iBand] )
return g_bandMax[iBand];
return g_bandMin[iBand];
}
float AA_GetBandMax( int iBand )
{
if( g_bandMin[iBand] > g_bandMax[iBand] )
return g_bandMin[iBand];
return g_bandMax[iBand];
}
void UpdateAnalyzer( signed short* buffer, uint count )
{
uint c, i;
float f, val;
float tempBars[AUDIO_BAND_COUNT];
float newVal;
float delta;
float blur = 0.4f;
float oldDelta;
float range;
float newNormVal;
float normDelta;
for( i = 0; i < AUDIO_BAND_COUNT; i++ )
tempBars[i] = 0.0f;
for( c = 0; c < count; c += 2 )
{
val = (((float)buffer[c] + (float)buffer[c + 1]) * 0.5f) / 32768.0f;
for( i = 0; i < AUDIO_BAND_COUNT; i++ )
{
f = fabsf( ProcessFilter( &g_filters[i], val ) );
if( f > tempBars[i] )
tempBars[i] = f;
}
}
for( i = 0; i < AUDIO_BAND_COUNT; i++ )
{
newVal = (1.0f - FILTER_TIME_BLUR) * g_bandValues[i] + FILTER_TIME_BLUR * tempBars[i];
if( tempBars[i] > newVal )
newVal = tempBars[i];
delta = newVal - g_bandValues[i];
// Decay values towards min/max.
g_bandMax[i] = g_bandMax[i] * 0.9997f; // adopt max faster.
g_bandMin[i] = 1.0f - (1.0f - g_bandMin[i]) * 0.9999f;
if( delta < 0.0f )
{
// Update max
if( newVal < g_bandMin[i] )
g_bandMin[i] = (1.0f - FILTER_TIME_BLUR) * g_bandMin[i] + FILTER_TIME_BLUR * newVal;
}
else
{
// Update min
if( newVal > g_bandMax[i] )
g_bandMax[i] = (1.0f - FILTER_TIME_BLUR) * g_bandMax[i] + FILTER_TIME_BLUR * newVal;
}
if( g_bandMax[i] < 0.0f )
g_bandMax[i] = 0.0f;
if( g_bandMax[i] > 1.0f )
g_bandMax[i] = 1.0f;
if( g_bandMin[i] < 0.0f )
g_bandMin[i] = 0.0f;
if( g_bandMin[i] > 1.0f )
g_bandMin[i] = 1.0f;
oldDelta = g_bandDeltaValues[i];
g_bandDeltaValues[i] = (1.0f - blur) * g_bandDeltaValues[i] + blur * fabsf( delta );
g_bandValues[i] = newVal;
range = g_bandMax[i] - g_bandMin[i];
newNormVal = newVal;
if( range > 0.0001f )
{
if( newNormVal < g_bandMin[i] )
newNormVal = g_bandMin[i];
else if( newNormVal > g_bandMax[i] )
newNormVal = g_bandMax[i];
newNormVal = (newNormVal - g_bandMin[i]) / range;
}
if( newNormVal > 1.0f )
newNormVal = 1.0f;
normDelta = newNormVal - g_bandNormValues[i];
g_bandNormDeltaValues[i] = normDelta;
g_bandNormValues[i] = newNormVal;
}
}
void InitAnalyzer()
{
uint i;
for( i = 0; i < AUDIO_BAND_COUNT; i++ )
{
float center = (float)i / (float)AUDIO_BAND_COUNT;
center *= center;
float cutoff = 40.0f + center * (17000.0f - 40.0f);
InitFilter( &g_filters[i], cutoff );
}
for( i = 0; i < AUDIO_BAND_COUNT; i++ )
{
g_bandValues[i] = 0;
g_bandDeltaValues[i] = 0;
g_bandMin[i] = 1.0f;
g_bandMax[i] = 0.0f;
}
}
The analyser is updated while converting the data from mad format to to the format PSP eats, the loop looks like this:
Code: Select all
for (i = 0; i < Synth.pcm.length; i++) {
signed short SampleL;
signed short SampleR;
// Left channel
SampleL = MadFixedToSshort(Synth.pcm.samples[0][i]);
if (MAD_NCHANNELS(&Frame.header) == 2)
SampleR = MadFixedToSshort(Synth.pcm.samples[1][i]);
else
SampleR = SampleL;
// Update the analyzer
g_analyzerBuf[g_analyzerIdx++] = SampleL;
g_analyzerBuf[g_analyzerIdx++] = SampleR;
if( g_analyzerIdx >= ANALYZER_BUF_SIZE )
{
UpdateAnalyzer( g_analyzerBuf, g_analyzerIdx );
g_analyzerIdx = 0;
}
if (samplesOut < numSamples) {
_buf[samplesOut * 2] = SampleL;
_buf[samplesOut * 2 + 1] = SampleR;
samplesOut++;
} else {
OutputBuffer[samplesInOutput * 2] = SampleL;
OutputBuffer[samplesInOutput * 2 + 1] = SampleR;
samplesInOutput++;
}
}
And then simple test program:
Code: Select all
void DrawMeters()
{
sceGuDisable( GU_CULL_FACE );
sceGuDisable( GU_TEXTURE_2D );
uint count = AA_GetBandCount();
sceGuColor(0xffffffff);
Vert2* verts = (Vert2*)sceGuGetMemory( count * 4 * sizeof( Vert2 ) );
Vert2* v = verts;
// Update vertices.
for( uint i = 0; i < count; i++ )
{
v->x = 10.0f + i * 15.0f;
v->y = 10.0f;
v->z = 0;
v->color = 0x800000ff;
v++;
v->x = 10.0f + i * 15.0f + 10.0f;
v->y = 10.0f + 2 + AA_GetBandNormValue( i ) * 100.0f;
v->z = 0;
v->color = 0x800000ff;
v++;
}
sceGumDrawArray( GU_SPRITES, GU_TEXTURE_32BITF|GU_COLOR_8888|GU_VERTEX_32BITF|GU_TRANSFORM_3D, count * 2, 0, verts );
sceGuEnable( GU_CULL_FACE );
sceGuEnable( GU_TEXTURE_2D );
}
Basically the min and max values work like slow springs, the min value tries to reach the top and the max value tries move down (slowly) and the moving real values pushes them around. Simple and stupid, but works extremely well :) The nice side effect is that if you have more quiet part in the music, the range shrinks and then when the music gets more intense again, you get nice burst in the visualisation. You can make the spring slower if this is not what you want.
Finally there is the AA_GetBandDeltaValue(). It is not used in the flower demo, but it works great with some kind of flashing lights kind of thing. Or if you want to control some physics kind of things. There is no science other than black magic behind that delta thing, it just emerged from one of my visualisation projects once upon a time, here's image of it:
http://www.pingstate.nu/omnilayer/yksi/ ... _botti.jpg
There you've got it. Some cool and friendly dev-helps-dev-and-shares-his-code in your face from memon ;)groepaz wrote:i kinda doubt moppi will release the source for this (maybe in some years... :=P).The flower demo does exactly the sort of thing I'm looking to do, if you come across the source code, it would be very beneficial to me, and much appreciated.
however, i would assume its using its own mp3 player internally, which pretty much makes that visualisation stuff trivial.
@memon: That's really great from you to share this code snippet with us :) It will be very very handy when I try to do mp3 playback for PMP Mod :) Thanks a lot and thumbs up
hehe, i was more relating to the source of the full demo (as the original poster implied) - which would imho be a lot more interisting than something which can be found with a bit of googling anyway :=PThere you've got it. Some cool and friendly dev-helps-dev-and-shares-his-code in your face from memon ;)
Still it's exactly the code, that was asked for in this thread, and is really helpful when you have no experience in audio visualization yet :)
Anyway the full code of the demo would most likely overstrain a lot of people here ;) (Apart from the fact that there only would be thousands of copy-and-paste 'yeah-I-mad'-sum-ubercool-demo' kiddys, so it really is better to keep it closed source)
Anyway the full code of the demo would most likely overstrain a lot of people here ;) (Apart from the fact that there only would be thousands of copy-and-paste 'yeah-I-mad'-sum-ubercool-demo' kiddys, so it really is better to keep it closed source)
hehe :) i might actually be happy with some non working pseudocode - just to verify if my guesses on the technique used was/is correct :=PAnyway the full code of the demo would most likely overstrain a lot of people here ;) (Apart from the fact that there only would be thousands of copy-and-paste 'yeah-I-mad'-sum-ubercool-demo' kiddys, so it really is better to keep it closed source)
-
- Posts: 197
- Joined: Fri Jul 01, 2005 2:50 am
groepaz, this is getting a bit off topic, but here's the flower thing in a nut shell:
The main stem:
1) pick 2 random numbers, the first is the lenght if a curve and the second is the curvature
2) linearly interpolate along that arc
3) if certain amount of distance has been passed, roll a dice again and based on the random number choose branch or one of petals/flowers
4) loop :)
The branches is handled the same way as the main stem, except that it is only one arc, where the main stem kind of always regenerates itself when it has finished one arc. Each "particle" has a life time whenre it first grows, then lives, and the dies. Depending on the depth of the tree the lifetime is shorter.
The real meat of the effect is how it is synced to the music imho. Basically what I do is that I vary the propability of certain items (especially the branches) based on the music, and also speed of growth (that linear interpolation) is based on the music.
There is also some great philosophical thingking behind the whole thing, but I will not bother with that today ;)
The main stem:
1) pick 2 random numbers, the first is the lenght if a curve and the second is the curvature
2) linearly interpolate along that arc
3) if certain amount of distance has been passed, roll a dice again and based on the random number choose branch or one of petals/flowers
4) loop :)
The branches is handled the same way as the main stem, except that it is only one arc, where the main stem kind of always regenerates itself when it has finished one arc. Each "particle" has a life time whenre it first grows, then lives, and the dies. Depending on the depth of the tree the lifetime is shorter.
The real meat of the effect is how it is synced to the music imho. Basically what I do is that I vary the propability of certain items (especially the branches) based on the music, and also speed of growth (that linear interpolation) is based on the music.
There is also some great philosophical thingking behind the whole thing, but I will not bother with that today ;)