Swizzling DXT1

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

Post Reply
ACh
Posts: 3
Joined: Sun Jun 03, 2007 5:12 pm

Swizzling DXT1

Post by ACh »

I wasn't able to find any mention in the forum of getting this working; apologies if this is already known.

It looks like it is in fact possible to use DXT1 compressed textures in swizzled mode. When reading from swizzled DXT1, the GE reads 32 bytes of data (presumably since 4bpp in DXT1 expands to 16bpp uncompressed, giving the 128-byte swizzle block). The 4x4 pixel blocks are rearranged in the same manner as for uncompressed 16-bit textures, 8 pixels across by 8 down; so if you have a 32x8 texture (00, 01 etc. represent 4x4 pixel blocks):

Code: Select all

00 01 02 03 04 05 06 07
10 11 12 13 14 15 16 17
the data would be swizzled as:

Code: Select all

00 01 10 11 02 03 12 13 04 05 14 15 06 07 16 17
The trick, however, is that while the GE only reads 32 bytes, it advances its internal pointer 128 bytes (just as for uncompressed reads), so you have to leave 96 bytes of space between each 32-byte swizzled block--and you lose all the space you gained from compression.

If you're clever, you can interleave two textures within the same memory block (only two, since the GE requires 64-byte texture alignment):

Code: Select all

00 01 02 03 04 05 06 07 | AA AB AC AD AE AF AG AH
10 11 12 13 14 15 16 17 | BA BB BC BD BE BF BG BH
20 21 22 23 24 25 26 27 | CA CB CC CD CE CF CG CH
30 31 32 33 34 35 36 37 | DA DB DC DD DE DF DG DH
--- becomes ---
00 01 10 11 .. .. .. .. AA AB BA BB .. .. .. .. 02 03 12 13 .. .. .. .. AC AD BC BD .. .. .. ..
04 05 14 15 .. .. .. .. AE AF BE BF .. .. .. .. 06 07 16 17 .. .. .. .. AG AH BG BH .. .. .. ..
20 21 30 31 .. .. .. .. CA CB DA DB .. .. .. .. 22 23 32 33 .. .. .. .. CC CD DC DD .. .. .. ..
24 25 34 35 .. .. .. .. CE CF DE DF .. .. .. .. 26 27 36 37 .. .. .. .. CG CH DG DH .. .. .. ..
Or, if the texture is no more than 32 pixels high, you can even interleave its own rows in the empty space, and set the texture stride to 4 pixels:

Code: Select all

00 01 02 03 04 05 06 07
10 11 12 13 14 15 16 17
20 21 22 23 24 25 26 27
30 31 32 33 34 35 36 37
40 41 42 43 44 45 46 47
50 51 52 53 54 55 56 57
60 61 62 63 64 65 66 67
70 71 72 73 74 75 76 77
--- becomes ---
00 01 10 11 20 21 30 31 40 41 50 51 60 61 70 71
02 03 12 13 22 23 32 33 42 43 52 53 62 63 72 73
04 05 14 15 24 25 34 35 44 45 54 55 64 65 74 75
06 07 16 17 26 27 36 37 46 47 56 57 66 67 76 77
However, swizzling only seems to give a marginal speedup; in my tests, swizzled DXT1 was about 10% faster than unswizzled, and 40-50% slower than swizzled 16-bit (5551) textures. So there's probably not much use for it; if you need the memory, go with unswizzled DXT1, and if you want speed, go with uncompressed textures.

FWIW, here's the code I use to swizzle:

Code: Select all

void swizzle_texture(const void * const src, void * const dest,
                     const int width, const int height, const int format)
{
    // ...
    if (format == GU_PSM_DXT1) {
        // Note: the following assumes width%8 == 0
        const int wblocks = width/8;
        const int hblocks = (height+7)/8;
        const uint8_t *sbase = (const uint8_t *)src;
        uint32_t *dptr = (uint32_t *)dest;
        int yblock;
        for &#40;yblock = 0; yblock < hblocks; yblock++, sbase += 2*&#40;width*2&#41;&#41; &#123;
            const uint32_t *sptr0 = &#40;const uint32_t *&#41; sbase;
            const uint32_t *sptr1 = &#40;const uint32_t *&#41;&#40;sbase + width*2&#41;;
            int xblock;
            for &#40;xblock = 0; xblock < wblocks;
                 xblock++, sptr0 += 4, sptr1 += 4, dptr += 32
            &#41; &#123;
                dptr&#91;0&#93; = sptr0&#91;1&#93;;  // Swap words for the PSP
                dptr&#91;1&#93; = sptr0&#91;0&#93;;
                dptr&#91;2&#93; = sptr0&#91;3&#93;;
                dptr&#91;3&#93; = sptr0&#91;2&#93;;
                dptr&#91;4&#93; = sptr1&#91;1&#93;;
                dptr&#91;5&#93; = sptr1&#91;0&#93;;
                dptr&#91;6&#93; = sptr1&#91;3&#93;;
                dptr&#91;7&#93; = sptr1&#91;2&#93;;
            &#125;
            if &#40;height <= 32&#41; &#123;
                dptr -= &#40;32*wblocks&#41; - 8;  // Interleave rows
            &#125;
        &#125;
    &#125;  // if &#40;format == GU_PSM_DXT1&#41;
&#125;
Post Reply