Hey there!
I've been working on Parallel Split Shadow Mapping (aka Cascaded Shadow Mapping) for some time and I finally got something to show ya. For those who don't know what that is, take a look:
http://appsrv.cse.cuhk.edu.hk/~fzhang/pssm_project/http://hax.fi/asko/PSSM.htmlhttp://appsrv.cse.cuhk.edu.hk/~fzhang/pssm_vrcia/
Basically, the scene cameras view frustum gets split into a few parts and each split gets its own Shadow Map. Note that my frustum splits got scaled a bit to reduce border specific errors. The idea behind that is that the nearer a split is, the higher the shadow map resolution is. In other words: Shadow Maps near the viewer have a higher resolution to obtain a better qualitity. This thingy is good for large scenes, eg. the game Crysis (which also uses PSSM in combination with Variance Shadow Mapping -> PSVSM).
There are a few differences between my implementation and the techniques proposed in the paper.
I don't use "geometry approximation", that is the light camera only encloses a frustum split, not only the shadow caster within that split.
I also don't use the far plane configuration yet. This means the far plane of the cameras view frustum gets clamped to the scenes bounding box, to remove empty space on the Shadow Maps - but thats easy to do.
The paper also proposes to use one shadow map pass per split and additional final passes to render the scene. This is required to let pixels have access to the correct Shadow Map. In other words: A four splitted frustum requires 4 * Shadow Map Passes + 4 * Scene Passes = Hell of alot passes. Nevertheless, this is possible with Truevision3D, but sh**ty, cause you'd need to render on the main buffer many times in a single frame. This would screw up the Z-Buffer and also the FPS statistics, cause TVEngine.Clear(true) does some FPS computations. Another variant would be to render the final scene passes to another Render Surface and draw it later on on the main buffer, which means one additional pass - yuck.
I came up with the idea to spare the 3 scene passes by selecting the good Shadow Map within the shader. This requires more shader instructions, but spares the additional frustum culling operations and geometry renderings on the CPU side - Which is faster, belive me!

Actually, all my techniques use four frustum splits. The paper propses to use one texture per Shadow Map, but I had the idea to "merge" all Shadow Maps into a single one. Since I only use a split count with the power of 2 its possible. This requires less memory, less shader instructions (just a few, 23->19 in PS), but requires large texture resolutions. It would also be good to store each depth map into an ARGB channel, but I'm planning to implement Variance Shadow Mapping (
http://forum.beyond3d.com/showthread.php?p=975976), which requires two channels per Depth Map, so I left that idea on my way.
This is how the "normal" way looks like:

But I did it that way:

Note that both textures are A16R16G16B16F, but I later used R32F cause my card supports it. Some people say that the first one is hardware filtered, but I its not over here on an "old" Geforce 7800 GTX.
I also did some benchmarking between both techniques - you may be interested in:


Both results are from a 1280*1024 Scene and without Anti Aliasing. Both Shadow Maps were R32F on this test. This shows, that my technique is a bit faster, but keep in mind that there are FOUR Shadow Maps in the other technique. Also note that there is no filtering (PCF/VSM) at the moment.
Aww.. and before I forget it:

Thats with the highest settings, but without filtering it truly sucks

I plan to optimize a few things, like the far plane configuration and the geometry approximation and maybe the code itself. I'll keep you up to date!