There’s something everyone must run into at some point. You’re happily rendering UI straightforwardly and then all the sudden for one reason or another you need to composite two layers. But you want it to look exactly (give or take some numerical precision issues) like it did before.
I’ve done this before but never in a fully satisfactory way. This time I decided I wanted the perfect solution. I decided to write this out while I work through it.
When compositing layers, the first problem you run into is that the alpha values in the top layer are nonsense. Well, what should they even be, anyway? The alpha values are meaningless until you need to composite that layer. You probably have no control over those values. They might be all 0, or all 1. They might be a complete mess.
Let’s say I have layer TOP and BOT and BOT just contains a wallpaper. I clear TOP to 0,0,0,0 and then draw an opaque triangle texture on it. Now to composite I draw TOP normally, as long as I replicate the triangle’s alpha values into TOP.
Suppose I scale the triangle a bit so it’s linear filtered. Now what we will run into is texels (perhaps 1,1,1,0) from outside the triangle texture getting incorporated into the filtering. You have to use premultiplied alpha to fix that (a 1,1,1,0 pixel will be come 0,0,0,0 after multiplying rgb *= a). This is well-known. I just wanted to get it out of the way and make you visualize the fuzzy-edged alpha values that will get produced by this. If we had a white triangle, we may end up with (0.5,0.5,0.5,0.5) at the edge (with premultiplied alpha, that’s a half-transparent full-white).
When this is drawn normally onto BOT, everything still works normally.
Now, let’s draw a big red semitransparent box over all that. Let’s do it at alpha=0.1 … now what should the alpha values be where the semitransparent box overlays the fuzzy edges in the triangle? We have 0.5 alpha clashing with 0.1 alpha. Do you want to to multiply them? That’s wrong… it suggests we now have an alpha of 0.05 which is basically invisible. The art on TOP should have produced higher alpha values. Do you want to add them? You’ll exceed 1.0 at some point. Addition/subtraction isn’t logical and it won’t work.
There’s not a lot you can do with the ROP (blending) units on the GPU so you may be tempted to go to the shader. But there’s no such thing as a free lunch… you can’t just do what amounts to blending on the shader without losing something, and it turns out that’s performance. GPUs just aren’t meant to do this. Maybe you know a way to do it with a cutting edge GPU and API, but: I doubt it, and there’s still no way it’s as fast as if you used the very old GPU programming models that were designed to be efficient from the outset. And anyway you still have to figure out the math!
In theory, if we weren’t compositing, after drawing the triangle edges we would have something like this:
WALLPAPER * 0.5 + TRIANGLE * 0.5
and after drawing the box we would have
(WALLPAPER * 0.5 + TRIANGLE * 0.5) * 0.9 + BOX * 0.1
-> WALLPAPER * 0.45 + TRIANGLE * 0.45 + BOX * 0.1
We note that the wallpaper represents the BOT layer and so when we composite (using only the information created while rendering TOP) we need to end up with a TOP alpha of 0.55 in order to blend the layers correctly. So somehow we need to end up with an 0.55. In particular we needed to get:
0.55 = 1.0 - 0.5 * (1-0.1)
dest_alpha = 1.0 - dest_alpha * (1-src_alpha)
So now we have to figure out how to produce that with normal GPU resources. Right off the bat, we have the ROP blending capabilities problem. While we could multiply DST and ONE_MINUS_SRC, there’s no way to get the extra 1.0- in there. Maybe we can do some algebra to help?
dest_alpha = (1.0 - dest_alpha) + (dest_alpha * src_alpha)
This isn’t helping, there’s just no way to deal with the 1.0-. Even though we have ONE_MINUS_DST_ALPHA, we can’t use it without multiplying it with SRC_ALPHA or DST_ALPHA which we don’t want (that’s simply how the blenders work.)
So what we’ll have to do is invert our sense of alpha so we don’t have this problem. Now we’d need to have 0.45 in there (we’ll un-invert this sense in the compositing stage, trust me). So to be clear:
0.45 = 0.5 * (1-0.1)
dest_alpha = dest_alpha * (1-src_alpha)
In other words, it’s where we got stuck before but without the 1.0- because that’s always baked into the dest_alpha.
Well, this looks much easier. now we can use a ONE_MINUS_SRC_ALPHA * DST_ALPHA configuration in the blender. But there’s one snag. Instead of clearing to 0,0,0,0 (fully transparent, ordinarily) we now need to clear to 0,0,0,1 (fully transparent, with the inverted alpha sense).
Now, how to composite it? Well, this is what’s needed (with the un-inverting taken into consideration):
dest_color = (dest_color * src_alpha) + (src_color * (1.0-src_alpha))
And that’s simply a blender configuration of:
SRC_COLOR * ONE_MINUS_SRC_ALPHA + DST_COLOR * SRC_ALPHA
The inverting of the alpha sense happens here, where you can see things are plainly backwards-seeming.
In all my years of doing this kind of stuff, I think I’ve never seen it written out this way before. You’d think the key characteristics of having an inverted sense of alpha and clearing to 0,0,0,1 would be memorable. But no… Finally, after writing all this, I thought of enough keywords to find an article that mentioned it, in the 2nd half:
http://hacksoflife.blogspot.com/2010/02/alpha-blending-back-to-front-front-to.html
A “little bit of head scratching” indeed.
I posted this anyway so you could see my analytical method involving trial numbers and so I’d have something to use to stop losing gruedorf.
UPDATE 2022:
It seems I forgot something here. Ordinarily when using premultiplied alpha, it affects the blending you use during composition. Without it you’d have:
SRC_COLOR * SRC_ALPHA + DST_COLOR * ONE_MINUS_SRC_ALPHA
With it you have:
SRC_COLOR * ONE + DST_COLOR * ONE_MINUS_SRC_ALPHA
So let me correct the analysis above. I said we’d need this:
dest_color = (dest_color * src_alpha) + (src_color * (1.0-src_alpha))
But actually src_color has src_alpha already incorporated via the premultiplication (actually it’s all muddled due to the inverse sense of alpha). So really what we need is this:
dest_color = (dest_color * src_alpha) + ((src_color/(1.0-src_alpha) * (1.0-src_alpha))
-> dest_color = (dest_color * src_alpha) + src_color
And thus a blender configuration of
SRC_COLOR * ONE + DST_COLOR * SRC_ALPHA