afaik the last column (39) was plain second line AFLI anyway in NUFLI, as there's no underlay. I wonder how you manage to squeeze in 7 color updates per line, as the AFLI line consumes 40 DMA cycles, also the sprites consume pretty much DMA cycles. If you need to cut down on color update writes in one line for the sake of saving, this restrictions nevertheless carries on onto the upcoming lines and thus again building dependencies outside the current AFLI block and sprite block boundaries? Decisions made in a prior line inflict the amount of colour choice on the current line and so on.
I would love to see that compute shader code :) Guess we'll have to wait for that article ?
I'm hoping to be able to release it all later this month.
But ok, joking aside, I get that the core routine that runs on the C64 is... NUFLI. The improvements are more on the preparation side, where it for sure makes a f**kload of sense to presort sprite colour reg writes... so, somehow I tend to think more of a "new Mufflon" to materialize soon, instead of a new "mode", since it's still NUFLI.