Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > Taking NUFLI one step further
2024-11-10 22:46
cobbpg

Registered: Jan 2022
Posts: 36
Taking NUFLI one step further

I'm working on a new converter for creating images the C64 can display with as much freedom as possible: NUFLI Studio preview video.

The images are NUFLI with the sprite colour limitations lifted: all sprite colours outside the FLI bug area can potentially change in every row. While this is impossible in general due to CPU time limitations, the solution is to generate the speedcode that updates the registers ahead of time. The code generator can make informed decisions about dropping the least impactful register changes to fit everything into the available budget. In practice, most pictures don't require all the 10 possible colour updates on every scanline, far from it.

Another big innovation compared to Mufflon is the improvement in conversion speed. Lifting the sprite colour change limitations makes it easy to fully parallelise the brute force colour search step for each 48x2 pixel area (or 24x2 pixels over the FLI bug), and this allowed me to implement it as a compute shader. The whole process takes about 0.25 seconds on my three-year-old gaming laptop. Besides, when using the internal editor, only the affected areas are recomputed, and they can be previewed in VICE right away (note that the video shows some lag that's probably introduced by OBS somehow; it doesn't happen outside recording).

My hope is that making the feedback practically instantaneous (even when using an external image editor) opens the door for pixel artists to develop a much better intuition about this image format. Also, removing the limitation of only changing sprite colours every second row should make it a lot less frustrating to work with.

I'm not sure what to call this image format. This is 95% NUFLI, the only real difference is that the speedcode is also generated ahead of time (when run on NTSC, there's a patching step done by the displayer routine to make it work), so the files are 4096 bytes bigger (they load from $1000 instead of $2000, the rest of the structure is almost identical). I'm leaning towards "NUFLI2" to make it somewhat search engine friendly, but I'm open to ideas.

At the moment I'm in the process of writing a manual for the tool and a deep-dive article about the technical details. Hopefully neither of those will take too long!
 
... 76 posts hidden. Click here to view all posts....
 
2024-11-12 10:48
Frantic

Registered: Mar 2003
Posts: 1648
Quote: add 'F' for 'flexible'?
FNUFLI or NUFLIF?


FNUFLI was a good one, I think.
2024-11-12 11:29
PopMilo

Registered: Mar 2004
Posts: 146
Looking great imho !
I would love to see that compute shader code :)
Guess we'll have to wait for that article ?
2024-11-12 18:17
cobbpg

Registered: Jan 2022
Posts: 36
Quoting Bitbreaker
afaik the last column (39) was plain second line AFLI anyway in NUFLI, as there's no underlay. I wonder how you manage to squeeze in 7 color updates per line, as the AFLI line consumes 40 DMA cycles, also the sprites consume pretty much DMA cycles. If you need to cut down on color update writes in one line for the sake of saving, this restrictions nevertheless carries on onto the upcoming lines and thus again building dependencies outside the current AFLI block and sprite block boundaries? Decisions made in a prior line inflict the amount of colour choice on the current line and so on.

You can define up to 17 colour updates per AFLI block (i.e. two scanlines): 4 for the FLI bug, 2*6 for the underlay in columns 3-38, and the border.

My code generation algorithm starts by making a complete list of register writes to realise the desired picture. Some of these are fixed to a certain scanline, others are what I call deferrable, i.e. they can be issued within a range of scanlines (basically where they are unused, hence their value doesn't matter, or they are the sprite Y updates). Every register write has its address, value and valid cycle interval recorded.

I go line by line from the top of the screen and start generating code such that all the timing and value constraints are met. Each update costs 4 or 6 cycles depending on whether an existing register value can be reused or not. I also take advantage of the fact that only the lower 4 bits matter for colours, so e.g. if we're supposed to write $3c to $d011 and we need to set something mid grey at the same time, we can potentially reuse the value.

It turns out that for a typical picture you can successfully fit the required code into the time budget on most scanlines. When I run out of time, I remove an update or two based on some heuristics (e.g. I start with removing bug updates, then if that's not sufficient, I'll remove some from the middle section as well, and always pick the one whose removal adds the smallest error) and try again until the resulting code fits. I'm always guaranteed to get 6 updates just like plain NUFLI, but typically it's more due to repeated colours.

What matters is that the process is guaranteed to produce an executable, and it gives you feedback about the CPU time used (the green/red bars on the right hand side). This way the artist can easily tell if they are allowed to add more colour in some area or have to cut back on changes in another to guarantee a certain outcome. In practice this system works surprisingly well.

As for update removals affecting subsequent lines, that's pretty much limited to the FLI bug, since with 6 updates you can always make sure to leave the underlays in the middle section in the desired state. Well, unless you change the border colour too, but just don't do that in the most contested areas. :P

Quoting PopMilo
I would love to see that compute shader code :)
Guess we'll have to wait for that article ?

The code is just a bunch of exceedingly boring nested loops, so don't have high expectations of brilliance. ;) It's very similar to what Mufflon does: just try all combinations and pick the one with the least overall error for each section. And yeah, I'll document everything first, which should also make the code easier to understand. I'm hoping to be able to release it all later this month.
2024-11-12 19:26
Jetboy

Registered: Jul 2006
Posts: 338
Quoting cobbpg
I'm hoping to be able to release it all later this month.


Christmas is coming early this year :) I just can't wait :)
2024-11-13 15:24
Jammer

Registered: Nov 2002
Posts: 1336
SNUFLI like Speedcode NUFLI? And has some ring to it ;)
2024-11-13 15:47
cobbpg

Registered: Jan 2022
Posts: 36
Or RUFLI for Rejuvenated Underlay FLI, pronounced like "roughly". :P
2024-11-13 16:20
Jetboy

Registered: Jul 2006
Posts: 338
PROFLI?

FLIPROQUO?

COBBPGFLI ?

ONUFLI (Optimised NUFLI, or Oh my god! FLI)?
2024-11-13 17:30
sebalozlepsi

Registered: Mar 2010
Posts: 23
NUFLIX
2024-11-13 21:28
Copyfault

Registered: Dec 2001
Posts: 478
The example video link in the first post is really something! Looks VERY promising and the "response time" you have while... "pixelling" is simply next level!

Not meant too serious, but you should maybe keep this for the next Performers-Demo and name the mode 'NextFli' ;)

But ok, joking aside, I get that the core routine that runs on the C64 is... NUFLI. The improvements are more on the preparation side, where it for sure makes a f**kload of sense to presort sprite colour reg writes... so, somehow I tend to think more of a "new Mufflon" to materialize soon, instead of a new "mode", since it's still NUFLI. Ofc, you already threw in the things that have to be implemented in a future version: full VIC register f**kery! THIS would really justify to call for a new mode, and I know at least one person that has this idea since a few years (Ptoing, DO YOU READ?)

Please DO NOT interpret any evil criticism in this text, it's really solely meant constructive, and I bow in deep respect of seeing that this NUFLI-mode-dependencies have been cracked (in a way) so you are able to feed this through shader logics!


Looking forward to reading *that* article you're writing on all of this :)

Cheeeerz

Copyfault
2024-11-14 03:21
Martin Piper

Registered: Nov 2007
Posts: 726
Quoting Copyfault
But ok, joking aside, I get that the core routine that runs on the C64 is... NUFLI. The improvements are more on the preparation side, where it for sure makes a f**kload of sense to presort sprite colour reg writes... so, somehow I tend to think more of a "new Mufflon" to materialize soon, instead of a new "mode", since it's still NUFLI.


The question of what to name something when the C64 display mode effect is the same, but the data preparation is what makes it special, is an interesting one.

When I did coding for the video sequences in Hunter's Moon, it just uses multicolour bitmap with sprites over the top. There's nothing special about the display mode. But the algorithm compressing all the data, making it fast enough to decompress while optimising bitmap and sprite data, that's the special part. But does the data processing deserve to get a name?
Previous - 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 - Next
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
Matt
MP Software/Hokuto F..
REBEL 1/HF
map/Plush
johny
MWR/Visdom
sln.pixelrat
Knobby/Role
encore
commodore_freak
E$G/HF ⭐ 7
Guests online: 120
Top Demos
1 Next Level  (9.7)
2 13:37  (9.7)
3 Coma Light 13  (9.6)
4 Edge of Disgrace  (9.6)
5 Mojo  (9.6)
6 Uncensored  (9.6)
7 The Demo Coder  (9.6)
8 Comaland 100%  (9.6)
9 What Is The Matrix 2  (9.6)
10 Wonderland XIV  (9.6)
Top onefile Demos
1 Layers  (9.7)
2 Cubic Dream  (9.6)
3 Party Elk 2  (9.6)
4 Copper Booze  (9.6)
5 Rainbow Connection  (9.5)
6 Morph  (9.5)
7 Dawnfall V1.1  (9.5)
8 Libertongo  (9.5)
9 Katzen-Video.mp4  (9.5)
10 Onscreen 5k  (9.5)
Top Groups
1 Booze Design  (9.3)
2 Oxyron  (9.3)
3 Performers  (9.3)
4 Triad  (9.3)
5 Censor Design  (9.3)
Top NTSC-Fixers
1 Pudwerx  (10)
2 Stormbringer  (10)
3 Fungus  (9.7)
4 Booze  (9.7)
5 Grim Reaper  (9.3)

Home - Disclaimer
Copyright © No Name 2001-2025
Page generated in: 0.066 sec.