Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > Taking NUFLI one step further
2024-11-10 22:46
cobbpg

Registered: Jan 2022
Posts: 52
Taking NUFLI one step further

I'm working on a new converter for creating images the C64 can display with as much freedom as possible: NUFLI Studio preview video.

The images are NUFLI with the sprite colour limitations lifted: all sprite colours outside the FLI bug area can potentially change in every row. While this is impossible in general due to CPU time limitations, the solution is to generate the speedcode that updates the registers ahead of time. The code generator can make informed decisions about dropping the least impactful register changes to fit everything into the available budget. In practice, most pictures don't require all the 10 possible colour updates on every scanline, far from it.

Another big innovation compared to Mufflon is the improvement in conversion speed. Lifting the sprite colour change limitations makes it easy to fully parallelise the brute force colour search step for each 48x2 pixel area (or 24x2 pixels over the FLI bug), and this allowed me to implement it as a compute shader. The whole process takes about 0.25 seconds on my three-year-old gaming laptop. Besides, when using the internal editor, only the affected areas are recomputed, and they can be previewed in VICE right away (note that the video shows some lag that's probably introduced by OBS somehow; it doesn't happen outside recording).

My hope is that making the feedback practically instantaneous (even when using an external image editor) opens the door for pixel artists to develop a much better intuition about this image format. Also, removing the limitation of only changing sprite colours every second row should make it a lot less frustrating to work with.

I'm not sure what to call this image format. This is 95% NUFLI, the only real difference is that the speedcode is also generated ahead of time (when run on NTSC, there's a patching step done by the displayer routine to make it work), so the files are 4096 bytes bigger (they load from $1000 instead of $2000, the rest of the structure is almost identical). I'm leaning towards "NUFLI2" to make it somewhat search engine friendly, but I'm open to ideas.

At the moment I'm in the process of writing a manual for the tool and a deep-dive article about the technical details. Hopefully neither of those will take too long!
 
... 95 posts hidden. Click here to view all posts....
 
2024-11-12 10:08
Bitbreaker

Registered: Oct 2002
Posts: 510
Quoting cobbpg

Quoting Bitbreaker
What still remains is the dependencies that AFLI Blcoks drag into the next sprite block and vice versa? As sprite colors are updated in odd and AFLI is updated in even lines, the decisions on colors made for either AFLI or sprite has always an impact on the decisions on the next lines, as one sprite block overlaps 2 AFLI blocks and one AFLI blopck overlaps witrh 2 sprite blocks.

No, this is solved, as this tool allows you to change underlay colours (outside the FLI bug) on every one of the 200 scanlines. There's one caveat: the rightmost 8 pixels of the rightmost column (i.e. between pixels 304 and 311 in each row) can only change on odd lines due to VIC timing limitations. They could be changed fully on NTSC, but not on PAL, sadly.


afaik the last column (39) was plain second line AFLI anyway in NUFLI, as there's no underlay. I wonder how you manage to squeeze in 7 color updates per line, as the AFLI line consumes 40 DMA cycles, also the sprites consume pretty much DMA cycles. If you need to cut down on color update writes in one line for the sake of saving, this restrictions nevertheless carries on onto the upcoming lines and thus again building dependencies outside the current AFLI block and sprite block boundaries? Decisions made in a prior line inflict the amount of colour choice on the current line and so on.
Curious on the outcome and a first release to play around. If the sprite/afli-block overlap stuff drops, the format is also way better to draw by hand, still painful with different pixel sizes regarding underlay and hires.
2024-11-12 10:48
Frantic

Registered: Mar 2003
Posts: 1661
Quote: add 'F' for 'flexible'?
FNUFLI or NUFLIF?


FNUFLI was a good one, I think.
2024-11-12 11:29
PopMilo

Registered: Mar 2004
Posts: 146
Looking great imho !
I would love to see that compute shader code :)
Guess we'll have to wait for that article ?
2024-11-12 18:17
cobbpg

Registered: Jan 2022
Posts: 52
Quoting Bitbreaker
afaik the last column (39) was plain second line AFLI anyway in NUFLI, as there's no underlay. I wonder how you manage to squeeze in 7 color updates per line, as the AFLI line consumes 40 DMA cycles, also the sprites consume pretty much DMA cycles. If you need to cut down on color update writes in one line for the sake of saving, this restrictions nevertheless carries on onto the upcoming lines and thus again building dependencies outside the current AFLI block and sprite block boundaries? Decisions made in a prior line inflict the amount of colour choice on the current line and so on.

You can define up to 17 colour updates per AFLI block (i.e. two scanlines): 4 for the FLI bug, 2*6 for the underlay in columns 3-38, and the border.

My code generation algorithm starts by making a complete list of register writes to realise the desired picture. Some of these are fixed to a certain scanline, others are what I call deferrable, i.e. they can be issued within a range of scanlines (basically where they are unused, hence their value doesn't matter, or they are the sprite Y updates). Every register write has its address, value and valid cycle interval recorded.

I go line by line from the top of the screen and start generating code such that all the timing and value constraints are met. Each update costs 4 or 6 cycles depending on whether an existing register value can be reused or not. I also take advantage of the fact that only the lower 4 bits matter for colours, so e.g. if we're supposed to write $3c to $d011 and we need to set something mid grey at the same time, we can potentially reuse the value.

It turns out that for a typical picture you can successfully fit the required code into the time budget on most scanlines. When I run out of time, I remove an update or two based on some heuristics (e.g. I start with removing bug updates, then if that's not sufficient, I'll remove some from the middle section as well, and always pick the one whose removal adds the smallest error) and try again until the resulting code fits. I'm always guaranteed to get 6 updates just like plain NUFLI, but typically it's more due to repeated colours.

What matters is that the process is guaranteed to produce an executable, and it gives you feedback about the CPU time used (the green/red bars on the right hand side). This way the artist can easily tell if they are allowed to add more colour in some area or have to cut back on changes in another to guarantee a certain outcome. In practice this system works surprisingly well.

As for update removals affecting subsequent lines, that's pretty much limited to the FLI bug, since with 6 updates you can always make sure to leave the underlays in the middle section in the desired state. Well, unless you change the border colour too, but just don't do that in the most contested areas. :P

Quoting PopMilo
I would love to see that compute shader code :)
Guess we'll have to wait for that article ?

The code is just a bunch of exceedingly boring nested loops, so don't have high expectations of brilliance. ;) It's very similar to what Mufflon does: just try all combinations and pick the one with the least overall error for each section. And yeah, I'll document everything first, which should also make the code easier to understand. I'm hoping to be able to release it all later this month.
2024-11-12 19:26
Jetboy

Registered: Jul 2006
Posts: 363
Quoting cobbpg
I'm hoping to be able to release it all later this month.


Christmas is coming early this year :) I just can't wait :)
2024-11-13 15:24
Jammer

Registered: Nov 2002
Posts: 1343
SNUFLI like Speedcode NUFLI? And has some ring to it ;)
2024-11-13 15:47
cobbpg

Registered: Jan 2022
Posts: 52
Or RUFLI for Rejuvenated Underlay FLI, pronounced like "roughly". :P
2024-11-13 16:20
Jetboy

Registered: Jul 2006
Posts: 363
PROFLI?

FLIPROQUO?

COBBPGFLI ?

ONUFLI (Optimised NUFLI, or Oh my god! FLI)?
2024-11-13 17:30
sebalozlepsi

Registered: Mar 2010
Posts: 23
NUFLIX
2024-11-13 21:28
Copyfault

Registered: Dec 2001
Posts: 487
The example video link in the first post is really something! Looks VERY promising and the "response time" you have while... "pixelling" is simply next level!

Not meant too serious, but you should maybe keep this for the next Performers-Demo and name the mode 'NextFli' ;)

But ok, joking aside, I get that the core routine that runs on the C64 is... NUFLI. The improvements are more on the preparation side, where it for sure makes a f**kload of sense to presort sprite colour reg writes... so, somehow I tend to think more of a "new Mufflon" to materialize soon, instead of a new "mode", since it's still NUFLI. Ofc, you already threw in the things that have to be implemented in a future version: full VIC register f**kery! THIS would really justify to call for a new mode, and I know at least one person that has this idea since a few years (Ptoing, DO YOU READ?)

Please DO NOT interpret any evil criticism in this text, it's really solely meant constructive, and I bow in deep respect of seeing that this NUFLI-mode-dependencies have been cracked (in a way) so you are able to feed this through shader logics!


Looking forward to reading *that* article you're writing on all of this :)

Cheeeerz

Copyfault
Previous - 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 - Next
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
iceout/Avatar/HF
iAN CooG/HVSC
MWR/Visdom
Sychamis
zscs
The Syndrom/TIA/Pret..
Guests online: 127
Top Demos
1 Next Level  (9.7)
2 13:37  (9.7)
3 Codeboys & Endians  (9.7)
4 Mojo  (9.6)
5 Coma Light 13  (9.6)
6 Edge of Disgrace  (9.6)
7 Signal Carnival  (9.6)
8 Wonderland XIV  (9.5)
9 Uncensored  (9.5)
10 Comaland 100%  (9.5)
Top onefile Demos
1 Nine  (9.7)
2 Layers  (9.6)
3 Cubic Dream  (9.6)
4 Party Elk 2  (9.6)
5 Copper Booze  (9.5)
6 Scan and Spin  (9.5)
7 Onscreen 5k  (9.5)
8 Grey  (9.5)
9 Dawnfall V1.1  (9.5)
10 Rainbow Connection  (9.5)
Top Groups
1 Artline Designs  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Performers  (9.3)
5 Censor Design  (9.3)
Top Diskmag Editors
1 Magic  (10)
2 Jazzcat  (9.5)
3 hedning  (9.2)
4 Elwix  (9.1)
5 Peter  (9.0)

Home - Disclaimer
Copyright © No Name 2001-2025
Page generated in: 0.074 sec.