Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
 Welcome to our latest new user Northwind ! (Registered 2024-11-20) You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > Taking NUFLI one step further
2024-11-10 22:46
cobbpg

Registered: Jan 2022
Posts: 15
Taking NUFLI one step further

I'm working on a new converter for creating images the C64 can display with as much freedom as possible: NUFLI Studio preview video.

The images are NUFLI with the sprite colour limitations lifted: all sprite colours outside the FLI bug area can potentially change in every row. While this is impossible in general due to CPU time limitations, the solution is to generate the speedcode that updates the registers ahead of time. The code generator can make informed decisions about dropping the least impactful register changes to fit everything into the available budget. In practice, most pictures don't require all the 10 possible colour updates on every scanline, far from it.

Another big innovation compared to Mufflon is the improvement in conversion speed. Lifting the sprite colour change limitations makes it easy to fully parallelise the brute force colour search step for each 48x2 pixel area (or 24x2 pixels over the FLI bug), and this allowed me to implement it as a compute shader. The whole process takes about 0.25 seconds on my three-year-old gaming laptop. Besides, when using the internal editor, only the affected areas are recomputed, and they can be previewed in VICE right away (note that the video shows some lag that's probably introduced by OBS somehow; it doesn't happen outside recording).

My hope is that making the feedback practically instantaneous (even when using an external image editor) opens the door for pixel artists to develop a much better intuition about this image format. Also, removing the limitation of only changing sprite colours every second row should make it a lot less frustrating to work with.

I'm not sure what to call this image format. This is 95% NUFLI, the only real difference is that the speedcode is also generated ahead of time (when run on NTSC, there's a patching step done by the displayer routine to make it work), so the files are 4096 bytes bigger (they load from $1000 instead of $2000, the rest of the structure is almost identical). I'm leaning towards "NUFLI2" to make it somewhat search engine friendly, but I'm open to ideas.

At the moment I'm in the process of writing a manual for the tool and a deep-dive article about the technical details. Hopefully neither of those will take too long!
 
... 26 posts hidden. Click here to view all posts....
 
2024-11-11 21:14
Jetboy

Registered: Jul 2006
Posts: 286
WOW! That is AMAZING!
Will need to take a closer look when you make it available.

Unity seems like a little bit overkill for converter, but i understand it provides a lot of foundations and lets you concentrate on the process of conversion more than when you would need to write all from scratch.

Then if i think about it, i'm using some tools that are written in Unity on a daily basis, ie. Stud.io - software to build Lego digitally.

I was thinking in about moving my converter to the shaders too, but i would have to skip error diffusion as simple implementation requires for the image to be processed sequentially. (there is some hope, as there are possible solutions i'm investigating right now, but there are too many distractions so i have no ETA).

Then maybe i should follow that mantra i learned from a friend -"better done than perfect". But i digress.

If you need beta testers, i'm up for it. Can also help with beta testing manual.
2024-11-11 22:00
cobbpg

Registered: Jan 2022
Posts: 15
Unity is absolutely overkill for this purpose, and ideally I'd rewrite this in C++ with some IMGUI solution if I had more time. But there's a bunch of advantages to using Unity even for a tool like this:

• can build for all major platforms with minimal effort, without having to deal with build systems directly
• can use a single shader language for all builds
• can use C# both for the UI and parts of the algorithm that aren't worth pushing through the GPU
• the latter bits can be compiled to native code and potentially vectorised thanks to the Burst compiler, again with very little additional effort
• decent profiling tools available out of the box
• hot reload system so application state can be retained through recompiles during development (although this has been getting too slow lately)

The downside is that the builds are unnecessarily huge, and due to the startup time overhead they are also less useful as command line tools, even if they can be run in headless mode.
2024-11-12 08:31
enthusi

Registered: May 2004
Posts: 677
add 'F' for 'flexible'?
FNUFLI or NUFLIF?
2024-11-12 10:08
Bitbreaker

Registered: Oct 2002
Posts: 504
Quoting cobbpg

Quoting Bitbreaker
What still remains is the dependencies that AFLI Blcoks drag into the next sprite block and vice versa? As sprite colors are updated in odd and AFLI is updated in even lines, the decisions on colors made for either AFLI or sprite has always an impact on the decisions on the next lines, as one sprite block overlaps 2 AFLI blocks and one AFLI blopck overlaps witrh 2 sprite blocks.

No, this is solved, as this tool allows you to change underlay colours (outside the FLI bug) on every one of the 200 scanlines. There's one caveat: the rightmost 8 pixels of the rightmost column (i.e. between pixels 304 and 311 in each row) can only change on odd lines due to VIC timing limitations. They could be changed fully on NTSC, but not on PAL, sadly.


afaik the last column (39) was plain second line AFLI anyway in NUFLI, as there's no underlay. I wonder how you manage to squeeze in 7 color updates per line, as the AFLI line consumes 40 DMA cycles, also the sprites consume pretty much DMA cycles. If you need to cut down on color update writes in one line for the sake of saving, this restrictions nevertheless carries on onto the upcoming lines and thus again building dependencies outside the current AFLI block and sprite block boundaries? Decisions made in a prior line inflict the amount of colour choice on the current line and so on.
Curious on the outcome and a first release to play around. If the sprite/afli-block overlap stuff drops, the format is also way better to draw by hand, still painful with different pixel sizes regarding underlay and hires.
2024-11-12 10:48
Frantic

Registered: Mar 2003
Posts: 1646
Quote: add 'F' for 'flexible'?
FNUFLI or NUFLIF?


FNUFLI was a good one, I think.
2024-11-12 11:29
PopMilo

Registered: Mar 2004
Posts: 146
Looking great imho !
I would love to see that compute shader code :)
Guess we'll have to wait for that article ?
2024-11-12 18:17
cobbpg

Registered: Jan 2022
Posts: 15
Quoting Bitbreaker
afaik the last column (39) was plain second line AFLI anyway in NUFLI, as there's no underlay. I wonder how you manage to squeeze in 7 color updates per line, as the AFLI line consumes 40 DMA cycles, also the sprites consume pretty much DMA cycles. If you need to cut down on color update writes in one line for the sake of saving, this restrictions nevertheless carries on onto the upcoming lines and thus again building dependencies outside the current AFLI block and sprite block boundaries? Decisions made in a prior line inflict the amount of colour choice on the current line and so on.

You can define up to 17 colour updates per AFLI block (i.e. two scanlines): 4 for the FLI bug, 2*6 for the underlay in columns 3-38, and the border.

My code generation algorithm starts by making a complete list of register writes to realise the desired picture. Some of these are fixed to a certain scanline, others are what I call deferrable, i.e. they can be issued within a range of scanlines (basically where they are unused, hence their value doesn't matter, or they are the sprite Y updates). Every register write has its address, value and valid cycle interval recorded.

I go line by line from the top of the screen and start generating code such that all the timing and value constraints are met. Each update costs 4 or 6 cycles depending on whether an existing register value can be reused or not. I also take advantage of the fact that only the lower 4 bits matter for colours, so e.g. if we're supposed to write $3c to $d011 and we need to set something mid grey at the same time, we can potentially reuse the value.

It turns out that for a typical picture you can successfully fit the required code into the time budget on most scanlines. When I run out of time, I remove an update or two based on some heuristics (e.g. I start with removing bug updates, then if that's not sufficient, I'll remove some from the middle section as well, and always pick the one whose removal adds the smallest error) and try again until the resulting code fits. I'm always guaranteed to get 6 updates just like plain NUFLI, but typically it's more due to repeated colours.

What matters is that the process is guaranteed to produce an executable, and it gives you feedback about the CPU time used (the green/red bars on the right hand side). This way the artist can easily tell if they are allowed to add more colour in some area or have to cut back on changes in another to guarantee a certain outcome. In practice this system works surprisingly well.

As for update removals affecting subsequent lines, that's pretty much limited to the FLI bug, since with 6 updates you can always make sure to leave the underlays in the middle section in the desired state. Well, unless you change the border colour too, but just don't do that in the most contested areas. :P

Quoting PopMilo
I would love to see that compute shader code :)
Guess we'll have to wait for that article ?

The code is just a bunch of exceedingly boring nested loops, so don't have high expectations of brilliance. ;) It's very similar to what Mufflon does: just try all combinations and pick the one with the least overall error for each section. And yeah, I'll document everything first, which should also make the code easier to understand. I'm hoping to be able to release it all later this month.
2024-11-12 19:26
Jetboy

Registered: Jul 2006
Posts: 286
Quoting cobbpg
I'm hoping to be able to release it all later this month.


Christmas is coming early this year :) I just can't wait :)
2024-11-13 15:24
Jammer

Registered: Nov 2002
Posts: 1335
SNUFLI like Speedcode NUFLI? And has some ring to it ;)
2024-11-13 15:47
cobbpg

Registered: Jan 2022
Posts: 15
Or RUFLI for Rejuvenated Underlay FLI, pronounced like "roughly". :P
Previous - 1 | 2 | 3 | 4 - Next
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
kbs/Pht/Lxt
Bieno/Commodore Plus
Airwolf/F4CG
Electric/Extend
Jope/Extend
Andy/AEG
pcollins/Quantum
Guests online: 77
Top Demos
1 Next Level  (9.7)
2 13:37  (9.7)
3 Coma Light 13  (9.7)
4 Edge of Disgrace  (9.6)
5 Mojo  (9.6)
6 Uncensored  (9.6)
7 Wonderland XIV  (9.6)
8 Comaland 100%  (9.6)
9 No Bounds  (9.6)
10 Christmas Megademo  (9.5)
Top onefile Demos
1 Layers  (9.6)
2 Party Elk 2  (9.6)
3 Cubic Dream  (9.6)
4 Copper Booze  (9.6)
5 Libertongo  (9.5)
6 Rainbow Connection  (9.5)
7 Onscreen 5k  (9.5)
8 Morph  (9.5)
9 Dawnfall V1.1  (9.5)
10 It's More Fun to Com..  (9.5)
Top Groups
1 Performers  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Nostalgia  (9.3)
5 Censor Design  (9.3)
Top Organizers
1 Burglar  (9.9)
2 Sixx  (9.8)
3 hedning  (9.7)
4 Irata  (9.7)
5 Tim  (9.7)

Home - Disclaimer
Copyright © No Name 2001-2024
Page generated in: 0.05 sec.