Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > Fast and Short Generalised Memset
2020-12-09 09:57
Raistlin

Registered: Mar 2007
Posts: 554
Fast and Short Generalised Memset

I wondered... has anyone ever tried writing a generalised "memset" for demo use..?

A function/macro that you can call with either:-

StartAddress, EndAddress, byte
- fills [StartAddress, EndAddress)

or

StartAddress, Length, byte
- fills [StartAddress, StartAddress+Length)

I guess as a macro it could then have the optimization of being able to do 256-byte aligned fills faster (eg. filling $0400-07ff rather than $0400-07e7).

I'm sure someone must've done one of these..?


I also wonder how the performance of such a generalised solution, if made as a function rather than a macro, would compare to a partially unrolled version..? eg.:-

lax #$00
Fill0400Loop:
sta $0400,x
sta $0500,x
sta $0600,x
sta $0700,x
inx
bne Fill0400Loop
2020-12-09 10:14
Krill

Registered: Apr 2002
Posts: 2839
Quoting Raistlin
I wondered... has anyone ever tried writing a generalised "memset" for demo use..?
Of course.

But you need a generalised memset in demos mostly only for initialisation, and then performance isn't much of an issue and you'd go for size.

When performance really is an issue, the old democoding rule applies. Use as much memory as possible, only optimise for size when needed.
So i guess this kind of code golf match would need a fourth parameter controlling how much you want to unroll the code. At the extreme end you'd have just LDA #0 with lots of STA mem16 and an rts.
2020-12-09 13:01
TWW

Registered: Jul 2009
Posts: 541
Yes.

The issue is as Krill states, memory.

Earlier I solved this with a "speedmode variable" set in the beginning of each project which unrolled or used a looped routine. The generalized routine then optimize for less than 8 bytes (unrolled), between 8 and 256 bytes (single loop) and over 256 bytes (loop for all pages and finally loop for remainder). Handled by parameters and macro (well actually a pseudo 8-D).

Syntax looked like this:
    :MEMSET destination_address ; number_of_bytes ; value ; safemode ; speedmode


    :MEMSET $fb ; #120 ; #%01101110 ; 0  - Fills from $fb to $fb + 120 with #%01101110 and turns off safemode
    :MEMSET $2000 ; #$7800               - Fills from $2000 to $9800 with #0
    :MEMSET 4096 ; #$2000 ; #$80 ; ; 1   - Fills from $1000 to $3000 with #$80 with safemode according to glabal value and speedmode enabled


I'd post the code but it's embarasingly long and written a long time ago, can probably be rewritten much smoother today with the increased functionality with kickass. PM me if you want it.

Now I just use the generalised routine and unroll (partially or fully) according to the need to squeeze cycles vs. memory footprint.
2020-12-09 13:30
Raistlin

Registered: Mar 2007
Posts: 554
Ahh, interesting. It sounds like you've done quite a bit of work in the area already.
2020-12-09 14:37
JackAsser

Registered: Jun 2002
Posts: 1989
https://xkcd.com/1205/
2020-12-09 14:39
Raistlin

Registered: Mar 2007
Posts: 554
Pah, JackAsser, time has no meaning in THE MATRIX.
2020-12-09 14:48
JackAsser

Registered: Jun 2002
Posts: 1989
Quote: Pah, JackAsser, time has no meaning in THE MATRIX.

😂😂😂
2020-12-09 19:18
Copyfault

Registered: Dec 2001
Posts: 466
What Krill said/wrote - literally!

Maybe I did not get the real thinking behind it, but wouldn't mem init be done by the decruncher anyway? Or is the focus on a plain init without decrunching as part of the very first init? Just wondering...
2020-12-10 00:01
Krill

Registered: Apr 2002
Posts: 2839
Quoting Copyfault
but wouldn't mem init be done by the decruncher anyway?
This is something i'd not rely on.

In a demo, you ideally want to load the next part while the current one is running. That means the next part's code should take as little mem as possible in its decrunched state.

Once the current part transitions over to the next, you can shuffle code around and initialise memory.

Having a part's pre-init code tight with few and little zeroed gaps makes for better pack ratio, too. Also it's a good idea to have a part contained entirely in one file, for loader (and cruncher) reasons. =)
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
Menace/Spaceballs
mutetus/Ald ^ Ons
jmin
Erhan/Nostalgia
Krill/Plush
Fungus/Nostalgia
Guests online: 61
Top Demos
1 Next Level  (9.8)
2 Mojo  (9.7)
3 Coma Light 13  (9.7)
4 Edge of Disgrace  (9.6)
5 Comaland 100%  (9.6)
6 No Bounds  (9.6)
7 Uncensored  (9.6)
8 Wonderland XIV  (9.6)
9 Bromance  (9.6)
10 Memento Mori  (9.6)
Top onefile Demos
1 It's More Fun to Com..  (9.7)
2 Party Elk 2  (9.7)
3 Cubic Dream  (9.6)
4 Copper Booze  (9.5)
5 Rainbow Connection  (9.5)
6 TRSAC, Gabber & Pebe..  (9.5)
7 Onscreen 5k  (9.5)
8 Wafer Demo  (9.5)
9 Dawnfall V1.1  (9.5)
10 Quadrants  (9.5)
Top Groups
1 Oxyron  (9.3)
2 Nostalgia  (9.3)
3 Booze Design  (9.3)
4 Censor Design  (9.3)
5 Crest  (9.3)
Top Fullscreen Graphicians
1 Carrion  (9.8)
2 Joe  (9.8)
3 Duce  (9.8)
4 Mirage  (9.7)
5 Facet  (9.7)

Home - Disclaimer
Copyright © No Name 2001-2024
Page generated in: 0.04 sec.