[CSDb] - User Forums - Release id #167152 : Krill's Loader, repository version 164

You are not logged in - nap

CSDb User Forums

Forums > CSDb Entries > Release id #167152 : Krill's Loader, repository version 164

2018-08-13 21:37

Krill

Registered: Apr 2002
Posts: 2980

Release id #167152 : Krill's Loader, repository version 164

If no problems emerge (i know they will, but anyways)... I can explain a bit about the full on-the-fly GCR block read+decode+checksumming.

... 36 posts hidden. Click here to view all posts....

2018-08-16 16:01

chatGPZ

Registered: Dec 2001
Posts: 11386

i know. terrible ideas there

2018-08-16 20:09

Cruzer

Registered: Dec 2001
Posts: 1048

Just type "make" they said. Easiest thing in the world. (Haven't tried with this version yet though.)

2018-08-17 00:09

Krill

Registered: Apr 2002
Posts: 2980

Quoting Cruzer

Just type "make" they said. Easiest thing in the world. (Haven't tried with this version yet though.)

Not sure what you're getting at, but you don't have to "make", just use the pre-compiled binaries in the archive. If they don't take your fancy for some reason or other, why, just... tell me and your wishes might *just* tip the scale to warrant different defaults. :)

2018-08-17 06:04

Radiant

Registered: Sep 2004
Posts: 639

Nice, will have to check this out if I ever get the time to do some serious C64 coding again. Thanks for sticking with proper tools.

2018-08-21 00:29

Smasher

Registered: Feb 2003
Posts: 520

first test: I compiled this amazing loader trying all supported packers.
this is the length of the compiled file "loader-c64.prg" (no raw load):

BITNAX:     $29d
BYTEBOOZER: $2a2
DOYNAX:     $2bb
EXOMIZER:   $370
LEVELCRUSH: $2be
NUCRUNCH:   $2d9
PUCRUNCH:   $324
SUBSIZER:   $384
TINYCRUNCH: $1f2

so you can import the resident at $0100-$03ff (assuming that's a smart place where to put it) with some packers, but not with all of them.
I hope this info could be useful for someone... I'll now dedicate some time on the speed tests.

2018-08-21 09:10

Krill

Registered: Apr 2002
Posts: 2980

Bitnax should be $0280 and tinycrunch $01d5, as advertised. Seems like you've left LOAD_RAW_API enabled.

The best place to fit the resident part is so that it ends at $03ff incl.
That way, you have as much stack space left as possible and can still use $0400 for screen memory.

This obviously doesn't work for the meatier decrunchers, as the resident portion is close to or bigger than $0300 bytes with them.

As for speed, tinycrunch should be fastest, followed by Bitnax, then Doynax-LZ.
This is because these three have a block-based read-data interface, as opposed to having a JSR for every incoming byte of the packed file.

2018-08-21 10:15

Smasher

Registered: Feb 2003
Posts: 520

mmmh, yes it could be I left RAW enabled in "config.inc". I'll recheck that later tonite. enabling or disabling that API just changes the compiled size, but no impact on the speed, correct?
or a better question (since I don't want to play with all the settings): only NTSC_COMPATIBILITY has impact on the speed performance AFAYK?
yes, even if tinycrunched files are ~2x bigger than exomized ones it seems your loader is way faster with TC!

2018-08-21 10:32

Krill

Registered: Apr 2002
Posts: 2980

Right, LOAD_RAW_API should have no or minimal impact on speed.

The options which do make it somewhat slower are NTSC_COMPATIBILITY, as you said, and also LOAD_UNDER_D000_DFFF and LOAD_VIA_KERNAL_FALLBACK.

Exomizer, despite big speed improvements from version 2 to 3, is still among the or the slowest one.

For tinycrunch vs. *nax, it may depend more on the actual corpus of test files what's faster with combined loading + depacking. The pack ratio diff vs. depacking speed diff ratio may or may not tilt the scale in favour of one or the other, depending on the actual file.

2018-08-28 19:55

Sparta

Registered: Feb 2017
Posts: 49

Krill, first of all congratulations, your loader is truly a masterpiece. I spent considerable time with deciphering it and I think now I understand what you are doing. The GCR loop is an amazing feat. One of its major advantages vs checksum verification integrated in either side of the transfer loop is that you do not need to wait with changing tracks until after transfer of the last block in a track is completed. Shrydar stepping cuts the delay to 12 bycles. This, however, can be completely eliminated. The following (Spartan) method provides a seamless and uninterrupted transfer of data across neighboring tracks. This is how it works in the latest version of my loader developed for personal use:

		lda	$1c00		//First half-track step
		sec
		rol
		and	#$03
		eor	$1c00
		sta	$1c00		//Update VIA 2 Port B

		sec			//Calculate second half step...
		rol
		and	#$03
		eor	$1c00
		sta	LastStep+1	//…and save it for later

Then start data transfer immediately:

		ldy	#$00
		…
		lsr
		dey
TrBranch:	bmi	Loop		//Send #$81 bytes first, then the remaining #$7f

		bit	$1800
		bpl	*-3
		sta	$1800		//Last 2 bits completed

		lda	#$d0		//Replace "BMI" with "BNE"
		sta	TrBranch
LastStep:	lda	#$00
		sta	$1c00		//Update VIA 2 Port B
		cpy	#$00		//
		bne	Loop2		//Back to transfer if not done
					//C64 loop has a similar delay built in

This can be adopted to almost any transfer loop reducing delay to a few cycles.

2018-08-28 21:20

Krill

Registered: Apr 2002
Posts: 2980

Sparta: Thanks! :)

I've considered something like your method (Spartan Stepping :D), but ultimately decided against it.

Its central concept is issueing the second half-track step in the middle of the block transfer.

However, this poses a few problems in a general-purpose standard format loader:
- The computer-side resident code needs to be aware of the slight delay in the middle and wait accordingly, which would increase resident code size ("C64 loop has a similar delay built in", as commented in your example).
- The computer-side code needs to be aware that the currently-transferred block is, indeed, the final file block of the current track, otherwise the extra delay would be in vain (and possibly a net loss due to just-missed following blocks). This would increase resident code size and also require that information to be sent to the resident code somehow, meaning extra protocol overhead.
- The drive-side code is extremely tight as it is (tightest code i ever made, and i've squeezed and squeezed again to fit in everything i needed to fit). It might not be possible to use this approach without throwing out some other functionality.

Previous - 1 | 2 | 3 | 4 | 5 - Next

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

Mike
t0m3000/hf^boom!^ibx
Chesser/Blazon
iAN CooG/HVSC
Magic/Nah-Kolor
Walt/Bonzai
saimo/RETREAM
katon/Lepsi De
Steffan/BOOM!
Andy/AEG
MWR/Visdom
Freeze/Blazon
Scrap/Genesis Project
4gentE/ΤRIΛD
hedning/G★P
LightSide
Guests online: 108

Top Demos

1 Next Level  (9.7)
2 13:37  (9.7)
3 Mojo  (9.7)
4 Coma Light 13  (9.6)
5 Edge of Disgrace  (9.6)
6 What Is The Matrix 2  (9.6)
7 The Demo Coder  (9.6)
8 Uncensored  (9.6)
9 Comaland 100%  (9.6)
10 Wonderland XIV  (9.6)

Top onefile Demos

1 Layers  (9.6)
2 No Listen  (9.6)
3 Cubic Dream  (9.6)
4 Party Elk 2  (9.6)
5 Copper Booze  (9.6)
6 Dawnfall V1.1  (9.5)
7 Rainbow Connection  (9.5)
8 Onscreen 5k  (9.5)
9 Morph  (9.5)
10 Libertongo  (9.5)

Top Groups

1 Performers  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Triad  (9.3)
5 Censor Design  (9.3)

Top Swappers

1 Derbyshire Ram  (10)
2 Jerry  (9.8)
3 Violator  (9.7)
4 Acidchild  (9.7)
5 Cash  (9.6)

Page generated in: 0.056 sec.