Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > CSDb Entries > Release id #167152 : Krill's Loader, repository version 164
2018-08-13 21:37
Krill

Registered: Apr 2002
Posts: 2982
Release id #167152 : Krill's Loader, repository version 164

If no problems emerge (i know they will, but anyways)... I can explain a bit about the full on-the-fly GCR block read+decode+checksumming.
2018-08-14 02:57
chatGPZ

Registered: Dec 2001
Posts: 11391
a toplevel make target that builds all tools would be nice. and perhaps updated tools that actually build with a somewhat recent ca65 :=)
it'd be nice to get rid of the perl dependency too - although i am not sure what a sane alternative would look like (i'd probably resort to sed/grep in the makefile, only because windows versions of these are a bit easier to just drop somewhere)
2018-08-14 03:23
Krill

Registered: Apr 2002
Posts: 2982
Quoting Groepaz
a toplevel make target that builds all tools would be nice.
Okay, good idea. :)

Quoting Groepaz
and perhaps updated tools that actually build with a somewhat recent ca65 :=)
Uhm, what? :) Which of the tools require ca65, i.e., run on realthing? :) Or did the ld65 map output format change _again_ without me taking note of that? o.O

Quoting Groepaz
it'd be nice to get rid of the perl dependency too - although i am not sure what a sane alternative would look like (i'd probably resort to sed/grep in the makefile, only because windows versions of these are a bit easier to just drop somewhere)
Yeah, maybe. If only those weren't such a pain to use. :) Python, maybe?
2018-08-14 14:24
chatGPZ

Registered: Dec 2001
Posts: 11391
python has the same problems, i think (is it a single exe requiring no install?) (the point is: right now "our" dev environment is completely freestanding requiring no msys or cygwin or any of that, and no installing either - breaking that is not an option :))

the problem with ca65 are at least nucrunch and tinycrunch, those will not build.
2018-08-14 14:28
Oswald

Registered: Apr 2002
Posts: 5095
- what is shryddar stepping?
- how is multiple drives achieved ?
- what is subsizer support ?
2018-08-14 14:52
Krill

Registered: Apr 2002
Posts: 2982
Quoting Groepaz
python has the same problems, i think (is it a single exe requiring no install?) (the point is: right now "our" dev environment is completely freestanding requiring no msys or cygwin or any of that, and no installing either - breaking that is not an option :))
A point i don't quite understand. Why is installing some package or another a problem? :) But anyhow, there seems to be no simple sane multi-platform single-exe scripting language around, or is there?

Quoting Groepaz
the problem with ca65 are at least nucrunch and tinycrunch, those will not build.
The stuff that won't build with current ca65 versions isn't required to pack files for loader usage, though. Anyhow, will be fixed in a patch release soonish. Thanks for the bug report. :)
2018-08-14 15:09
chatGPZ

Registered: Dec 2001
Posts: 11391
Quote:
Why is installing some package or another a problem? :)

because this environment is ment to be used by people who are not programmers, and who should be able to use this without reading instructions or doing the install dance. so far its all about "put that shit somewhere, type make" - and i'd like to keep it that way. _I_ dont care, and i already have perl and python installed, and probably even rust soon =P (*)

Quote:
But anyhow, there seems to be no simple sane multi-platform single-exe scripting language around, or is there?

which is why i use grep/sed for these kind of things :) (there is a single .exe basicv2 interpreter.... =D)

Quote:
The stuff that won't build with current ca65 versions isn't required to pack files for loader usage, though.

perhaps - but right now when doing "make" in the respective directories it just fails. a toplevel makefile that just builds what is needed might be an acceptable workaround :)

(*) another requirement is that i can easily crosscompile windows .exe on my linux system. not sure if that rules out rust too - i havent tried this :)
2018-08-14 15:09
Krill

Registered: Apr 2002
Posts: 2982
Quoting Oswald
- what is shryddar stepping?
See Release id #160665 : Stepper Test 1.0

Quoting Oswald
- how is multiple drives achieved ?
See Error message: "Found more than one drive on IEC bus" with C128D metal and recent demos and Reversibly mute serial devices by a tight loop drive code which disable interrupts and set ATNA according to ATN IN in a loop

Quoting Oswald
- what is subsizer support ?
See Subsizer 0.6 (note that a later version is shipped with the loader)
2018-08-14 15:19
chatGPZ

Registered: Dec 2001
Posts: 11391
(**) it also needs to build without hassle on macOS
2018-08-14 15:37
JackAsser

Registered: Jun 2002
Posts: 2014
Quote: (**) it also needs to build without hassle on macOS

Hassle is subjective, hazel is a tree.
2018-08-14 15:56
Krill

Registered: Apr 2002
Posts: 2982
Quoting Groepaz
because this environment is ment to be used by people who are not programmers, and who should be able to use this without reading instructions or doing the install dance. so far its all about "put that shit somewhere, type make" - and i'd like to keep it that way.
For the particular case of the non-coders being able to rebuild your demo (that's what you implied, right?), there is no such perl or similar requirement, because you'd simply ship your pre-compiled loader-c64.prg, install-c64.prg and loadersymbols-c64.inc along with the demo source code. :)
2018-08-14 16:38
chatGPZ

Registered: Dec 2001
Posts: 11391
no, generated files in the repo are not an option either
2018-08-14 17:01
ChristopherJam

Registered: Aug 2004
Posts: 1409
Quoting Groepaz
(*) another requirement is that i can easily crosscompile windows .exe on my linux system. not sure if that rules out rust too - i havent tried this :)

Ironically that's probably easier for you than for me at this point; I used a Docker container running ubuntu so I could build the windows .exe for nucrunch; directly crosscompiling rust to windows binaries from macOS is currently nontrivial.
2018-08-14 17:17
ChristopherJam

Registered: Aug 2004
Posts: 1409
Quote: python has the same problems, i think (is it a single exe requiring no install?) (the point is: right now "our" dev environment is completely freestanding requiring no msys or cygwin or any of that, and no installing either - breaking that is not an option :))

the problem with ca65 are at least nucrunch and tinycrunch, those will not build.


tinycrunch requires no additional python libraries beyond the python standard library*, and indeed still works with python 2 - so for macOS or ubuntu no further installs should be required. Windows I've no idea.

Thanks for your ca65 bug reports re nucrunch and tinycrunch; I didn't realise newer versions of ca65 had made breaking changes.


(*unless you want to rebuild the test Koala, but I've shipped that)
2018-08-15 09:48
Krill

Registered: Apr 2002
Posts: 2982
Quoting Groepaz
no, generated files in the repo are not an option either
IMHO, that's needlessly restrictive. While it's generally accepted that adding auto-generated files to a repository is a bad idea for various reasons, i find it perfectly okay to add files that are generated manually once in a while, even if they're generated from other code in the same repository.
2018-08-15 13:40
chatGPZ

Registered: Dec 2001
Posts: 11391
maybe yes, maybe no :) i like it that way, and i am also exercising certain things there which i like to see elsewhere.
2018-08-15 22:07
Digger

Registered: Mar 2005
Posts: 438
Yay, can't wait to test it on my really moody drive!
2018-08-16 06:29
Lazycow

Registered: Dec 2013
Posts: 2
Really fast! But the loadertest-c64.d64 hangs with vice 2.4 on NTSC. Emulation problem or bug? (I don't have an NTSC machine to verify)
2018-08-16 08:46
Krill

Registered: Apr 2002
Posts: 2982
Quoting Lazycow
Really fast! But the loadertest-c64.d64 hangs with vice 2.4 on NTSC. Emulation problem or bug? (I don't have an NTSC machine to verify)
The option NTSC_COMPATIBILITY is disabled for the pre-built images and binaries (because it would slightly slow down PAL performance, see loader/include/config.inc). Enabling it should make it work with both PAL and NTSC.
2018-08-16 10:02
JackAsser

Registered: Jun 2002
Posts: 2014
Quote: Really fast! But the loadertest-c64.d64 hangs with vice 2.4 on NTSC. Emulation problem or bug? (I don't have an NTSC machine to verify)

Not that it matters in this case probably, but generally don’t report bugs / problems using 2yr old emulators.
2018-08-16 10:59
Krill

Registered: Apr 2002
Posts: 2982
Quoting JackAsser
2yr old emulators.
FWIW, 1541U (up to and including current versions) detection code as found in loader/src/install.s ;)
drvch1541u: .byte "m-e", .lobyte($0205), .hibyte($0205); read forward
            sei
            ldx #$ff
            stx $0300
            stx $1803; set all port pins as outputs
            lda #$a4; bit 0 may be forced to GND (1541-II) or connected to track 0 sensor (1541-C, normally 0 = not on track 0)
            sta $1801
            cmp $1801
            bne is1541u
            anc #$8a; and #imm, but no asl/rol, bit 7 of result goes to carry
            beq is1541u
            bcc is1541u
            txa
            arr #$7f; bit 6 of result goes to carry
            ror $0300
is1541u:    inc $1803; set all port pins as inputs
            cli
            rts
drvchkued:
If $0300 has a negative value after execution, 1541U detected.
2018-08-16 14:07
chatGPZ

Registered: Dec 2001
Posts: 11391
Quote:
2yr old emulators

if only - 2.4 was released 2012 :)

thanks for pointing out the emulator code that can be safely removed :)
2018-08-16 14:19
Krill

Registered: Apr 2002
Posts: 2982
Quoting Groepaz
thanks for pointing out the emulator code that can be safely removed :)
If you mean the 1541U detection code, take a look at this option:
.define ONLY_1541_AND_COMPATIBLE   0 ; reduces host-side install code by omitting any native custom drive code for non-1541 compatible
                                     ; drives, treats any drive as 1541, using an incompatible drive will cause undefined behaviour
2018-08-16 15:18
chatGPZ

Registered: Dec 2001
Posts: 11391
but that would remove support for 1581 etc too, i guess?
2018-08-16 15:25
Krill

Registered: Apr 2002
Posts: 2982
Quoting Groepaz
but that would remove support for 1581 etc too, i guess?
Indeed. To keep it simple, there's no option to remove detection of specific models of the 1541 family only, once that family is detected. I'm sure you can figure out what to nix for that. :)
2018-08-16 15:27
chatGPZ

Registered: Dec 2001
Posts: 11391
broken emulators are not part of the 1541 family though :)
2018-08-16 15:35
Krill

Registered: Apr 2002
Posts: 2982
Yes. But you know the story. After i found out about these incompatibilities the hard way, i grudgingly decided to work around those (because 1541U is quite relevant these days, and some older firmwares for 1541U1 will not be fixed), then add 1541U as a separate drive model, which is now also detected.

The idea is that anybody using the custom drive code API could then decide to use some other code if 1541U is detected and its broken implementation of some illegals or the VIA port would prohibit use of the regular 1541 code.
2018-08-16 16:01
chatGPZ

Registered: Dec 2001
Posts: 11391
i know. terrible ideas there
2018-08-16 20:09
Cruzer

Registered: Dec 2001
Posts: 1048
Just type "make" they said. Easiest thing in the world. (Haven't tried with this version yet though.)
2018-08-17 00:09
Krill

Registered: Apr 2002
Posts: 2982
Quoting Cruzer
Just type "make" they said. Easiest thing in the world. (Haven't tried with this version yet though.)
Not sure what you're getting at, but you don't have to "make", just use the pre-compiled binaries in the archive. If they don't take your fancy for some reason or other, why, just... tell me and your wishes might *just* tip the scale to warrant different defaults. :)
2018-08-17 06:04
Radiant

Registered: Sep 2004
Posts: 639
Nice, will have to check this out if I ever get the time to do some serious C64 coding again. Thanks for sticking with proper tools.
2018-08-21 00:29
Smasher

Registered: Feb 2003
Posts: 521
first test: I compiled this amazing loader trying all supported packers.
this is the length of the compiled file "loader-c64.prg" (no raw load):
BITNAX:     $29d
BYTEBOOZER: $2a2
DOYNAX:     $2bb
EXOMIZER:   $370
LEVELCRUSH: $2be
NUCRUNCH:   $2d9
PUCRUNCH:   $324
SUBSIZER:   $384
TINYCRUNCH: $1f2

so you can import the resident at $0100-$03ff (assuming that's a smart place where to put it) with some packers, but not with all of them.
I hope this info could be useful for someone... I'll now dedicate some time on the speed tests.
2018-08-21 09:10
Krill

Registered: Apr 2002
Posts: 2982
Bitnax should be $0280 and tinycrunch $01d5, as advertised. Seems like you've left LOAD_RAW_API enabled.

The best place to fit the resident part is so that it ends at $03ff incl.
That way, you have as much stack space left as possible and can still use $0400 for screen memory.

This obviously doesn't work for the meatier decrunchers, as the resident portion is close to or bigger than $0300 bytes with them.

As for speed, tinycrunch should be fastest, followed by Bitnax, then Doynax-LZ.
This is because these three have a block-based read-data interface, as opposed to having a JSR for every incoming byte of the packed file.
2018-08-21 10:15
Smasher

Registered: Feb 2003
Posts: 521
mmmh, yes it could be I left RAW enabled in "config.inc". I'll recheck that later tonite. enabling or disabling that API just changes the compiled size, but no impact on the speed, correct?
or a better question (since I don't want to play with all the settings): only NTSC_COMPATIBILITY has impact on the speed performance AFAYK?
yes, even if tinycrunched files are ~2x bigger than exomized ones it seems your loader is way faster with TC!
2018-08-21 10:32
Krill

Registered: Apr 2002
Posts: 2982
Right, LOAD_RAW_API should have no or minimal impact on speed.

The options which do make it somewhat slower are NTSC_COMPATIBILITY, as you said, and also LOAD_UNDER_D000_DFFF and LOAD_VIA_KERNAL_FALLBACK.

Exomizer, despite big speed improvements from version 2 to 3, is still among the or the slowest one.

For tinycrunch vs. *nax, it may depend more on the actual corpus of test files what's faster with combined loading + depacking. The pack ratio diff vs. depacking speed diff ratio may or may not tilt the scale in favour of one or the other, depending on the actual file.
2018-08-28 19:55
Sparta

Registered: Feb 2017
Posts: 49
Krill, first of all congratulations, your loader is truly a masterpiece. I spent considerable time with deciphering it and I think now I understand what you are doing. The GCR loop is an amazing feat. One of its major advantages vs checksum verification integrated in either side of the transfer loop is that you do not need to wait with changing tracks until after transfer of the last block in a track is completed. Shrydar stepping cuts the delay to 12 bycles. This, however, can be completely eliminated. The following (Spartan) method provides a seamless and uninterrupted transfer of data across neighboring tracks. This is how it works in the latest version of my loader developed for personal use:
		lda	$1c00		//First half-track step
		sec
		rol
		and	#$03
		eor	$1c00
		sta	$1c00		//Update VIA 2 Port B

		sec			//Calculate second half step...
		rol
		and	#$03
		eor	$1c00
		sta	LastStep+1	//…and save it for later

Then start data transfer immediately:
		ldy	#$00
		…
		lsr
		dey
TrBranch:	bmi	Loop		//Send #$81 bytes first, then the remaining #$7f

		bit	$1800
		bpl	*-3
		sta	$1800		//Last 2 bits completed

		lda	#$d0		//Replace "BMI" with "BNE"
		sta	TrBranch
LastStep:	lda	#$00
		sta	$1c00		//Update VIA 2 Port B
		cpy	#$00		//
		bne	Loop2		//Back to transfer if not done
					//C64 loop has a similar delay built in

This can be adopted to almost any transfer loop reducing delay to a few cycles.
2018-08-28 21:20
Krill

Registered: Apr 2002
Posts: 2982
Sparta: Thanks! :)

I've considered something like your method (Spartan Stepping :D), but ultimately decided against it.

Its central concept is issueing the second half-track step in the middle of the block transfer.

However, this poses a few problems in a general-purpose standard format loader:
- The computer-side resident code needs to be aware of the slight delay in the middle and wait accordingly, which would increase resident code size ("C64 loop has a similar delay built in", as commented in your example).
- The computer-side code needs to be aware that the currently-transferred block is, indeed, the final file block of the current track, otherwise the extra delay would be in vain (and possibly a net loss due to just-missed following blocks). This would increase resident code size and also require that information to be sent to the resident code somehow, meaning extra protocol overhead.
- The drive-side code is extremely tight as it is (tightest code i ever made, and i've squeezed and squeezed again to fit in everything i needed to fit). It might not be possible to use this approach without throwing out some other functionality.
2018-08-28 22:07
Sparta

Registered: Feb 2017
Posts: 49
Yes, you got it. Spartan stepping uses the transfer loop to pace half-track steps instead of a timer. :)

I respectfully disagree with your second point. The computer-side code does not need to know whether the currently transferred block is the final block of a track. Thus, code can be simplified. Fetching and transferring a block takes roughly 27000-29000 cycles depending on speed zones. Spartan stepping adds 17 cycles to this (72 vs. 72.06 bycles/block transfer). I do not think this causes a significant delay resulting in missing the next block. The total loss while loading a full 35-track disk is 664*17= 11288 cycles, spread out evenly. Shrydar stepping, on the other hand, adds 12*256*34=104448 cycles delay. The difference is about 10-fold.

After the on-the-fly GCR loop and the 72-cycles/byte transfer loop, Spartan stepping was the first thing that resulted in a significant speed improvement in my loader.

I can see your point in your third comment. Your code's complexity-to-tightness ratio is extremely high. :)
2018-08-29 11:14
ChristopherJam

Registered: Aug 2004
Posts: 1409
shrydar here.

Yes, I have wondered about doing a half track step mid transfer too, but I think the pertinent performance metric is percentage time saved, which even at interleave of three is only 0.5% (3072 cycles every 600,000 - and that's assuming no errors, and either perfectly aligned tracks or out of order loading).

Either shrydar stepping or spartan stepping is a huge improvement over the old "wait until you're about to try and read the next block, then spend 60+ bycles on stepping and stabilisation" mind. The biggest win is almost certainly from allowing the head to settle during the transfer.

I'm still undecided about when to do the second step in Marmaload; my own loader development's been on hold while I've been distracted by crunchers and demo effects.

At this rate I suspect that'll remain the case until I've at least one production out the door using Krill's instead - we'll see :)
2018-08-29 11:30
Krill

Registered: Apr 2002
Posts: 2982
Quoting Sparta
I respectfully disagree with your second point. The computer-side code does not need to know whether the currently transferred block is the final block of a track.
You're probably right there. But the first and third points alone seem to prohibit Spartan Stepping in my case. And yes, what Shrydar aka ChristopherJam said. :)
2018-08-29 16:32
Sparta

Registered: Feb 2017
Posts: 49
Quoting ChristopherJam
The biggest win is almost certainly from allowing the head to settle during the transfer.


Agreed on this. In my loader Sparkle, which will never be as versatile as Krill's, I think I am going to settle (huh) with the best of both worlds. I.e. I will time the second half-track step in the transfer loop about 12 bycles after the first one to allow enough time for the head to settle. Call it the Spartan Shrydar Step. :))

P.S. I was aware of the mysterious Shrydar's identity. Google knows everything. :)
2018-08-30 10:10
bubis
Account closed

Registered: Apr 2002
Posts: 19
Quote: Quoting ChristopherJam
The biggest win is almost certainly from allowing the head to settle during the transfer.


Agreed on this. In my loader Sparkle, which will never be as versatile as Krill's, I think I am going to settle (huh) with the best of both worlds. I.e. I will time the second half-track step in the transfer loop about 12 bycles after the first one to allow enough time for the head to settle. Call it the Spartan Shrydar Step. :))

P.S. I was aware of the mysterious Shrydar's identity. Google knows everything. :)


We just don't know who you are, my Hungarian fellow. :)
2019-03-13 06:39
map

Registered: Feb 2002
Posts: 27
Quoting Groepaz
python has the same problems, i think (is it a single exe requiring no install?) (the point is: right now "our" dev environment is completely freestanding requiring no msys or cygwin or any of that, and no installing either - breaking that is not an option :))

the problem with ca65 are at least nucrunch and tinycrunch, those will not build.


One possibility might here to use Pyinstaller with option --onefile to create an .exe from the .py.
https://pypi.org/project/PyInstaller/
Using the UPX packer you can minimize the filesize of the .exe.
2019-03-14 08:43
Krill

Registered: Apr 2002
Posts: 2982
Quoting Groepaz
python has the same problems, i think (is it a single exe requiring no install?) (the point is: right now "our" dev environment is completely freestanding requiring no msys or cygwin or any of that, and no installing either - breaking that is not an option :))
Apparently, this does seem to exist: http://winpython.github.io/ - "The easiest way to run Python [...] out of the box on any Windows PC, without installing anything!", "WinPython lives entirely in its own directory, without any OS installation" and similar claims. There's some small print, though, so YMMV*.

* "Your metrage may vary" in PAL-land.
2019-03-14 08:47
Krill

Registered: Apr 2002
Posts: 2982
Quoting map
Using the UPX packer you can minimize the filesize of the .exe.
Expect lots of Windows users to bemoan that their assorted snake-oil protection suite puts it into quarantine.
2019-03-15 06:11
ChristopherJam

Registered: Aug 2004
Posts: 1409
Perhaps 'someone' should port tc_encode (the tinycrunch encoder) to c64? :D


Also, it's worth noting that the build issues with nucrunch and tinycrunch under newer versions of ca65 were fixed some time ago now.
2019-03-15 06:54
map

Registered: Feb 2002
Posts: 27
Quoting Krill
Quoting map
Using the UPX packer you can minimize the filesize of the .exe.
Expect lots of Windows users to bemoan that their assorted snake-oil protection suite puts it into quarantine.


True, but UPX is completely optional here.

Another hint: in case the target is to run the Python script using wine - the wine project only recently added the required libraries for the latest Python3. Not all Linux Distros have updated their packages yet.
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
E$G/HF ⭐ 7
Jetboy/Elysium
Warlord
goto80/HT
WVL/Xenon
Guests online: 82
Top Demos
1 Next Level  (9.7)
2 13:37  (9.7)
3 Coma Light 13  (9.6)
4 Edge of Disgrace  (9.6)
5 Mojo  (9.6)
6 Uncensored  (9.6)
7 The Demo Coder  (9.6)
8 Comaland 100%  (9.6)
9 What Is The Matrix 2  (9.6)
10 Wonderland XIV  (9.6)
Top onefile Demos
1 Layers  (9.7)
2 Cubic Dream  (9.6)
3 Party Elk 2  (9.6)
4 Copper Booze  (9.6)
5 Dawnfall V1.1  (9.5)
6 Rainbow Connection  (9.5)
7 Morph  (9.5)
8 Libertongo  (9.5)
9 Onscreen 5k  (9.5)
10 It's More Fun to Com..  (9.5)
Top Groups
1 Booze Design  (9.3)
2 Oxyron  (9.3)
3 Performers  (9.3)
4 Triad  (9.3)
5 Censor Design  (9.3)
Top Swappers
1 Derbyshire Ram  (10)
2 Jerry  (9.8)
3 Violator  (9.7)
4 Acidchild  (9.7)
5 Cash  (9.7)

Home - Disclaimer
Copyright © No Name 2001-2025
Page generated in: 0.104 sec.