Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
 Welcome to our latest new user AdamTilinger ! (Registered 2021-07-25) You are not logged in 
CSDb User Forums


Forums > C64 Coding > Fastloading alternative
2007-07-06 23:20
Shadow

Registered: Apr 2002
Posts: 351
Fastloading alternative

I'm currently working on the loading routine for my DTV demo.
It will read in a packed file, unpack it to DTV extended memory, then read the next file and so on until the entire demo is loaded.

To make it compatible with 64HDD, 1541-III-DTV etc. (not everyone who has their DTV connected to a normal 1541 diskdrive) I'm making one version that uses the standard loading routines, that will be compatible with these. I'm testing using 64HDD and it seems to work.

	lda #fname2-fname1
	ldx #<fname1
	ldy #>fname1
	jsr $ffbd     // call setname
	lda #$01
	ldx $ba       // last used device number
	bne !skip+
	ldx #$08      // default to device 8
!skip:   
	ldy #$01      // not $00 means: load to address stored in file
	jsr $ffba     // call setlfs
	lda #$00      // $00 means: load to memory (not verify)
	jsr $ffd5     // call load
	bcs error    // if carry set, a load error has happened
	jmp unpackfile1
error:
	rts
unpackfile1: // unpack the first file
// Then load next file etc.
...

fname1:	.text "0.*"
fname2:	.text "1.*"
...


However, as you all know the built in disk routines are extremely slow, so it will take quite a while to load the demo.

So I'm thinking of doing a version for those that use a real 1541 that uses some kind of fastloading.
My question is if anyone has a code snippet that does the same as the above, but FASTER! :)
I don't need any IRQ loader or any other such fancy stuff, since I do all loading before the demo starts. The only thing I require of the loader is that it leaves the normal text-screen $0400-$07ff visible (I'll use that to show loading progres).

Any ideas?
2007-07-07 15:07
MagerValp

Registered: Dec 2001
Posts: 1010
Grab the covertbitops 2-bit loader. The source is very easy to understand and modify, and about 5x faster than the kernal.
2007-07-07 16:46
Oswald

Registered: Apr 2002
Posts: 4725
well, there's dreamload for high compatibility. approach doc baccardi with the idea of supporting DTV too :)
2007-07-08 15:28
tlr

Registered: Sep 2003
Posts: 1464
You can make things simpler for you by allowing the user to select fast load or kernal load at boot.

If there is a lot of data to load and 1541 only + blank screen is acceptable I'd like to mention DTV Speed Load which is quite fast.
It is possible to use this with open screen on the DTV as you can disable badlines, but you'll need to hack a bit to make bit 0 and 1 of $dd00 constant.

Otherwise I second Mv's and Oswald's recommendations.
2007-07-09 08:32
Shadow

Registered: Apr 2002
Posts: 351
Thanks all! I've decided to try to port and rewrite the covertbitops loader to fit my needs.
2007-07-09 21:32
Shadow

Registered: Apr 2002
Posts: 351
Managed to get the covertbitops loader running. However, while it is faster, it is nowhere close to 5x speed.
I timed the loading of my 21kb picture (about 85 blocks or so):

Normal kernal routine: 56 seconds
Fastloader routines: 28 seconds

So I'm getting only a 2x speedup...
2007-07-09 22:13
TNT
Account closed

Registered: Oct 2004
Posts: 189
Try different interleave values when writing files to disk. This will affect kernal loading speed, but it's so slow anyway that nobody notices...
2007-07-09 22:18
cadaver

Registered: Feb 2002
Posts: 1142
Some tools also write with interleave 1 (c1541?) which will slow the loading down. In theory default interleave 10 should be OK for 2-bit fastloading, but I give no guarantee that the 2-bit loader presented in the fastloading rant is in any way optimal, and it's long since I looked at it..
2007-07-10 00:53
MagerValp

Registered: Dec 2001
Posts: 1010
My bad, the speed I get in my old benchmark is 1100 bytes per second (roughly 3x) on the 1541 and 2600 bytes per second on the 1581. I optimized Lasses loader a bit and added custom GCR decoding. To get more speed you need to implement a more efficient handshake with block xfer, and not do a full handshake for each byte. You'll also get a speed boost by doing GCR decoding on the C64 side, with nice big lookup tables.
2007-07-10 11:39
Shadow

Registered: Apr 2002
Posts: 351
I have to admit that my knowledge of the inner workings of the 1541 are non-existent, so all this about interleave and GCR decoding is total mumbo jumbo to me! :)
Anyway, things seems to work and the fastloader is faster than the original atleast, so all is good. ;)
2007-07-10 13:11
MagerValp

Registered: Dec 2001
Posts: 1010
It's simple enough: let's say you load a file that starts at track 17 sector 0. The 1541 loads the first sector into ram, decodes the GCR data, and then sends over the 254 decoded bytes to the C64. It then requests the next sector of the file. As the disk keeps on spinning, the next sector under the read head is not 17:1. When you set the interleave, you try to predict what the next sector will be (with a little safety margin to allow for variance in drive mechs, etc) to minimize the time the 1541 has to wait for the sector to become available. With the coverbitops 2-bit loader a good interleave is somewhere around 10 or 11, iirc - that is after 17:0 the 2nd sector will be saved to 17:10 or 17:11, then 17:20 or 17:2, etc.

As for fast GCR decoding, that's a little more complicated, and a fair amount of work to get right. Basically it's replacing the routine that decodes the 325 bytes bytes read by the 1541 into the 256 bytes that get sent over to the C64. The drive rom routine that does the conversion is notoriously slow.
2007-07-10 13:47
Shadow

Registered: Apr 2002
Posts: 351
Interesting stuff, I can see how the interleaving can have quite an impact on loading now.
I created my disk by using D64 Editor 0.028 on PC and then writing it to disk with a MMC64. So the question is whether D64 Editor puts the files with a correct interleace or not...
2007-07-10 18:46
Steppe

Registered: Jan 2002
Posts: 1501
When in doubt, use C64copy to create your d64 images. There's an option to set the interleave, IIRC.
2007-09-10 17:33
Conrad

Registered: Nov 2006
Posts: 759
This post is a bit off-topic, as I don't really want to start new threads about the same bloody hardware ;)

I've coded up such a routine in the drive-code which (or that I wish it WOULD do) read GCR bytes via CLV and lda$1c01 as normal, and then immediately decodes them on the fly.

To make it more understandable:

bvc *
lda $1c01
clv

(gcr decoder code)   (about 33 cycles :( )
(store byte into buffer)

bvc *
lda $1c01
clv

(etc...)


Due to GCR bytes being 5 bits for each nybble, the whole subroutine reads $1c01 five times to decode 4 bytes (using 4 different decoder routines corresponding to the GCR bit patterns stored in an 8-bit value), which are then stored in the data buffer to send through the serial.

The problem though is that it's showing weird byte results in the buffer. The GCR-decoder routines work fine as i've tested them separately, so I think the problem is that the routines use too many cycles in between each CLV and $1c01 read, meaning it's dragging in speed. Could someone confirm if this is the true or not?

I'm testing the code on a 1541, so the cpu is running at 1MHz, about the same speed as the spin motor iirc.

EDIT: furthermore, I didn't forget to use the GCR conversion tables (lda $f8a0,x ora $f8c0,x) just to let you know.
2007-09-10 19:35
TNT
Account closed

Registered: Oct 2004
Posts: 189
Yes, your routine is too slow. Target below 26 cycles on average to be safe. That includes waiting for GCR byte, reading it, clearing overflow flag and storing data byte.
2007-09-10 19:55
Conrad

Registered: Nov 2006
Posts: 759
Thank you TNT, that's what I needed to know :)

Damn, 26 cycles is a very small limit. One of the GCR decoders I coded sums up to 46. :( Guess I'll have to decode them afterwards.
2007-09-11 11:36
MagerValp

Registered: Dec 2001
Posts: 1010
1541 isn't fast enough to decode in realtime. You can do a partial decode though (shift bytes around, etc), so the 2nd pass is faster.
2007-09-11 13:18
Oswald

Registered: Apr 2002
Posts: 4725
krill has many innovative stuff in his loaders. for example interleave independency. a vague attempt to describe it: I guess he first scans the whole track for the sector chaining, then he simply loads the sectors in whatever order they come. on the c64 side the loader puts each sector out with the correct offset. for gcr decoding he doesnt decodes the bytes straight instead he aligns the bits to be sent parallel (2bit irq loadr). so sending a nybble looks like: synch sta whateverreg asl sta whateverreg synch
2007-09-11 14:57
Peiselulli

Registered: Oct 2006
Posts: 77
Use the Fastloader that is implemented in my kernal.
It is base on TLR's Speed loader and working very well for 1541 and DTV.
Link:

DTV Speed Load

With this, you have a speed-up by factor 25 or so.
But no interrupt loader (is no problem for you, as you said).

EDIT: Sorry, that was posted by TLR before.
2007-09-11 16:34
Oswald

Registered: Apr 2002
Posts: 4725
factor 25 sounds too much, iirc even AR warp load is slower.
2007-09-11 18:07
tlr

Registered: Sep 2003
Posts: 1464
Correct. The original turbo was named 25x, but the speed is probably more like 16x. It could be made a bit faster on the DTV though as there are faster CPU modes + blitter.
2007-09-12 09:37
Krill

Registered: Apr 2002
Posts: 1974
You have to distinguish between the plain AR fast load (16x) using the original CBM disk format with GCR, and Warp*25 (25x) using a special format and encoding (MFM), which is not compatible with the original CBM disk format.
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
Guests online: 46
Top Demos
1 Coma Light 13  (9.6)
2 Bromance  (9.6)
3 Edge of Disgrace  (9.6)
4 Uncensored  (9.6)
5 Memento Mori  (9.5)
6 Lunatico  (9.5)
7 Comaland 100%  (9.5)
8 Unboxed  (9.5)
9 Wonderland XII  (9.5)
10 Christmas Megademo  (9.5)
Top onefile Demos
1 Copper Booze  (9.7)
2 Barry Boomer - Trapp..  (9.5)
3 Daah, Those Acid Pil..  (9.5)
4 Dawnfall V1.1  (9.5)
5 To Norah  (9.5)
6 Lovecats  (9.5)
7 Elite Code Mechanics  (9.4)
8 Quadrants  (9.4)
9 For Your Sprites Only  (9.4)
10 Oldschool Sprite Demo  (9.4)
Top Groups
1 Booze Design  (9.4)
2 Oxyron  (9.4)
3 PriorArt  (9.3)
4 Crest  (9.3)
5 Triad  (9.3)
Top Crackers
1 Mr. Z  (9.8)
2 S!R  (9.7)
3 Doc  (9.6)
4 Mr Zero Page  (9.6)
5 Mitch  (9.6)

Home - Disclaimer
Copyright © No Name 2001-2021
Page generated in: 0.057 sec.