; dividend is in BIGNUM, little-endian ; divisor is in accu sta DIVISOR lda #0 ldx #BIGNUMBYTES longdivmod ldy #8 asl BIGNUM - 1,x - rol cmp DIVISOR bcc + sbc DIVISOR + rol BIGNUM - 1,x dey bne - dex bne longdivmod ; quotient is in BIGNUM, little-endian ; remainder is in accu
a little writeup on the basic idea and how did you end up on the routine would help.
For 5 bits, there are only 32 possibilities. If you're willing to store a piece of code for each, some of them can be made faster. (Powers of two, obviously. Also, in the first iteration of x, the "sbc" won't get executed until 2^(8-y) >= divisor, so you can do a preloop that just rols)
That is, if you're doing this on the C64. If this is some drive GCR thing (and the 5 bit provision makes me think it might be), then of course memory's too precious. But you might still be able to squeeze out a few cycles with a log2(x) lookup table, for x=0..31. (EDIT: Actually, if memory's not an issue, you can have two 8kb lookup tables, for div and mod of 8 bit by 5 bit. You'll still have to shift, though.)
How would such a table improve speed? As far as I can tell, it could only be used for the first iteration.
Gunnar, "line slope divison" I'have discovered long ago that log/exp is accurate enough for that.
div: ldy#160 iterl: clc ldx#20 : rol acc,x dex bpl :- sec lda acc sbc den bcc nosub sta acc inc acc+20 nosub: dey bne iterl rts
It is "2 to 5 bits of divisor", actually. I'm extracting permutation indices from a large number, and the sets to permute have 3 to slightly over 16 items. Unranking the permutations can be done in O(n) as well, and in-place. :)
Or the very last iteration? :) At least i don't see how this table would apply to larger arguments than a byte.
# [init zp_{mod,div} to point to {mod,div}_lookup] loop: ldy BIGNUM, x lda (zp_div), y sta RESULT, x lda (zp_mod), y ora #(>div_lookup) sta zp_div+1 eor #(>div_lookup) ^ (>mod_lookup) sta zp_mod+1 inx cpx #BIGNUMBYTES bne loop
; div16 ; input: ; - 16-bit number in dividend, little-endian ; - 8-bit in divisor ; output: ; - 16-bit number in dividend ; - AC: remainder ; - XR: 0 ; - YR: unchanged div16 ldx #16 lda #0 - asl dividend rol dividend+1 rol a cmp divisor bcc + sbc divisor inc dividend + dex bne - rts divisor .byte 0 dividend .byte 0, 0