You are not logged in -
nap
CSDb User Forums
Forums
>
C64 Coding
>
Fast large multiplies
2012-06-09
19:45
Repose
Account closed
Registered: Oct 2010
Posts: 227
Fast large multiplies
I've discovered some interesting optimizations for multiplying large numbers, if the multiply routine time depends on the bits of the mulitplier. Usually if there's a 1 bit in the multiplier, with a standard shift and add routine, there's a "bit" more time or that bit.
The method uses several ways of transforming the input to have less 1 bits. Normally, if every value appears equally, you average half 1 bits. In my case, that becomes the worst case, and there's about a quarter 1 bits. This can speed up any routine, even the one that happens to be in rom, by using pre- and post- processing of results. The improvement is about 20%.
Another speedup is optimizing the same multiplier applied to multiple multiplicands. This saves a little in processing the multiplier bits once. This can save another 15%.
Using the square table method will be faster but use a lot of data and a lot of code.
Would anyone be interested in this?
... 150 posts hidden. Click
here
to view all posts....
2023-12-14
13:26
Repose
Account closed
Registered: Oct 2010
Posts: 227
Oh, you definitely have added some great ideas!
However, I researched it before and there's a few people who came up with that idea before any of us, not c64 sceners either.
2024-02-13
08:13
Repose
Account closed
Registered: Oct 2010
Posts: 227
I've finally published my new fastest routine, 2023 version. It's even faster now, 187.1.
https://codebase64.org/doku.php?id=base:fastest_multiplication_..
And independently benchmarked here
https://github.com/tobyLobster/multiply_test/?tab=readme-ov-fil..
Both faster and shorter, so not bad.
2024-02-13
13:37
ChristopherJam
Registered: Aug 2004
Posts: 1424
Nice work!
2024-02-13
20:58
Repose
Account closed
Registered: Oct 2010
Posts: 227
Of course, only now do I realize a dumb trick:
x1 = p_sqr_lo
umult16
; set multiplier as x1
lda x1
;sta p_sqr_lo
sta p_sqr_hi
eor #$ff
sta p_neg_sqr_lo
sta p_neg_sqr_hi
; set multiplier as x0
; *x1 is no longer preserved
To easily save 3 cycles and 1 byte zp.
Going a bit further, if you add 6 zp, using two sets of pointers, then x0 and x1 can be p_sqr_lo1 and p_sqr_lo2. And one more step, make the caller set up both pointers, but that's a bit more difficult as we can't assume the caller can use A for the calculation eor #$ff (yes, A will be trashed when you call the umult16, but I mean when the caller has the other half of x_ ready).
So, save 3, 6 or 12 cycles.
One more hack is passing y0 in Y, but in the worst case, that's not convenient to the caller, if it happens to be in X, then it's easier to stx y0 rather than txa:tay.
So, I would say saving 3 or 6 cycles are good options, and 3 for sure. That brings the routine down to 180.1 :) Saving 14.5 cycles and being 10% faster definitely makes me happy, losing another 6 bytes zp not so much.
2025-01-01
13:13
Repose
Account closed
Registered: Oct 2010
Posts: 227
Another year and I'm just a day late, but I do have a 2024 version of multiplies which are faster again, breaking the world record 3 times in a row. I still have some work to do to fully verify and explore all the possibilities, but it's coming soon. I think it will be another 6% faster at best.
I keep wondering, how could I not see this? How can I open my mind better to the possibilities? You have to always go back to first principles and explore every variation.
So, this is just a pre-announcement.
Previous
-
1
| ... |
6
|
7
|
8
|
9
|
10
|
11
|
12
|
13
|
14
|
15
| 16 - Next
Refresh
Subscribe to this thread:
You need to be logged in to post in the forum.
Search the forum:
Search
All forums
C64 Coding
C64 Composing
C64 Pixeling
C64 Productions
CSDb Bug Reports
CSDb Development
CSDb Discussions
CSDb Entries
CSDb Feedback
CSDb Info
CSDb moderators
CSDb Questions
Messages to moderators
Requests
for
in
Writer & text
Text
Writer
All times are CET.
Search CSDb
All
Releases
Groups
Sceners
Events
BBS
SIDs
-------
Forum
Comments
Advanced
Users Online
AmiDog
Mike
Honesty/Covenant/Ons..
Knut Clausen/SHAPE/F..
t0m3000/bo0M!^hf^ibex
NthSt4r
REBEL 1/HF
Colt45RPM
Didi/Laxity
CA$H/TRiAD
GuyGavin/HF
Guests online: 229
Top Demos
1
Next Level
(9.7)
2
13:37
(9.7)
3
Codeboys & Endians
(9.7)
4
Mojo
(9.6)
5
Coma Light 13
(9.6)
6
Edge of Disgrace
(9.6)
7
Signal Carnival
(9.6)
8
Wonderland XIV
(9.5)
9
Uncensored
(9.5)
10
Comaland 100%
(9.5)
Top onefile Demos
1
Nine
(9.7)
2
Layers
(9.6)
3
Cubic Dream
(9.6)
4
Party Elk 2
(9.6)
5
Copper Booze
(9.5)
6
Scan and Spin
(9.5)
7
Onscreen 5k
(9.5)
8
Grey
(9.5)
9
Dawnfall V1.1
(9.5)
10
Rainbow Connection
(9.5)
Top Groups
1
Artline Designs
(9.3)
2
Booze Design
(9.3)
3
Oxyron
(9.3)
4
Performers
(9.3)
5
Censor Design
(9.3)
Top Fullscreen Graphicians
1
Joe
(9.7)
2
Sulevi
(9.6)
3
The Sarge
(9.6)
4
Veto
(9.5)
5
Facet
(9.5)
Home
-
Disclaimer
Copyright © No Name 2001-2025
Page generated in: 0.064 sec.