Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > Alt-history no-cost design changes with great value
2021-05-01 22:49
Krill

Registered: Apr 2002
Posts: 2839
Alt-history no-cost design changes with great value

Which things in the C-64 could have been implemented or connected differently without conceivable extra cost, for coding advantages?

Thinking of things like shuffling the chip register bits like VIC's $d011 and $d016 differently (such that some effects can be achieved with fewer register writes or less twiddling).
Or putting some IO register to $01 (and move the memory configuration somewhere else, somehow).
Maybe also having different PLA memory configurations (not necessarily more).
Or connecting external signals to the CIA port pins in a different order.

Discuss! =)
2021-05-01 23:16
XmikeX
Account closed

Registered: Oct 2018
Posts: 2
FUCKINGS TO C64 ..! .. err, my apologies.

... What I mean is .. How about something like .. replace 6510 with 6502C from Atari and get a true Halt .. instead of that BA stuff ! (??)

--

.. now .. Let's continue with the beloved C-128, and the missing "improvements" there ! .. of course !

C-128 was slated to have ...

- FULL 4 MHZ operation, across the board, no more Z80 waits!
- A real 6551 chip (at $d700, as seen in CP/M3 source) !
- MMU that could handle Full 256k of RAM, as noted in rom copy routine to missing ram banks !
- VDC with IRQ pin AND full 64k RAM ! (we only saw this in the 128 DCR, but as Strobe reminded me... they did not connect the /IRQ pin ! .. and yes, i know the 64k part is easy to add in the flat 128s )

MOS was incable of 4 MHz anything, so I won't bemoan lack of 4 MHz support chips, etc !

(this thread is carryover from IRCNET #c-64 discussion)
2021-05-01 23:33
Mixer

Registered: Apr 2008
Posts: 422
C-64
- SID noise reduction.
- DMA for memory transfers/register updates.
2021-05-01 23:46
Mac Bacon

Registered: Feb 2016
Posts: 6
Sprite pos registers would be easier to index if they were in x0,x1,x2,x3,x4,x5,x6,x7,y0,y1,y2,y3,y4,y5,y6,y7 order.
2021-05-02 00:35
Krill

Registered: Apr 2002
Posts: 2839
Quoting Mixer
C-64
- SID noise reduction.
- DMA for memory transfers/register updates.
Pretty sure a DMA controller would have added a few dollars to the BOM. =)

What is required for SID noise reduction?

Quoting Mac Bacon
Sprite pos registers would be easier to index if they were in x0,x1,x2,x3,x4,x5,x6,x7,y0,y1,y2,y3,y4,y5,y6,y7 order.
Can you elaborate with some examples?
2021-05-02 00:35
Zaz

Registered: Mar 2004
Posts: 33
Nah, the C64 is perfect!
2021-05-02 05:31
Martin Piper

Registered: Nov 2007
Posts: 634
Colour RAM address like screen RAM
2021-05-02 07:25
MagerValp

Registered: Dec 2001
Posts: 1055
Ditching 1540 compatibility, which failed anyway.
2021-05-02 09:31
Oswald

Registered: Apr 2002
Posts: 5017
this comes from Graham (he told me on irc aeons ago): why have the top 2 bits of char pointers ANDed to 00 in ECM mode ? just leave them as is.

also probably wouldnt cost much to have a PLA setting where VICII is not forced to see char rom at $1000 and $9000.

I'm totally with Xmikex on the Halt thing. When I learned how Atari does it I was like WTF is it so complicated on c64 then. VICII could just easily halt cpu on every 2nd 2mhz cycle, and just stop doing that on the borders for 2mhz.

proper CIA's with shift registers working into serial bus ?

also dropping potmeters from SID and lightpen from VICII for something more useful?


how about border disable bit in VICII ? probably wouldnt need more then a handful of transistors :) ... or letting badlines go into top/bottom border area..

edit: multicolor mode #2, where its always multicolor, so 16 colors possible for d800.

edit#2: also how about that unused lowmost bit in d018? wouldnt it be nice sometimes to just inc d018 ? :)
2021-05-02 10:16
TWW

Registered: Jul 2009
Posts: 541
Another colour ram bank allowing double buffering
2021-05-02 11:03
tlr

Registered: Sep 2003
Posts: 1714
A simpler way to sync the CPU to the VIC-II. Either via some kind of halt facility, or at least having a up-counting timer to measure how far from the time of IRQ assertion we are.
2021-05-02 11:30
chatGPZ

Registered: Dec 2001
Posts: 11108
Quote:
proper CIA's with shift registers working into serial bus ?

the CIA shift registers work just fine
2021-05-02 11:52
Krill

Registered: Apr 2002
Posts: 2839
Quoting TWW
Another colour ram bank allowing double buffering
Quoting Martin Piper
Colour RAM address like screen RAM
Again, pretty sure these things like colour extra RAM or extending the entire data bus to 12 bits would have added quite a bit to price of the machine.
Please stay with no-cost changes (or very little cost if you will).

Quoting MagerValp
Ditching 1540 compatibility, which failed anyway.
So the ROM loader would be slightly faster, but still very slow? =)

Quoting Oswald
this comes from Graham (he told me on irc aeons ago): why have the top 2 bits of char pointers ANDed to 00 in ECM mode ? just leave them as is.
Very good!
Quoting Oswald
also how about that unused lowmost bit in d018? wouldnt it be nice sometimes to just inc d018 ? :)
As that bit is always 1, inc works pretty well, no? :)

Quoting Oswald
multicolor mode #2, where its always multicolor, so 16 colors possible for d800.
You mean changing multicolour char mode, which renders hires chars for colours 0-7, so it would always put out multicolour chars?
2021-05-02 14:36
TWW

Registered: Jul 2009
Posts: 541
Quoting Krill
Quoting TWW
Another colour ram bank allowing double buffering
Quoting Martin Piper
Colour RAM address like screen RAM
Again, pretty sure these things like colour extra RAM or extending the entire data bus to 12 bits would have added quite a bit to price of the machine.
Please stay with no-cost changes (or very little cost if you will).



Use bank switching and change between 2 ColRAM banks located @ $d800. 2k ram instead of 1k and PLA to handle the switching.
2021-05-02 14:55
ChristopherJam

Registered: Aug 2004
Posts: 1378
Quote: Another colour ram bank allowing double buffering

I gather c128 has that :)
2021-05-02 15:24
Jammer

Registered: Nov 2002
Posts: 1289
- TXY/TYX
- 8bit Colour RAM which uses higher nybble either for hires/mc in charmode, or luma level per char in bitmap mode (but that would probably cost a lot back then)
- parallel drive interface (we know it was compatibility decision, not cost decision)
- SID made according to original specs + filter per channel ;)
- hardware PCM channel
2021-05-02 15:26
ChristopherJam

Registered: Aug 2004
Posts: 1378
Agreed on deinterleaving the sprite position registers

Re-arranging the pulse width bits on SID to put the eight high bits in a single register would have saved a fair bit of code - most tunes could happily leave the low four bits untouched during playback.

Putting the lowest bit of sprite X position into a separate register instead of the highest would likely have resulted in a lot of routines that left the LSB zero, but would make full screen sprite positioning a hell of a lot saner if you could live with the slightly coarser movement.

Alternately, instead of seperate registers for sprite x-msb, spriority, sprite x/y expand, sprite MCM, have a mode register for each sprite that combines those five attributes (and probably the collision flags too).

Soooo much saner for multiplexers that actually change modes.

Allow disabling of the character set images in one of the memory maps (surely just a change in PLA programming).

Bugfix stx $xxxx,y and sty $xxxx,x
2021-05-02 15:34
Krill

Registered: Apr 2002
Posts: 2839
As for re-grouping registers, i always thought the SID's 7 per-voice registers should have been interleaved.
Then updating registers in a per-voice loop could be done with, e.g., X = 2, 1, 0 rather than subtracting 7.
2021-05-02 15:48
chatGPZ

Registered: Dec 2001
Posts: 11108
Quote:
parallel drive interface (we know it was compatibility decision, not cost decision)

it was pure cost decision, the cable and connectors for ieee488 cost a small fortune back then.
2021-05-02 18:42
ChristopherJam

Registered: Aug 2004
Posts: 1378
Eh, I've never found the SID register stride that big a deal - you just interleave all your voice state at strides of seven as well. Especially now we know we can
TXA
AXS#7

at loop end :)
2021-05-02 18:55
Krill

Registered: Apr 2002
Posts: 2839
Quoting ChristopherJam
Eh, I've never found the SID register stride that big a deal - you just interleave all your voice state at strides of seven as well. Especially now we know we can
TXA
AXS#7

at loop end :)
Not a big deal, but may cost precious bytes and/or cycles. =D While rearranging them would have no downsides, afaict right now.
2021-05-02 21:33
Oswald

Registered: Apr 2002
Posts: 5017
"You mean changing multicolour char mode, which renders hires chars for colours 0-7, so it would always put out multicolour chars?"

yes :) but not changing the original mode, which is cool as it is, but offer a 2nd multicolor mode which does this.
2021-05-02 21:42
ws

Registered: Apr 2012
Posts: 228
If, let's say, the screen ram adress would be set, and always the color ram adress would accordingly be set to a mirrored/inverted high adress of that value, like for example:
set screen ram to $0400, color ram would be at $8400,
or if you set screen ram to $8400, color ram would be at $0400 (explanation to make the "mirror" point clear here), thus being able to leave the color ram chips away, wouldn't that actually have reduced cost?
sacrificing just 1000 bytes - while actually, with screen color being read from normal ram, both nybbles could have been used? And even if you didn't change the bus so all 8 bits would be read (i understand that would be necessary?) - why did they add actual ram for the colors in the first place? Just to add 1000 bytes?
2021-05-02 22:03
tlr

Registered: Sep 2003
Posts: 1714
Quote: If, let's say, the screen ram adress would be set, and always the color ram adress would accordingly be set to a mirrored/inverted high adress of that value, like for example:
set screen ram to $0400, color ram would be at $8400,
or if you set screen ram to $8400, color ram would be at $0400 (explanation to make the "mirror" point clear here), thus being able to leave the color ram chips away, wouldn't that actually have reduced cost?
sacrificing just 1000 bytes - while actually, with screen color being read from normal ram, both nybbles could have been used? And even if you didn't change the bus so all 8 bits would be read (i understand that would be necessary?) - why did they add actual ram for the colors in the first place? Just to add 1000 bytes?


The reason is memory bandwidth. The color ram has a separate 4-bit data bus. If you were to read the color data from the main ram instead, then you'd need to find 40 more cycles in every bad line.
2021-05-02 22:26
ws

Registered: Apr 2012
Posts: 228
@tlr: ah, i understand. thank you for explaining!
2021-05-03 08:43
Oswald

Registered: Apr 2002
Posts: 5017
one more pla and one more color ram chip could solve the problem, anyway I dont think we would get much boost from double bufferable color ram. the best scrolling games which milk every performance there is are good enough.
2021-05-03 09:14
tlr

Registered: Sep 2003
Posts: 1714
For the record: the plus4 has color data fetched from the main ram. This is solved by having two bad lines each char row. I guess you wouldn't want that for the c64? :)
2021-05-03 09:22
cadaver

Registered: Feb 2002
Posts: 1153
Oswald: Right, double buffer still means you lose the CPU cycles, and in most cases you can race the raster or split the update in halves. Selective non-update FTW :)
2021-05-03 12:12
Jammer

Registered: Nov 2002
Posts: 1289
Quoting cadaver
Oswald: Right, double buffer still means you lose the CPU cycles, and in most cases you can race the raster or split the update in halves. Selective non-update FTW :)


Yeah. Too bad we can't do anything like this with dedicated sample channel which would handle playback timing internally ;)
2021-05-03 12:49
ChristopherJam

Registered: Aug 2004
Posts: 1378
VSP fix... and a bit that lets you put the sprite pointers at (eg) end of bank regardless of where the screen is, so that AGSP routines don't scroll the sprite pointers onto the visible area.
2021-05-03 12:54
chatGPZ

Registered: Dec 2001
Posts: 11108
Regarding sprite pointers i always wondered how they ended up in regular RAM - you'd think those are registers.
2021-05-03 13:10
tlr

Registered: Sep 2003
Posts: 1714
Quote: Regarding sprite pointers i always wondered how they ended up in regular RAM - you'd think those are registers.

I guess that would have taken up more die space. They are only needed when the sprite data is fetched, and there are 4 read slots for each sprite anyway. Packing it to 3 slots/sprite would require more logic too.

It's a good thing the pointers can be switched with single writes, we should all be happy!
2021-05-03 13:21
chatGPZ

Registered: Dec 2001
Posts: 11108
Quote:
They are only needed when the sprite data is fetched

That could be said about a couple more things with sprites though, so why one ended up in RAM but not the other? :) It DOES make sense though if you think about double buffering the screen content... so who knows
2021-05-03 14:00
Copyfault

Registered: Dec 2001
Posts: 466
Maybe someone already wrote this, so sorry in case I missed it: having all the memory-changing opcodes with index in play writing to the non-fixed hi-byte adress first and then writing to the correct adress in the following cycle would really be nice ;) (thinking of INC abs,X; STA abs,X and the like)
2021-05-03 14:24
Krill

Registered: Apr 2002
Posts: 2839
Quoting Copyfault
Maybe someone already wrote this, so sorry in case I missed it: having all the memory-changing opcodes with index in play writing to the non-fixed hi-byte adress first and then writing to the correct adress in the following cycle would really be nice ;) (thinking of INC abs,X; STA abs,X and the like)
Er... you mean writing twice, to two different memory locations on page boundary crossing?
2021-05-03 14:27
chatGPZ

Registered: Dec 2001
Posts: 11108
How would that be nice and not just be a major WTF?
2021-05-03 14:48
JackAsser

Registered: Jun 2002
Posts: 1989
Route CHAREN, HIRAM and LORAM signals to the expansion bus. If you don't want a wider connector (more expensive carts) then just replace one of the +5V and 2 of the 4 GNDs.
2021-05-03 14:49
Krill

Registered: Apr 2002
Posts: 2839
I guess any changes in the CPU behaviour would have been anything than cheap, though. Picking another existing CPU, ok, if produced in-house at MOS. :)

Quoting Groepaz
Quote:
parallel drive interface (we know it was compatibility decision, not cost decision)

it was pure cost decision, the cable and connectors for ieee488 cost a small fortune back then.
Yes, and with that fast serial failure...

Would have been nice to have a different layout in drive-side $1800 to write to the serial bus.
E.g., with DATA and CLK out on say bits 4 and 0, could bitbang out a byte in just 4+3*6 cycles (not 4+3*8) via STA $1800 : lsr : STA $1800 etc.
And if not that, having ATNA and CLK out on bits 1 and 0 would have been the next best thing. =)
2021-05-03 14:55
ChristopherJam

Registered: Aug 2004
Posts: 1378
Quoting Groepaz
How would that be nice and not just be a major WTF?

You could clear RAM at a hair over 2.5 cycles per byte! Epic!
2021-05-03 14:57
Shadow
Account closed

Registered: Apr 2002
Posts: 355
Quote: A simpler way to sync the CPU to the VIC-II. Either via some kind of halt facility, or at least having a up-counting timer to measure how far from the time of IRQ assertion we are.

Yeah, this one for sure! So convenient when coding on the Atari 8-bit machines or the VCS/2600 for that matter to just do a STA WSYNC - and you are in perfect sync again.
Cycle/raster-exact code on the C64 is such a nightmare in comparison. First you have to use some convoluted methods with double IRQs or whatnot to actually get in sync, and then it's a struggle to actually keep it once you start having badlines and sprites etc.
That's why my answer when someone makes suggestion on anything I code "But couldn't thing X be in the sideborder?" the answer is always a resolute "NO!" :)
2021-05-03 15:07
Krill

Registered: Apr 2002
Posts: 2839
Quoting Shadow
That's why my answer when someone makes suggestion on anything I code "But couldn't thing X be in the sideborder?" the answer is always a resolute "NO!" :)
Too lame or lazy, in other words. :)

I think setting up a CIA timer counting 63 cycles once, then querying that after VIC raster interrupts and having a little delay slide isn't really that much hassle. Especially since you can easily re-use that piece of code.
2021-05-03 15:41
Dwangi

Registered: Dec 2001
Posts: 129
Maybe not an answer to the original question.

But I miss the Z-register.
2021-05-03 15:53
Copyfault

Registered: Dec 2001
Posts: 466
Another what-if idea: grouping CSEL (not RSEL, mind!) and the YSCROLL-bits together in one VIC-II-control register would sometimes help to do badline- and sideborder-stuff in one go.
2021-05-03 16:19
Krill

Registered: Apr 2002
Posts: 2839
Quoting Copyfault
Another what-if idea: grouping CSEL (not RSEL, mind!) and the YSCROLL-bits together in one VIC-II-control register would sometimes help to do badline- and sideborder-stuff in one go.
Just have a global border-disable bit as Oswald suggested, and be done with it. :)
2021-05-03 17:19
Copyfault

Registered: Dec 2001
Posts: 466
Quoting Krill
Quoting Copyfault
Another what-if idea: grouping CSEL (not RSEL, mind!) and the YSCROLL-bits together in one VIC-II-control register would sometimes help to do badline- and sideborder-stuff in one go.
Just have a global border-disable bit as Oswald suggested, and be done with it. :)
Global border-disable flag is maybe the most steroidal what-if-scenario :)

But grouping CSEL and YSCROLL is not the same and may offer other things, like: opening the sideborder and repeating the textline (and supressing badlines in the following) \o/
2021-05-03 17:23
Krill

Registered: Apr 2002
Posts: 2839
Badline-disable bit, then! =)
2021-05-03 17:58
chatGPZ

Registered: Dec 2001
Posts: 11108
Hungarians suggesting global border disable. Mind: blown
2021-05-03 20:54
Count Zero

Registered: Jan 2003
Posts: 1821
Krill is already on a maker track and just collecting further suggestions? Much here sounds like you guys want some Amiga500 at 1MHz ... or a DTV or so? :)
2021-05-04 12:40
Codetsu

Registered: Feb 2017
Posts: 3
jep DTV have all that and more
no dreams it's here ready to code
2021-05-04 13:12
Krill

Registered: Apr 2002
Posts: 2839
Quoting Count Zero
Krill is already on a maker track and just collecting further suggestions? Much here sounds like you guys want some Amiga500 at 1MHz ... or a DTV or so? :)
No, this is just idle musing.

Retro ≠ Vintage,
DTV ≠ C-64.
2021-05-04 19:20
Slammer

Registered: Feb 2004
Posts: 416
Quote: Agreed on deinterleaving the sprite position registers

Re-arranging the pulse width bits on SID to put the eight high bits in a single register would have saved a fair bit of code - most tunes could happily leave the low four bits untouched during playback.

Putting the lowest bit of sprite X position into a separate register instead of the highest would likely have resulted in a lot of routines that left the LSB zero, but would make full screen sprite positioning a hell of a lot saner if you could live with the slightly coarser movement.

Alternately, instead of seperate registers for sprite x-msb, spriority, sprite x/y expand, sprite MCM, have a mode register for each sprite that combines those five attributes (and probably the collision flags too).

Soooo much saner for multiplexers that actually change modes.

Allow disabling of the character set images in one of the memory maps (surely just a change in PLA programming).

Bugfix stx $xxxx,y and sty $xxxx,x


I understand the logic behind the 'one control register for each sprite' but I would miss easy sprite stretching.
2021-06-07 23:11
dyme

Registered: Nov 2018
Posts: 14
Gravedigging, but one easy change rarely gets mentioned:

It would have been fairly easy to enable changing the hi-byte of the memory bus for accesses to the zeropage (and stack) by writing it into a fixed memory address akin to $01, or maybe some cia register.

Being able to treat every single 256byte block as zeropage would bring such a performance boost, not only because of the 3 instead of 4 cycles per lda/sta, but being able to stx addr,y and sty addr,x saves a lot of swapping registers through memory. Also SO MUCH more zeropage-pointers for zeropage indirect addressing.

Also chaining / interleaving functions by stack combined with switching the stack page midchain could... ah... I don't know, something awesome, I guess.
2021-06-08 02:42
Deev

Registered: Feb 2002
Posts: 206
Quote: "You mean changing multicolour char mode, which renders hires chars for colours 0-7, so it would always put out multicolour chars?"

yes :) but not changing the original mode, which is cool as it is, but offer a 2nd multicolor mode which does this.


A very easy improvement would just be to switch:

Light-red/red
Mid-grey/purple
Light-blue/blue

When pixelling char graphics, this would allow for a fixed light colour and a fixed dark colour, with a choice of light-red, mid-grey, light-blue, green and (possibly) cyan inbetween.

I haven't thought about this for too long, so there could be some downsides, but a lot of existing games I have looked at wouldn't need to change much (if at all), whilst this selection of changeable colours also allows for greater flexibility.

You could achieve a look similar to Creatures without using colour splits and you could still use hires chars.
2021-06-08 08:45
ChristopherJam

Registered: Aug 2004
Posts: 1378
A VIC flag that enables each byte of pixel data read (regardless of whether it's from a bitmap or a charset) being XORed with the previous value instead of replacing it.

"previous value" reset to zero at start of each scanline.

Only 8 bits of additional internal VIC state (even then only if the current implementation destructively shifts out the bits read during pixel output), and free EOR fill for left to right polygon plotters.
2021-06-08 09:24
Krill

Registered: Apr 2002
Posts: 2839
Quoting ChristopherJam
A VIC flag that enables each byte of pixel data read (regardless of whether it's from a bitmap or a charset) being XORed with the previous value instead of replacing it.

"previous value" reset to zero at start of each scanline.

Only 8 bits of additional internal VIC state (even then only if the current implementation destructively shifts out the bits read during pixel output), and free EOR fill for left to right polygon plotters.
Nice idea, but mustn't the XOR be bit-wise for a left-to-right filler to work?

No idea how the pixel pipeline works internally, but it sounds like it could be done rather easily with little more gates. Just that they thought of these kinds of thing only a little later, in full-blown blitters. :)
2021-06-08 10:38
wil

Registered: Jan 2019
Posts: 42
Guys, if you plan to go back in time to add improvements to the C64 *please* be careful!

Remember what happened last time when we went back in time to improve C128's VDC. We wanted to add a blitter function instead of the limited hardware sprites and a separate memory to avoid cycle stealing. And we ended up losing the sprites for good and the separate memory is damn slow to access now...
2021-06-08 12:04
Oswald

Registered: Apr 2002
Posts: 5017
that polygon filler will never work, as VICII never reads bytes continously nor vertically nor horizontally
2021-06-08 12:05
Krill

Registered: Apr 2002
Posts: 2839
Quoting Oswald
that polygon filler will never work, as VICII never reads bytes continously nor vertically nor horizontally
Pixels are output one by one along a scan.
2021-06-08 12:30
Oswald

Registered: Apr 2002
Posts: 5017
Quote: Quoting Oswald
that polygon filler will never work, as VICII never reads bytes continously nor vertically nor horizontally
Pixels are output one by one along a scan.


my bad, havent read CJ's suggestion fully... edit: or did I ? CJ talks about eoring bytes, however eoring horizontally per pixel as you suggest would work.

then just max 2 extra onboard bits for the eor mechanism or 1 for hires, should be doable from a few dozens transistors :) or less
2021-06-08 13:01
ChristopherJam

Registered: Aug 2004
Posts: 1378
Quoting wil
Remember what happened last time when we went back in time to improve C128's VDC. We wanted to add a blitter function instead of the limited hardware sprites and a separate memory to avoid cycle stealing. And we ended up losing the sprites for good and the separate memory is damn slow to access now...


No direct access to VRAM on the Commander X16 either. Most un-c64 like.
2021-06-08 13:08
Krill

Registered: Apr 2002
Posts: 2839
Quoting ChristopherJam
No direct access to VRAM on the Commander X16 either. Most un-c64 like.
It does have auto-increment on VRAM pointers, though, so you can pump in sequential data pretty efficiently with a single store per datum.
2021-06-08 13:10
ChristopherJam

Registered: Aug 2004
Posts: 1378
Quoting Krill
but mustn't the XOR be bit-wise for a left-to-right filler to work?

Not at all! It needs to be bytewise for pattern fill to work. This is exactly what I implemented in software for the 3d renderer in Effluvium

Quoting Oswald
that polygon filler will never work, as VICII never reads bytes continuously nor vertically nor horizontally

Well, that's the point - it needs to XOR with the previous byte displayed, not the previous one in memory. The byte immediately before in memory is on the wrong line altogether in bitmap mode. And we need to do a horizontal fill to keep the cost down; a vertical fill would need a 320 byte buffer internal to VIC, and I can't see that being cheap.

(also, it's easier to do pattern fills with a horizontal fill than with a vertical one)
2021-06-08 13:23
Krill

Registered: Apr 2002
Posts: 2839
I see that it needs to be byte-wise (or word-wise) for vertical fill (with a buffer spanning the entirely horizontal width), but i don't see why simple bit-wise from left to right wouldn't work for horizontal fill.

The latter i've done in software in the zoomscroller of +H2K (but of course it uses a 256-entry lookup table to be somewhat efficient).

Edit: Oh, pattern fill... now that's a more complex beast anyways, certainly not a cheap add-on. :)
2021-06-08 15:52
Oswald

Registered: Apr 2002
Posts: 5017
I dont understand how a byte-wise-horizontall eor fill would work in higher res than bytes ? :)
2021-06-08 16:53
Krill

Registered: Apr 2002
Posts: 2839
Quoting Oswald
I dont understand how a byte-wise-horizontall eor fill would work in higher res than bytes ? :)
Filling is performed right-to-left.
Current fill state is held in the N flag.
Pick one out of two lookup-tables according to that state bit.
Read next unfilled input into an index register.
Look up filled output.
Store filled output.
Repeat. =)

(So yeah, two 256-entry tables, actually, but they're just inverted versions of each other. Probably you could use just one with shifting the previously-stored output still in the accu, then conditional inversion of the looked-up next filled value according to carry flag.)
2021-06-08 18:54
Oswald

Registered: Apr 2002
Posts: 5017
Quote: Quoting Oswald
I dont understand how a byte-wise-horizontall eor fill would work in higher res than bytes ? :)
Filling is performed right-to-left.
Current fill state is held in the N flag.
Pick one out of two lookup-tables according to that state bit.
Read next unfilled input into an index register.
Look up filled output.
Store filled output.
Repeat. =)

(So yeah, two 256-entry tables, actually, but they're just inverted versions of each other. Probably you could use just one with shifting the previously-stored output still in the accu, then conditional inversion of the looked-up next filled value according to carry flag.)


I dont get how CJ's HW version would work with only a 8 bit store previous result byte and byte wise eor.
2021-06-08 20:53
Krill

Registered: Apr 2002
Posts: 2839
Quoting Oswald
I dont get how CJ's HW version would work with only a 8 bit store previous result byte and byte wise eor.
Not so sure either. While VIC does have a line buffer, it only buffers screen data worth 40 bytes (whose update you can prevent by disabling badlines).

For vertical filling, the entire width (320 pixels) would have to be buffered for XORing a line later.

Edit: Hmm well, maybe he meant repurposing the line buffer for actual bit/pixel storage. Then the additional bits would only be needed with destructive shifting-out, for buffering that shifted-out byte. But then that extra buffer could well be only one bit, i guess. :)
And colours/screen data would be lost, so... each line a badline? =)
2021-06-08 22:36
ChristopherJam

Registered: Aug 2004
Posts: 1378
Why so complicated you lot? I was pretty clear I only need one additional byte of state.
Here's an extract from the fill code from Effluvium:

rqZapLine0
	lda rqkLBuf +$00
	sta rqkCSet0+$00*64,y
	eor rqkLBuf +$01
	sta rqkCSet0+$01*64,y
	eor rqkLBuf +$02
	sta rqkCSet0+$02*64,y
	eor rqkLBuf +$03
	sta rqkCSet0+$03*64,y
	eor rqkLBuf +$04
	sta rqkCSet0+$04*64,y
	eor rqkLBuf +$05
	sta rqkCSet0+$05*64,y
	eor rqkLBuf +$06
	sta rqkCSet0+$06*64,y
	eor rqkLBuf +$07
	sta rqkCSet0+$07*64,y

Obviously in this instance I was using a contiguous eorbuffer rather than plotting into the charset itself, but you can see how few operations the fill uses.
2021-06-09 00:56
Krill

Registered: Apr 2002
Posts: 2839
Not quite immediately obvious how this translates to hardware.

Can you elaborate?
2021-06-09 07:41
Oswald

Registered: Apr 2002
Posts: 5017
so how does that fill a line horizontally ? if I have

%00001100 %00000000 %00110000

this will fill with your code to:

%00001100 %00001100 %00111100

and not to:

%00001111 %11111111 %11110000
2021-06-09 08:26
Krill

Registered: Apr 2002
Posts: 2839
Okay, so... horizontal pattern fill in hardware, with one additional state byte.

The fill itself would be simple bitwise XOR from one pixel to the next, and that additional state byte is the pattern mask, applied (XOR/AND/OR) to the output with an 8-pixel granularity?
2021-06-09 10:35
oziphantom

Registered: Oct 2014
Posts: 478
I want cart memory without kernal.
2021-06-09 11:06
ChristopherJam

Registered: Aug 2004
Posts: 1378
No, the state byte is the result of the most recent xor. Perhaps an example would help.

Start by filling buffer with this:
(base 4, using _ for 0 to make it a bit easier to read)
____ __12 12__ ____ __11 11__ ____
____ _121 2___ ____ __12 12__ ____
____ 1212 ____ ____ _111 1___ ____
____ 2121 ____ ____ _212 1___ ____
___2 121_ ____ ____ 1111 ____ ____
__21 21__ ____ ____ 1212 ____ ____

Note that for each of the two edges in the desired output, we're writing two bytes to each line; one updating the pattern in the right side of a byte, the next updating the pattern in the left.

Then, scanning from left to right, replace each input byte with the XOR of itself with the last byte output:
____ __12 1212 1212 12_3 _3_3 _3_3
____ _121 2121 2121 2133 3333 3333
____ 1212 1212 1212 13_3 _3_3 _3_3
____ 2121 2121 2121 2333 3333 3333
___2 1212 1212 1212 _3_3 _3_3 _3_3
__21 2121 2121 2121 3333 3333 3333

Now we have three regions, black on the left, a 1/2 checker in the middle, and a 03/33 stipple on the right.
Here's the eorfiller, assuming the data is in a charset
    ldx#127
loop:
    lda cset,x
    eor cset+128*0,x
    sta cset+128*0,x
    eor cset+128*1,x
    sta cset+128*1,x
    eor cset+128*2,x
    sta cset+128*2,x
    eor cset+128*3,x
    sta cset+128*3,x
    ...
    dex
    bpl loop


Widening Oswald's example a little, you do this:

%00001111 %11110000 %00000000 %00101010 %10000000

this will fill with my code to:

%00001111 %11111111 %11111111 %11010101 %01010101
2021-06-09 12:43
Oswald

Registered: Apr 2002
Posts: 5017
this is fuckin genious man, and why is this better for efluvium than vertical eorfill ?
2021-06-09 12:44
Oswald

Registered: Apr 2002
Posts: 5017
but there is extra overhead of having to make 2 bytes per edge, so why not just per pixel horizontal HW fill ?
2021-06-09 13:08
ChristopherJam

Registered: Aug 2004
Posts: 1378
Cheers, Oswald.

Well, for vertical fill you still need two writes if you're going to get a pattern fill - and tracking the pattern phase for that one hurt my brain too much haha.

Per pixel HW fill would be fine if you just want solid colour, but why not go the extra six bits and get patterns :)

The extra cost of the second write is fairly low compared to calculating the X offset anyway.
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
iceout/Avatar/HF
csabanw
Frostbyte/Artline De..
Guests online: 125
Top Demos
1 Next Level  (9.8)
2 Mojo  (9.7)
3 Coma Light 13  (9.7)
4 Edge of Disgrace  (9.6)
5 Comaland 100%  (9.6)
6 No Bounds  (9.6)
7 Uncensored  (9.6)
8 The Ghost  (9.6)
9 Wonderland XIV  (9.6)
10 Bromance  (9.6)
Top onefile Demos
1 It's More Fun to Com..  (9.8)
2 Party Elk 2  (9.7)
3 Cubic Dream  (9.6)
4 Copper Booze  (9.5)
5 Rainbow Connection  (9.5)
6 TRSAC, Gabber & Pebe..  (9.5)
7 Onscreen 5k  (9.5)
8 Wafer Demo  (9.5)
9 Dawnfall V1.1  (9.5)
10 Quadrants  (9.5)
Top Groups
1 Oxyron  (9.3)
2 Nostalgia  (9.3)
3 Booze Design  (9.3)
4 Censor Design  (9.3)
5 Crest  (9.3)
Top Crackers
1 Mr. Z  (9.9)
2 S!R  (9.9)
3 Mr Zero Page  (9.8)
4 Antitrack  (9.8)
5 OTD  (9.8)

Home - Disclaimer
Copyright © No Name 2001-2024
Page generated in: 0.16 sec.