Good Apple 3

You are not logged in - nap

Good Apple 3 [2015]

Released by :
Sokrates

Release Date :
16 February 2015

Type :
C64 One-File Demo

User rating:

awaiting 8 votes (6 left) See votestatistics

Credits :

Code

....

Sokrates of The Tuneful Eight

Download :

http://csdb.dk/getinternalfile.php/136355/goodapple3.prg (downloads: 516)

Look for downloads on external sites:
Pokefinder.org

User Comment
Submitted by Jak T Rip on 4 March 2015

Very nice, considering you haven't done much C64 code at all yet!

User Comment
Submitted by Sokrates on 26 February 2015

>To me it seems that the compression-implementation stands out way
>more than the actual (de)compression.
Yes, that's the tricky part. For information: the compression calculation
of the shown 70 images takes 4.5 seconds on a standard platform
(Intel based laptop).

>0-time is also a bit misleading the screen data still has to be moved.
This is a special case due to the C64 can switch it's screen buffer
within range 0x0400. On other computers you can't do that. So the fastest
animation would be: having <64K images im memory, then switch between
screen buffers. But for me moving data is not part of decompression
any more. It's just used to make the already decompressed data visible.

What I have learned here:
* This idea is (still) used quite often
* Other people also made implementations for automatical calculations
So currently I still think I was the first one, but that could be
due to missing information of course.

For (parts of) you it might be interesting:
* This can be calculated automatically
* If you want to write implementations for yourself you know now what to do

>Maybe use it on a large scroller font rather? :)
For my "long term" project I get reductions from 86 KB to 15,5 KB.
So the results strongly depend on the given data.

User Comment
Submitted by enthusi on 25 February 2015

Ah yes! I recall :)
I guess for 70 frames there is no way to brute-force it as you hit a googol of permutations :) To me it seems that the compression-implementation stands out way more than the actual (de)compression. 0-time is also a bit misleading the screen data still has to be moved. There RLE methods that are even faster than the fastest lda-sta-loop. Maybe use it on a large scroller font rather? :)

User Comment
Submitted by Sokrates on 25 February 2015

My first version was made in 2012. The current version ist MUCH
faster in calculation time than the original one. Btw: I shared a
(refactored) version with you enthusi back in 2013.

>Thomas Jentzsch i.e. implemented something smarter here:
>http://atariage.com/forums/blog/144/entry-10014-optimize-data-o..
Thanks, good link! He used the greedy algorithm.

Since I found no papers about this:
if anyone has additional information about similar work:
can you please share it?

> And PAL was making a joke ;-)
How could I know? I don't speak any Norwegian... :-)

User Comment
Submitted by Stone on 24 February 2015

@Sokrates: I don't modify the tables, they are used as-is. The table optimization was done by a brute force method. And PAL was making a joke ;-)

User Comment
Submitted by enthusi on 24 February 2015

The pic I showed is ptoings work by eye. However, I wrote a brute-force code in python that was indeed not TOO fast and came up with a similar result :)
Thomas Jentzsch i.e. implemented something smarter here:
http://atariage.com/forums/blog/144/entry-10014-optimize-data-o..
So this is quite interesting for image-data in a way. Not so much for animations I think, though.

User Comment
Submitted by Sokrates on 24 February 2015

The idea is still used very often, which is great!
But how do you calculate your data? I would guess:
by looking at it sharply and then compress the data by hand.

>http://en.wikipedia.org/wiki/Shortest_common_supersequence
Good one! To be more precise, the problem is very close related
to the shortest common superstring problem (not equal, since
additional optimizations can be done) and can be solved by brute
force or by approximation (I guess this was your point, Krill).

>But with just 70 fragments to shuffle around, easy to brute-force.
But of course you know that the computation time for these kind of
problems grow pretty fast with the number of input data? :-)

So what's new: to my knowledge ZERO decompression time was not
calculated automatically before (blame me wrong!).

Assume you change your data: e.g. new scaled lookup tables for
Stein and PAL, new digits for enthusi, other instruments
for Dane. It's very simple to change things when the data can be
calculated automatically.

And maybe you get better results, e.g. by mixing the data alltogether
(lookup tables, graphical data, music data etc.).

@Bitbreaker: yes it can be optimized for the graphical use case. But
then you cannot mix this data with other data types.

User Comment
Submitted by Bitbreaker on 24 February 2015

It still could be made faster with better ratio:
As each frame can have a maximum of 16 symbols, as already stated, only 4 bits are needed to represent a char. Thus two consecutive frames can be put together into 1000 bytes as low- and hinibbles. 50% mem saved. Now have 2 charsets where one is build that way that it only displays the relevant symbols on the hinibbles, and another one that represents the symbols in the lownibbles. This way you copy 1000 bytes to your desired screen, switch $d018 to display the hinibbles. The second srceen is then displayed by switching $d018 only. No copy action for that frame. So it needs half the cycles and compresses better. And then even things like delta or your overlap scheme is not yet included.
Now make a compo about it :-)

User Comment
Submitted by Dane on 23 February 2015

I've used the same technique a few times to compress instrument-data in some tunes where memory has been an issue. (yeah, every byte counts..)

User Comment
Submitted by Jupp3 on 23 February 2015

I actually wrote similar "compression" scheme for Stacken Blochen (that would seek optimal order (maximize overlap) for animation frame rows)

In the end, I would "just" pack 2 pixels (depth + color) into one byte, as the amount of animations wouldn't make it too big anyway...

User Comment
Submitted by enthusi on 23 February 2015

We arranged the digits for Assembloids2600 with considerable overlap:
https://www.youtube.com/watch?v=hllx7ZiAmZg

User Comment
Submitted by PAL on 23 February 2015

This is very similar to the way I put my table entries into the stack and know what to search and find down the road due to the one bit in my secret up code that is fixed in another secret up code and stored on the third. My FLD all wave routine uses this faster than than that puny star wars scroller. I know it is a bit complex for you coders but enjoy the puzzle.

User Comment
Submitted by Stone on 23 February 2015

I used the same trick in my Star Wars scroller to fit all the lookup tables in memory.

User Comment
Submitted by Krill on 23 February 2015

I think i see what you did there.
As most screens begin and end with runs of the same bytes, you reordered all the unpacked screens so that the overlap of adjacent screens is maximized, thus minimizing the overall size while being able to copy individual screens without further decompression.
This can be considered a form of compression, but it's still nothing new: http://en.wikipedia.org/wiki/Shortest_common_supersequence
And that problem is not only NP-hard, it's NP-complete. But with just 70 fragments to shuffle around, easy to brute-force.

User Comment
Submitted by Sokrates on 23 February 2015

>With that definition all those crunched executables on the C-64 unpack in zero time as well,
>as the programs only start after decompression.
I guess this is a misunderstanding. AFTER loading the file "goodapple3.prg" and BEFORE
starting it with the first SYS call the data is uncompressed in memory.

>And i still don't see what new invention your algorithm implements :)
Err - zero decompression time?

>To be honest, i haven't bothered checking the actual compression scheme,
I can guess that from your comments :-)

>but i suspect it's LZ-related (as optimal basic RLE compression isn't NP-hard).
No, completely different to that.

User Comment
Submitted by Krill on 23 February 2015

With that definition all those crunched executables on the C-64 unpack in zero time as well, as the programs only start after decompression.
And i still don't see what new invention your algorithm implements :)
To be honest, i haven't bothered checking the actual compression scheme, but i suspect it's LZ-related (as optimal basic RLE compression isn't NP-hard).

User Comment
Submitted by Sokrates on 23 February 2015

>Not sure what you're trying to prove here.
Zero decompression time.

>The decompression part isn't a new idea, as far as i can tell,
Something similar was done in the early days when some code and data
was "reused". This was done "by hand" for small data sizes. This here is
calculated and can be applied for large data. So: where have you seen something
like this before?

>and of course uncompressing takes more than 0 time and is slower than mere copying.
Please note that the data IS already uncompressed in the memory EVEN BEFORE the program starts!

>And what part of it is NP-hard?
Find the optimal (in size) data set containing all given data (in same order).

>So is it a sliding window type of thing, where each frame is fully contained within the 1000-byte window?
Yes, sounds like a pretty good explanation!

User Comment
Submitted by Stone on 23 February 2015

So is it a sliding window type of thing, where each frame is fully contained within the 1000-byte window?

User Comment
Submitted by Krill on 23 February 2015

Not sure what you're trying to prove here. The decompression part isn't a new idea, as far as i can tell, and of course uncompressing takes more than 0 time and is slower than mere copying.

And what part of it is NP-hard? Finding the optimal compressed representation of any given string, using your algorithm?

User Comment
Submitted by Sokrates on 23 February 2015

>what's the difference from your compression and simply cut the "empty parts" from top and bottom on each frame?
My compression is lossless: no data is changed or removed or changed in order.

User Comment
Submitted by Flavioweb on 23 February 2015

I'm not hilarious... just try to understand...
...but... in this case, what's the difference from your compression and simply cut the "empty parts" from top and bottom on each frame?

User Comment
Submitted by Sokrates on 23 February 2015

>rocket science...
Well, maybe not a difficult problem for you, Bitbreaker :-)

For the rest of us it's NP-hard:
http://en.wikipedia.org/wiki/NP-hard

User Comment
Submitted by Sokrates on 23 February 2015

>all it does is letting the screen overlap in mem regarding the empty area mostly.
Puh, that was pretty fast from "impossible" to "all it does" :-)
Please note that this method works for any array types, sizes, and
dimensions.

>instant 50% size & double speed win.
The visual part was just some example. Of course a lot of optimization
can be applied. Actually the compression rate of this example is
not a good one compared to other use cases. I just liked the idea
to have some visual feedback.

User Comment
Submitted by Oswald on 23 February 2015

this type of mode animation could be sped up by emulating 2 char high chars. double each char row, and use a different cset for top/bottom. then with one sta 2 rows can be updated :P instant 50% size & double speed win. guess you dont have 'packed nibbles'.

User Comment
Submitted by Bitbreaker on 23 February 2015

Had a peek into the code, all it does is letting the screen overlap in mem regarding the empty area mostly. So that is where bytes are saved, rocket science... m( No idea why the loop for setting the source-pointers is unrolled, when some simple tables would suffice.

User Comment
Submitted by Sokrates on 23 February 2015

>...but, above all, why you copy from memory to screen when it would be enough to change some pointers instead?

The addresses of the images are not a multiple of 0x0400. To give a visual feedback I have to copy them to the screen memory.

User Comment
Submitted by Sokrates on 23 February 2015

>but not even that, its impossible to have 70 displayable screens in memory.
I take your disbelief as an great compliment :-)
Thank's for that!

>so how do you fit 70x1000 uncompressed bytes into 65536 ?
No, not in 65536 bytes, in 42325 bytes :-)

That is exactly the point of this new method. The code is
straightforward, it can be disassembled to see that this
is no fake.

As I said before: if someone has an interessting use-case
(a lot of read-only arrays and running out of memory) I can
do some calculations...

User Comment
Submitted by Flavioweb on 23 February 2015

...but, above all, why you copy from memory to screen when it would be enough to change some pointers instead?

User Comment
Submitted by Oswald on 23 February 2015

so how do you fit 70x1000 uncompressed bytes into 65536 ? but not even that, its impossible to have 70 displayable screens in memory.

User Comment
Submitted by Sokrates on 23 February 2015

>Why copy data a second time and not just decompress to a second buffer and switch screens via $d018?

At start of the program every frame is already uncompressed in the memory. That's how zero uncompression time is achieved.

So this part
> decompress 1000 bytes
is reduced to set a pointer to the next image. Therefore the data is only copied one time.

User Comment
Submitted by Bitbreaker on 23 February 2015

> nextImageLoop:
> decompress 1000 bytes
> write 1000 bytes to screen buffer
> goto nextImageLoop

Why copy data a second time and not just decompress to a second buffer and switch screens via $d018?

User Comment
Submitted by Dr.j on 17 February 2015

Very interesting ;-)
lets go 5 mins animations on c64 ;-)

User Comment
Submitted by chatGPZ on 16 February 2015

reminds me of Spider Anim

User Comment
Submitted by Sokrates on 16 February 2015

This is another version of the new compression method to clearify
some issues. I can imagine that this is getting boring for the most
of you, so probably this one is the last in this series.

This version demonstrates zero decompression time.

Here is how I define when decompression is done:
original array "ao"
compressed array "ac"
decompressed array "ad"

An array is decompressed when:
|accessTime(ao) - accessTime(ad)| = d
"i" is any random number within the size of the array,
d should be zero (exept for minimal deviations e.g. caused by page effects)
AND
internalRepresentation(ao)==internalRepresentation(ad)

This is quite a strict definition and of course not a common one.
But I think it fits quite well for the dedicated hardware and the
best speed use-case. Just to be complete: Data is compressed when:
size(allOriginalData)>size(allCompressedData)

Pseudo code for this demo:

nextImageLoop:
decompress 1000 bytes
write 1000 bytes to screen buffer
goto nextImageLoop

The demo includes 70 frames, each frame size is 1000 bytes.
The original data of 70000 bytes is compressed in this demo
to 42325 bytes (60.46 percent of original size).

I got some feedback on how to improve the compression ratio.
Thanks for that, but please note that the goal here is NOT to
have an excellent compression ratio. The goal is to make
decompression AS FAST AS possible. The gained memory can be
used for additional code or data during the runtime of a
game/demo without any slowdown caused by data accesses.

Please also note that my decompression definition is met BEFORE an
image is written to the screen buffer. I assume in other demos this
point is first reached AFTER writing to the screen buffer.
So the large code in this demo to write an image into the
screen buffer (using loop unrolling) is NOT part of the decompression
anymore. It's only purpose is to give some visual feedback.

Please note furthermore that this method is not restricted to any data
type (byte, int, float, etc.). The only restriction is: the data must
be read-only.

So here we go: zero decompression time according to the given definition.

Search CSDb

Advanced

Navigate

Prev - Random - Next

Detailed Info

· Summaries
· User Comments (34)
· Production Notes

Fun Stuff

· Goofs
· Hidden Parts
· Trivia

Forum

· Discuss this release

Support CSDb

About this site:
CSDb (Commodore 64 Scene Database) is a website which goal is to gather as much information and material about the scene around the commodore 64 computer - the worlds most popular home computer throughout time. Here you can find almost anything which was ever made for the commodore 64, and more is being added every day. As this website is scene related, you can mostly find demos, music and graphics made by the people who made the scene (the sceners), but you can also find a lot of the old classic games here. Try out the search box in the top right corner, or check out the CSDb main page for the latest additions.

Page generated in: 0.092 sec.