Released At :
Switch On 2018
Mixed Demo Competition at Switch On 2018 : #1
|Code|| .... ||Algorithm of Algotech, Offence, Onslaught, svenonacid|
|Music|| .... ||Jammer of 1mandivision, Artstate, EXclusive ON, MultiStyle Labs, PriorArt, Samar Productions|
|Graphics|| .... ||Almighty God of Excess, Level 64, Onslaught, Sputnik World|
|Sampling|| .... ||Algorithm of Algotech, Offence, Onslaught, svenonacid|
SIDs used in this release :
Look for downloads on external sites:
Submitted by algorithm on 10 June 2018
|Ok, here are the production and technical notes for this demo. Lets start off with some questions and answers.|
Yes, it is to an extent but only in regards to storage and data retrieval speed. The CPU does not miraculously get overclocked. What you are still seeing is full screen video decompression and sample decompression in realtime on a 1mhz C64 fitting everything into less than 1mb of storage.
So what is new from the previous bananarama easyflash demo?
The bananarama demo internally was just a simplified animation puller from easyflash without any additional decompression (apart from the tile lookups). Frames were directly decoded from easyflash cart in one of two compression methods (CSAM and TileVQ 2x2). That is pretty much it. Mainloop was not used and the audio and video decompression was done inside the IRQ in chunks with no additional threads.
This new demo has quite a few advancements. Notably is the higher fidelity audio which is encoded using my most up to date audio encoder (SSDPCM2 V3) The sample rate is also increased by over 30%. This obviously means more cpu usage for updating the samples as well as decoding them leaving less cpu time free (Along with more complexity in the audio decode in comparison)
The internal framework is more advanced and has tasks running from the IRQ with audio decode constantly occuring in the IRQ using fixed cpu time - as in the bananarama demo. Data is pushed to the stack with the NMI reading from the stack backwards and relocating the stack pointer per frame.
Video decode and other cpu intensive tasks that would take more than a frame to process are pushed from the IRQ to interrupt the main loop and processed. These tasks can individually access easyflash banks seperately to the other tasks.
The main loop area is usually for the loader. This loader can load and decompress to ram seperate subsections of the demo from easyflash. Again, this is independant to the other easyflash banks that can be accessed from other tasks and can occur at the same stage without any conflicts.
In a nutshell. This basically means data loading and decompression to ram from Easyflash while audio and video is being decompressed from easyflash in the background. This allows maximisation of storage.
bananarama demo was constant 25fps. this demo however has far lower in some areas.
increased complexity in audio decode and a higher sample rate uses much more cpu time in comparison. The main killer is NMI updates - each update at over 30 cycles on average - including the overhead of the nmi being started. multiply this by 216 updates per frame and that equates to over 100 rasterlines worth of cpu time (just for playback from a rolling 256 byte buffer, not including the decompression) It was a decision i made to ensure that audio had more priority. In some sections the framerate is deliberately reduced in order to give more cpu time for the loader to depack the next subsection in time.
In order to increase the apparent frame rate for the video decode i made the decision to write to the same buffer and to decode horizontally. This gives partial frame updates of the next frame with the previous one (for scenes with non drastic differences, it appears smoother due to the new frame gradually replaced into the old frame)
Eww. frame bopping up and down and color randomly changing. broken?
There is a disclaimer at the beginning of the demo on what may cause this. If it is captured by some amateur and put on youtube without heeding this - or viewed on a desktop pc at 60hz monitor output or even worse, converted to 25/30hz, it will destroy the whole video effect in the demo. This will result in the image randomly appearing to move up and down and to the left and right. Colors will appear to randomly change and appear broken. Also some of the "truecolor" merging in two of the parts will be totally broken as well as the chromatic aberration effects in the video decode.
What is recommended ideally is a real c64 output at 50.125hz onto a CRT . With the solid frameupdate, there will be done of the image bopping and the random color changes. However there will still be flicker. Bear this in mind (less so if the CRT has some phosphor delay)
now onto the more details description
SAMPLE PLAYBACK AND DECODE
An optimised quad-delta per frame waveform recreator (SSDPCM2 V3) is used. I wont go into further detail in this method (It was described in my previous C64 demo that did it at 16khz) The original audio was split into 4 bar segments and recreated to give the full audio from start to end with all vocals and variations into a smaller space. Each of these 4 bar patterns were additionally packed using the SSDPCM2 V3 method into a quarter of the original size.
The sample rate is at 11khz and is just high enough to be able to hear the cymbals with less aliasing. After resequencing and packing. The whole audio takes up less than 450k of storage. This is roughly equivalent to 16kbs bitrate (2 kilobytes per second)
the audio decoder uses a stack pushing method to place decoded bytes into the stack. This is done ahead of time from the NMI that is reading the stack backwards. To ensure there are no issues, the stack data for the code and processes are relocated per frame update to ensure there are no conflicts and that the NMI can read only the actual sample data constantly with no corruption.
the actual output method used is via d418 and pre-filter setup allowing 6bits or so of bit resolution output. This method is not fully stable however in particular on c64's with a old sid chip (6581). For this reason it is recommended to use a new sid (8580) which is more stable in regards to its filters - although again, there can be some variance.
Digiboost mods on the sid chip will result in the audio quality being destroyed however. Keep that in mind. There is no current support for this. Dual sid setups can also be an issue due to the autodetector detecting another chip or some external capacitors on the board affecting the quality.
THE LOADER AND FRAMEWORK
I have implemented a framework for this demo that allows multiple tasks to utilise different easyflash banks and not to interfere with each other. Status of $01 register can also be different within each task.
The loader can depack data using either doynax or lzwvl depackers. lzwvl was used for most of the depacking of data for subsections to ram due to its very fast speed - at the expense of worse compression.
As mentioned before, the framework can push tasks to interrupt the main loop and then continue from the mainloop when finished. This is useful for processing data that may take more than one frame
VIDEO DECODE AND PLAYBACK.
In order to get any form of "decent" video playback with all the sample decode and playback in realtime, I had to resort to using the good old CSAM compressed images. These were packed further by tiling them to either 1x2 or 2x2 chars to take half or a quarter of the original CSAM size.
If you do not know what csam is. Its my video/frame encoder that uses my own implementation of clustering via genetic encoding to allow only 256 chars to represent a whole sequence of specific video frames with some other options such as masking/weighting, dct-matching etc.
It works by putting in a random population - each member being a character, selecting fitness of each "member". giving the members a lifespan based on their fitness. bringing more population to the group. Mating with a close match. then based on the new fitness, giving a lifespan + age. this is constantly refined.
In the demo, all of the "background" video sequences are encoded using CSAM together with either tile 2x2 or tile 1x2 reduction. Frames are decoded directly from easyflash banks.
For "post processing", due to the limited amount of cpu time available (after audio decode and frame decode) I am relying entirely on the output of the C64 being as it should be (50.125hz).
For this reason, there is a disclaimer at the beginning of the demo. If it is captured by some amateur and put on youtube without heeding this - or viewed on a desktop pc at 60hz monitor output or even worse, converted to 25/30hz, it will destroy the whole video effect in the demo. This will result in the image randomly appearing to move up and down and to the left and right. Colors will appear to randomly change and appear broken. Also some of the "truecolor" merging in two of the parts will be totally broken as well as the chromatic aberration effects in the video decode.
For youtube or similar video capturing, it is recommended that you use a capture card and know what you are doing. If using an emulator, at least apply some post processing to merge the frames together. I have seen some awful stuff even using a low powered pc and direct desktop capture - resulting in audio stuttering and frame freezes.
"Deblocking" is used by merely shifting d011 and d016 per second frame. by combining different palettes per frame, this has a sideeffect of the shifting giving an impression of chromatic changes in edges.
Some of the parts utilise triple frame merging and with a 50hz stable display, this merges the frames to give an impression of more colors.
Most of the demo was done on a few evenings just before the start of the switchon party. However the audio and framework (as well as encoding of the sequences) was done quite some time ago and then later continued on recently.
In order to simplify production of the seperate subsections of the demo, i used a dispatch system where individual parts could be smoothly displayed and then be placed back to the background animation player at any point and time with loading and attachment to the existing irq.
Thanks to agod and jammer for the additional data (graphics and audio for the end part)
· Hidden Parts