[CSDb] - User Forums - Compotime: Show me your (vector)balls

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > Compotime: Show me your (vector)balls

2013-05-24 11:28

Bitbreaker

Registered: Oct 2002
Posts: 500

Compotime: Show me your (vector)balls

After several comments arised that such an amiga-ball can be filled faster, i now want to call out a filler-compo for our coders.

Requirements:

The vector must be rendered in hires, background is white, foreground is dark red.

There's a raster-irq running that splits the screen at $2d and $f2 to set the background and border color to white and black, as seen in the screenshot. Means, there is a charline free in the bottom, that is where the benchmark results are displayed with the system charset. Displaying the result with screencodes is enough for us coders, but hex or decimal values are okay too.

The animation will be precalculated to see the power of your filler only. Therefore a data.bin is provided that contains all animationsteps for all faces with culling etc. already done.

The data structure may be altered to your needs, but not the animation itself, obvious isn't it?

The structure of data.bin is as follows:
byte x1 | $80
byte y1
byte x2
byte y2
byte x3
byte y3
byte x4 (optional, depending on if we have a triangle or quad)
byte y4 (optional, depending on if we have a triangle or quad)

As you can see faces can have 3 or 4 vertices, the first vertice is marked with bit 7 set, to be able to determine if a face consists of 3 or 4 vertices and to have a break out point for a finished frame, which is marked with the value $ff. If there's further questions about the data-format, don't hesitate to contact Bitbreaker

The filling must happen fullframe and fullsize, means, no interlacing or other cheap tricks with reducing resolution.

A counter for benchmarking must be implemented to count the frames until 256 frames have been displayed, it must be made visible in the bottom line.

The lowest value achieved counts (as there might be some jitter), for that, each entry must run in an endless loop.

The whole mem can be used, but every free byte of mem gives extra kudos.

Deadline is June 25th 0:00.

If the deadline is extended, a severe drama is expected, if not, you are out. Also i'll participate with an own entry, make a drama about it! :-)

Entries must be handed in to Bitbreaker and must not be released beforehand. They all will then released after the deadline, for maximum thrill and drama :-)

Each entry must be executeable with run.

SO DO YOU HAVE THE BALLS?

2013-05-24 11:56

JackAsser

Registered: Jun 2002
Posts: 1989

<nevermind> just saw the $ff stop marker... :P

2013-05-24 11:57

Bitbreaker

Registered: Oct 2002
Posts: 500

Nope. but feel free to sum them up and add that info, as stated, you are free to change that to your needs :-)

2013-05-24 12:07

HCL

Registered: Feb 2003
Posts: 716

Haha, funny compo :). So, if i really made some good balls before (as vector bobs), that will not count.. I'll see if i can make a filler again.. no promises though :).

2013-05-24 12:18

JackAsser

Registered: Jun 2002
Posts: 1989

Bitbreaker: Did a reference renderer in Java that "almost" work... Have you verified the data? Are you sure the first x-position of a quad can't be at x=127 (i.e. a valid $ff)?

Of course I coded it wrong... but I dunno why just yet. :P Anyway, when fixed I'll upload the renderer for everyone so that the code explains the data-format.

2013-05-24 12:23

Bitbreaker

Registered: Oct 2002
Posts: 500

$7f does not occur as there's a 4 pixel safety margin around the object. In case my yet implementation would choke hard i guess :-)

2013-05-24 12:44

Bitbreaker

Registered: Oct 2002
Posts: 500

The procedure to read in the vertices is: read in vertices until x is negative (bit 7 set). If you read in 3 vertices till then, draw a triangle else a quad. I just thought i save you a few bytes in the initial dataset :-)

2013-05-24 13:53

ChristopherJam

Registered: Aug 2004
Posts: 1378

Do entries need to render from data.bin directly, or can they be preprocessed?

If preprocessing is allowed, can it be at assembly time, or must it be after the demo has loaded?

What timezone is the deadline in?

2013-05-24 14:00

Oswald

Registered: Apr 2002
Posts: 5017

this compo rather tests line drawing speed than filling. :) I see many obvious optimizations, dont know if I'd give away if write it there, most decent coders know them already. some of these are almost on the animation side... no wonder jackie is testing it in java :)

maybe rules should be extended that frames may not deviate more than X% from some original rendering ? algorithm will make a lossy char anim of this in no time :)

2013-05-24 14:10

Bitbreaker

Registered: Oct 2002
Posts: 500

Quote: this compo rather tests line drawing speed than filling. :) I see many obvious optimizations, dont know if I'd give away if write it there, most decent coders know them already. some of these are almost on the animation side... no wonder jackie is testing it in java :)

maybe rules should be extended that frames may not deviate more than X% from some original rendering ? algorithm will make a lossy char anim of this in no time :)

data.bin can be preprocessed even before linked to the part, as said, it can be adopted to your needs, it's just important that faces appear at the same place (i bit of inaccuracy is okay, as you can also see from my screenshot), as there's when bluntly using data.bin. Deadline is June 25th 2013 0:00 MET.
any blocky animation will be deteced within no time and a serious drama will be generated upon it, be sure about that! :-) If not it will be downvoted and the drama begins!

2013-05-24 14:33

HCL

Registered: Feb 2003
Posts: 716

Oh, so it's not about doing the fastest filler, it's about getting peoples votes? That changed the field drastically :P.

2013-05-24 14:46

ChristopherJam

Registered: Aug 2004
Posts: 1378

Oops, I missed that line; my bad. Thanks for the clarifications, on both counts.

2013-05-24 15:25

ChristopherJam

Registered: Aug 2004
Posts: 1378

Are we permitted to plot into a 16x16 grid of characters, or must it be in standard bitmap mode?

2013-05-24 15:58

enthusi

Registered: May 2004
Posts: 675

So the code is supposed to handle exactly those vertices in whatever format? Preprocessing should not include data as i.e. horizontal on/offs?

2013-05-24 17:24

Bitbreaker

Registered: Oct 2002
Posts: 500

you are free to choose any mode, and yes, it fits into a 16x16 charfield for maybe a good reason :-) Actually i don't want to restrict too much and keep space for playing around :-)
And yes, you may create any new data from the data.bin that suits your needs better, but you'll understand that just dividing all numbers by two might be a bad idea in regards to the outcoming result, but feel free to build any new dataset from it, like for e.g. swap x/y, change clockwiseness or whatever or save more bytes if you feel for it as long as the outcome on the screen is the same as if using the original data. Of course there's no need then to include the original data.bin if you found your own, better format. Also you can preprocess the data with whatever you want, it must not be the task of your .prg to do so, but of course can, if you got BALLS :-)

2013-05-24 20:25

JackAsser

Registered: Jun 2002
Posts: 1989

To preview the data you may use this piece of java-code which reads data.bin directly:

import java.awt.*;
import java.io.*;

public class Reference extends Frame {
  private static final int ZOOM = 4;
  private byte data[];
  private int dataPointer = 2;

  private int next() {
    int r = ((int)data[dataPointer++])&0xff;
    if (dataPointer >= data.length)
      dataPointer=2;
    return r;
  }

  private int peek() {
    return ((int)data[dataPointer])&0xff;   
  }

  public Reference() {
    super("Bitbreaker's filler compo");
    
    try {
      File f = new File("data.bin");
      FileInputStream fis = new FileInputStream(f);
      data = new byte[(int)f.length()];
      fis.read(data);
      fis.close();
    }
    catch (IOException e) {
      e.printStackTrace();
      System.exit(1);
    }

    setLayout(new BorderLayout());
    final Panel p = new Panel() {
      public void paint(Graphics g) {
        g.setColor(Color.WHITE);
        g.fillRect(0,0,getWidth(),getHeight());
        g.setColor(Color.RED);
        
        int x[] = new int[4];
        int y[] = new int[4];

        while(peek()!=0xff) {
          int nbrVertices = 0;
          do {
            y[nbrVertices] = (next() & 0x7f)*ZOOM;
            x[nbrVertices] = (next())*ZOOM;
            nbrVertices++;
          } while ((peek()&0x80)==0);          
          g.fillPolygon(x, y, nbrVertices);
        }
        next();
      }

      public Dimension getPreferredSize() {
        return new Dimension(128*ZOOM, 128*ZOOM);
      }
    };
    add(p, BorderLayout.CENTER);
    pack();
    setResizable(false);
    setVisible(true);

    Thread t = new Thread() {
      public void run() {
        while(true) {
          try {
            p.repaint();
            Thread.sleep(1000/50);
          }
          catch (InterruptedException e) {
          }
        }
      }
    };
    t.start();
  }

  public static void main(String[] args) {
    new Reference();
  }
}

2013-05-24 21:55

chatGPZ

Registered: Dec 2001
Posts: 11108

not very interested in coding hires fillers atm - but i might use this dataset as input for my mod converter and make a sequel of It's All Your Fault with it =P

2013-05-25 09:26

Bitbreaker

Registered: Oct 2002
Posts: 500

I can also provide oneway encrypted datajunk i created when fiddling around with crunchers like doynax :-)

2013-05-25 10:15

Burglar

Registered: Dec 2004
Posts: 1031

so this will result in many compo fillers ;)

nice idea, too bad I wouldnt know where to start writing a superfast filler :/

2013-05-25 10:43

Cruzer

Registered: Dec 2001
Posts: 1048

Too short deadline for me I'm afraid, but I'll release a version later that beats you all. :)

2013-05-25 13:26

Mixer

Registered: Apr 2008
Posts: 422

Make the ball actually round, while you're all at it :)

2013-05-26 20:30

Skate

Registered: Jul 2003
Posts: 490

Do you mind if i use SuperCPU? :D

I'm so busy these days but I'd really like to attend that compo. No promises but i may join the fun.

2013-05-28 23:08

Cruzer

Registered: Dec 2001
Posts: 1048

Decided to give it a shot anyway, and I already got my first unoptimized version up'n'running. First question - why is it bigger than in your original demo? Wasn't this supposed to be a compo about making a faster version of the exact same ball?

2013-05-28 23:11

chatGPZ

Registered: Dec 2001
Posts: 11108

the original demo isnt an animation player :)

2013-05-29 06:30

HCL

Registered: Feb 2003
Posts: 716

Hmm.. also got an unoptimized version running.. 7-8 frames at worst, but then the lines and fill/clear code are looped. Never mind.. the setup feels a bit strange though.. I wonder if those $5000 bytes of animation will rule out optimizations that would be possible if it was real-time.. because just as Cruzer said, i thought we were competing against the original, which was in real-time :P. Very strange this.. :)

2013-05-29 06:54

Bitbreaker

Registered: Oct 2002
Posts: 500

Well, i have also done an animation version of this ball, with the same filling algorithm as used in the realtime version. The reason to take animated data is to make things compareable and faster. Also i said it already many times: you may change the data to your needs, as long as it produces the same output. I really don't say this for no reason, i changed the data.bin heavily for my purposes. And hey, you can use all mem, so don't complain about those $5000 bytes, it is just what our merry musicians would usually waste, right? :-P

2013-05-29 06:57

Bitbreaker

Registered: Oct 2002
Posts: 500

Quote: the original demo isnt an animation player :)

I don't think it's so much bigger than in the realtime version, at least when being in the front and at its largest zoom. Also, it still fits into a 16x16 with safety margin, i don't see how this brings trouble :-)

2013-05-29 06:59

Trash

Registered: Jan 2002
Posts: 122

@ Cruzer & HCL:
In my mind the compo was about the filler and nothing else.

What I'm not really clear about is what benchmark result that should be shown, what unit should be used, fps seems illogical optmized I expect a topcoder to reach 25 fps, average cycles per frame demands som calculations that eats rastertime or am I just stupid?

2013-05-29 07:06

Bitbreaker

Registered: Oct 2002
Posts: 500

Quoting Trash

@ Cruzer & HCL:
In my mind the compo was about the filler and nothing else.

What I'm not really clear about is what benchmark result that should be shown, what unit should be used, fps seems illogical optmized I expect a topcoder to reach 25 fps, average cycles per frame demands som calculations that eats rastertime or am I just stupid?

The benchmark works like this:

The raster irq counts up a counter each time it is called. The filler counts up rendered frames. When it has rendered 256 frames, it reads the counter and spits out the number.

Cheapest way to do so:
lda fcnt_l
sta $07e0
lda fcnt_h
sta $07e1

That'd be enough to make me happy.

This way, you can easily calculate fps or such on your own, but no need to waste too many valuable cycles and bytes on that.

2013-05-29 09:29

JackAsser

Registered: Jun 2002
Posts: 1989

Quote: @ Cruzer & HCL:
In my mind the compo was about the filler and nothing else.

What I'm not really clear about is what benchmark result that should be shown, what unit should be used, fps seems illogical optmized I expect a topcoder to reach 25 fps, average cycles per frame demands som calculations that eats rastertime or am I just stupid?

Yep, about the filler, problem is that you have $5000 worth of animation data which prohibits unrolled filler code to some degree.

2013-05-29 10:18

Bitbreaker

Registered: Oct 2002
Posts: 500

Quote: Yep, about the filler, problem is that you have $5000 worth of animation data which prohibits unrolled filler code to some degree.

aren't those limitations there to make our life more exciting? :-) My yet version has still $1000 bytes free, and i have yet no idea on how to waste them. So stop the whining already and optimize! :-P

2013-05-29 12:06

JackAsser

Registered: Jun 2002
Posts: 1989

Quote: aren't those limitations there to make our life more exciting? :-) My yet version has still $1000 bytes free, and i have yet no idea on how to waste them. So stop the whining already and optimize! :-P

2013-05-29 12:33

Oswald

Registered: Apr 2002
Posts: 5017

I think one could calculate the 3d to save that $5000 bytes, and still may be faster trough the gain of unrolling :) Dot spheres are VERY cheap to calc. then averege the face Z coords to get face visibility. (if no perspective) half of faces is enough as opposite faces has reverse visibility. etc.

2013-05-29 12:42

Bitbreaker

Registered: Oct 2002
Posts: 500

If you do not use perspective it

a) looks ugly
b) you possibly have to draw more faces

2013-05-29 12:57

HCL

Registered: Feb 2003
Posts: 716

..and it would be cheating doing it in realtime. That's the new future, better get used to it ;).

2013-05-29 13:11

Oswald

Registered: Apr 2002
Posts: 5017

Bitbreaker, I think in case of a spherical symmetric body perspective is not so important, see EoD for proof. Can you recreate the screenshot in post#1 without it so we can see?:) not sure about more faces, do perspective hide a face earlier than without?

2013-05-29 13:32

HCL

Registered: Feb 2003
Posts: 716

Quoting Oswald

do perspective hide a face earlier than without?

Yes, that's about all it does :).

2013-05-29 13:38

Bitbreaker

Registered: Oct 2002
Posts: 500

Not possible, as it bloats the data.bin by another 33% and makes my code go BOOM then :-) But here's a quickly done comparison (though no idea if wings3d uses similar values for perspective compared to mine):

Also, if you think your approach is faster: proof it! :-)

2013-05-29 14:33

JackAsser

Registered: Jun 2002
Posts: 1989

No perspective on the ball will be almost unoticable. If u do it realtime noone would be able to tell i can assure u.

2013-05-29 16:27

Oswald

Registered: Apr 2002
Posts: 5017

one cool thing without perspective I notice on this gif is that you can draw many lines with one bigger line instead.

HCL, perspective also adds distortion which helps the mind see the 3dness more, doesnt it?

2013-05-29 16:35

JackAsser

Registered: Jun 2002
Posts: 1989

Quote: one cool thing without perspective I notice on this gif is that you can draw many lines with one bigger line instead.

HCL, perspective also adds distortion which helps the mind see the 3dness more, doesnt it?

Just rotate it a bit and symmetry is gone...

2013-05-29 16:50

Cruzer

Registered: Dec 2001
Posts: 1048

Let's stick to the original precalculations, otherwise the drama will be too intense. But I wouldn't mind getting the full set of coords, including hidden ones. I think it would be easier to pack that way.

2013-05-29 17:07

JackAsser

Registered: Jun 2002
Posts: 1989

Quote: Let's stick to the original precalculations, otherwise the drama will be too intense. But I wouldn't mind getting the full set of coords, including hidden ones. I think it would be easier to pack that way.

I agree

2013-05-29 18:43

Bitbreaker

Registered: Oct 2002
Posts: 500

So here's the C-programme that generates the data.bin for you all to play around and in the end miss the deadline for more drama :-P Attention, contains a Makefile!!1! /o\

2013-05-29 20:01

Cruzer

Registered: Dec 2001
Posts: 1048

Thanx, Bitbreaker! It compiled and executed on first try. Amazing!

2013-05-30 10:23

enthusi

Registered: May 2004
Posts: 675

Why not such a vector sphere? ;-)

(bad looking example)

2013-05-30 10:53

JackAsser

Registered: Jun 2002
Posts: 1989

©enthusi: because that would look undistorted and would be too easy to implement since there are no quads in it. :P But yeah... I agree. Also getting the rotated coordinates for such a sphere in real time is easier that Bitbreaker's I think, not 100% sure though, but it feels more simple. :)

But then again... it's not AAAAMIIIIGAAAAAH!

2013-05-30 11:56

Oswald

Registered: Apr 2002
Posts: 5017

jackie, where's the symmetry in this triangle sphere ? recently on poutet I saw in a thread, that one method to make an 'iso' sphere is to normalize a cube's vertices. still seems more work than the doth sphere cheats: ie. average many cube points out of 8 then scale them with LUTS. dot sphere is only 3 adds and a lookup, cube is 2 adds and a lookup, hmm :) this is getting sounding like that this method leads to new sphere dot records:) anyway I may be wrong its been ages I did a dot sphere and too tired to think it really trough.

2013-05-30 13:00

JackAsser

Registered: Jun 2002
Posts: 1989

Quote: jackie, where's the symmetry in this triangle sphere ? recently on poutet I saw in a thread, that one method to make an 'iso' sphere is to normalize a cube's vertices. still seems more work than the doth sphere cheats: ie. average many cube points out of 8 then scale them with LUTS. dot sphere is only 3 adds and a lookup, cube is 2 adds and a lookup, hmm :) this is getting sounding like that this method leads to new sphere dot records:) anyway I may be wrong its been ages I did a dot sphere and too tired to think it really trough.

Dunno where the symmetry is tbh, and I know how to generate one by triangulating each surface and then normalizing the middle point to the radius of the mathematical sphere. However, there are other "spheres" such as http://en.wikipedia.org/wiki/Pentakis_dodecahedron which might be easier to deduce from the rotated world axises. F.e. each vertex may be a simple integer multiple of the x-, y- and z axis summed (like your cube example, but with more complex numbers). Also that would exploit x, y and z symmetry, just like the cube (i.e. +x,+y,+z is just a flipped version of -x,-y,-z).

2013-05-30 13:27

HCL

Registered: Feb 2003
Posts: 716

Honestly i don't see how this would lead to faster sphere calculations, but it's still interesting if it does :).

Thats sphere pic looks strange.. No symmetry as far as i can see.

Quote:

Pentakis_dodecahedron

Oh, that seems to be the one in EoD.

2013-05-30 15:05

Cruzer

Registered: Dec 2001
Posts: 1048

Quoting HCL

Quote:
Pentakis_dodecahedron
Oh, that seems to be the one in EoD.

Sure does, almost looks like screenshots from EoD. :)

And now, back to the compo... Must the routine be vblank synched and/or double buffered, or is it ok with artifacts from filling while displaying the gfx? The latter is faster of cuz, so it's important with clear rules about this.

2013-05-30 17:42

Bitbreaker

Registered: Oct 2002
Posts: 500

I do double buffering what looks white okay on the real machine, anything less will look rather ugly i'd say :-) vblank would be overkill and we'd also loose granularity when comparing the results.

2013-05-31 21:26

ChristopherJam

Registered: Aug 2004
Posts: 1378

A few of the polygons near the edge of the sphere are inside out, likely a bug in the backface removal. May we reject these?

2013-05-31 22:22

Cruzer

Registered: Dec 2001
Posts: 1048

@Bitbreaker: Agree, thanks for clearing it up.

@ChristopherJam: How on earth did you detect that? And I vote against removing them, since the compo is about rendering the given animation fastest, not about making the prettiest vectorball.

2013-06-01 03:25

ChristopherJam

Registered: Aug 2004
Posts: 1378

@Cruzer It's more that some plot routines either crash or don't set any pixels at all if the ordering is clockwise instead of counterclockwise.

e.g. if you assume the highest point is the start of a chain running down the right hand side to the lowest point, and the end of a chain running up the left hand side, you can use each chain to make a set of masks

left mask
00111
01111
01111
11111

right mask
11100
11110
11111
11000

left AND right:
00100
01110
01111
11000

If the left and right are the wrong way around, left AND right will produce all zeros, as the right mask will become 0 before the left mask becomes 1

So, even a routine that uses all the data might fail to plot CW polys, so we at least need a ruling as to whether we need to to flip CW polys to make them visible.

2013-06-01 07:44

Bitbreaker

Registered: Oct 2002
Posts: 500

I'd say optimize them away, as they can't be too many and will not change the result much speedwise and optically (except when crashing the machine :-) ) As for me they didn't cause any trouble, but i also sorted out some faces that will never been shown, like some of them are also at zero height if i remember right :-)
Now i am just pondering what i shall do with my $1800 free bytes, add music? 8-)

2013-06-01 07:48

ChristopherJam

Registered: Aug 2004
Posts: 1378

Sweet, thanks. As you say, there were only a few, and they were basically slivers anyway.

Yup, the zero area polygons are already gone :D

2013-06-06 04:43

Cruzer

Registered: Dec 2001
Posts: 1048

Quoting Bitbreaker

Now i am just pondering what i shall do with my $1800 free bytes, add music? 8-)

Is it oneframed yet? Otherwise free bytes can always be used for optimizations. :)

2013-06-06 06:36

Bitbreaker

Registered: Oct 2002
Posts: 500

Quote: Quoting Bitbreaker
Now i am just pondering what i shall do with my $1800 free bytes, add music? 8-)
Is it oneframed yet? Otherwise free bytes can always be used for optimizations. :)

It is doublebuffered, fast and fully functional, and it is $1a00 free bytes meanwhile :-) However the biggest free block is somewhat around $800 bytes, the rest is spread all over. So far i can't see any possibility to improve speed by more memory usage. To do so, i'd need a few more pages of zeropage :-) Code unrolling will make things slower, so will do the introduction of lookup tables. I really wasted big amounts of mem for aligned and interleaved stuff though.

2013-06-06 06:47

Hein

Registered: Apr 2004
Posts: 933

Quote: It is doublebuffered, fast and fully functional, and it is $1a00 free bytes meanwhile :-) However the biggest free block is somewhat around $800 bytes, the rest is spread all over. So far i can't see any possibility to improve speed by more memory usage. To do so, i'd need a few more pages of zeropage :-) Code unrolling will make things slower, so will do the introduction of lookup tables. I really wasted big amounts of mem for aligned and interleaved stuff though.

what cruzer is trying to say is that he's almost having it oneframed. :)

2013-06-06 07:03

Bitbreaker

Registered: Oct 2002
Posts: 500

aka animated? ;-)

2013-06-06 13:24

Cruzer

Registered: Dec 2001
Posts: 1048

Quoting Hein

what cruzer is trying to say is that he's almost having it oneframed. :)

Most definitely, and I don't even know what to use the remaining 17000 cycles each frame for. ;)

2013-06-06 13:42

Bitbreaker

Registered: Oct 2002
Posts: 500

Is it in nufli quality?!

2013-06-06 15:23

ChristopherJam

Registered: Aug 2004
Posts: 1378

If it is, it better only be used for anti-aliasing. The rules clearly state that "background is white, foreground is dark red" ;)

2013-06-06 16:37

algorithm

Registered: May 2002
Posts: 702

Not looked at it. But how many frames of animation in data.bin?

2013-06-06 16:46

Bitbreaker

Registered: Oct 2002
Posts: 500

there are 128 frames in there.

2013-06-06 17:47

Hein

Registered: Apr 2004
Posts: 933

Quote: Quoting Hein
what cruzer is trying to say is that he's almost having it oneframed. :)
Most definitely, and I don't even know what to use the remaining 17000 cycles each frame for. ;)

Fill them up with memory?

2013-06-07 06:30

HCL

Registered: Feb 2003
Posts: 716

I think we will see a VQ-ed version, one-framed. Now how to set up CSAM to do the job again!? :P

2013-06-07 13:01

Perplex

Registered: Feb 2009
Posts: 254

The animation won't look very good when played back at 50+ Hz, though. It rotates much too fast for that. I suggest the remaining cycles are to be used for interpolating new frames inbetween the provided keyframes. ;-)

2013-06-07 13:21

HCL

Registered: Feb 2003
Posts: 716

@Perplex: Agree, very good idea. My idea is no longer to make it as fast as possible, as Cruzer already made it one-framed, with one hell of a margin as well :P. Instead i'm going for the most beautiful speed, do it with NUFLI anti-aliasing perhaps in interlace to get more color depth. I'm going for teh votes, not for the speed :).

2013-06-07 14:09

Bitbreaker

Registered: Oct 2002
Posts: 500

to achieve better votes, one should place some boobs beneath the ball.

2013-06-08 06:38

algorithm

Registered: May 2002
Posts: 702

@Perplex. Yes, the additional cycles can use the interleaved method of displaying frame 1 as frame 1, frame 2 as frame 1 and 2 interleaved, frame 3 as frame 3 which in turn would double the time it takes for a whole animation transition (and make it appear smoother)

@hcl. CSAM is not that much suited for outline or line based source (although with some tinkering, charmaps can be extracted and processed further. I do have an unreleased version that has an additional mean-removal before the VQ which works well for preserving edges more efficiently.

2013-06-08 07:00

Oswald

Registered: Apr 2002
Posts: 5017

not many things looks uglyer than playing animations like that, mayba 8x8 plasmas with +4 pixel x/y shift flicker.

2013-06-08 07:16

algorithm

Registered: May 2002
Posts: 702

At 50fps it would not look that bad. Interpolating frames via the software approach would not work via pre-generated codebooks via VQ methods hence why i suggested the interleave method (although interpolated frames do not necessarily require to be placed in the codebook at all - just the closest fit to the existing codebook data) but would still require the 256 bytes (16x16) of charlookup data

2013-06-08 19:01

chatGPZ

Registered: Dec 2001
Posts: 11108

Quote:

not many things looks uglyer than playing animations like that, mayba 8x8 plasmas with +4 pixel x/y shift flicker.

using lossy compression and playing it back like that? =P

2013-06-13 12:13

HCL

Registered: Feb 2003
Posts: 716

Ok, posted a first version.. Just to put some pressure on the rest of you :).

2013-06-13 20:12

Cruzer

Registered: Dec 2001
Posts: 1048

Thanks! Had actually kinda lost the motivation and started on something else already. :)

2013-06-14 06:39

HCL

Registered: Feb 2003
Posts: 716

Well, if i had it one-framed already, i would also start on something else ;). ..or stable 2-framed also for that matter.. I didn't do anything magic there at all so if anyone has just something better than 1992-standard, you will probably beat me. But we're supposed to beat Bitbreaker right, so just had to come up with something.

No matter how i think around this, it boils down to something very similar to Cruzer's "hard-line" from YKTR. Feels lame to implement that, since that is probably where Cruzer starts, and optimizes more from there :P.. But since there is an average of ~90 short lines per frame, it is very important to keep a low overhead on the line drawing. I tried a faster line routine with pre-calced div-values but it ended up slower anyway by just fetching those values from the tables (!).

2013-06-14 06:52

Oswald

Registered: Apr 2002
Posts: 5017

I have similar feelings. I would precalc all possible lines, including the 'trick' of setting many pixels by one sta. each line would be as long as the longest of the used angle, put rts as needed. also precalc shifted versions. then through ZP setup you can access the whole 16x16 matrix. You can save half of the zp setup by using the 7th bit in Y. store the animation as jumptables already into the speedcodes. Then unroll the filler. then realize you're out of mem :)

I did not start on this since I think cruzer and others would do the same. And as counterintuitive as it is, maybe its true that Bitbreakers raster fill method is better for this.

2013-06-14 07:09

HCL

Registered: Feb 2003
Posts: 716

@Oswald: Damn, seems you're a bit ahead of me in the details of the hard-line, but i figured most of it :). Yeah, it's some hell of a job to set up all that, and i'm afraid if i do it i will make other things slower so i don't gain anything in the end(?).. Perhaps i will try anyway, i think there is just enough memory, but then again perhaps i'm refusing to see some part of it all..

That's sort of why i posted this one. It's just plain and simple, and it does the job not too badly. It is a tiny bit faster than Bitbreaker's original, so if i did the realtime calculations perhaps it would end up on par.. so what did i prove then?

2013-06-14 08:37

Oswald

Registered: Apr 2002
Posts: 5017

cool :) so if a plain lineroutine is faster, then unrolling will do even more.

did some estimates. just by looking at the screenshot lets say one line is max 24 pixels.

ldy #$00 ;can be skipped sometimes or replaced by iny/dey
lda (),y
ora mask,x ;save the set up of 8 masks when shifting horizontally
sta (),y

makes 9 bytes. 24 pixel radius needs about 74 angles (half of 360 is enough)

24*9*74= 15984 = 16k. pretty good! it would be smaller as near vertical lines doesnt set all 24 pixels, also trough table juggling its still possible to set more than one pixel by one lda ora sta AND shifting horizontally, but only worths it at near horizontal lines.

storing jump addies instead of coords is not possible tho, as it would blow up the anim data (p1->p2, p2->p3, p3->p4, instead of p1->p2->p3->p4)

2013-06-14 12:30

HCL

Registered: Feb 2003
Posts: 716

..No it's not a looped line, it's a 1992 standard unrolled bresenham:

lda #
ora LineBuf,y
sta LineBuf,y
txa
sbc ydiff
bcs *+7
iny
adc xdiff
bcc *-3
tax

The lines are up to 30 pixels long (dx or dy), so you need some more space than you mentioned..

One optimization i can give away is to skip the "ora" in the code above, which i thought would be possible for quite a few lines. Would be easy to store as a single bit in the animation (the last unused bit per vertex-pair), though my measurements showed it's just ~20% of the lines that it applies to -> no go.

Also have not come up with any better way to store the vertexes, that doesn't eat up more precious memory :P.

2013-06-14 13:14

Oswald

Registered: Apr 2002
Posts: 5017

sorry, I meant unrolling even the slope calcs into ldy #'s :)

why looping on calculating the slope? for such lines use log div, for the rest the non looping bresenham. but you should know that :)

2013-06-14 15:34

ChristopherJam

Registered: Aug 2004
Posts: 1378

I was considering cheating horrendously by storing for the top and bottom of each character-aligned 8 pixel wide slice of each polygon a y-value and an index into a table of edge patterns (short lists of y offset/bitmask pairs), but given that there are over 10,000 such slices over the 120 frames, I don't think I'll have the memory for that approach; even 4 bit y deltas plus 12 bit pattern indices don't leave me enough space for the ~3800 pattern definitions required

Back to the drawing board.

2013-06-14 15:45

Oswald

Registered: Apr 2002
Posts: 5017

Quote: I was considering cheating horrendously by storing for the top and bottom of each character-aligned 8 pixel wide slice of each polygon a y-value and an index into a table of edge patterns (short lists of y offset/bitmask pairs), but given that there are over 10,000 such slices over the 120 frames, I don't think I'll have the memory for that approach; even 4 bit y deltas plus 12 bit pattern indices don't leave me enough space for the ~3800 pattern definitions required

Back to the drawing board.

damn i dont understand a word :)

2013-06-14 15:51

Cruzer

Registered: Dec 2001
Posts: 1048

Yup, as you correctly guessed I started out with the hardliner concept. And I'm only at about 2.5 frames so far, but hey, it's just about beating BitBreaker. :)

I haven't done any specialized line routines yet, so the pixels are just being drawn one by one. But I have some plans similar to Oswald's (I think) about drawing multiple pixels in one sta, as well as omitting the eor for the lines that don't share any bytes with other lines. Why are you guys using ora btw?

But optimizations like that are of cuz only worth it if they aren't eaten up by increased administration costs, which are already a huge part of it with a lot of small lines, so I have also worked on getting them reduced. I'm still using BitBreaker's original data, but if I reorganized them so the lines wouldn't have to start from scratch each time in guessing stuff like which of the 4 main directions it's pointing, whether it's flat, etc. it would help a lot.

2013-06-14 20:42

Sorex
Account closed

Registered: Nov 2003
Posts: 43

Quoting Bitbreaker

any blocky animation will be deteced within no time and a serious drama will be generated upon it, be sure about that! :-)

I'm not good at this stuff at all so I'm not planning to compete.

And I don't know if speedy fillers use tricks like 8 pixel fills to speed up things so is that counted as a blocky animation aswell then even when it actually "draws" it at each frame?

Or is only 1 pixel fill allowed?

2013-06-14 21:32

HCL

Registered: Feb 2003
Posts: 716

@Oswald: yeah, that's what i did, i even had a precalced div table.. but just finding the values in the table ate up the benefit of the faster line. Though the bresenham is faster on flat lines, plus that the steap lines are quite short.

@Cruzer: of course it is EOR, not ORA. So, do you have any mem left for further development?

2013-06-15 11:50

Cruzer

Registered: Dec 2001
Posts: 1048

@HCL: About $2a00 free + various holes in the data/code. And it might be possible to gain some more by packing the coords, but of cuz only if the freed up data can be used for optimizing the whole thing more than the depacking takes.

2013-06-15 21:35

ChristopherJam

Registered: Aug 2004
Posts: 1378

@Cruzer, are you already down to 46,500 cycles? (I'm assuming 19656-43*25=18581 cycles available per frame; should really be less once the raster IRQ is factored in)

I think I've only just worked out a way to get that low, and I've less than $1e00 bytes remaining :-/ All my grandiose plans for getting below 2 frames turned out to need at least 70k of ram, unless you count VQ :p

2013-06-16 13:58

Kisiel
Account closed

Registered: Jul 2003
Posts: 56

so maybe is good idea to use memory expansion, like 1541U aka REU ?
VICE have this so it's not a problem.

2013-06-16 17:48

chatGPZ

Registered: Dec 2001
Posts: 11108

with a reu you can just do an animation, whats the point then? booooring

2013-06-16 18:09

Cruzer

Registered: Dec 2001
Posts: 1048

We could also just all agree to run Vice at 400% speed. Why waste your time with nerdy optimizations when there's easier ways to get to the same result? :)

2013-06-16 18:28

Cresh

Registered: Jan 2004
Posts: 354

2013-06-16 19:27

chatGPZ

Registered: Dec 2001
Posts: 11108

Quote:

We could also just all agree to run Vice at 400% speed. Why waste your time with nerdy optimizations when there's easier ways to get to the same result? :)

i have it running at 75fps using GL \o/ does anybody care? =P

2013-06-17 08:47

Axis/Oxyron

Registered: Apr 2007
Posts: 91

@Groepaz: Damn, you have beaten me. My version runs in 50 fps on Amiga.

2013-06-17 14:30

Danzig

Registered: Jun 2002
Posts: 428

@Axis & @Groepaz: Now please sit down and port Groepazens GL Version to Storm-C using GL on a plain A1200 mit 020er. My bet: 2 fps :D

2013-06-17 15:02

Axis/Oxyron

Registered: Apr 2007
Posts: 91

My bet: Wont start at all.

?Out of chipmem error

Ready.

2013-06-17 15:48

chatGPZ

Registered: Dec 2001
Posts: 11108

i have ported SDL with software GL before.... maybe we can use that? 2fps is mandatory for amiga afterall =)

2013-06-17 18:49

Danzig

Registered: Jun 2002
Posts: 428

Quoting Axis/Oxyron

My bet: Wont start at all.

?Out of chipmem error

Ready.

Yep, your bet is closer to reality I guess... Still I remember laughing my ass off, when I first saw the GL-samples with Storm-C back then. And that was 68030/50 already!

Quoting Groepaz

i have ported SDL with software GL before.... maybe we can use that? 2fps is mandatory for amiga afterall =)

Yeah, I already knew that the topic is a typo by bitbreaker. He was supposed to call it: "SLOW me your (vector)balls". Damn you, Bierkeule!

2013-06-17 19:36

HCL

Registered: Feb 2003
Posts: 716

:D

Well, i wasted a whole day on implementing the hard-liner.. and all i got was ~16 frames or so on 256 animation steps :(. Overhead ate my hardliner.. :P. Ok, now at least i have done it, perhaps i can reduce some shit somewhere.. But no matter what, i'm still generations behind Cruzer..

2013-06-18 11:08

Mixer

Registered: Apr 2008
Posts: 422

My version is round faced and runs on PC! I might join in, but I'll cheat! naah, no time.

I've been thinking whether this could be done by rearranging and organizing the animation coordinates by drawing each red face on sprites with somewhat hardcoded sprite filler and then just multiplexing sorted sprites on screen. Perhaps one could even shrink all the continuous vertical edges and expand them with sprite stretcher, however stretching eat precious cycles. Similar approach might work with chars, but I guess the overhead makes it pointless.

2013-06-18 11:14

Oswald

Registered: Apr 2002
Posts: 5017

Quote: :D

Well, i wasted a whole day on implementing the hard-liner.. and all i got was ~16 frames or so on 256 animation steps :(. Overhead ate my hardliner.. :P. Ok, now at least i have done it, perhaps i can reduce some shit somewhere.. But no matter what, i'm still generations behind Cruzer..

dont get it, drawing 256 phases in 16 frames is like 20 rasterlines to draw one phase? :)

2013-06-18 11:51

HCL

Registered: Feb 2003
Posts: 716

@Oswald: No, i *gained* some 16 frames. Calm down boy ;).

@Mixer: I think the sprite-shit will be hard to get working.. The faces are just a tad bit bigger than one sprite. That also rules out the popular sprite-filler (by updating sprite x-pos each rasterline), since it would require masking the right edge with a white sprite, and there are possibly more than 4 red ares on one rasterline..

2013-06-20 11:17

ChristopherJam

Registered: Aug 2004
Posts: 1378

Well, I finally got something working last night, albeit running in around 2.8 frames. Got it down to 2.6 today and sent off a draft to @Bitbreaker.

Now to try to improve it further :D

2013-06-20 13:58

Oswald

Registered: Apr 2002
Posts: 5017

2.5 is suprisingly good, considering that about a 3/4 frame alone is the filling :)

2013-06-20 20:21

Bitbreaker

Registered: Oct 2002
Posts: 500

So guys, i see you have been struggling hard and there's already two drafts in my inbox, while i was laying at the beach and enjoying my holidays :-)

2013-06-21 02:35

Skate

Registered: Jul 2003
Posts: 490

I was hoping to find some time to join this compo. but there is just 4 days to go and i couldn't even start yet. :/

2013-06-21 06:09

Oswald

Registered: Apr 2002
Posts: 5017

Quote: So guys, i see you have been struggling hard and there's already two drafts in my inbox, while i was laying at the beach and enjoying my holidays :-)

you had a release version before the compo even started, so what?

2013-06-21 06:49

Bitbreaker

Registered: Oct 2002
Posts: 500

Quote: you had a release version before the compo even started, so what?

right, but a slow one though that i improved since the compo started. I'd say everyone has some filler-routines ready to adopt to that compo.

2013-06-21 08:43

ChristopherJam

Registered: Aug 2004
Posts: 1378

Ideas yes, code no. I started from scratch when the compo was announced with no more code than a raster IRQ initialiser and a few lines of Python that dump arrays out to .a65 sources or .prgs

2013-06-21 09:40

Oswald

Registered: Apr 2002
Posts: 5017

Quote: right, but a slow one though that i improved since the compo started. I'd say everyone has some filler-routines ready to adopt to that compo.

I'm eager to see who which method will win, eorfill, or per scanline filling :)

2013-06-21 18:13

chatGPZ

Registered: Dec 2001
Posts: 11108

and i am still tempted to attempt a plain animation.... weather doesnt contribute a lot to it actually happening though =)

2013-06-23 11:27

The Syndrom

Registered: Aug 2005
Posts: 56

this would've been my approach aswell - I'm just too lazy to generate all single frames as a first step.

2013-06-23 11:50

Mr. SID

Registered: Jan 2003
Posts: 421

I have a python script that renders all the frames to png files. If anyone is interested...

2013-06-23 12:59

chatGPZ

Registered: Dec 2001
Posts: 11108

please, its the main show stopper for me atm aswell, cant be arsed to render the frames =)

2013-06-23 13:35

Stone

Registered: Oct 2006
Posts: 168

Perplex rendered these: https://www.dropbox.com/sh/wkdj7d7cjo5wob9/YQqvLkyYeN/amigaball..

2013-06-23 16:50

Mr. SID

Registered: Jan 2003
Posts: 421

It's completely based on Bitbreaker's C source:
http://galway.c64.org/~sid/ball.py

2013-06-23 18:45

BYB

Registered: Jan 2011
Posts: 18

thanks for uploading the png's.
Maybe it's possible to save some mem by mirroring verticaly the half of frames. So Hires consists of only back- and foreground colors, the ball should be shaped ofcoz. Sprites allowed?

2013-06-23 21:04

Perplex

Registered: Feb 2009
Posts: 254

Quote: Perplex rendered these: https://www.dropbox.com/sh/wkdj7d7cjo5wob9/YQqvLkyYeN/amigaball..

Here's the Ruby code I used if anyone's curious: https://gist.github.com/lhz/5748054

2013-06-24 21:45

Cruzer

Registered: Dec 2001
Posts: 1048

Deadline got here faster than expected, so I didn't get to implement any further optimizations. But looking forward to the other entries!

2013-06-25 07:16

Bitbreaker

Registered: Oct 2002
Posts: 500

Can i receive all the animated versions now and the versions of those who shouted the loudest that this is an easy job? :-) It is about to deliver guys!

2013-06-25 10:48

HCL

Registered: Feb 2003
Posts: 716

WTF!? June.25 00:00, that's a weirdo time for a deadline!! You has it just any date at 23.59, that's how you do it.. Now i just remembered June.25, and that's today, and *boff* the deadline already passed :-O. I post my last balls now, and the drahma about the deadline starts right here!! ;).

2013-06-25 10:56

Bitbreaker

Registered: Oct 2002
Posts: 500

yay, drama \o/ of course i'll accept your actual entry past the deadline for the sake of that! :-) Sorry for confusing :-)

2013-06-25 11:00

Cruzer

Registered: Dec 2001
Posts: 1048

Quoting Bitbreaker

If the deadline is extended, a severe drama is expected

WTF, if I knew it would be extended I could have done so much more, bla bla bla... No, it's ok :)

2013-06-25 11:44

Martin Piper

Registered: Nov 2007
Posts: 634

I wonder if anyone used a cart?

2013-06-25 12:01

ChristopherJam

Registered: Aug 2004
Posts: 1378

Wait, what? Deadline's gone?

I got caught by the same midnight confusion :(

Can I send the code as it was earlier today?

2013-06-25 13:14

Martin Piper

Registered: Nov 2007
Posts: 634

Using lossless delta animation each frame was down to about 300 bytes. With the number of frames that meant too much memory. :(

2013-06-25 13:21

The Human Code Machine

Registered: Sep 2005
Posts: 110

There are only 128 frames which will be repeated twice and this should fit into the memory!

2013-06-25 17:14

chatGPZ

Registered: Dec 2001
Posts: 11108

Quote:

Using lossless delta animation each frame was down to about 300 bytes. With the number of frames that meant too much memory. :(

damn, no need to try myself then :) funny enough, i somehow got distracted and caught myself tweaking some old line drawer of mine =P

2013-06-25 18:12

HCL

Registered: Feb 2003
Posts: 716

Haha.. quite funny this. Someone sets up a goal, a bunch of guys aims their weapons, and in the end, hit targets of a great variation more or less far from the goal :D.

2013-06-25 18:31

Bitbreaker

Registered: Oct 2002
Posts: 500

it is said that there is an entry from Metalvotze that is worth waiting for. So i'll upload all entries + results as soon as it's done.

2013-06-25 18:42

Bitbreaker

Registered: Oct 2002
Posts: 500

Quote: Haha.. quite funny this. Someone sets up a goal, a bunch of guys aims their weapons, and in the end, hit targets of a great variation more or less far from the goal :D.

They are getting old and loose eyesight! :-)

2013-06-26 06:21

Bitbreaker

Registered: Oct 2002
Posts: 500

Finally, there's the results, and thanks to Metalvotze, none of the serious competitors will be last :-)

So here come the results (executables):

place   handle          frames

god     bitbreaker      $24f
----------------------------
1.      christopherjam  $25f
2.      axis            $29c
3.      cruzer          $29e
4.      hcl             $2a4
5.      drago           $dead

As you see, i decided to be out of compo :-) Congrats to ChristopherJam pushing the eor-filler that hard to nearly reach my result and a big thanks to all that participated! Now it is time to discuss and boast in detail i guess? :-) Now show me your inballievable code!

2013-06-26 07:37

Axis/Oxyron

Registered: Apr 2007
Posts: 91

Congrats to Christopher. Great work dude!

After taking a look into the code of all entries I have to say: "We are all bloody uncreative".
The code looks something like 90% identical (eor filler mostly unrolled and slopetable lines) with only small changes in the details like how the stored data is converted to slope spans.
I really hoped that someone comes up with a nice innovation. Like special code for flat lines gathering multiple pixels per store or some tricks to avoid the eor per linepixel.
I guess this has to wait until the next compo.

2013-06-26 08:48

Bitbreaker

Registered: Oct 2002
Posts: 500

So that is how it is done in my case:

All coordinates are shifted 4 pixels to the right, so that only 15 columns have to be treated (thanks to THCM for that hint) also all faces are resorted so that they are drawn from right to left, thus the left edge of all faces never will come into contact with edges from other faces, and ora'ing with already present content in the buffer can be omitted on left side.
The use of slope tables seem to be common practice, so not much to tell here. Funny enough one comes a long with rather small tables here, but the input data is bloated up by that process. To save bytes the format of data.bin was adopted quite a bit.
Stuff has been aligned to convenient places so that most of all the jump pointers can be easily calculated with 2 cycle illegal opcodes. Also a lot of code is squeezed into the zeropage and from $ff9c on, so that both code segments can be accessed via normal branches (address wraparound).

Here's the source have fun digging through it.

2013-06-26 08:56

ChristopherJam

Registered: Aug 2004
Posts: 1378

Thanks, guys!

And yes, I was hoping to get away from using an eorfiller too, but my span fill attempts either took up way too much ram with tables, or were too slow, despite presorting the polys from left to right so I never needed to mask the right hand sides.

All of my slopes are taken from a single 384 byte table, but the offsets into it are all precomputed and my memory usage is pretty dire. I'm damn impressed that bitbreaker managed to get the best time in only 133 blocks!

Each edge is stored as two bytes for the address to jump into an eorfill routine, the low byte of a base pointer into the eorbuffer, and an offset into the slope table.

The eorfill routines plot up to 32 pixels, each taking its Y value from basepointer+slope[x*4+offset]. They plot two pixels at a time if the slope is low enough.

I too shifted four pixels left and four pixels up, so I only use 15x15 chars.

If I knew I could have gotten away with a single charset for display, that would have saved me a few kb to unroll some of the loops further.

Very impressed with the cleanness of some of the other entries.

2013-06-26 09:48

Axis/Oxyron

Registered: Apr 2007
Posts: 91

My implementation is a pretty straight forward eor filler with slopetable lines.

Completely unrolled speedcode for clear and fill that only accesses the bytes that get touched in at least one frame of the animation.
I shifted the coordinates -3 in x. My prototype reported the least amount of touched bytes in that position.
My address generation in the lineroutine is optimized down to 1 inc-zp every 16 pixels, because all the #$80 fiddling is encoded into my slopetables.
The rest is just classic code-optimization of the lineinit.
I just realized, I could have saved a lot with storing the line speedcode with multiple widths. So I dont need to patch and restore an rts into the linecode.

2013-06-26 10:16

ChristopherJam

Registered: Aug 2004
Posts: 1378

Sadly I only thought of including the $80 fiddling into the slope tables this morning, but I did have multiple speedcode fragments rather than doing RTS patching. All my fragments just JMP back into the main loop in zero page.

2013-06-26 11:39

Shadow
Account closed

Registered: Apr 2002
Posts: 355

Aw damnit, I missed the deadline with my joke entry.
Oh well, here it is anyway:

http://ag1976.com/tmp/amiga_petscii.prg

50 fps baby, oh yeah! :D

2013-06-26 11:58

chatGPZ

Registered: Dec 2001
Posts: 11108

so, as if anyone cares, ofcourse i tried anyway :)

in: 263168 bytes (128 file(s)) out: 69281 bytes (1 file(s)-128 frame(s)) left: 26.33%

so that'd *almost* fit, using plain delta+RLE on the bitmaps. using screen+charset properly might actually make it fit, but since my little packer doesnt do that automatically i couldnt be arsed to test =P

oh, and it runs so fast that it looks totally crap, making the whole animation attempt somewhat pointless =) was interesting to get some figures for code- and data size of both attempts though, cheers =)

2013-06-26 12:12

Shadow
Account closed

Registered: Apr 2002
Posts: 355

Groepaz: Yeah, I noticed that on my PETSCII animated version as well - the animation is really not well suited for 50fps display, it just turns into a pink blur... Running VICE at 50% speed actually makes it look better!

2013-06-26 14:17

BYB

Registered: Jan 2011
Posts: 18

Where to download? I would like to see all the other versions too. Actually i only saw the petscii one, really nice work and idea. :) Ah, i found the competition entries up there :)

2013-06-26 14:29

Shadow
Account closed

Registered: Apr 2002
Posts: 355

Bitbreaker posted a link to the executables in the same post with the results (#134)

2013-06-26 18:09

HCL

Registered: Feb 2003
Posts: 716

Big congratulations to ChristopherJam who is the true winner of this compo, and also the only entry with correct double buffering!! No wonder you had memory problems if you managed to do that.

Omg, i ended up on last position :(. This definitely ends my era as a 1337 coder.. but i still claim that i once was.

..So here comes a few excuses. I wasted ~2 weeks on a huffman-packed animation of the line-buffer, which turned out alot slower than i expected. Well, at least i *tried* something else than an eor-filler, but later went back to implement something like Cruzer's hard-liner from 2004. I didn't really get any further than Cruzer did back then i suppose, or rather i didn't even get there probably :P. From the benchmarks it looks like Me+Cruzer+Axis implemented almost exactly the same thing.

I really would have wanted to go further from here, but there was no more time, and also my energy was starting to drain. I still have most of the zeropage unused, and the data.bin is untouched, but shifted 4 pixels like the rest of you also did. The last optimization that gained me some $18 frames or so was to unroll the vertex read-loop for one face, and thus also duplicating the line-init (via macro) to operate on various vertex combos. Gained more than i expected.

Ok, time to check out the other entries to see if there is something interesting.. Should be for a lamer like me :P.

2013-06-26 20:16

Cruzer

Registered: Dec 2001
Posts: 1048

Gongrats to Birbreaker and ChristopherJam, truly impressed. Guess it's back to the drawing board. Great to see that the filled vector standard has reached a new level compared to the 1992 style that rules for many years.

2013-06-26 20:27

HCL

Registered: Feb 2003
Posts: 716

Cruzer's implementation looks really clean and almost unoptimized. I bet you could gain some easy frames here and there..

Axis's ball is crapped on places.. I call for a ~$10 frames penalty ;). Well, it's probably just an easy bug fix, but it looks sloppy.. 3 bytes per vertex was kinda innovative, don't you think?

CJam, WTF? Looped clear + eor-fill, and still you beat us with margin. Ok, some zp-code there, and lots of precalced data, but i'm still like WTF?! Gotta learn the lesson :P.

Bitbreaker's ball was fast, ok, but you have also had two compos to optimize it ;). Besides i still think that span-filling is slower if you realtime-calc the vectors, hmm. that requires a proof i suppose :P ..and you *are* having double buffers, so it would be a piece of cake to do it bug-free then!?

I don't know if this applies to some of your lines, but i draw the lines backwards.. Then i don't need to find an address to put the RTS and then restore it. The line always finishes on the right place with RTS, and i just have to find out where to start. WTF, i did the worst result, i shouldn't come with tips and trix.. i should try to learn instead, it's just hard to change roles :P.

2013-06-27 06:36

Bitbreaker

Registered: Oct 2002
Posts: 500

Quoting HCL

Bitbreaker's ball was fast, ok, but you have also had two compos to optimize it ;). Besides i still think that span-filling is slower if you realtime-calc the vectors, hmm. that requires a proof i suppose :P ..and you *are* having double buffers, so it would be a piece of cake to do it bug-free then!?

What you mean with bug free? without tearing? In fact the tearing is not too heavy on the real machine. I might vsync but loose a few frames by that. Redoing the clearing to make it happen linewise and not columnwise would help, to make the synching happening within a tighter range.
As for realtime calced vectors i'd need ~$33e for just the filling and calculations coming along with that, if i remember correctly.
Also i had to do many things on that filler from scratch or can we take that as a hook for some serious drahma please? :-P

2013-06-27 07:28

PopMilo

Registered: Mar 2004
Posts: 145

Thanks!

Thank you Bitbreaker for making this compo, thank you all who showed how to code this thingy...

My take on this - we need more of these small, focused competitions!

2013-06-27 07:33

ChristopherJam

Registered: Aug 2004
Posts: 1378

An eorfiller with double buffers can be tear free without losing any frames by only changing $d016 in the IRQ, but any plotter that spends most of its time writing the the final bitmap requires triple buffers for the same effect - and if you've unrolled code dedicated to each buffer, that eats memory fast.

@HCL, the looped clear+fill costs me an extra few thousand cycles per frame, but it frees up memory for more unrolled speedcode fragments, which saves me more time than I lose to the loops.

As it is I'd hoped to spend an evening tuning how much memory I allocate to each of the clear, fill, and edge routines to optimise cycles, but like you I misread the deadline!

The speedcode uses a mix of inlined and JSRs for incrementing the column pointer, with inlines on the more commonly used cases towards the ends of the routines, but JSRs to save RAM on the less used cases. Again, the distribution's probably not all that optimal.

I draw lines left to right or right to left depending whether the slope is up or down; this way I only have positive slopes in my slope table.

2013-06-27 18:10

chatGPZ

Registered: Dec 2001
Posts: 11108

darn. i find myself working on the damned packer now, because i think a both faster AND shorter version (than the ones shown) is within reach. damn urge to code useless crap it is =)

2013-06-27 18:17

The Syndrom

Registered: Aug 2005
Posts: 56

apart from the animation-approach I even think Shadows very nice version can be improved, if you split the charsets into 4 or 8 to gain full resolution. You'd still need some kind of preprocessor to sort/match the charsets to the 128 frames.

2013-06-27 20:14

Shadow
Account closed

Registered: Apr 2002
Posts: 355

Actually, by just using a custom-charset instead of the standard C64 one, I think I could get something that looks considerably better (especially given the very fast rotation speed at 50 fps).
Hmm... maybe I will give it a try just for fun!

2013-06-27 21:10

algorithm

Registered: May 2002
Posts: 702

Having it look good at 50fps would require either more of a smoother transition between each frame (256 frames+) or some type of interpolation.

Notice frame by frame how some frames are near identical or/and with char flipping can be reused.

2013-06-27 21:54

Shadow
Account closed

Registered: Apr 2002
Posts: 355

Quote:

Actually, by just using a custom-charset instead of the standard C64 one, I think I could get something that looks considerably better

I hereby retract the statement above. It looked like crap! I would proably need more intelligence in the charset analyzer, the "as-few-bits-diff-as-possible"-method didn't cut it at all.

2013-06-28 07:08

Bitbreaker

Registered: Oct 2002
Posts: 500

dear animators,

here's my petscii attempt

and ugly attempt (charset with 256 individual chars for all frames)

2013-06-28 07:14

PopMilo

Registered: Mar 2004
Posts: 145

@Bitbreaker: Good enough to totally distract me on workplace!
Looks decent on larger surfaces... Not bad at all.

2013-06-28 10:06

Cruzer

Registered: Dec 2001
Posts: 1048

How about using half sized chars (8x4 px)? The animation should still fit in memory with 128 frames and 15x30 cells without the corners.

2013-06-28 10:27

Bitbreaker

Registered: Oct 2002
Posts: 500

Quote: How about using half sized chars (8x4 px)? The animation should still fit in memory with 128 frames and 15x30 cells without the corners.

Would be possible, but it might still look crap, and actually i don't bother much pushing those frames through my various converters :-) With one charset each 16 frames it still looks kind of crap :-)

2013-06-28 15:57

chatGPZ

Registered: Dec 2001
Posts: 11108

i doubt anyone would have noticed the crap even in the one charset version =) it moves so fast, you cant really make it worse by making the charset a bit inaccurate =D (still, i would aim for non-lossy compression.... lossy is crap by definition =P)

2013-06-30 05:48

Oswald

Registered: Apr 2002
Posts: 5017

interesting, now either everyone was wrong with the eor fillers so far, or bitbreaker found a corner case for span fillers. I didnt thought I will see this moving this fast though and that includes hcl's version :)

2013-06-30 07:24

Axis/Oxyron

Registered: Apr 2007
Posts: 91

It is ofcourse an extreme combination of corner cases.
Hires leads to the fact that the eor fillers cant use their profit of drawing the lines only in half the resolution.
An object without any shared edges helps the span filler to avoid drawing the edges twice and avoids a lot of masking and oring at the edges.
Also the structure of the object is pretty optimal for span fillers.
So, the same fillers with different data would lead to total different results.
With a low poly object in lores I am sure Cruzer will run circles around all of us.

2013-07-01 07:09

Bitbreaker

Registered: Oct 2002
Posts: 500

Quoting Axis/Oxyron

It is ofcourse an extreme combination of corner cases.
Hires leads to the fact that the eor fillers cant use their profit of drawing the lines only in half the resolution.
An object without any shared edges helps the span filler to avoid drawing the edges twice and avoids a lot of masking and oring at the edges.
Also the structure of the object is pretty optimal for span fillers.
So, the same fillers with different data would lead to total different results.
With a low poly object in lores I am sure Cruzer will run circles around all of us.

No one wants lowres anymore nowadays :-) Also i am not sure if the span filler performs worse on low poly stuff. Of course edges are shared then (what is expensive in the yet implementation, as it implies some overhead per line and face), but bigger areas that i can fill with 5 cycles per 8 pixel get filled pretty fast therefore.

2013-07-01 11:17

Oswald

Registered: Apr 2002
Posts: 5017

nobody? :) I love early / mid 90's style pc/amiga LUT effects, which is usually not possible in hires.

It would be interesting to see how your span filler performs on a cube compared to eor fillers. as you say it's indeed superior when looking at larger areas (5cycles vs 8 or more). one could ofcourse buffer the edges to avoid calculating them twice, but that has some overhead aswell, so one has probably go through all the pain and code it entirelly to see wether it worths it. span fillers loose out on the edges to eor fillers (the preparations to jump into correct speedcode span & masking the edges), its hard to guess which would win.

2013-07-01 18:34

Bitbreaker

Registered: Oct 2002
Posts: 500

A cube at around same size is done in $14a frames, including different patterns per face, and without using slopetables but calculating the slopes on the fly. If anyone has the need to try i can provide the data used for that testcase.

2013-07-01 18:40

chatGPZ

Registered: Dec 2001
Posts: 11108

Quote:

nobody? :) I love early / mid 90's style pc/amiga LUT effects, which is usually not possible in hires.

just do it i say :)

2013-07-02 06:23

Oswald

Registered: Apr 2002
Posts: 5017

Quote: A cube at around same size is done in $14a frames, including different patterns per face, and without using slopetables but calculating the slopes on the fly. If anyone has the need to try i can provide the data used for that testcase.

$14a for 128 rotation phases? thats ~2,5 frame per phase, a nicely optimized eorfill version should be 2 or a bit more.

2013-07-02 06:36

Bitbreaker

Registered: Oct 2002
Posts: 500

*sigh*

$14a for $100 rendered frames, just as with the vectorball where it is also $100 rendered frames.

2013-07-02 07:56

Oswald

Registered: Apr 2002
Posts: 5017

well, then its pretty much killing the eor filler method, and you even have dither for free :)

2013-07-02 11:26

ChristopherJam

Registered: Aug 2004
Posts: 1378

Hey, eor filler has dither for free if you fill from left to right instead of top to bottom (eg final part of Effluvium).

2013-07-02 12:52

Oswald

Registered: Apr 2002
Posts: 5017

how do you fill from left to right?

I remember my very first naive filler was trying doing something like that with tables, so if a byte b4 filling looked like:

00110000

then feed it into a table and you get:

00111111

fill left to right ;)

2013-07-02 13:02

JackAsser

Registered: Jun 2002
Posts: 1989

Quote: well, then its pretty much killing the eor filler method, and you even have dither for free :)

EOR-fill dithering is free in the fill-phase but require "thick" vertical lines when drawing the lines.

2013-07-02 13:36

ChristopherJam

Registered: Aug 2004
Posts: 1378

@Oswald, it requires two lda:eor:sta sequences for each row, once to set the right-hand pixels for the bytes, one for the left hand. I guess that's what @JackAsser means by 'thick lines'?

Eg, for an edge that starts 43 MCM pixels into a row, flipping from solid 0x00 to solid 0xff

lda#$03:eor row+10:sta row+10
lda#$fc:eor row+11:sta row+11

(assuming contiguous bytes for each row of the eorbuffer; I can't actually remember whether effluvium had a fullscreen eorbuffer or if I just did one row at a time)

2013-07-02 14:03

Oswald

Registered: Apr 2002
Posts: 5017

I dont get it but two runs is more expensive, than span filling or the eor filling method. What jackasser means.. I guess its easyer to see how graham did it in natural wonders, it takes a lot to explain :)

2013-07-02 21:20

HCL

Registered: Feb 2003
Posts: 716

$14a sounds too fast for an eor-filler where ~$a0 is spent on the eor-shit.. (if i got the calculations right :P). But i think that "glenz" might be the thing that could keep the eor-filler alive. Drawing many lines *should* be faster with the eor-filler since nothing else than the line itself needs to be in fast registers, then multicolor also gives a benefit. So perhaps it turns out that both Bitbreaker and Cruzer found their corner cases where their mad implementations are just superior to anything else ;).

2013-07-03 07:30

Bitbreaker

Registered: Oct 2002
Posts: 500

@HCL: no doubt, at glenz my routines would suck hard, that is where eor has its true benefit :-)

2013-09-20 12:05

PopMilo

Registered: Mar 2004
Posts: 145

Quote: Would be possible, but it might still look crap, and actually i don't bother much pushing those frames through my various converters :-) With one charset each 16 frames it still looks kind of crap :-)

@Bitbreaker: Any chance on 'lending' any of those converters ? :)

Would like to test char-based animation on another type of graphic...

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

Fred/Channel 4
eryngi
Rock/Finnish Gold
Fungus/Nostalgia
Skylab/The Movers
lft
rexbeng
Majikeyric
zscs
Didi/Laxity
oziphantom
dillof
Guests online: 135

Top Demos

1 Next Level  (9.8)
2 Mojo  (9.7)
3 Coma Light 13  (9.7)
4 Edge of Disgrace  (9.6)
5 Comaland 100%  (9.6)
6 No Bounds  (9.6)
7 Uncensored  (9.6)
8 The Ghost  (9.6)
9 Wonderland XIV  (9.6)
10 Bromance  (9.6)

Top onefile Demos

1 It's More Fun to Com..  (9.7)
2 Party Elk 2  (9.7)
3 Cubic Dream  (9.6)
4 Copper Booze  (9.5)
5 Rainbow Connection  (9.5)
6 TRSAC, Gabber & Pebe..  (9.5)
7 Onscreen 5k  (9.5)
8 Wafer Demo  (9.5)
9 Dawnfall V1.1  (9.5)
10 Quadrants  (9.5)

Top Groups

1 Oxyron  (9.3)
2 Nostalgia  (9.3)
3 Booze Design  (9.3)
4 Censor Design  (9.3)
5 Crest  (9.3)

Top NTSC-Fixers

1 Pudwerx  (10)
2 Booze  (9.7)
3 Stormbringer  (9.7)
4 Fungus  (9.6)
5 Grim Reaper  (9.3)

Page generated in: 0.215 sec.